Comment on wcd-2021-61

This study examines the effects of stochastic parameterizations on the link between November sea ice and the winter NAO in historical coupled model simulations. The authors find that in the control simulations the connection is weak and opposite of what is seen in observations. When they add the stochastic parameterizations to the ocean and sea ice model components, the connection between the November sea ice and winter NAO switches sign and becomes closer to the observed value. They attribute the differences to improved ice-ocean-atmosphere coupling. I think that this an interesting study with potentially important results worthy of publication. However, I do have a number of issues that need to be addressed, so my recommendation is for major revisions.

This study examines the effects of stochastic parameterizations on the link between November sea ice and the winter NAO in historical coupled model simulations. The authors find that in the control simulations the connection is weak and opposite of what is seen in observations. When they add the stochastic parameterizations to the ocean and sea ice model components, the connection between the November sea ice and winter NAO switches sign and becomes closer to the observed value. They attribute the differences to improved ice-ocean-atmosphere coupling. I think that this an interesting study with potentially important results worthy of publication. However, I do have a number of issues that need to be addressed, so my recommendation is for major revisions.
Major comments: 1.The use of different sea ice regions for the model and observations is problematic. The authors have correlated the NAO with sea ice concentration at all gridpoints and cherrypicked the regions with the largest correlations (which is different in the model and observations). Given the weak correlations combined with large internal variability, there is a good chance the internal variability is contributing to the regions with the highest correlations. This means all the subsequent analysis and discussion about statistical significance is not reliable because the region was not selected a priori. The authors should use the Barents-Kara (BK) Sea for both observations and model correlations. I don't even think this will have that large of an effect on the analysis and conclusions because there are clearly differences in correlations over just the BK Sea ( Figure 4).
The justification for this is not at all convincing. The authors claim that because models have different biases, the regions with the most sea ice variability is different across different models and the real world. However, The sea ice in the BK Sea in the OCE does not look that different than in ERA5, so I don't see why they cannot use the same region. The leading EOF in ERA5 looks very similar around the BK region ( Figure 2). I can see maybe shifting the regions slightly to account for biases (e.g. if the model ice edge is 1°t oo far south in the model, shift the region definition 1° to the south), but to use a very different region is not justifiable and introduces additional issues. 2. The model the authors use may be an outlier and the results may not be that relevant to other models. This is very briefly mentioned in the discussion, but I think there are reasons to think this may not work as well in other models. Most models tend have a weak connection between reduced sea ice and a negative NAO. In addition, as mentioned in the introduction, model experiments forced with reduced sea ice also tend to show a weak negative NAO response. However the control model used here shows the opposite sign correlation compared to most models, and a previous study (Ringgaard et al. 2020, doi:10.1007/s00382-020-05174-w) shows that a version of this model shows no NAO response to reduced sea ice in the BK Sea. In addition, the improved correlation in the OCE version are still weak. Could it not be the case that the OCE is just improving the flaws in this particular model, which brings it more in line with other models? This would then mean that applying the same methods in other models may not have as large of an effect.
3. The authors claim that mean state changes cannot explain the differences, but I don't find their arguments that convincing. They argue that AMIP ensemble with prescribed SSTs and sea ice show weak correlations. First of all, taking the correlations of the AMIP ensemble at face value would suggest that close to half of the difference can be explained by the mean state. Second, there are many other difference related to the coupling of sea ice and SSTs that could cancel out the improvements made by correcting the mean state biases in the AMIP experiments. It is likely that the improved mean state explains at least some of the differences and it can't be ruled out that it is entire explanation.
4. The authors conclude that the link between sea ice and the NAO is stronger because of improved ice-ocean-atmosphere coupling. This is a bit vague and could be investigating a little further. What about the coupling is actually being improved? Because the authors argue that coupling on short timescales can explain the difference, there could be a lot value in doing similar analysis to what was done in Figure 7, but with other variables. For example, does the OCE ensemble have a stronger upward heat flux and temperature response following reduced sea ice? 5. The title and abstract need to be more specific. Many different links between the Arctic and the midlatitudes have been hypothesized via a number of different mechanisms. It is misleading to refer to Arctic-midlatitude links very generally, when the authors have only investigated one specific link between November Barents-Kara sea ice and the winter NAO in interannual variability. Even with this correlation, the authors have only looked at one mechanism (they have not investigated the stratospheric mechanism). L30-42: Somewhere in this discussion it should be mentioned that observed correlation seems to be highly intermittent when looking at the much longer record (Kolstad and Screen 2019, doi:10.1029/2019GL083059). In the middle of the 20th century, the sign of the connection appears to be opposite compared to the recent period.
L38-42: This is not an accurate description of Blackport et al. 2019. This study has nothing to do with the connection between November BK sea ice and the winter NAO and is not that relevant for this study. A much more relevant study that argues that the correlation between November BK sea and winter NAO may not be causal is Peings, 2019 (doi:10.1029/2019GL082097). L41: Warner et al. 2020 do not suggest tropical forcing as a common driver of sea ice and the NAO. They did suggest this may be the case for other aspects of the mid-latitude circulation, but not the NAO.
L198-207/ Figure 1: The main takeaway from this is that OCE reduces the sea ice everywhere. The changes in variability are also entirely consistent with just a reduction in sea ice extent everywhere . Figure 1 and 3: I think that it would be more useful to show plots for OCE-ERA5 as well to make the improvements easier to see. L307: I don't think any study, including Koenigk and Brodeau (2017), state that the observed signal is a spurious signal. This study, and others like it, express caution that it could be. There is a lot internal variability and spurious signals can arise in model simulations of similar length to the observed record even when there is no/weak signal overall. It also the case that the recent observed correlation appears to be unusually high compared to the longer record(Kolstad and Screen 2019). Figure 5a: The fact that all simulations start off with a higher correlations than over the whole period intrigues me. Because all simulations start of the same ocean state, is it possible that they happened to be initialized in particular state of low frequency variability that contributes to a stronger correlation? L317-319: I don't understand why that would suggest it is coincidental. You wouldn't be able to rule it out, but that is very different from suggesting that it is. L322-323: Is it actually the case that each 30 year period is statistically significant from 0? I doubt that this is the case given that some 30 year periods show correlations close to 0.
L328: How often do they attain correlations that exceed the observed correlation? Figure 5b: I think it is misleading to plot it this way because the overlapping 30 year periods are obviously not independent. There are really only about 6 independent data points in the OCE distribution. I don't doubt that the differences are statistically significant, but this plot likely exaggerates the perceived significance. L350-352: Isn't it more relevant to know whether or not these correlations are statistically different from the correlations in OCE or CTRL? L360-368: The regressions of November zg500 on November sea ice is likely not the response to the sea ice anomalies(at least not entirely). Instead, a large part of it is the atmospheric circulation that forces the sea ice anomalies. The sign of the NAO is opposite to what would be expected if it was the response. Unless the authors are arguing that the initial response to reduced sea ice is a positive NAO, but that contradicts what is shown in Figure 7. L425-456/ Figure 9. I am not sure I understand the point of this analysis. The authors have already established that feedback between sea ice and the NAO, so I don't see how the NAO forcing of the sea ice could explain the difference between OCE and CTRL. There could potentially by a stratospheric pathway where there are causality issues, as suggested by Peings 2019, but the authors have effectively argued against this being the reason for the improvement by showing that difference can entirely be explain based on the daily coupling. The authors should more clearly explain the motivation for it, or remove it. L463: Figure 9->Figure 10 L516: How would the varying model biases contribute to the inconsistencies within long simulations from a single model? Note that there also appears to be large inconsistencies between short periods in observations as well (Kolstad and Screen 2019).
-What do the trends in NAO look like? If the improved correlations represent a response to sea ice loss, it may be expected that there is more negative NAO trends in the OCE simulations. This could have implications for the midlatitude response to sea ice loss and global warming, not only for seasonal predictions. This may be a bit beyond the scope of the study, and a larger ensemble may be needed to find robust differences, but it would really simple to check.