Comment on wcd-2021-51

This paper studies and contrasts impacts of increasing atmospheric model resolution vs increasing the resolution of orography on the NH wintertime mid-latitude atmospheric dynamics in a general circulation model. As climate modelling moves towards higher and higher resolution, studies like this are important and interesting to the community. The simulations are appropriate and the analysis is clear and well presented, with key implications for future research. I am keen to see this paper published in Weather and Climate Dynamics. I have a few more significant comments that should be addressed prior to publication, followed by some minor suggestions.

higher resolution, studies like this are important and interesting to the community. The simulations are appropriate and the analysis is clear and well presented, with key implications for future research. I am keen to see this paper published in Weather and Climate Dynamics. I have a few more significant comments that should be addressed prior to publication, followed by some minor suggestions.

Major Comments
Major Comment 1 I think you need to justify more clearly, and discuss in more detail the implications of, your choice to turn off the subgrid scale parameterizations, since this is not representative of resolution changes in climate models. Part of the discussion of implications should include the fact that this study almost certainly overestimates the importance of orographic resolution because by turning off the parameterizations: -Zonal winds overestimated and therefore role of orography overestimated -Parameterizations are designed to fill the unresolved gap, so they should be, by definition, less important at higher resolutions. The true importance of orographic resolution in climate models lies somewhere between your results, and the case if the parameterizations were perfect (which they are not!), which is that there should be no difference between high and low resolution orography.
If feasible I also highly recommend, as in Kanehama et al. (2019), that you repeat one of your high resolution experiments with the subgrid scale parameterizations on, to confirm that your main conclusions hold true even when the parameterizations are included (though for the above reasons I would expect the improvement with increased orographic resolution to be reduced).
Major Comment 2 I think the last section could be structured more clearly to make your conclusion clearer and more concise. I agree with your conclusion that re-tuning high resolution models is important, but I have a couple of concerns: I'm not convinced you have shown the link to the biases in atmospheric dynamics sufficiently clearly -this requires more analysis, or perhaps emphasis on the differences between the TL511_orog255 and TL799_orog255 in figure 6 (including analysis to confirm that the differences are statistically significant, see comment 3), to illustrate the impact of the radiative budget biases on the dynamics, which is the focus on this paper (as stated on line 68). You have shown the radiative budget impact clearly, but your introduction is missing a lot of the literature on this if the radiative budget is to be your focus (e.g. https://link.springer.com/article/10.1007/s00382-018-4547-y, https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2019JD032184, https://www.jstage.jst.go.jp/article/jmsj/98/1/98_2020-005/_article, https://journals.ametsoc.org/view/journals/bams/98/3/bams-d-15-00135.1.xml to list a few recent papers). Alternatively (the option I would recommend) you could limit your analysis more to the dynamics, spend less time discussing the exact differences in the radiative budget, and in one or two paragraphs summarize these results and the importance of tuning (including reference to previous papers that make this argument too), particularly for the dynamics (the point you are making). After a strong focus on dynamics in the rest of the paper I was a little surprised by the extensive analysis of the radiative budget in section 6 until I read the discussion section. I agree with your assessment that this is likely related to the lack of tuning for the higher resolutions. I recommend motivating the analysis section a bit more with of the discussion about typical high-resolution initiatives lacking tuning for the high-res version that you discuss in lines 395-396 -either at the beginning of section 6, or in the introduction.

Major comment 3.
Statistical significance -some of your changes are quite small, e.g. the changes in zonal wind in Fig. 2, and differences between TL511_orog255 and TL799_orog255 in figure 6could you give an estimate of the fractional change, and/or statistical significance of these changes, perhaps based on assuming independence between consecutive winters (A reasonable assumption for climatological fixed SSTs)?

Minor comments
Line 72: Sections 3 and 4 present analysis of the mean climate and of the mid-latitude variability respectively. Line 135. Define GHGS and GHGN to make it easier for readers (e.g. geopotential height gradient at a southern (GHGS) and northern (GHGN) latitude) Line 227. You have never explicitly mentioned that the increase in resolution will result in an increase in maximum heights of mountains -this would be good to add (https://doi.org/10.1029/2020AV000343 might be an interesting paper to reference) Lines 245-255. The improvement of the tri-modal distribution with increased orographic resolution is perhaps consistent with this paper: https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2019GL084780  Figures 3 and 6. It took me a while to notice that there were colour groups for the high and lower resolution topography -this is useful, and I would recommend informing the reader you have done this, and perhaps making it clearer by using different line styles to differentiate between high and low resolution topography versions. This would emphasise the benefit of increasing orographic resolution. Also, please consider colour-blind readers: asking readers to distinguish between red and green is not the best for those readers (e.g. https://www.ascb.org/science-news/how-to-make-scientific-figures-accessible-to-readerswith-color-blindness/) Line 371. As you have pointed out the lack of tuning for the higher resolution experiments, which will particularly affect precipitation and cloud cover, I think it is hard to make robust conclusions about the different basins.
Line 377: Suggested re-wording: "the deterioration of the radiative budget counteracts any potential improvements provided by the refinement of the atmospheric grid" -the original wording of "most of" implies that there is a definite improvement in the dynamics with the increased atmospheric resolution, which I don't think you have shown as you don't have a re-tuned simulation.
Line 397 -Suggested re-wording: potential mechanism responsible …. is increased in our experiments.