|I appreciate authors’ efforts in addressing my comments. However, I still have major reservations as outlined below. |
The line numbers and sections refer to the revised manuscript.
After my first review, I came across Dorrington and Strommen 2020 [doi:10.1029/2020GL087907, hereafter DS2020] and Dorrington et al. 2021 [doi:10.5194/wcd-2021-71], which answered some of my questions about WTD.
Fitness of WTD: DS2020 found that the raw Z500 phase space (before removing the jet speed variability) is quite Gaussian (DS2020’s Fig 1a). In other words, the clustering is not very clear cut. DS2020 found the standard/classical k=4 clustering to be problematic. “When setting K=4 in our residual space, we found K-means clustering consistently returns Clusters 3, 4, and 5 of Figure 4 but in different 30-year windows switches between including Clusters 1 and 2” (quoting from last paragraph of section 3.4 in DS2020). Such decadal variability in cluster centroid was considered not a real signal, but artifact of the choice of k=4.
Interpretation of WTD: Back to your work, your response to my previous comment 1c did not address my questions. (Maybe I was not clear enough.) I still cannot understand what is different in the *raw model output*, that causes difference in the WTD analysis. Part of my questions relate to your choice to allow cluster centroids to differ and that the clustering is not clear cut. DS2020 made the clustering clear cut by removing the jet speed variability, and they worked towards requiring “dynamically relevant regimes to be approximately stationary features of the midlatitude circulation over centennial time scales, at least in terms of spatial patterns if not in residence times or transition probabilities” (quoting from section 3.2 in DS2020).
When you find “cluster centroid” to be different in some models, could it be the models behave like different 30-year windows in the example of DS2020? And that difference gets overly exaggerated because of the bad choice of k=4?
On the interpretation of “blocking frequency” and “blocking center”, I would like to see the output of the WTD clustering (the weather type and its frequency) as an intermediate step before the “blocking frequency” and “blocking center”. I think that helps readers interpret the results. Meanwhile, fundamentally, I am looking for more clear-cut clustering, or to use the same cluster centroid. In this way, I can better trust that WTD clustering can faithfully summarize the Z500 variability, and will not overly exaggerate noises.
2. Insignificant results
Related to my previous comment 1d. Now, after taking away results using DeltaZ500_HIST, most results are not statistically significant. If you cannot exclude that non-clear-cut clustering might have a role in making the results insignificant, I think you should acknowledge that in your manuscript.
While I agree that “Finding not statistically significant changes is a result itself”, I encourage authors to cite papers that contrast with the results here. Huguenin et al. 2020 [doi:10.1029/2019GL086132] also used some kind of circulation type classification, and found lack of change in frequency and persistence, but they cited papers which contrast with their results. You may also read Kautz et al. 2021 [doi:10.5194/wcd-2021-56] and Nabizadeh et al. 2021 [doi:10.1175/JCLI-D-21-0141.1], where you may find some discussion on blocking under climate change.
3. Line 14-15: You may also refer to Kautz et al. 2021 [doi:10.5194/wcd-2021-56].
4. Line 124: Consider change “net impact” to “total impact” or “gross impact”.
5. Section 3.3: Your response to my previous comment 10 helps a bit, but the revised manuscript is still not clear on the treatment of holes.
Line 128. “longer than five days”->”at least five days long”
Line 129. Remove “and separated by at least two non-blocking days”. Because this is not true for the 2nd example I gave last time (001110101011100).
Line 129. Consider change “is assumed to represent”->”might represent”, in order to soften this sentence, because this is not true for the two examples I gave last time.
Line 130. Consider change “Therefore” to “Concretely”, because this sentence is what the code does, not only examples.
Below are more subtle details. One way is that you can reply here and refer to the discussion here in the manuscript.
If you are doing find-and-replace in place, the searching order of (11011,11101,10111) matters, e.g., 00110110100, 001110101100. Would be good to make explicit the order of searching.
Overlapping matches can be bad for codes, e.g., 110111011 is two 11011 overlap together. Does your code find one or two matches of 11011? What will your code say about 0011011101110100?
6. Fig. 1 caption: Related to my previous comment 15, how about you say in the caption that it is the CRMSD, not RMSD?
7. Supplement Step A: Related to my previous comment 23, can there be more than one blobs that contain a DG-grid box? How are they treated?