Articles | Volume 6, issue 4
https://doi.org/10.5194/wcd-6-1661-2025
© Author(s) 2025. This work is distributed under the Creative Commons Attribution 4.0 License.
Signal, noise and skill in sub-seasonal forecasts: the role of teleconnections
Download
- Final revised paper (published on 04 Dec 2025)
- Supplement to the final revised paper
- Preprint (discussion started on 30 Jun 2025)
- Supplement to the preprint
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
- RC1: 'Comment on egusphere-2025-2556', Anonymous Referee #1, 03 Aug 2025
- RC2: 'Comment on egusphere-2025-2556', Anonymous Referee #2, 15 Aug 2025
- AC1: 'Authors' responses to Referees comments', Alexey Karpechko, 01 Sep 2025
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
AR by Alexey Karpechko on behalf of the Authors (15 Sep 2025)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (24 Sep 2025) by Gwendal Rivière
RR by Anonymous Referee #1 (04 Oct 2025)
RR by Anonymous Referee #2 (17 Oct 2025)
ED: Publish subject to technical corrections (03 Nov 2025) by Gwendal Rivière
AR by Alexey Karpechko on behalf of the Authors (11 Nov 2025)
Author's response
Manuscript
Reivew of " Signal, noise and skill in sub-seasonal forecasts: the role of
Teleconnections" by Karpechko et al
This study uses a set of ensemble relaxation experiments to explore the relationship between tropical and stratospheric teleconnections, forecast skill, and signal to noise relationships. Relaxing either the tropics or the stratosphere increases the forecast skill for SLP, and to a lesser degree for T2m and precip, in many regions; these effects are mostly consistent with previous work. The novel part is that the study then tries to diagnose whether the increases is associated with a signal in the ensemble mean, with a reduction in the ensemble spread, or both. While in many regions the answer is "both", there are numerous exceptions (including the Northern Europe signal in SLP to stratospheric nudging, where the ensemble mean signal is weak, and most of the skill increase comes from a reduction in ensemble spread). The authors then diagnose how big an ensemble is needed before it possible to reliably extract signal from noise, and find that larger ensembles than are used in this study would be needed to identify sub-seasonal predictability; this last part is where I think the study could be improved the most.
Overall, the required revisions could be relatively minor if the authors decide to tone down the statements I found most objectionable, or more major if they disagree with my assessment and provide additional evidence supporting their statements. Either way, revisions are needed before I consider the final version.
There are three major comments that are somewhat related to one-another and concern how to interpret the signal to noise metrics presented in this paper:
1a. As alluded to above, I think the conclusions drawn from the analysis on the minimal ensemble size are likely overstated. I am particularly bothered by lines 23-24 in the abstract and 71-73 in the introduction. The discussion section (lines 513-518) is a little more careful, but even there I think the wording can be refined.
The minimal ensemble size used in this paper is true for the S2N definition and perfect model definition used here. But there are other ways of extracting subseasonal signals from forecast ensembles and skill can be demonstrated from much smaller ensembles in many situations.
Using long hindcasts we can extract teleconnection signals from the tropics using <5 ensemble members (e.g. Stan et al 2022). Domeisen et al 2020 (already cited) also showed that <5 members is enough to extract signals from the stratosphere for many models. Both of these studies use long hindcasts from several models, and demonstrate some skill at representing teleconnections using far fewer members, even as the skill will of course increase as ensemble sizes increase. I think the authors' results are demonstrating that signal exceeds noise only for ensemble sizes larger than 20, and such a signal to noise analysis is essential for deciding on ensemble size of real-time operational forecasts. But real-time forecasts use 50 members or more at least for IFS, so it would seem that operational forecasts are already large enough to extract signals in most regions. It would seem that rewording the text in the three locations noted above would be enough to resolve this issue, unless the authors disagree with me in which case additional work is needed.
1b. A related issue is that equations 12 and 13 work in the limit that Control has no skill. If I understand equation 12 and 13 correctly, the residual skill in CTRL in week 5-6 will lead to an overestimate of the minimal ensemble size. This is because of nonzero sigma^2 in CTRL. Is there a way to account for this effect in the derivation of equation 13, or at least quantify how important this effect might be?
1c. An alternate way of thinking about "perfect model" and signal to noise is the ratio of predictable components (RPC) from Smith and Scaife 2018 (already cited). This definition seems to be more robust to ensemble size, and can identify S2N issues with relatively small ensembles (see figure 1 of Smith and Scaife and figure S17 of Garfinjkel et al 2024; already cited) though bigger ensemble sizes certainly help. I hate to add yet another metric to this already comprehensive paper, but I think the authors need to compute RPC if they really think their statements in the three locations outlined above are correct. Otherwise, the statements in the abstract and end of discussion about minimum ensemble size need to be made more specific to one specific method of ascertaining signal to noise. On a related note, it isn't clear to me whether RPC and S2N metrics are actually the same thing, or even closely related, despite the fact that they both use similar terminology; hence the closing paragraph on lines 535-540 seems overly speculative at the moment.
(Given the fact that STRAT nudging is increasing skill in Northern Europe despite not increasing EM variability, I strongly suspect there is an RPC>1 issue in this region. This is likely to be similar to the RPC>1 issue shown by Garfinkel et al 2024 for this model in polar cap height)
Minor comments
Line 19/20: an additional possibility is that the model isn't fully utilizing the predictable signal, or possibly is misrepresenting the predictable signal.
Line 44: missing word in "some state-of-art can capture"
Line 58: I suggest adding Stan et al 2022
Table 1: is there tapering for the stratospheric nudging below
Line 139: the "(\rho)" belong two words earlier in the sentence
Line 380-381: is it possible to provide a more physically meaningful interpretation? For example, is there overly strong downward coupling from the stratosphere to Northern Europe in control?
Figure 8 and similar other figures: suggest masking regions without skill with a different color than white, since white is used for topography.
Stan, Cristiana, Cheng Zheng, Edmund Kar-Man Chang, Daniela IV Domeisen, Chaim I. Garfinkel, Andrea M. Jenney, Hyemi Kim et al. "Advances in the prediction of MJO teleconnections in the S2S forecast systems." Bulletin of the American Meteorological Society 103, no. 6 (2022): E1426-E1447.