Articles | Volume 7, issue 2
https://doi.org/10.5194/wcd-7-767-2026
© Author(s) 2026. This work is distributed under the Creative Commons Attribution 4.0 License.
A spread-versus-error framework to reliably quantify the potential for subseasonal windows of forecast opportunity
Download
- Final revised paper (published on 12 May 2026)
- Supplement to the final revised paper
- Preprint (discussion started on 23 Oct 2025)
- Supplement to the preprint
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on egusphere-2025-4925', Anonymous Referee #1, 28 Nov 2025
- AC1: 'Reply on RC1', Philip Rupp, 25 Mar 2026
-
RC2: 'Comment on egusphere-2025-4925', Tim Woollings, 09 Jan 2026
- AC2: 'Reply on RC2', Philip Rupp, 25 Mar 2026
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
AR by Philip Rupp on behalf of the Authors (26 Mar 2026)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (09 Apr 2026) by Tim Woollings
RR by Tim Woollings (20 Apr 2026)
RR by Anonymous Referee #1 (21 Apr 2026)
ED: Publish subject to minor revisions (review by editor) (21 Apr 2026) by Tim Woollings
AR by Philip Rupp on behalf of the Authors (26 Apr 2026)
Author's response
Author's tracked changes
Manuscript
ED: Publish as is (29 Apr 2026) by Tim Woollings
AR by Philip Rupp on behalf of the Authors (03 May 2026)
Manuscript
The manuscript “A spread-versus-error framework to reliably quantify the potential for subseasonal windows of forecast opportunity” by Rupp et al. explores the relationship between the ensemble spread and forecast error in sub-seasonal ensemble forecasts (days 14-46) by ECMWF system and in a statistical toy model. The authors propose an approach, based on spread-error relationship, to identify regions where variations in ensemble spread correlate with variations in forecast error and demonstrate, using a simple statistical model, that spread-error relationship can be deteriorated by insufficient sampling, lack of physical processes that modulate predictability, and model deficiencies.
The paper provides several interesting ideas, in particular exploring the connection between intra-forecast and inter-forecast variability of the spread, and illustrating several critical issues of sub-seasonal forecasting (such of under-sampling) using the toy model. I have no doubt that the paper should be published in WCD. However, I ask the authors to clarify several critical points before publication.
Major points.
Specific points:
L61-64: Are these assertions supported by research, or is it your hypothesis? If this is the former, a reference is needed. If this is your hypotheses, please be clear about it.
L113: Provide full reference for Leutbecher et al.
L114-115: “A comparison between the IFS model and the CNRM model further shows qualitatively robust patterns (discussed in Section 6).” Robust patterns of what? Also, more information about the used CNRM data is needed.
L115-116: It is quite difficult to comprehend what exactly “forecast spread reliability is influenced by the potential for windows of opportunity” means. I am not sure which definition of “reliability” the authors are using. A reliable ensemble forecast system (or any other forecast system that provides probabilistic forecasts) is one whose predicted probabilities correspond to the observed frequencies; this is what a reliability diagram illustrates. It would help if the authors provided the definition of reliability they are using. In addition, what is the difference between “windows of opportunity” and “potential for windows of opportunity”? “Opportunity” and “potential” sound synonymous to me.
L125-127: “However, if the ensemble size is small, sampling errors will be relatively large. In such a case, some forecast/time step with, e.g., low spread, could be also associated with comparably large error, as the spread is simply underestimated due to sampling error.” You assume that spread is not a good predictor for accuracy, but has this been studied? Also, how to define whether the ensemble size is small or not? The size you are using (50 members at least) does not sound small to me.
Figure 2: Have you tried plotting only the “inter” component of your variance separation, rather than showing daily spread and error, which are mostly noise?
Figure 2 captions: “Red dashed line” not “Orange dashed line”
L151: How do you define “anomaly”? Figure 2 shows only positive values. For anomalies I would expect both positive (above climatology) and negative (below climatology) values.
L175: Do you assume that ensemble mean is well represented in the toy model, or do you also assume it is well represented in operational forecasts? Is this assumption justified?
L242: Does your assumption hold? I understand that, as you under-sample the forecast distribution, the variability of the spread will in general increase. However, I believe that the variability of ensemble mean would also increase, leading to increased error. Why this would not be the case?
L251: If the error is overestimated then how this can lead to a lower error?
L235-255: I cannot understand your explanations for decreased SRS in experiment (b), and I am not sure that you can explain it without analysing variability of ensemble mean.
L262-270: Do you mean that a larger ensemble size than 100 members would be required to capture the spread-error relationship in the case shown in panel “c”? Have you tested this with your toy model?
L271: “intrincic” -> ” intrinsic”
L289-290: Can you be more specific about which effects are unsystematic? I understand that insufficient number of cases leads to unsystematic effects, but can for example small sample size lead to unsystematic effects, or does it always lead to decreased SRS?
L324-329: Can you provide equations for the inter- and intra- variability?
L341: I do not know what the journal’s policy is, but I would prefer to see the definition of the theoretical sampling error estimate in the text rather than in figure captions.
L351-352: I presume you refer to Figure 4d? It would be nice to explicitly refer to this figure in the text, for clarity.
L388-389: It took me a while to figure out that you are using different colour scale for Figs. 9b and 9d. I suggest using the same scale because you are making the point about smallness of the anomalies in Fig.9d, which cannot be seen with the present scales.