Deficient ocean–atmosphere feedbacks constrain seasonal NAO prediction

Kolstad, Erik W.

doi:10.5194/wcd-7-507-2026

Articles | Volume 7, issue 1

https://doi.org/10.5194/wcd-7-507-2026

Articles | Volume 7, issue 1

Research article

24 Mar 2026

Research article |

| 24 Mar 2026

Deficient ocean–atmosphere feedbacks constrain seasonal NAO prediction

Erik W. Kolstad

Abstract

As the North Atlantic Oscillation (NAO) accounts for a dominant share of wintertime weather variability across the North Atlantic, it is a coveted target for seasonal prediction. Yet dynamical forecast systems exhibit limited skill. Here I build on previous results linking November sea-surface temperature (SST) anomalies to the subsequent winter NAO via ocean–atmosphere feedback mechanisms involving baroclinicity and surface heat fluxes. I hypothesise that limited model skill is partly attributable to a deficient representation of these mechanisms. While remote influences such as tropical or stratospheric forcing can affect both SSTs and the NAO, thereby contributing to apparent but non-causal relationships, I find that the seasonal prediction system SEAS5's internal lagged SST–NAO relationship nonetheless correlates with its NAO forecast skill. Since this skill reflects the combined effects of all sources of predictability – including tropical and stratospheric forcing – this correlation is an important finding. Using mediation analysis to contrast the behaviour of SEAS5 with that of the ERA5 reanalysis, I find that SEAS5 produces weaker mediated effects via both fluxes and baroclinicity than those found in ERA5. Critically, the strength of these mediated effects in the model correlates with its NAO forecast skill. This suggests that models reproducing realistic mediation pathways for ocean–atmosphere interactions are likely to achieve higher NAO skill than models that do not.

Download & links

Article (PDF, 6182 KB)

Download & links

Received: 14 Oct 2025 – Discussion started: 17 Oct 2025 – Revised: 08 Mar 2026 – Accepted: 09 Mar 2026 – Published: 24 Mar 2026

1 Introduction

It is no wonder that the North Atlantic Oscillation (NAO) has received considerable attention in studies of climate dynamics, given that it accounts for roughly half of the interannual wintertime tropospheric pressure variance over the North Atlantic (Ambaum et al., 2001). Hence, it serves as a good proxy for fluctuations in the strength and latitudinal position of the jet stream (Woollings and Blackburn, 2012) and storm tracks (Rivière and Orlanski, 2007), and by extension for variations in weather and associated impacts over and around the North Atlantic basin (e.g. Athanasiadis et al., 2017; Degenhardt et al., 2023).

Owing to its wide-ranging influence, the NAO is routinely used as a benchmark for mid-latitude seasonal prediction skill. Statistical (empirical) methods have a long history and have achieved potentially useful levels of skill using predictors such as autumn Arctic sea ice, Eurasian snow cover, tropical and regional sea-surface temperatures (SSTs), and stratospheric variables (e.g. Hall et al., 2017; Wang et al., 2017). While empirical models offer interpretability and have at times appeared to outperform dynamical systems, their reliance on historical relationships makes them vulnerable to non-stationarity and shifts in climate regimes (Hertig et al., 2015; Kolstad and Screen, 2019). Demonstrated skill in one period therefore provides no guarantee of consistent predictability in others (Weisheimer et al., 2017; Baker et al., 2024). More recently, empirical models have been enriched with machine-learning and hybrid techniques, yielding higher forecast skill, including for the NAO (e.g. Mu et al., 2023; Sun et al., 2024).

In sum, empirical models are susceptible to including non-causal predictors, either because background-state changes render previous relationships invalid or due to spurious correlations. In principle, dynamical coupled prediction systems are not constrained by these limitations, as they aim to reproduce the behaviour of the climate system from first principles. Such systems should ideally be able to integrate any initial condition to a realistic future state. In practice, however, as Suckling and Smith (2013) pointed out, even physics-based models are not independent of the data used in their design. Their apparent ability, evaluated primarily through retroactive forecasts, or reforecasts (also known as hindcasts), to reproduce historical variability may therefore overstate true predictive skill. Moreover, in a changing climate, even out-of-sample performance offers no guarantee of future success, given the nonlinear nature of the system's response to external forcing (Stott et al., 2013). This highlights the need to assess not only the overall skill of prediction systems, but also the physical consistency of the processes and feedbacks they represent.

About a decade ago there was a surge of enthusiasm over the high surface-defined NAO skill reported in some dynamical systems (Scaife et al., 2014). However, there is a wide range of performance between systems and system upgrades have not significantly improved overall skill (Baker et al., 2024). Several studies have convincingly demonstrated that the performance of dynamical prediction systems depends on how well they represent crucial physical processes. For instance, Patrizio et al. (2025) showed that skill in decadal NAO forecasts depends on how models represent feedbacks between subpolar SST anomalies and the NAO. Haarsma et al. (2019) found that increasing the oceanic resolution of a coupled system strengthened air–sea interactions and enhanced seasonal predictability in the North Atlantic storm track entrance region east of Newfoundland. Similarly, Hardiman et al. (2022) showed that most forecast systems consistently underestimate positive feedback mechanisms between transient eddies and the large-scale flow, leading to weaker eddy forcing of the mean circulation. A related problem concerns the representation of mesoscale oceanic eddies and associated SST gradients in “eddy-rich regions, including the Gulf Stream” (Zhang et al., 2021). Limited horizontal resolution tends to smooth these gradients and weaken the coupling between SST, surface heat fluxes, and low-level atmospheric baroclinicity (Hewitt et al., 2017; Bellucci et al., 2021; Athanasiadis et al., 2022).

Together, these findings highlight that two-way ocean–atmosphere coupling is fundamental to NAO variability across timescales, yet it remains misrepresented in current prediction systems. Indeed, for the same family of forecast systems considered here, Roberts et al. (2021) demonstrated that biases in the location and structure of the Gulf Stream substantially degrade subseasonal forecast skill, and that correcting these SST errors online improves the mean state and circulation anomalies across the North Atlantic and downstream into Europe.

A large body of work has convincingly shown that, in the North Atlantic, the atmospheric impact on the ocean surface generally dominates the feedback loop. For instance, Patterson et al. (2024) recently demonstrated that the atmosphere primarily drives surface heat flux variability in the Labrador Sea. However, the same study identified a strong oceanic imprint on the atmosphere further south in the Gulf Stream region. Similarly, Joyce et al. (2019) demonstrated that meridional shifts of the Gulf Stream front and its associated SST gradients tend to lead changes in storm tracks and Greenland blocking by one to three months, suggesting an evolving pathway from autumn ocean conditions to wintertime atmospheric variability. Both observational studies (Czaja and Frankignoul, 2002; Wang et al., 2004; Hall et al., 2017) and modelling experiments (Rodwell et al., 1999; Watanabe and Kimoto, 2000; Baker et al., 2019; Sun et al., 2024) have shown that characteristic SST patterns can precondition the atmosphere on subseasonal to seasonal timescales. In particular, the North Atlantic SST tripole (or the similar “horseshoe” pattern) has been linked to a feedback loop involving the NAO itself (Peng et al., 2002; Pan, 2005; Mosedale et al., 2006; Cassou et al., 2007; Gastineau and Frankignoul, 2015).

A useful distinction emerging from this literature is that observational studies identify these SST–atmosphere linkages directly in the climate record, whereas model-based studies generally recover similar patterns only when the predictable component of variability is isolated. This reflects signal-to-noise issues (Scaife and Smith, 2018; Weisheimer et al., 2024): models can reproduce the relevant mechanisms, but the forced signal is often weaker than in observations, making the pathways more difficult to detect.

Kolstad and O'Reilly (2024) used reanalysis data to demonstrate that the correlation between November SSTs and the subsequent NAO increases gradually through the winter season, peaking in January and February. They showed that surface heat fluxes in the western part of the Subpolar Gyre region and baroclinicity in the storm track entrance region in the western North Atlantic act as key mediators in this feedback. The latter result is consistent with the well-established role of diabatic heating and eddy feedbacks in maintaining storm track baroclinicity (Hardiman et al., 2022).

The statistical framework used by Kolstad and O'Reilly (2024) is known as mediation analysis (e.g. MacKinnon et al., 2000; Nguyen et al., 2021). It is particularly well-suited to climate science applications where feedback loops and mediated pathways are common but difficult to isolate using traditional correlation-based methods. Recent examples of its use, in additon to some of my own work, include the studies by Maybee et al. (2023) and Risser et al. (2025). The framework belongs to the broader family of causal inference methods (Pearl et al., 2016), so named because they are designed to identify and quantify causal relationships. Several such approaches have been successfully used in climate science, ranging from easily interpretable approaches like Granger causality (e.g. Granger, 1969; Mosedale et al., 2006; McGraw and Barnes, 2018) to more complex methods (e.g. Ebert-Uphoff and Deng, 2012; Hannart et al., 2016; Runge et al., 2019; Docquier et al., 2024).

A key advantage of causal inference approaches is that they allow pathways to be investigated without manipulating model boundary conditions for sensitivity experiments. Perturbation-based methods, though widely used, can produce unintended consequences. For example, perturbing greenhouse gas concentrations triggers numerous feedbacks on diverse timescales, complicating attribution of the climate system's response and adjustments (Knutti and Rugenstein, 2015). Even more localised interventions can have undesirable side effects: Lewis et al. (2024) showed that modifying albedo or applying surface heating to force sea-ice loss can generate spurious warming and exaggerate the atmospheric circulation response. Similarly, O'Reilly et al. (2023) demonstrated that active SST-restoring in the tropical North Atlantic can systematically drive upward surface heat fluxes that are unrepresentative of observations, leading to an exaggerated precipitation and remote circulation response. As Palmer and Weisheimer (2011) noted, multiple model errors can compensate for one another, making it difficult to diagnose the underlying causes of biases. These considerations further motivate the use of mediation analysis, which relies solely on observed covariances and avoids imposing artificial perturbations.

This study extends Kolstad and O'Reilly (2024) in three ways. First, it examines whether the strength of the November-to-winter SST-NAO linkage in a state-of-the-art seasonal forecast system is related to its skill in predicting the NAO, thereby motivating the subsequent mediation analysis. Second, it quantifies and clarifies causal directionality in the relationships between November SSTs and surface heat fluxes, baroclinicity, and the winter NAO. Third, it applies the mediation framework to the forecast system to assess whether biases in these relationships can help explain its limited NAO prediction skill.

It is important to emphasise that the mediation pathways examined here do not account for the full influence of November SSTs on the winter NAO, which can be viewed as the combined effect of all possible pathways operating throughout the climate system, both locally and remotely. The present analysis focuses on only a small subset of this much broader interaction network. It zooms in on surface heat fluxes and baroclinicity because there are good physical reasons to expect these mechanisms to participate in SST-induced adjustments of the North Atlantic circulation. In other words, the analysis should be interpreted as isolating two components of the total SST influence: it has the potential to reveal where these specific pathways reinforce or oppose the SST–NAO relationship, without implying that they represent the full climate system’s response. It is furthermore acknowledged that remote effects such as the El Niño–Southern Oscillation (ENSO) may influence both the SSTs and the NAO and thereby contribute to apparent but non-causal associations within the SST–mediator–NAO relationship.

The following section gives an overview of mediation and partial-correlation analysis, before the data and methods are described in Sect. 3. Section 4 presents the results, and Sect. 5 discusses their implications for understanding and improving NAO predictability.

2 Mediation analysis

Adopting the naming convention of MacKinnon et al. (2000), a causal pathway linking a predictor variable X to an outcome variable Y is written as:

X \to Y .

In the analysis to follow, X is an index representing SST anomalies in November and Y is the winter NAO index. Due to the temporal offset, the correlation between these two variables must be mediated by other processes, referred to as mediators and denoted Z. Here Z is a gridded spatial field representing surface heat fluxes and a metric for baroclinicity. These mediators are investigated separately through the pathway

X \to Z \to Y .

It is customary to quantify the mediating role of Z and categorizing it as either: a perfect or partial mediator if it fully or partially accounts for X→Y (Baron and Kenny, 1986); or a suppressor if the correlation between X and Y is strengthened when Z is accounted for (Conger, 1974).

As mentioned in the Introduction, a pixel value of any one variable cannot uniquely mediate the lagged effect of SSTs on the NAO. In reality, a practically infinite web of interacting processes combine to realise that relationship. Nevertheless, the approach used here is useful for providing a spatial fingerprint of where a single variable exerts the strongest mediating influence. Equally important, the method can be used to identify where a forecast model incorrectly mediates or even suppresses the SST–NAO correlation.

2.1 Regression equations

To test for mediation or suppression, three regression equations are defined (ignoring intercepts and residuals for simplicity). Prior to estimating the coefficients, X, Y, and Z were standardised. The first equation describes the total effect τ of the predictor X on the predictand Y:

\begin{matrix} (1) & Y = τ X . \end{matrix}

Here, τ represents the correlation between the standardised November SST index and the standardised winter NAO index. The second regression describes $X \to Z \to Y$ by accounting for the standardised mediator variables. The effect of X on Y changes to τ^′, known as the direct effect (not through the mediator), and the effect of Z on Y when accounting for X is denoted as β:

\begin{matrix} (2) & Y = τ^{'} X + β Z . \end{matrix}

The total effect of X on the mediator Z is labelled here as α in the second equation:

\begin{matrix} (3) & Z = α X . \end{matrix}

An important thing to note is that α encapsulates not just the direct forcing X→Z, but also all the indirect forcing through intermediate variables, crucially including via the pathway $X \to Y \to Z$ .

A central concept is the product αβ, known as the indirect or mediated effect (of X on Y through Z). The total effect is the sum of the direct and mediated effects: $τ = τ^{'} + α β$ . This also follows from Eqs. (1)–(3). Scaling the mediated effect by the total effect yields:

\begin{matrix} (4) & \frac{α β}{τ} = 1 - \frac{τ^{'}}{τ} . \end{matrix}

According to the standard criteria for mediation laid out by Baron and Kenny (1986), τ, α, and β must all be significantly different from zero. If $τ^{'} = 0$ (or is not significantly different from zero), it follows from Eq. (4) that the total and mediated effects are identical. In this case, the pathway X→Y is fully accounted for by Z, indicating that $X \to Z \to Y$ represents a valid causal pathway – though not necessarily the only one.

2.2 Identifying the SST-forced component

Mediation often represents forward-directed pathways where X changes Z, and Z subsequently affects Y. In climate dynamics, however, feedback mechanisms are common, and ambiguities may arise because Y and Z are evaluated contemporaneously. This implies that both the hypothesised $X \to Z \to Y$ and the alternative $X \to Y \to Z$ pathways may be active. It is nevertheless possible to assess the degree to which Z responds directly to X rather than indirectly through Y by regressing out the concurrent variability of Y:

\begin{matrix} (5) & Z = α^{'} X + γ Y . \end{matrix}

Here, α^′ represents the NAO-independent SST-to-mediator influence, to be compared with α from Eq. (3), which includes all routes from X to Z (including those via Y). The product α^′β is then interpreted as the SST-forced component of the mediated effect; that is, the influence that would arise if the mediator responded only to direct SST forcing, while the NAO retained its full sensitivity to the mediator through β. This is a complementary diagnostic to the full mediated effect αβ, not a replacement; it helps distinguish SST-forced mediation from mediation that is predominantly atmospheric in origin.

To quantify the SST forcing onto the mediator, the sign consistency between α^′ and α is assessed. In regions where the SST forcing aligns with the full forcing (which includes NAO feedbacks on Z), the ratio $α^{'} / α$ should be positive. Values near zero indicate that the SST forcing is weak. Because the mediators (Z) are inherently noisy, Ordinary Least Squares (OLS) estimates of α and α^′ are both subject to attenuation bias (Greene, 2003), which biases coefficients towards zero and increases the chance of sign flips across different sample sets.

Accordingly, I use a conservative hypothesis test: the null hypothesis is that the SST-forced component (α^′) is zero or has the opposite sign to the total forcing onto the mediator (α). Rejecting this null hypothesis indicates that the SST forcing is sufficiently robust to maintain a consistent physical direction despite attenuation. I emphasise that it does not imply that the SST forcing dominates over NAO → mediator feedbacks.

Accounting for the NAO's autocorrelation is another prudent step to prevent potential confounding of the results. Kolstad and O'Reilly (2024) showed that in ERA5, this autocorrelation was only significant from November to December and not from November to DJF; this was confirmed to be valid for the shorter period examined here for both ERA5 and SEAS5. Labelling the NAO index in November as Y₀, a new regression equation could be defined as:

Z = α^{''} X + γ_{0} Y_{0} + γ^{'} Y .

However, as the coefficient γ₀ was found to be negligible for both mediators, which was expected in light of the missing NAO autocorrelation, Eq. (5) is used in the analysis.

2.3 Suppression

An interesting special case occurs when $τ^{'} / τ > 1$ , which means that the mediated effect αβ has the opposite sign to the total effect τ (Eq. 4). In these cases, Z is referred to as a suppressor because the regression coefficient linking X and Y is inflated when Z is accounted for (Muniz and MacKinnon, 2025). In the context of this study, this could mean that X (November SST anomalies) drives changes in Z (e.g. flux anomalies), but the response of Y (the NAO) to those fluxes is of opposite sign to the direct X→Y pathway. This can occur because Y itself feeds back onto Z, helping to make β negative. In other words, Z acts as a negative feedback, transmitting a damping influence on Y that partly cancels (suppresses) the predictive signal from X. In the raw correlation, this feedback reduces the apparent strength of X as a predictor of Y, but once Z is controlled for, the hidden strength of the X→Y link is revealed. Put differently, had it not been for the negative feedback through Z, X would have exerted stronger predictive power on Y.

2.4 Scope

The mediation framework is applied to X, Z and Y as defined over the North Atlantic sector. Controls for remote precursors such as ENSO or stratospheric anomalies are not included. Consequently, any shared influence of such processes on both pre-winter SST and the winter NAO can appear implicitly in the estimated relationships; the results should therefore be interpreted as structural diagnostics rather than formal causal attribution across the full suite of teleconnections.

2.5 Sample coefficient notation

Throughout the paper, sample coefficients in Eqs. (1)–(5) (i.e. coefficients estimated through OLS fitting) are denoted by carets; for instance, $\hat{τ}$ is the estimated τ value.

3 Data and methods

3.1 Data

Reanalysis and seasonal forecast data are used. The reanalysis reference is ERA5 (Hersbach et al., 2020), produced by the European Centre for Medium-range Weather Prediction (ECMWF), and the forecast system is SEAS5, the ECMWF's seasonal prediction system (Johnson et al., 2019). The reason only one model is investigated here is that its reforecast period extends back to 1981, while reforecasts are only available from 1993 and onwards for comparable systems – this shorter period would render the mediation analysis less robust. The analysis covers the winters from 1981/82 to 2023/24 (hereafter referred to as 1981–2023).

The atmospheric component of SEAS5 is the Integrated Forecast System (IFS) atmosphere model. The grid spacing for the ocean model in SEAS5 is 0.25 degrees, which has been shown to yield a decent representation of air–sea interaction along the Gulf Stream front compared to lower-resolution models (Jin and Yu, 2013; Athanasiadis et al., 2022; Patrizio et al., 2023). It seems the resolution will not change in the new SEAS6 system due to be released soon, but the new ocean model nevertheless appears to yield multiple improvements, including large reductions in SST errors along the Gulf Stream (Keeley et al., 2024, their Fig. 2a).

A well-documented feature of SEAS5 relevant for this study is a warm SST bias in the western North Atlantic up to the mid-1990s (Stockdale et al., 2018; Tietsche et al., 2020). Inherited from issues with the ocean reanalysis, this bias allowed SST errors to grow rapidly and produced a local warm anomaly that affected near-surface temperature and surface heat fluxes. To verify that it did not affect the conclusions of this paper, the core analysis was repeated using only the period 2001–2023; the results were practically unchanged.

SEAS5 reforecasts were used from 1981 to 2016 with 25 ensemble members, and real-time forecasts from 2017 to 2023 with only the first 25 of 51 members used to ensure consistency with the reforecasts. The analysis was based on individual ensemble members (i.e. not ensemble means) unless otherwise specified.

SEAS5 forecasts are issued once per month. In this study, the November forecasts and reforecasts are used, corresponding by convention to lead times of 1–4 months for November through February. A potential drawback of using the November initialisations is that the SST fields are similar across ensemble members due to oceanic inertia. However, repeating the complete analysis with October initialisations (for which the November SST fields are more diverse) produced qualitatively similar results. I chose to base the analysis on the November runs, as this allows evaluation of the model's skill for the set of forecasts used operationally for predicting winter conditions.

The variables considered are SST, mean sea level pressure (SLP), and the sum of sensible and latent heat flux, hereafter referred to as surface heat flux, or SHF (positive upwards). Baroclinicity is quantified by the Eady growth rate maximum (e.g. Hoskins and Valdes, 1990), defined for the 700–850 hPa layer as

σ_{E} = c f |\frac{\partial v}{\partial z}| / N,

where the unit is day⁻¹, c=86 400×0.3098, f is the Coriolis parameter, v is the wind vector, z is the geopotential height, and N is the Brunt-Väisälä frequency, given by

N = \sqrt{\frac{g}{θ} \frac{\partial θ}{\partial z}},

with θ the potential temperature and g the gravitational acceleration.

Anomalies were calculated by subtracting the overall mean and dividing by the overall standard deviation, spanning all years and ensemble members.

3.2 Climate indices

Two scalar indices are central to the analysis: the DJF NAO index, and an SST-based index representing the November SST anomaly pattern in the extratropical North Atlantic most strongly correlated with the following winter's NAO index.

To construct the NAO index, the first Empirical Orthogonal Function (EOF) of interannual DJF mean ERA5 SLP anomalies was computed over the domain 20–80° N, 90° W–40° E, using the eofs Python package (Dawson, 2016) and applying $\sqrt{\cos ϕ}$ latitude weighting. For both ERA5 and SEAS5, the corresponding NAO index time series were obtained by projecting their respective gridded SLP anomalies onto the ERA5-based spatial EOF pattern. It was a deliberate choice to use the ERA5 loading pattern for both datasets, as the purpose of this study is to assess how SEAS5 represents the real-world NAO pattern.

The SST index was calculated in a similar way. November SST anomalies from ERA5 were first regressed onto the interannual ERA5 NAO index to obtain a spatial regression pattern. SST anomalies were then projected onto this pattern within a reference domain extending from the 20 to 70° N and from 100° W to 20° E (Czaja and Frankignoul, 2002; Kolstad and O'Reilly, 2024), after which the resulting series was standardised to form the SST index. As with the NAO index, the SST index for SEAS5 was computed by projection onto the ERA5-based pattern, not a model-specific optimal pattern, to maintain consistency across the datasets.

No masking for sea ice was applied. In both ERA5 and SEAS5, grid cells covered by sea ice are not missing values but contain subzero SSTs, which remain valid anomalies in this framework. Masking would risk introducing artificial discontinuities in space and time, since the ice edge varies between months and years.

3.3 Statistical significance

Bootstrapping was used to estimate statistical significance by creating 10 000 randomised series through sampling with replacement. To ensure comparability between the two datasets, the bootstrap sample length was set equal to the number of years in the study period for both datasets (i.e. 43). This avoids giving SEAS5 an artificial advantage with respect to ERA5 due to its larger sample size (25 members per year). When assessing the significance of a metric (e.g. a correlation) at a significance level of 5 % (used throughout this study), the 2.5th and 97.5th percentiles of the correlation coefficient across those 10 000 randomised series were computed, and if the interval between these percentiles did not include zero, the correlation was deemed significant.

4 Results

4.1 SST–NAO relationship

The ERA5 SST anomaly regression pattern (i.e. November SST anomalies regressed onto the DJF NAO index) is shown with shading in Fig. 1a. As expected, it is similar to the pattern in Fig. 1f in Kolstad and O'Reilly (2024), which was also computed based on ERA5 but for a longer period (1940–2022). The contours in Fig. 1a display the regression of DJF SLP anomalies onto the NAO index.

https://wcd.copernicus.org/articles/7/507/2026/wcd-7-507-2026-f01

Figure 1(a) Colours: November SST anomalies in ERA5 projected onto the ERA5 DJF NAO index. The unit is K, and dots mark coefficients significantly different from zero at the 5 % level. Contours: DJF SLP anomalies projected onto the same NAO index (unit: hPa). The contour interval is 1 hPa; solid (dashed) contours indicate positive (negative) coefficients, and the zero contour is omitted. The map extent corresponds to the region used to define the November SST index. (b) Time series of the November SST index (orange) and the DJF NAO index (blue) in ERA5. Years on the x-axis correspond to the December month at the start of the winter (i.e. DJF 1981/82 is labelled 1981). (c) As in (b), but for SEAS5, based on the ensemble mean each year.

Figure 1b shows the interannual November SST index, obtained by projecting the SST anomalies onto the regression pattern in Fig. 1a, together with the winter NAO index, both from ERA5 data. Although only a few of the local SST coefficients in Fig. 1a are significant, the sample correlation between the two indices is relatively high ( $\hat{τ} = 0.49$ , p≈0.001), underscoring the strong link between late-autumn SSTs and the subsequent winter NAO. The SST index captures well the two exceptionally negative NAO winters of 2009/10 and 2010/11, as well as the extended positive NAO phase around 1990, though there are also seasons with weak correspondence, such as 2000/01. It is emphasised that $\hat{τ}$ does not represent a skill score, as no independent training and evaluation periods were defined.

Turning to the (ensemble mean) SEAS5 indices shown in Fig. 1c, it is evident that the SST index covaries with the SST index in ERA5 (r=0.91). However, several differences between the datasets are also apparent. Most important, the SST–NAO correlation is substantially lower ( $\overline{τ} = 0.21$ , p=0.18, with the overbar signifying that the ensemble mean was used) than in ERA5, demonstrating a discrepancy in the linkages between the observed “NAO-optimal” SST pattern and the winter NAO. Here it is acknowledged that the SST–NAO correlation is weaker than in ERA5 at least partly because both indices were deliberately derived from ERA5-based spatial patterns.

A second point is the non-significant NAO skill: the anomaly correlation coefficient between the NAO index in SEAS5 and ERA5, denoted henceforth as ρ, is not significant for the ensemble mean ( $\overline{ρ} = 0.29$ , p=0.06). This low predictive power is consistent with Baker et al. (2024).

When the SST–NAO correlation and the NAO skill are evaluated for all the ensemble members instead of for the ensemble mean, both metrics deteriorate, revealing higher internal noise. This behaviour is consistent with the “signal-to-noise paradox” (e.g. Scaife and Smith, 2018). The sample parameter based on all members, which is used in the remainder of the analysis, is $\hat{τ} = 0.06$ , which is not significantly positive and much lower than $\overline{τ} = 0.21$ . The member-level NAO skill also decreases from $\overline{ρ} = 0.29$ to the non-siginificant value of $\hat{ρ} = 0.07$ .

4.2 Linking the SST–NAO relationship to NAO prediction skill

One of the three main purposes of this paper is to assess whether the total SST–NAO relationship (τ in Eq. 1) in the model is proportional to its NAO skill. That skill is defined here as ρ: the correlation between the DJF NAO index in SEAS5 and the corresponding DJF NAO index in ERA5, with ERA5 years selected to match the year of each SEAS5 member. If there had been no relationship between τ and ρ, it would be of limited interest to scrutinise the causal pathways through which the SSTs influence the NAO.

It is important to emphasise that ρ is an external quantity: it depends solely on the correlation between the SEAS5 and ERA5 DJF NAO indices and contains no information about the model’s internal relationships among X, Z, and Y. By contrast, the total effect τ and the mediated effect αβ derive entirely from SEAS5’s internal covariance structure. There is therefore no algebraic or definitional link between skill and any aspect of the model’s SST–NAO relationship. Any association between the two reflects actual co-variation between an external validation measure and internal model dynamics, rather than an outcome expected by construction.

To investigate the association between τ and ρ, bootstrapping was used to generate an ensemble of 10 000 SEAS5 series, each with the same length as the number of years in the study period (43). The results are not sensitive to this choice of length, and the same bootstrap ensemble is analysed further in Sect. 4.6.

For each bootstrap series, two quantities were computed: (1) $\hat{τ}$ , the sample SEAS5-internal correlation between the November SST index and the DJF NAO index; and (2) $\hat{ρ}$ , the sample NAO skill. Figure 2 shows a scatterplot of these parameters for a subset of the series. The correlation across all the 10 000 series is positive (r=0.33) and significant at the 5 % level. This does not imply that a strong SST–NAO relationship is sufficient or strictly required for high NAO skill, since the NAO is influenced by many processes unrelated to North Atlantic SST variability. Rather, the result provides support for a central premise of this study: the extent to which the model reproduces the observed influence of November SST anomalies on the winter NAO contributes meaningfully to its overall NAO skill.

https://wcd.copernicus.org/articles/7/507/2026/wcd-7-507-2026-f02

Figure 2Results for a 10 000-member bootstrap ensemble of SEAS5 series, each of length 43. For each bootstrap sample, the x-axis shows the SEAS5-internal correlation between the November SST index and the DJF NAO index, and the y-axis shows the NAO skill $\hat{ρ}$ , defined as the correlation between the DJF NAO indices in SEAS5 and ERA5 (with ERA5 years matched to the bootstrap sample). To enhance readability, only 1000 randomly chosen points are shown.

Download

4.3 ERA5 climatology and SEAS5 bias

Before examining the role of SHF and baroclinicity in mediating the SST–NAO relationship, it is useful to consider the climatological context. In Fig. 3, ERA5 climatologies and SEAS5 biases are therefore shown, starting with the mean November SSTs in the North Atlantic in Fig. 3a. A prominent feature is the strong SST gradient along the boundary between the warm Gulf Stream waters and the much colder waters along the North American coastline. These gradients give rise to intense SHF on the warm side of the front (Fig. 3b), and they also coincide with strong low-level baroclinicity (Fig. 3c). The last panel in the top row, Fig. 3d, shows the climatological SLP pattern, which is characterised by a dipole between the Icelandic Low and the Azores High.

https://wcd.copernicus.org/articles/7/507/2026/wcd-7-507-2026-f03

Figure 3Top row (a–d): ERA5 climatologies for (a) November SST (K); (b) DJF surface heat fluxes (W m⁻²); (c) DJF Eady growth rate maximum σ_E (d⁻¹); (d) DJF SLP (hPa). The bottom row (e–h) show the SEAS5 biases (SEAS5 minus ERA5) for the same variables as in the top row.

The aforementioned warm SST bias in the western North Atlantic (Stockdale et al., 2018; Tietsche et al., 2020) is visible as a tongue-like feature in the east–west direction south of Greenland in Fig. 3e. This is also linked to a clearly defined positive DJF SHF bias in Fig. 3f. The poor SEAS5 representation of the SST gradient along the Gulf Stream seen in Fig. 3a is also of interest. Figure 3e reveals a pronounced warm bias on the cold side of the front and a weaker cold bias on the warm side, resulting in an overall weakened gradient. The SHF biases in Fig. 3f reflect these SST errors, generally showing fluxes that are too strong in warm-biased regions and too weak in cold-biased sectors. Although not shown here, these SHF biases project strongly onto the DJF SST bias in the western part of the basin; these are larger in magnitude than the ones for November in Fig. 3e, but for the most part they have the same sign, suggesting a growth of the model's SST bias with lead time. Figure 3g indicates that the underestimated SST gradients along the Gulf Stream are associated with too weak baroclinicity in the storm track entrance region, while σ_E is too high to the south.

These errors imply a distorted storm track: too few cyclones over the northern North Atlantic and too many further south, which is consistent with the IFS cyclone bias investigations by Jung et al. (2006) and Büeler et al. (2024). This interpretation is supported by the mean SLP bias pattern in Fig. 3h, which shows that SEAS5 underestimates the amplitude of the observed NAO-like dipole. Sampled at representative grid points near the two NAO centres of action (Stykkishólmur, Iceland, and Ponta Delgada, Azores), the mean SLP bias amounts to +1.0 and −0.4 hPa, respectively, giving a bias in the north–south difference of +1.4 hPa. This confirms that the model’s climatological pressure contrast is weaker than observed, implying westerlies that are too weak across the subpolar North Atlantic. The most distinct weakening of the westerlies occurs between the mid-basin negative SLP bias and the positive bias near Iceland. This corresponds to the weak negative SHF bias observed in the same region (Fig. 3f). In this area with suppressed westerlies, surface fluxes are likely underestimated because reduced wind speeds dampen the intensity of the cold-air advection from the west.

4.4 Mediated effects

Figures 4 and 5 show the sample parameters $\hat{α}$ and $\hat{β}$ , as well as their product $\hat{α} \hat{β}$ , for both mediators. The unit is standard devations (SD), as all the variables in the regression equations were standardised prior to estimating the coefficients. It is repeated for emphasis that α, the regression coefficient linking November SSTs to the mediator (X→Z) in Eq. (3), captures all routes through which SST anomalies influence Z. This includes the indirect effect via the NAO (i.e. the pathway $X \to Y \to Z$ ), as well as other pathways not explicitly considered here. The NAO-independent contribution of SSTs to Z, denoted α^′ in Eq. (5), is examined in Sect. 4.5.

https://wcd.copernicus.org/articles/7/507/2026/wcd-7-507-2026-f04

Figure 4The top row shows sample parameters for ERA5 for the surface heat flux pathway: (a) $\hat{α}$ ; (b) $\hat{β}$ ; (c) the mediated effect ( $\hat{α} \hat{β}$ ). Panels (d)–(f) show the corresponding parameters for SEAS5. Dots indicate where the parameters differ significantly from zero at the 5 % level. Unit in all panels: standard deviations (SD).

https://wcd.copernicus.org/articles/7/507/2026/wcd-7-507-2026-f05

Figure 5As Fig. 4, but for the baroclinicity parameter σ_E.

4.4.1 Surface heat fluxes

From Fig. 4a, it emerges that the November SST index yields positive SHF coefficients in large parts of the Subpolar Gyre in ERA5. These positive $\hat{α}$ values largely coincide with positive $\hat{β}$ values (Fig. 4b). The product $\hat{α} \hat{β}$ therefore yields a pronounced mediated effect in the reference region (Fig. 4c), consistent with Kolstad and O'Reilly (2024). This implies that heat fluxes in this area play an important role in mediating the effect of November SSTs on the winter NAO.

Limited suppression occurs in the mid-basin area, suggesting a negative feedback mechanism. This happens because $\hat{α}$ and $\hat{β}$ have opposing signs; in other words, the SST index generates a flux response that counteracts the contemporaneous NAO–SHF relationship.

Figure 4d reveals that SEAS5 yields barely any significant $\hat{α}$ values. Although there is a fair degree of spatial correspondence with the findings for ERA5, the uniformly positive structure south of Greenland seen in the reanalysis is lacking. Instead, $\hat{α}$ is partly negative; this roughly overlaps with the positive SST and SHF biases identified in Fig. 3e, f. The spatial match is not exact, however, and there is no obvious mechanistic link between the bias and the sign reversal. Regardless of its origin, the model produces the wrong sign of the SST-flux relationship in a dynamically important region, in marked contrast to ERA5.

In comparison, the pattern of $\hat{β}$ (Fig. 4e) does resemble the one in ERA5, demonstrating that SEAS5 has a strong and mainly correct contemporaneous SHF–NAO relationship. However, the mediated effect $\hat{α} \hat{β}$ in SEAS5 shown in Fig. 4f diverges from ERA5, with no significant mediation in the Subpolar Gyre region. In light of the strong $\hat{β}$ pattern, this suggests that the SHF-related part of this weak SST–NAO correlation is due to the inadequate $\hat{α}$ representation. This is discussed further in Sect. 4.5.

Lastly, it is noteworthy that SEAS5 exhibits suppression in the same mid-basin domain as ERA5.

4.4.2 Baroclinicity

The top row of Fig. 5 (panels a–c) shows a distinctly positive mediated effect in ERA5 in the western storm track entrance region, in a wide corridor further south, and near Iceland. In all these areas, the sign of $\hat{α}$ and $\hat{β}$ is the same, and the spatial structures of these parameters are similar. This likeness could indicate that the mediated effect is mainly due to the effect of the NAO on the baroclinicity (i.e. Y→Z). In that case, $\hat{α^{'}}$ in Eq. (5) is expected to be near-zero; this is explored in the next section. Even so, feedbacks between the NAO and baroclinicity (i.e. eddy-mean flow feedbacks) still appears to be an important mechanism for maintaining the NAO.

The picture for SEAS5 (Fig. 5d–f) is similar to ERA5 in the sense that the signs of $\hat{α}$ and $\hat{β}$ overlap in two bands across the North Atlantic. However, the magnitude of $\hat{β}$ is distinctly larger than that of $\hat{α}$ ; clearly the magnitude of the muted mediated effect in panel (f) is dictated by $\hat{α}$ . Neither $\hat{α}$ nor $\hat{α} \hat{β}$ is significant anywhere.

4.5 Disentangling forcing and feedback

The findings in the previous section raised questions about the directionality of the mediated effects associated with both SHF and baroclinicity. Although the pathway $X \to Z \to Y$ is not meant to be interpreted as strictly unidirectional, evaluating the X→Z link in isolation helps determine the extent to which November SST anomalies generate responses independently of the NAO. To that end, the leftmost panels in Fig. 6 show the sample parameter $\hat{α^{'}}$ from Eq. (5), which isolates the X→Z influence with the NAO regressed out, for SST and SLP. These variables, which are not considered as mediators, are analysed here because they indicate changes in the lower boundary (SST) and circulation (SLP).

https://wcd.copernicus.org/articles/7/507/2026/wcd-7-507-2026-f06

Figure 6The left column shows the November SST-forced effect ( $\hat{α^{'}}$ ; Eq. 5), on DJF SST (colours) and SLP (contours with interval 0.1 SD; positive solid, negative dashed; zero omitted) in ERA5 (a) and SEAS5 (d). Dots indicate where $\hat{α^{'}}$ for SST is significantly different from zero at the 5 % level. The remaining panels show the November SST-forced mediated effect ( $\hat{α^{'}} \hat{β}$ ) on ERA5 SHF (b), ERA5 baroclinicity (c), SEAS5 SHF (e), and SEAS5 baroclinicity, all in DJF. Dots denote where the ratio $\hat{α^{'}} / \hat{α}$ is significantly positive at the 5 % level.

Starting with ERA5, Fig. 6a shows that, once the NAO contribution is removed, November SST anomalies induce an SLP pattern dominated by positive coefficients over the south-western North Atlantic. This pattern implies anomalous northerly advection in positive phases and southerly advection in negative phases. The associated SST response resembles the November antecedent in Fig. 1a (as expected from oceanic inertia), but there is more pronounced mid-basin dominance with positive values.

In SEAS5, the $\hat{α^{'}}$ field in Fig. 6d shows an SST structure broadly similar to the ERA5 pattern in panel (a), with one notable exception: significant negative values appear south of Greenland. A similar sign discrepancy was already seen for the SHF $\hat{α}$ coefficient (Fig. 4d), indicating that the SST–flux response in this region is systematically misrepresented in SEAS5. These negative $\hat{α^{'}}$ values lie near the well-documented positive SST bias during the early reforecast period (Stockdale et al., 2018; Tietsche et al., 2020), and when the analysis is repeated for 2001–2023, when this bias was much smaller, the negative values largely disappear. This suggests that the sign error may be linked to compensating adjustments associated with the bias, although the spatial correspondence is not exact and the mechanism cannot be established here. However, this issue is not central to the present study – while the bias alters some spatial details in SEAS5, the mediated effect in SEAS5 is not significant in this region in either period, and the skill–mediation covariability discussed in Sect. 4.6 is unaffected.

The SST-forced mediated effect $\hat{α^{'}} \hat{β}$ via surface heat fluxes in ERA5 is shown in Fig. 6b. In this panel the dots indicate areas where the null hypothesis – that α^′ and α have opposite signs – can be rejected at the 5 % level across the 10 000 bootstrap samples. In these areas the SST-forced component contributes in the same direction as the full mediated effect, which includes contemporaneous feedbacks from the NAO onto the fluxes. Over parts of the Subpolar Gyre, where β is positive (Fig. 4b), $\hat{α^{'}} \hat{β}$ is partly positive or near-neutral, but few areas are marked with dots. This indicates that the strong total mediation seen in Fig. 4c is largely attributable to the Y→Z pathway; that is, NAO feedbacks on the fluxes dominate in this region. Further south in the North Atlantic, $\hat{α^{'}} \hat{β}$ is negative over a broad area. The density of dots there indicates that the NAO-independent SST-forced component contributes to suppressing the SST–NAO correlation. A similar but less extensive pattern appears in the full mediated effect $\hat{α} \hat{β}$ in Fig. 4c.

In SEAS5 (Fig. 6e), the spatial structure of $\hat{α^{'}} \hat{β}$ resembles the pattern of the full mediated effect $\hat{α} \hat{β}$ in Fig. 4c. Some areas are marked with dots, including the mid-basin region exhibiting suppression. This implies that the SST-forced component plays a role in this suppression, matching the ERA5 result in Fig. 6b.

For baroclinicity, Fig. 6c shows that the SST-forced mediated effect is mainly positive in the two bands where the total mediated effect $\hat{α} \hat{β}$ is positive and significant (Fig. 5c), albeit noticeably weaker in magnitude. Parts of these bands are marked with dots, indicating where the SST-forced component plays a limited role in the full mediation. Figure 6f demonstrates that SEAS5 similarly produces weak, positive $\hat{α^{'}} \hat{β}$ in the two bands, but no dots appear where the mediated effect is strongest in magnitude. This suggests that the SST-driven component of the mediated effect in the crucial areas is negligible.

In summary, the directional picture is heterogeneous but broadly consistent with a dominant NAO → mediator pathway. For SHF, the SST-driven contribution mainly projects onto the mid-basin area where suppression dominates, while over parts of the Subpolar Gyre the strong total mediation appears to be largely attributable to NAO feedbacks onto the fluxes. For baroclinicity, ERA5 indicates a modest SST-forced contribution aligned with the total response in two bands – albeit noticeably weaker than the full mediation – whereas SEAS5 shows no such contribution. These results motivate the next step: to assess whether variations in these internally generated mediation patterns are associated with variations in external NAO forecast skill.

4.6 Relating mediated effects to NAO prediction skill

In Sect. 4.2, a modest but significant association was identified between the November-to-DJF SST–NAO correlation and the model's NAO skill $\hat{ρ}$ (r=0.33). This suggests that the mediated effect associated with the SST–NAO linkage may also relate to forecast skill. Although the mediation signal in SEAS5 is weak overall, it is not absent: the positive $\hat{α} \hat{β}$ values for baroclinicity (Fig. 5f) broadly overlap with those in ERA5 (Fig. 5c). For SHF (Fig. 4c, f), there is likewise some agreement, apart from the negative values south of Greenland noted earlier.

This section examines whether variations in mediation strength across subsets of SEAS5 realisations are associated with variations in NAO skill. To do so, the 10 000-member bootstrap ensemble introduced in Sect. 4.2 is revisited. For each bootstrap sample, the SEAS5 mediated effect $\hat{α} \hat{β}$ is estimated separately for SHF and σ_E, alongside the NAO skill $\hat{ρ}$ and the model-internal SST–NAO correlation from Sect. 4.2.

The maps in Fig. 7 show where, geographically, the mediated effect $\hat{α} \hat{β}$ co-varies with the NAO skill $\hat{ρ}$ across the SEAS5 bootstrap samples. The most prominent feature is that the strongest positive correlations occur in the regions where ERA5 exhibits robust positive mediation. For SHF (Fig. 7a), this positive covariability appears across the Subpolar Gyre, even south of Greenland, where the overall $\hat{α} \hat{β}$ in SEAS5 is negative (Fig. 4f). This indicates that, within SEAS5, bootstrap subsets in which the model produces a mediation pattern more closely resembling ERA5 are also the subsets with higher NAO skill. Conversely, samples that yield negative $\hat{α} \hat{β}$ in these regions tend to have lower skill. Thus, even though SEAS5 does not reproduce the magnitude or sign of the mediated effect perfectly, its internal covariability shows that more realistic mediation pathways are associated with improved NAO prediction skill.

https://wcd.copernicus.org/articles/7/507/2026/wcd-7-507-2026-f07

Figure 7Results for the 10 000 bootstrap samples introduced in Sect. 4.2. Shading shows the correlation between $\hat{ρ}$ and the mediated effect $\hat{α} \hat{β}$ , computed from SEAS5 for each bootstrap sample, for (a) surface heat fluxes (SHF) and (b) baroclinicity. Dots indicate correlations that do not significantly differ from zero at the 5 % level. White contours reproduce the ERA5 mediated effect $\hat{α} \hat{β}$ from Figs. 4c and 5c in panels (a) and (b), respectively (contour interval 0.1 SD; positive solid, negative dashed; zero omitted).

A few regions also display negative correlations between $\hat{ρ}$ and $\hat{α} \hat{β}$ , but these do not overlap with the key regions where ERA5 exhibits strong SHF-mediated effects in the Subpolar Gyre. For baroclinicity, the correspondence between $\hat{ρ}$ and $\hat{α} \hat{β}$ is more uniformly related to the mediated effect in ERA5 (Fig. 7b).

Taken together, these patterns reinforce the main conclusion: the clearest and most physically interpretable skill-mediation covariability occurs in the regions where ERA5 displays robust positive mediation. In these areas, SEAS5 achieves higher NAO skill when it incidentally reproduces the observed mediation pathways, underscoring the importance of representing these air–sea feedbacks realistically in seasonal prediction systems.

As noted in Sect. 4.2, the skill and mediation metrics derive from entirely different sources of information. The fact that they co-vary in physically meaningful regions therefore supports the view that the ERA5-identified pathways correspond to mechanisms that matter for NAO predictability in the model.

5 Summary and discussion

In this paper, feedback pathways linking the state of the North Atlantic sea surface in late autumn and the NAO during the following winter have been explored. As in Kolstad and O'Reilly (2024), these pathways were investigated using mediation analysis, a branch of statistical causal inference methods that has seen little use in climate dynamics so far. The results demonstrate that feedbacks previously identified through idealised perturbation experiments in dynamical models can also be diagnosed directly from observational or reanalysis data. One advantage of this approach is that it avoids the need to manipulate boundary conditions like SSTs. Such manipulations can elicit compensatory model adjustments that complicate interpretation, particularly when the models themselves suffer from systematic biases. Mediation analysis instead infers causal structure directly from observed covariability, offering a complementary perspective on internal feedback pathways.

It must nevertheless be acknowledged that reanalysis products are themselves produced with models – in the case of ERA5, from the same model lineage as SEAS5. Thus, reanalyses are not free from biases, and their depiction of physical relationships may be influenced by model behaviour. Mediation analysis cannot fully resolve such issues, but by contrasting reanalysis-based and model-based feedbacks, it can help to pinpoint where key processes diverge.

Additional limitations should be kept in mind. For one, the mediation framework as applied here is linear and does not adequately capture nonlinear feedbacks. Further, SEAS5 is only one dynamical system; different models likely represent feedback differently. Future work could extend this examination to other models, some of which exhibit higher NAO skill than SEAS5 (Baker et al., 2024), or indeed multi-model ensembles, incorporating nonlinear mediation analysis techniques. Another limitation is that the analysis does not condition on remote precursors. For instance, ENSO can influence both late-autumn North Atlantic SSTs and the winter NAO, potentially giving rise to apparent, but non-causal, links between pre-winter SST and the NAO. An idea for future research could be to extend the framework by explicitly conditioning on, or stratifying by, for instance tropical predictors or stratosphere–troposphere interactions.

Notwithstanding these caveats, this study has extended Kolstad and O'Reilly (2024), where processes linking late‐autumn SST anomalies and the winter NAO were analysed based on ERA5 data, by investigating these processes in the forecast system SEAS5. It was hypothesised that the SST–NAO relationship has bearing on the NAO prediction skill in that model, and this was confirmed. Although the observed correlation (r=0.33) is modest, it still represents a non-trivial association in light of the many other processes that influence NAO skill, including the stratosphere, tropical SST variability, Arctic sea‐ice anomalies, and internal atmospheric dynamics, to mention but a few.

Having established this link between the SST–NAO relationship and NAO skill, the analysis extended Kolstad and O'Reilly (2024) further by revealing a physically coherent sequence of processes underpinning the SST–NAO connection. Independently of the NAO, November SST anomalies induce a surface‐pressure pattern that preconditions the atmosphere for anomalies in two mediators: surface heat flux (SHF) and baroclinicity in the western North Atlantic. These anomalies in turn nudge the NAO, which subsequently feeds back on both SHF and baroclinicity. A key outcome concerns the directionality of these causal pathways. Across large parts of the North Atlantic, forcing from the NAO onto the mediators was found to dominate. Crucially, however, in the regions with the strongest mediated effects through SHF and baroclinicity, the directionality was ambiguous, consistent with the existence of a two-way feedback mechanism.

It is important to emphasise that these feedbacks do not account for all aspects of NAO variability. The processes identified here represent one pathway among many, complementing the aforementioned influences from, for example, the stratosphere. Rather than providing a complete explanation, the results demonstrate how even a single coupled feedback sequence can shape NAO variability and how its misrepresentation in a prediction system may limit its ability to capture the full range of NAO behaviour.

A key finding is that these pathways are substantially weakened in SEAS5. This is likely linked to the muted SST–NAO relationship in the model relative to ERA5. When this link is weak, the total effect of SSTs on the mediators, labelled α herein, is also necessarily weak. Figure 7, where bootstrap resampling was used to explore the relationship between NAO prediction skill and the mediated SST–NAO effect, illustrates this succinctly. Model samples that exhibit a stronger mediated effect also show higher NAO prediction skill. Conversely, samples that by chance yield higher NAO skill also display a stronger mediated effect. This mutual dependence suggests that if the model were able to reproduce the SST–NAO pathways via SHF and baroclinicity more realistically, it would likely predict the NAO more accurately as well.

However, this counterfactual hypothesis cannot be tested directly because the model does not currently reproduce these pathways. Targeted experiments that enforce more realistic air-sea interactions could help clarify whether strengthening these pathways would indeed improve NAO prediction skill. The study by Roberts et al. (2021) provides a prime example of such experiments. Other improvements, such as enhancing the resolution of the ocean (e.g. Haarsma et al., 2019) or the atmosphere (e.g. Czaja et al., 2019; Wills et al., 2024), or improving the representation of eddy–mean flow feedbacks (e.g. Hardiman et al., 2022), have also been shown to produce more precise atmospheric responses to SST forcing.

The results presented here raise interesting questions for the emerging class of ML-based seasonal and subseasonal prediction systems (e.g. Chen et al., 2024; Kent et al., 2025). If trained on model-generated data or on reanalyses influenced by model biases, such systems risk inheriting some of the deficiencies documented here. Conversely, ML approaches trained directly on observations might bypass some of these problems – but whether they would be able to capture the same preconditioning and feedback structures as the real climate system is currently unknown.

Mediation analysis offers a powerful and versatile framework for tackling these research challenges. It can help pinpoint where models fail to represent key causal pathways, assess whether targeted improvements translate into more realistic coupled feedbacks and higher predictive skill, and evaluate whether ML-based forecasts reproduce the same physical linkages observed in nature. In a broader sense, mediation analysis can serve as a bridge between statistical diagnostics and both process studies and model development/evaluation, advancing our understanding of how both unidirectional and feedback mechanisms shape climate predictability.

Code availability

The software code used to generate the figures and analysis in this study is persistently archived on Zenodo at https://doi.org/10.5281/zenodo.19096274 (Kolstad, 2026).

Data availability

The atmospheric reanalysis and seasonal forecast data used in this study were obtained from the Copernicus Climate Change Service (C3S) Climate Data Store (CDS). Specifically, I used ERA5 monthly averaged data on pressure levels (https://doi.org/10.24381/cds.6860a573, Hersbach et al., 2023 a) and single levels (https://doi.org/10.24381/cds.f17050d7, Hersbach et al., 2023 b), as well as Seasonal forecast monthly statistics on single levels (https://doi.org/10.24381/cds.68dd14c3, Copernicus Climate Change Service, Climate Data Store, 2018 a) and pressure levels (https://doi.org/10.24381/cds.0b79e7c5, Copernicus Climate Change Service, Climate Data Store, 2018 b). All datasets are publicly accessible through the CDS. No new primary data were generated during this research.

Author contributions

EWK is the sole author and responsible for all aspects of this work.

Competing interests

The author has declared that there are no competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Acknowledgements

I wish to extend my gratitude to the editor, David Battisti, as well as two reviewers for improving the manuscript. I also thank the ECMWF for providing access to both ERA5 and SEAS5 data, and Timothy Stockdale at ECMWF for pointing me to the SST bias inherited from the ocean reanalysis. The figures use colour maps developed by Fabio Crameri (Crameri et al., 2020). Parts of the text were refined with the help of various Large Language Models (LLMs), which were used to improve clarity and language and to assist with coding tasks, but not to generate scientific content or conclusions.

Financial support

This research has been supported through Climate Futures, funded by Norges Forskningsråd (grant no. 309562).

Review statement

This paper was edited by David Battisti and reviewed by Robert Jnglin Wills and one anonymous referee.

References

Ambaum, M. H. P., Hoskins, B. J., and Stephenson, D. B.: Arctic Oscillation or North Atlantic Oscillation?, J. Climate, https://doi.org/10.1175/1520-0442(2001)014<3495:AOONAO>2.0.CO;2, 2001. a

Athanasiadis, P. J., Bellucci, A., Scaife, A. A., Hermanson, L., Materia, S., Sanna, A., Borrelli, A., MacLachlan, C., and Gualdi, S.: A Multisystem View of Wintertime NAO Seasonal Predictions, J. Climate, https://doi.org/10.1175/JCLI-D-16-0153.1, 2017. a

Athanasiadis, P. J., Ogawa, F., Omrani, N.-E., Keenlyside, N., Schiemann, R., Baker, A. J., Vidale, P. L., Bellucci, A., Ruggieri, P., Haarsma, R., Roberts, M., Roberts, C., Novak, L., and Gualdi, S.: Mitigating Climate Biases in the Midlatitude North Atlantic by Increasing Model Resolution: SST Gradients and Their Relation to Blocking and the Jet, J. Climate, https://doi.org/10.1175/JCLI-D-21-0515.1, 2022. a, b

Baker, H. S., Woollings, T., Forest, C. E., and Allen, M. R.: The Linear Sensitivity of the North Atlantic Oscillation and Eddy-Driven Jet to SSTs, J. Climate, 32, 6491–6511, https://doi.org/10.1175/JCLI-D-19-0038.1, 2019. a

Baker, L. H., Shaffrey, L. C., Johnson, S. J., and Weisheimer, A.: Understanding the Intermittency of the Wintertime North Atlantic Oscillation and East Atlantic Pattern Seasonal Forecast Skill in the Copernicus C3S Multi-Model Ensemble, Geophys. Res. Lett., 51, e2024GL108472, https://doi.org/10.1029/2024GL108472, 2024. a, b, c, d

Baron, R. M. and Kenny, D. A.: The Moderator–Mediator Variable Distinction in Social Psychological Research: Conceptual, Strategic, and Statistical Considerations, J. Pers. Soc. Psychol., 51, 1173–1182, https://doi.org/10.1037/0022-3514.51.6.1173, 1986. a, b

Bellucci, A., Athanasiadis, P. J., Scoccimarro, E., Ruggieri, P., Gualdi, S., Fedele, G., Haarsma, R. J., Garcia-Serrano, J., Castrillo, M., Putrahasan, D., Sanchez-Gomez, E., Moine, M.-P., Roberts, C. D., Roberts, M. J., Seddon, J., and Vidale, P. L.: Air-Sea Interaction over the Gulf Stream in an Ensemble of HighResMIP Present Climate Simulations, Clim. Dynam., 56, 2093–2111, https://doi.org/10.1007/s00382-020-05573-z, 2021. a

Büeler, D., Sprenger, M., and Wernli, H.: Northern Hemisphere Extratropical Cyclone Biases in ECMWF Subseasonal Forecasts, Q. J. Roy. Meteorol. Soc., 150, 1096–1123, https://doi.org/10.1002/qj.4638, 2024. a

Cassou, C., Deser, C., and Alexander, M. A.: Investigating the Impact of Reemerging Sea Surface Temperature Anomalies on the Winter Atmospheric Circulation over the North Atlantic, J. Climate, https://doi.org/10.1175/JCLI4202.1, 2007. a

Chen, L., Zhong, X., Li, H., Wu, J., Lu, B., Chen, D., Xie, S.-P., Wu, L., Chao, Q., Lin, C., Hu, Z., and Qi, Y.: A Machine Learning Model That Outperforms Conventional Global Subseasonal Forecast Models, Nat. Commun., 15, 6425, https://doi.org/10.1038/s41467-024-50714-1, 2024. a

Conger, A. J.: A Revised Definition for Suppressor Variables: A Guide To Their Identification and Interpretation, Educ. Psychol. Meas., 34, 35–46, https://doi.org/10.1177/001316447403400105, 1974. a

Copernicus Climate Change Service, Climate Data Store: Seasonal forecast monthly statistics on single levels, Copernicus Climate Change Service (C3S) Climate Data Store (CDS) [data set], https://doi.org/10.24381/cds.68dd14c3, 2018a. a

Copernicus Climate Change Service, Climate Data Store: Seasonal forecast monthly statistics on pressure levels, Copernicus Climate Change Service (C3S) Climate Data Store (CDS) [data set], https://doi.org/10.24381/cds.0b79e7c5, 2018b. a

Crameri, F., Shephard, G. E., and Heron, P. J.: The Misuse of Colour in Science Communication, Nat. Commun., 11, 5444, https://doi.org/10.1038/s41467-020-19160-7, 2020. a

Czaja, A. and Frankignoul, C.: Observed Impact of Atlantic SST Anomalies on the North Atlantic Oscillation, J. Climate, 15, 606–623, https://doi.org/10.1175/1520-0442(2002)015<0606:OIOASA>2.0.CO;2, 2002. a, b

Czaja, A., Frankignoul, C., Minobe, S., and Vannière, B.: Simulating the Midlatitude Atmospheric Circulation: What Might We Gain From High-Resolution Modeling of Air-Sea Interactions?, Curr. Clim. Change Rep., 5, 390–406, https://doi.org/10.1007/s40641-019-00148-5, 2019. a

Dawson, A.: Eofs: A Library for EOF Analysis of Meteorological, Oceanographic, and Climate Data | Journal of Open Research Software, J. Open Res. Softw., 4, https://doi.org/10.5334/jors.122, 2016. a

Degenhardt, L., Leckebusch, G. C., and Scaife, A. A.: Large-Scale Circulation Patterns and Their Influence on European Winter Windstorm Predictions, Clim. Dynam., 60, 3597–3611, https://doi.org/10.1007/s00382-022-06455-2, 2023. a

Docquier, D., Di Capua, G., Donner, R. V., Pires, C. A. L., Simon, A., and Vannitsem, S.: A comparison of two causal methods in the context of climate analyses, Nonlin. Processes Geophys., 31, 115–136, https://doi.org/10.5194/npg-31-115-2024, 2024. a

Ebert-Uphoff, I. and Deng, Y.: Causal Discovery for Climate Research Using Graphical Models, J. Climate, https://doi.org/10.1175/JCLI-D-11-00387.1, 2012. a

Gastineau, G. and Frankignoul, C.: Influence of the North Atlantic SST Variability on the Atmospheric Circulation during the Twentieth Century, J. Climate, https://doi.org/10.1175/JCLI-D-14-00424.1, 2015. a

Granger, C. W.: Investigating Causal Relations by Econometric Models and Cross-Spectral Methods, Econometrica, pp. 424–438, https://doi.org/10.2307/1912791, 1969. a

Greene, W. H.: Econometric Analysis, Pearson Education, 2003. a

Haarsma, R. J., García-Serrano, J., Prodhomme, C., Bellprat, O., Davini, P., and Drijfhout, S.: Sensitivity of Winter North Atlantic-European Climate to Resolved Atmosphere and Ocean Dynamics, Sci. Rep., 9, 13358, https://doi.org/10.1038/s41598-019-49865-9, 2019. a, b

Hall, R. J., Scaife, A. A., Hanna, E., Jones, J. M., and Erdélyi, R.: Simple Statistical Probabilistic Forecasts of the Winter NAO, Weather Forecast., https://doi.org/10.1175/WAF-D-16-0124.1, 2017. a, b

Hannart, A., Pearl, J., Otto, F. E. L., Naveau, P., and Ghil, M.: Causal Counterfactual Theory for the Attribution of Weather and Climate-Related Events, B. Am. Meteorol. Soc., https://doi.org/10.1175/BAMS-D-14-00034.1, 2016. a

Hardiman, S. C., Dunstone, N. J., Scaife, A. A., Smith, D. M., Comer, R., Nie, Y., and Ren, H.-L.: Missing Eddy Feedback May Explain Weak Signal-to-Noise Ratios in Climate Predictions, npj Clim. Atmos. Sci., 5, 57, https://doi.org/10.1038/s41612-022-00280-4, 2022. a, b, c

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., de Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.-N.: The ERA5 Global Reanalysis, Q. J. Roy. Meteorol. Soc., 146, 1999–2049, https://doi.org/10.1002/qj.3803, 2020. a

Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, A., Muñoz Sabater, J., Nicolas, J., Peubey, C., Radu, R., Rozum, I., Schepers, D., Simmons, A., Soci, C., Dee, D., and Thépaut, J.-N.: ERA5 monthly averaged data on pressure levels from 1940 to present, Copernicus Climate Change Service (C3S) Climate Data Store (CDS) [data set], https://doi.org/10.24381/cds.6860a573, 2023a. a

Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, A., Muñoz Sabater, J., Nicolas, J., Peubey, C., Radu, R., Rozum, I., Schepers, D., Simmons, A., Soci, C., Dee, D., Thépaut, J.-N.: ERA5 monthly averaged data on single levels from 1940 to present, Copernicus Climate Change Service (C3S) Climate Data Store (CDS) [data set], https://doi.org/10.24381/cds.f17050d7, 2023b. a

Hertig, E., Beck, C., Wanner, H., and Jacobeit, J.: A Review of Non-Stationarities in Climate Variability of the Last Century with Focus on the North Atlantic–European Sector, Earth-Sci. Rev., 147, 1–17, https://doi.org/10.1016/j.earscirev.2015.04.009, 2015. a

Hewitt, H. T., Bell, M. J., Chassignet, E. P., Czaja, A., Ferreira, D., Griffies, S. M., Hyder, P., McClean, J. L., New, A. L., and Roberts, M. J.: Will High-Resolution Global Ocean Models Benefit Coupled Predictions on Short-Range to Climate Timescales?, Ocean Modell., 120, 120–136, https://doi.org/10.1016/j.ocemod.2017.11.002, 2017. a

Hoskins, B. J. and Valdes, P. J.: On the Existence of Storm-Tracks, J. Atmos. Sci., 47, 1854–1864, 1990. a

Jin, X. and Yu, L.: Assessing High-Resolution Analysis of Surface Heat Fluxes in the Gulf Stream Region, J. Geophys. Res.-Oceans, 118, 5353–5375, https://doi.org/10.1002/jgrc.20386, 2013. a

Johnson, S. J., Stockdale, T. N., Ferranti, L., Balmaseda, M. A., Molteni, F., Magnusson, L., Tietsche, S., Decremer, D., Weisheimer, A., Balsamo, G., Keeley, S. P. E., Mogensen, K., Zuo, H., and Monge-Sanz, B. M.: SEAS5: the new ECMWF seasonal forecast system, Geosci. Model Dev., 12, 1087–1117, https://doi.org/10.5194/gmd-12-1087-2019, 2019. a

Joyce, T. M., Kwon, Y.-O., Seo, H., and Ummenhofer, C. C.: Meridional Gulf Stream Shifts Can Influence Wintertime Variability in the North Atlantic Storm Track and Greenland Blocking, Geophys. Res. Lett., 46, 1702–1708, https://doi.org/10.1029/2018GL081087, 2019. a

Jung, T., Gulev, S. K., Rudeva, I., and Soloviov, V.: Sensitivity of Extratropical Cyclone Characteristics to Horizontal Resolution in the ECMWF Model, Q. J. Roy. Meteorol. Soc., 132, 1839–1857, https://doi.org/10.1256/qj.05.212, 2006. a

Keeley, S., Mogensen, K., Bidlot, J., Balmaseda, M. A., and Hatfield, S.: Introduction of a New Ocean and Sea-Ice Model Based on NEMO4-SI3, ECMWF Newsletter, 180, 24–29, https://doi.org/10.21957/sk4928ds0a 2024. a

Kent, C., Scaife, A. A., Dunstone, N. J., Smith, D., Hardiman, S. C., Dunstan, T., and Watt-Meyer, O.: Skilful Global Seasonal Predictions from a Machine Learning Weather Model Trained on Reanalysis Data, npj Clim. Atmos. Sci., 8, 314, https://doi.org/10.1038/s41612-025-01198-3, 2025. a

Knutti, R. and Rugenstein, M. A. A.: Feedbacks, Climate Sensitivity and the Limits of Linear Models, Philos. T. R. Soc. A, 373, 20150146, https://doi.org/10.1098/rsta.2015.0146, 2015. a

Kolstad, E. W.: plasmaman/feedbacks-nao: Code for paper Deficient ocean–atmosphere feedbacks constrain seasonal NAO prediction (Version v1), Zenodo [code], https://doi.org/10.5281/zenodo.19096274, 2026. a

Kolstad, E. W. and O'Reilly, C. H.: Causal Oceanic Feedbacks onto the Winter NAO, Clim. Dynam., 62, 4223–4236, https://doi.org/10.1007/s00382-024-07128-y, 2024. a, b, c, d, e, f, g, h, i, j

Kolstad, E. W. and Screen, J. A.: Nonstationary Relationship Between Autumn Arctic Sea Ice and the Winter North Atlantic Oscillation, Geophys. Res. Lett., 46, 7583–7591, https://doi.org/10.1029/2019GL083059, 2019. a

Lewis, N. T., England, M. R., Screen, J. A., Geen, R., Mudhar, R., Seviour, W. J. M., and Thomson, S. I.: Assessing the Spurious Impacts of Ice-Constraining Methods on the Climate Response to Sea Ice Loss Using an Idealized Aquaplanet GCM, J. Climate, https://doi.org/10.1175/JCLI-D-24-0153.1, 2024. a

MacKinnon, D. P., Krull, J. L., and Lockwood, C. M.: Equivalence of the Mediation, Confounding and Suppression Effect, Prev. Sci., 1, 173–181, https://doi.org/10.1023/A:1026595011371, 2000. a, b

Maybee, B., Ward, N., Hirons, L. C., and Marsham, J. H.: Importance of Madden–Julian Oscillation Phase to the Interannual Variability of East African Rainfall, Atmos. Sci. Lett., 24, e1148, https://doi.org/10.1002/asl.1148, 2023. a

McGraw, M. C. and Barnes, E. A.: Memory Matters: A Case for Granger Causality in Climate Variability Studies, J. Climate, https://doi.org/10.1175/JCLI-D-17-0334.1, 2018. a

Mosedale, T. J., Stephenson, D. B., Collins, M., and Mills, T. C.: Granger Causality of Coupled Climate Processes: Ocean Feedback on the North Atlantic Oscillation, J. Climate, https://doi.org/10.1175/JCLI3653.1, 2006. a, b

Mu, B., Jiang, X., Yuan, S., Cui, Y., and Qin, B.: NAO Seasonal Forecast Using a Multivariate Air–Sea Coupled Deep Learning Model Combined with Causal Discovery, Atmosphere, 14, 792, https://doi.org/10.3390/atmos14050792, 2023. a

Muniz, F. B. and MacKinnon, D. P.: Three Approaches to Testing for Statistical Suppression, Multiv. Behav. Res., 60, 817–839, https://doi.org/10.1080/00273171.2025.2483245, 2025. a

Nguyen, T. Q., Schmid, I., and Stuart, E. A.: Clarifying Causal Mediation Analysis for the Applied Researcher: Defining Effects Based on What We Want to Learn, Psychol. Method., 26, 255–271, https://doi.org/10.1037/met0000299, 2021. a

O'Reilly, C. H., Patterson, M., Robson, J., Monerie, P. A., Hodson, D., and Ruprich-Robert, Y.: Challenges with Interpreting the Impact of Atlantic Multidecadal Variability Using SST-restoring Experiments, npj Clim. Atmos. Sci. 6, 14, https://doi.org/10.1038/s41612-023-00335-0, 2023. a

Palmer, T. N. and Weisheimer, A.: Diagnosing the Causes of Bias in Climate Models – Why Is It so Hard?, Geophys. Astrophys. Fluid Dynam., 105, 351–365, https://doi.org/10.1080/03091929.2010.547194, 2011. a

Pan, L.-L.: Observed Positive Feedback between the NAO and the North Atlantic SSTA Tripole, Geophys. Res. Lett., 32, 1–4, https://doi.org/10.1029/2005GL022427, 2005. a

Patrizio, C. R., Athanasiadis, P. J., Frankignoul, C., Iovino, D., Masina, S., Paolini, L. F., and Gualdi, S.: Improved Extratropical North Atlantic Atmosphere – Ocean Variability with Increasing Ocean Model Resolution, J. Climate, https://doi.org/10.1175/JCLI-D-23-0230.1, 2023. a

Patrizio, C. R., Athanasiadis, P. J., Smith, D. M., and Nicolì, D.: Ocean-Atmosphere Feedbacks Key to NAO Decadal Predictability, npj Clim. Atmos. Sci., 8, 146, 2025. a

Patterson, M., O'Reilly, C., Robson, J., and Woollings, T.: Disentangling North Atlantic Ocean – Atmosphere Coupling Using Circulation Analogs, J. Climate, 37, 3791–3805, https://doi.org/10.1175/JCLI-D-23-0602.1, 2024. a

Pearl, J., Glymour, M., and Jewell, N. P.: Causal Inference in Statistics: A Primer, John Wiley & Sons, ISBN 978-1-119-18686-1, 2016. a

Peng, S., Robinson, W. A., and Li, S.: North Atlantic SST Forcing of the NAO and Relationships with Intrinsic Hemispheric Variability, Geophys. Res. Lett., 29, 117–1–117–4, https://doi.org/10.1029/2001GL014043, 2002. a

Risser, M. D., Ombadi, M., and Wehner, M. F.: Granger Causal Inference for Climate Change Attribution, Environ. Res.-Clim., 4, 022001, https://doi.org/10.1088/2752-5295/add046, 2025. a

Rivière, G. and Orlanski, I.: Characteristics of the Atlantic Storm-Track Eddy Activity and Its Relation with the North Atlantic Oscillation, J. Atmos. Sci., https://doi.org/10.1175/JAS3850.1, 2007. a

Roberts, C. D., Vitart, F., and Balmaseda, M. A.: Hemispheric Impact of North Atlantic SSTs in Subseasonal Forecasts, Geophys. Res. Lett., 48, e2020GL0911446, https://doi.org/10.1029/2020GL091446, 2021. a, b

Rodwell, M. J., Rowell, D. P., and Folland, C. K.: Oceanic Forcing of the Wintertime North Atlantic Oscillation and European Climate, Nature, 398, 320–323, 1999. a

Runge, J., Bathiany, S., Bollt, E., Camps-Valls, G., Coumou, D., Deyle, E., Glymour, C., Kretschmer, M., Mahecha, M. D., Muñoz-Marí, J., van Nes, E. H., Peters, J., Quax, R., Reichstein, M., Scheffer, M., Schölkopf, B., Spirtes, P., Sugihara, G., Sun, J., Zhang, K., and Zscheischler, J.: Inferring Causation from Time Series in Earth System Sciences, Nat. Commun., 10, 2553, https://doi.org/10.1038/s41467-019-10105-3, 2019. a

Scaife, A. A. and Smith, D.: A Signal-to-Noise Paradox in Climate Science, npj Clim. Atmos. Sci., 1, 28, https://doi.org/10.1038/s41612-018-0038-4, 2018. a, b

Scaife, A. A., Arribas, A., Blockley, E., Brookshaw, A., Clark, R. T., Dunstone, N., Eade, R., Fereday, D., Folland, C. K., Gordon, M., Hermanson, L., Knight, J. R., Lea, D. J., MacLachlan, C., Maidens, A., Martin, M., Peterson, A. K., Smith, D., Vellinga, M., Wallace, E., Waters, J., and Williams, A.: Skillful Long-Range Prediction of European and North American Winters, Geophys. Res. Lett., 41, 2514–2519, https://doi.org/10.1002/2014GL059637, 2014. a

Stockdale, T., Balmaseda, M. A., Johnson, S., Ferranti, L., Molteni, F., Magnusson, L., Tietsche, S., Vitart, F., Decremer, D., Weisheimer, A., Roberts, C., Balsamo, G., Keeley, S., Mogensen, K., Zuo, H., Mayer, M., and Monge-Sanz, B.: SEAS5 and the Future Evolution of the Long-Range Forecast System, Tech. Rep. 835, ECMWF, https://doi.org/10.21957/z3e92di7y, 2018. a, b, c

Stott, P., Good, P., Jones, G., Gillett, N., and Hawkins, E.: The Upper End of Climate Model Temperature Projections Is Inconsistent with Past Warming, Environ. Res. Lett., 8, 014024, https://doi.org/10.1088/1748-9326/8/1/014024, 2013. a

Suckling, E. B. and Smith, L. A.: An Evaluation of Decadal Probability Forecasts from State-of-the-Art Climate Models, J. Climate, https://doi.org/10.1175/JCLI-D-12-00485.1, 2013. a

Sun, Y., Simpson, I., Wei, H.-L., and Hanna, E.: Probabilistic Seasonal Forecasts of North Atlantic Atmospheric Circulation Using Complex Systems Modelling and Comparison with Dynamical Models, Meteorol. Appl., 31, e2178, https://doi.org/10.1002/met.2178, 2024. a, b

Tietsche, S., Balmaseda, M., Zuo, H., Roberts, C., Mayer, M., and Ferranti, L.: The Importance of North Atlantic Ocean Transports for Seasonal Forecasts, Clim. Dynam., 55, 1995–2011, https://doi.org/10.1007/s00382-020-05364-6, 2020. a, b, c

Wang, L., Ting, M., and Kushner, P. J.: A Robust Empirical Seasonal Prediction of Winter NAO and Surface Climate, Sci. Rep., 7, 279, https://doi.org/10.1038/s41598-017-00353-y, 2017. a

Wang, W., Anderson, B. T., Kaufmann, R. K., and Myneni, R. B.: The Relation between the North Atlantic Oscillation and SSTs in the North Atlantic Basin, J. Climate, 17, 4752–4759, https://doi.org/10.1175/JCLI-3186.1, 2004. a

Watanabe, M. and Kimoto, M.: Atmosphere-Ocean Thermal Coupling in the North Atlantic: A Positive Feedback, Q. J. Roy. Meteorol. Soc., 126, 3343–3369, 2000. a

Weisheimer, A., Schaller, N., O'Reilly, C., MacLeod, D. A., and Palmer, T.: Atmospheric Seasonal Forecasts of the Twentieth Century: Multi-Decadal Variability in Predictive Skill of the Winter North Atlantic Oscillation (NAO) and Their Potential Value for Extreme Event Attribution, Q. J. Roy. Meteorol. Soc., 143, 917–926, https://doi.org/10.1002/qj.2976, 2017. a

Weisheimer, A., Baker, L. H., Bröcker, J., Garfinkel, C. I., Hardiman, S. C., Hodson, D. L. R., Palmer, T. N., Robson, J. I., Scaife, A. A., Screen, J. A., Shepherd, T. G., Smith, D. M., and Sutton, R. T.: The Signal-to-Noise Paradox in Climate Forecasts: Revisiting Our Understanding and Identifying Future Priorities, B. Am. Meteorol. Soc., https://doi.org/10.1175/BAMS-D-24-0019.1, 2024. a

Wills, R. C. J., Herrington, A. R., Simpson, I. R., and Battisti, D. S.: Resolving Weather Fronts Increases the Large-Scale Circulation Response to Gulf Stream SST Anomalies in Variable-Resolution CESM2 Simulations, J. Adv. Model. Earth Syst., 16, e2023MS004123, https://doi.org/10.1029/2023MS004123, 2024. a

Woollings, T. and Blackburn, M.: The North Atlantic Jet Stream under Climate Change and Its Relation to the NAO and EA Patterns, J. Climate, https://doi.org/10.1175/JCLI-D-11-00087.1, 2012. a

Zhang, W., Kirtman, B., Siqueira, L., Clement, A., and Xia, J.: Understanding the Signal-to-Noise Paradox in Decadal Climate Predictability from CMIP5 and an Eddying Global Coupled Model, Clim. Dynam., 56, 2895–2913, https://doi.org/10.1007/s00382-020-05621-8, 2021. a

Articles

Short summary

Winter weather over the North Atlantic remains difficult. I studied a seasonal forecast system and found that its deficient representation of lagged ocean–atmosphere associations impedes its predictive skill. Specifically, the model's inadequate representation of feedback mechanisms involving surface heat fluxes and storm formation directly limits its skill. This analysis provides entry points for improving the model's performance.