Origins of Multi-decadal Variability in Sudden Stratospheric Warmings

Sudden Stratospheric Warmings (SSWs) are major disruptions of the Northern Hemisphere (NH) stratospheric polar vortex and occur on average approximately 6 times per decade in observation based records. However, within these records, intervals of significantly higher and lower SSW rates are observed suggesting the possibility of low frequency variations in event occurrence. A better understanding of factors that influence this decadal variability may help to improve predictability of NH mid-latitude surface climate, through stratosphere-troposphere coupling. In this work, multi-decadal variability of 5 SSW events is examined in a 1000-yr pre-industrial simulation of a coupled global climate model. Using a wavelet spectral decomposition method, we show that hiatus events (intervals of a decade or more with no SSWs) and consecutive SSW events (extended intervals with at least one SSW in each year) vary on multi-decadal timescales of period between 60 and 90 years. Signals on these timescales are present for approximately 450 years of the simulation. We investigate the possible source of these long-term signals and find that the direct impact of variability in tropical sea surface temperatures, as well as the asso10 ciated Aleutian Low, can account for only a small portion of the SSW variability. Instead, the major influence on long-term SSW variability is associated with long-term variability in amplitude of the stratospheric quasi biennial oscillation (QBO). The QBO influence is consistent with the well known Holton-Tan relationship, with SSW hiatus intervals associated with extended periods of particularly strong, deep QBO westerly phases. The results support recent studies that have highlighted the role of vertical coherence in the QBO when considering coupling between the QBO, the polar vortex and tropospheric circulation. 15


Introduction
Major Sudden Stratospheric Sudden Warming (SSW) events involve significant disruption of the Northern Hemisphere (NH) stratospheric polar vortex and represent the largest mode of interannual variability in the boreal winter stratosphere (Butler et al., 2017;Baldwin et al., 2020). They are associated with an equatorward shift and deceleration of the North Atlantic jet stream (Kidston et al., 2015), negative phases of the North Atlantic Oscillation (NAO) (Baldwin and Dunkerton, 2001) as 20 well as cold snaps over Eurasia and North America (Thompson et al., 2002;Lehtonen and Karpechko, 2016;Tomassini et al., 2012;Kretschmer et al., 2018). SSWs also play a key role in seasonal to sub-seasonal forecasts (Domeisen et al., 2020a, b).
In reanalysis datasets, SSWs occur at an average rate of 0.6 events/winter but this varies markedly over the record (Butler et al., 2015) suggesting the possibility of variability on much longer timescales. For example, observational studies have noted a hiatus in the 1990s when very few major SSW events occurred (Butler et al., 2015;Pawson and Naujokat, 1999;Shindell 25 et al., 1999) while, in contrast, the early 21st century displayed a remarkable number of consecutive winters containing SSW events (Manney et al., 2005).
Despite a significant body of work aimed at understanding the nature of SSWs and their impacts on mid-latitude surface climate, variability of their occurrence on decadal to multi-decadal timescales is not well understood. Multi-decadal stratospheric variability has been considered in the context of forced anthropogenic warming signals in GCMs. For example Garfinkel ternal variability in these studies was not fully established but Garfinkel et al. (2015) used a subset of the simulations analysed in Garfinkel et al. (2017) and linked a decadal trend  in late winter vortex strength to sea surface temperature (SST) variability. On the other hand, Seviour (2017) analysed observed variations between 1980 and 2016 and concluded that the vortex variability was primarily internally generated. Schimanke et al. (2011) noted multidecadal scale variations in SSW occurence with periods of approximately 52 years in a multi-century GCM integration and demonstrated coherent variability 40 in other parts of the climate system, including vertically propagating planetary wave activity, Eurasian snow cover and Atlantic SSTs. However, despite providing some indications of externally driven variability, results from this study are not conclusive, since the GCM used (EGMAM: ECHO-G with Middle Atmosphere Model) exhibits significant bias in mean SSW rate compared to reanalyses (2 events per decade). This means that their findings may not be fully representative of the observed stratosphere and the authors note that further simulations are required to understand this variability. Manzini et al. (2012) ex-45 plore causes of 20 year period variability in a simulation with prescribed, pre-industrial SSTs. They propose that, given the boundary conditions in the simulations are fixed, such variability must be internally generated. Butchart et al. (2000) suggest that decadal variability in vortex strength as well as SSW frequency may originate from feedbacks caused by the non linear nature of boreal winter stratospheric dynamics. Both works show that these internally induced signals significantly influence mid-latitude surface variability, forcing similar period signals in the NAO and North Atlantic SSTs. 50 A region often considered in studies of vortex strength variability is the equatorial stratosphere. The primary mechanism for coupling between these regions is between the Quasi Biennial Oscillation (QBO) and the vortex. An association between the phase of the QBO and the strength of the polar vortex was first proposed by Holton and Tan (1980) and Holton and Tan (1982) who found that the polar vortex exhibited a strengthening when the QBO near the 50 hPa level was in its westerly phase (QBO-W) compared to its easterly phase (QBO-E). This link, usually referred to as the Holton-Tan (HT) effect, has been reported 55 in subsequent studies with more comprehensive observations as well as in modelling studies using GCMs including the Met Office Hadley Centre Model 2 (HadGEM2) (Watson and Gray, 2014) and other Met Office models (Garfinkel et al., 2018) based on predecessors of the model considered in this study. Separate modelling studies also report a HT effect (Baldwin and Dunkerton, 1991;Pascoe et al., 2005;Lu et al., 2008). A number of physical mechanisms have been proposed to account for the observed coupling between the QBO and the stratospheric polar vortex that involve a QBO influence on wave propagation 60 into the winter stratosphere (Baldwin et al., 2001).
The QBO is typically defined by the equatorial zonal-mean zonal wind (ZMZW) at a single level in the mid-stratosphere.
The 50 hPa level is usually used for NH observational studies (Baldwin et al., 2001;Baldwin and Dunkerton, 1998) but some studies have also noted the importance of characterising the vertical structure of the QBO (Fraedrich et al., 1993;Wallace et al., 1993;Baldwin and Dunkerton, 1998;Dunkerton, 2017;Gray et al., 2018;Andrews et al., 2019). In an observationalbased study Gray et al. (2018) find an enhanced association between the QBO and polar vortex when a metric incorporating the vertical coherence of equatorial winds via empirical orthogonal functions is utilised (Schenzinger, 2016). In a modelbased study Andrews et al. (2019) introduce a similar but simpler methodology by defining the QBO as the average ZMZW between two vertical levels, which preferentially selects time intervals that display a vertically coherent QBO phase between the specified levels. These studies suggest the importance of vertical QBO metrics when considering QBO-vortex coupling 70 although the influence mechanisms are not well understood.
Decadal to multi-decadal scale variability in the QBO and the HT relationship has also been examined. There are clear variations in QBO period and phase transition timing (Pascoe et al., 2005;Anstey and Shepherd, 2008;Yang and Yu, 2016).
These may be linked to variations in the degree of 'stalling' of the QBO phase descent, which can cause more or less persistent wind direction at a given level. A number of studies have also noted the transient nature of the strength of the HT relationship 75 (Lu et al., 2008(Lu et al., , 2014Anstey and Shepherd, 2008;Osprey et al., 2010). Lu et al. (2008Lu et al. ( , 2014 note that the mid-latitude wave-guide is modulated by the shape of the vortex so that planetary waves are diverted further equatorwards when the vortex is anomalously strong and wide, and this could temporarily reduce the influence of the QBO on the vortex. Vortex variability has also been closely associated with variations in winter surface climate that can determine the strength of mid-latitude tropospheric wave driving. Among the most notable of these is the climatological low pressure system over 80 the Aleutian Islands in the Bering sea -the Aleutian low (AL). The intensity of the AL has been shown to modulate vertical planetary wave propagation into the vortex region (Woo et al., 2015;Garfinkel et al., 2010;Manzini et al., 2006). The effect has been found in reanalysis (Hu and Guan, 2018) as well as modelling studies (Kren et al., 2016;Kang and Tziperman, 2017; mid-latitude climate (Tsuyoshi and Shingo, 1989;Trenberth and Hurrell, 1994;Zhang et al., 1997) and it varies significantly 85 on decadal to multi-decadal timescales. Overland et al. (1999) note that 10 year mean values of SLP over the AL region exhibit fluctuations of up to 35% of the climatological mean. Subsequent studies corroborate the presence of these decadal scale fluctuations: Sugimoto and Hanawa (2009) and Minobe (1999) show 20 year fluctuations in intensity and centre of action of the AL while Raible et al. (2005) propose a 50-60 year trend in AL intensity, suggesting the existence of even longer timescale variability.

90
Further surface features linked to vortex variability involve tropical SSTs. For example the SST anomalies over the Eastern Pacific region associated with the El Niño Southern Oscillation (ENSO) have been shown to induce a stratospheric vortex circulation response via a pathway involving the AL (Domeisen et al., 2019). A positive ENSO phase is associated with a deepening of the AL which promotes stronger planetary wave forcing of the middle atmosphere. This teleconnection has been found extensively in observation based studies (Garfinkel and Hartmann, 2008;Ineson and Scaife, 2009;Smith and Kushner, 95 2012) as well as modelling studies (Bell et al., 2009;Domeisen et al., 2014;Manzini et al., 2006;Richter et al., 2015). However, the connection's robustness has also been shown to vary between ENSO events (Deser et al., 2017;Iza et al., 2016) and between decades (Osprey et al., 2019) suggesting elements of non-stationarity in the teleconnection. SSTs in other tropical regions also exhibit coherence with the vortex. Rao and Ren (2017) show that Tropical Atlantic SSTs give rise to a vortex response although it is highly variable throughout the season while Kushner (2011), Fletcher andKushner (2013) and Rao and Ren 100 (2015) propose a tropical Indian Ocean (TIO) connection. Positive TIO SST anomalies lead to a reduced strength of the AL that weakens the Rossby wave forcing of the vortex, an opposite effect to the ENSO-vortex connection where positive SST anomalies leads to vortex weakening.
Other surface forcings have been shown to modulate vortex variability via alterations of tropospheric stationary wave activity. For example, Eurasian snow coverage in October has been shown to exhibit a connection with the mid winter vortex in 105 observational and model data Cohen et al., 2007), although Henderson et al. (2018) highlight that many of the underlying processes behind this connection are not well understood. Sea ice extent over the Kara and Barent regions has also been proposed as an influence on planetary wave forcing of the vortex via alterations in ocean-atmosphere heat flux Nakamura et al., 2016). The resulting influence on SSW occurrence forms the basis of a proposed pathway whereby increased sea ice reduction leads to a negative Arctic Oscillation (AO) and severe NH winter weather. Finally, Hirota Although substantial effort has been applied to characterising SSWs and their underlying mechanisms, there is little understanding of periods of hiatus (such as that in the 1990s) and consecutive-event years (early-mid 2000s), primarily because of the short record of reliable observations as well as the complexities and often non-stationary nature of the multiple observed 115 teleconnections. Much longer time-series are required to successfully identify and understand the source of decadal and multidecadal scale variability. In the meantime, analysis of variability and teleconnections in long climate model simulations may help to understand these processes.
In this work, we analyse long-term variability of the stratospheric polar vortex in a 1000-yr pre-industrial (piControl) simulation of the UK Earth System Model (UKESM). The absence of external forcings such as greenhouse gas increases, volcanic and 120 solar variations allows us to examine sources of long-term variability that are internally generated within the climate system.
We identify intervals containing high and low SSW rates and analyse their variability, with a focus on multi-decadal scales.
Improved understanding and representation of stratospheric variability will help to improve predictions of NH winter surface weather and climate (Kidston et al., 2015;Gray et al., 2020). A variety of techniques are employed to examine associations with those parts of the climate system that are known to exhibit long-term memory, namely the tropical SSTs and the related 125 Aleutian Low. In particular, a wavelet spectral decomposition and cross-spectrum analysis is employed to overcome some of the difficulties with non stationary signals that may arise. We also investigate interactions between the polar vortex and the QBO as a potential source of internally-driven variability. The latter reveals a source of multi-decadal scale variability associated with the amplitude and vertical depth of the QBO. The paper is structured as follows: Section 2 sets out the GCM used in the investigation, the spectral analysis method (wavelet analysis) and relevant climate indices. Section 3 presents results from 130 the analysis. Section 4 discusses findings and concludes.

Model Configuration
The first version of the UK Earth System Model (henceforth referred to as UKESM) is the most recent configuration of the MetOffice unified model (the UM) (Mulcahy et al., 2018). UKESM is a stratosphere resolving coupled ocean-atmosphere-  Ridley et al., 2018) models respectively, while ocean biochemistry is added through MEDUSA (Yool et al., 2013). UKESM 140 also includes a fully interactive chemistry scheme via coupling with the UK Chemistry and Aerosols model (UKCA, Mulcahy et al., 2018). We utilise a 1000 year pre-industrial (PI) control simulation of UKESM submitted to CMIP6 which is spun-up to achieve initial model equilibrium following the method outlined in Yool et al. (2020). This run is forced using CMIP6 pre-industrial values for concentrations of major GHGs (global mean 284.317ppm CO 2 , 808.25ppb CH 4 , 273.02ppb N 2 O).
While there are no volcanic eruptions in the simulation, background stratospheric volcanic aerosols are set to climatological 145 values between 1850 and 2014 estimated from satellite products and other model simulations (Menary et al., 2018). We choose a PI control for this analysis to examine internal variability in SSWs on multi-decadal timescales. To verify that the model reproduces relevant features of the climate system we compare with the ERA-Interim reanalysis dataset (Dee et al., 2011).

Linear Regression Analysis
We employ a multi-linear regression technique to give an estimate of the relative contributions to SSW variability from the 150 QBO, ENSO and the AL following the method outlined in Krzywinski (2015). We model an SSW timeseries of length n which we denote y aŝ where β j denotes the coefficient of the corresponding index andŷ is the prediction of y. We calculate the best estimate for each β using an ordinary least square (OLS) estimator which minimises the sum of squared error (SSE) between the predicted 155ŷ and the real time series y with respect to each coefficient. the SSE is given by SSE = n i (ŷ i − y i ) 2 . We can compare the estimated magnitude of the coefficient for each index to analyse the respective contributions to SSW variability. We can also calculate standard error ranges for β estimates. The standard error on an estimated value of a true β j (denoted byβ j ) is given by where SSR is the sum of squared residuals which measures the sum of squared deviations of predicted values from the mean y value, y. This is given by SSR = n i (ŷ i − y) 2 ). k is the number of predictors used in the linear model (in this case 3) and X is an n × k matrix consisting of the predictor indices. We also define significance levels forβ j using a t statistic with n − k degrees of freedom to test the null hypothesis thatβ j = 0. The t statistic is given by t =β j se(βj ) .

165
In order to study possible multi-decadal variability in SSW occurrence, we utilise a wavelet analysis method based on Torrence and Compo (1998). Such a wavelet analysis can be used to examine time series which displays non-stationary spectral power over multiple frequencies (Daubechies, 1990) giving it a useful advantage over more traditional fourier methods for spectral analysis. The wavelet transform of a uniform 1-dimensional time series, x, of length N and timestep δt is given by the convolution between the series and a scaled and translated version of a wavelet function ψ 0 (equation 3) where * denotes the complex conjugate and s is the wavelet scale indicating the frequency of the wavelet. Varying s and translating along the time scale (the index n), W n indicates the amplitude of signals at different scales and their variation in time. Torrence and Compo (1998) suggest an approach to varying the scale s as increasing in powers of 2 according to s j = s 0 2 jδj , j = 0, 1, ..., J where s 0 is the shortest resolvable scale of a signal, J corresponds to the longest and δj is the scale resolution. The translated and scaled wavelet has the form and we select the form of ψ 0 following the recommendation of Torrence and Compo (1998) as a Morlet wavelet, an oscilla-180 tory function enveloped by a Gaussian which is expressed as The advantages of using a Morlet wavelet for analysing signals in climate time-series is discussed in Lau and Weng (1995) in which the authors acknowledge that while truly physical signals should be detected regardless of which wavelet basis is chosen, for best results one should adopt a wavelet function reminiscent of the real signal. They show that when a Morlet 185 wavelet form is utilised, spectral decomposition methods can detect common forms of behaviour exhibited in the variability of time series associated with the Earth's climate. These include time variations in period and amplitude of signals, abrupt changes in periodicity (sudden regime shift to different spectral behaviour) and some forms of rapid changes in series over time. These forms of behaviour are most likely relevant for our analysis of SSWs, therefore we proceed with a wavelet of this form.

190
It is computationally quicker to compute the wavelet transform in discrete Fourier space. By the convolution theorem, the transform reduces to multiplication wherex k andψ are the discrete Fourier transforms of the time series x (equation 9) and the wavelet function (equation 10) respectively, H(ω k ) is the Heavyside function andψ is normalised to have unit energy when integrated over all ω. The square modulus of the wavelet transform gives the wavelet power spectrum which indicates relative strength of signals in the time series as a function of signal period and discretised time. In order to directly compare spectra of different indices we normalise all spectra 200 by the variance of the corresponding time series. We also define a confidence interval for wavelet power observed at a given period and time for a series by assuming a mean background spectrum corresponding to that of a first order autoregressive (AR1, red noise) process modelled by where α is the lag-1 autocorrelation of the time series and z n is Gaussian white noise. Torrence and Compo (1998) show 205 that such a process's wavelet power spectrum is χ 2 distributed and therefore can be used to define a 95% confidence interval for any observed power.

Cross Wavelet Spectra
The cross wavelet spectrum of two time series x and y with associated wavelet spectra W x n and W y n gives a measure of coincident power (the same period at the same timepoints) between the series. It is given by where W x * n (s) is the complex conjugate of the wavelet power spectrum of x (Grinsted et al., 2004). The complex argument of W xy n (s) gives the local phase difference between signals in x and y in frequency-time space. The phase relationship between the two time-series can be represented by a vector that subtends an angle representing the phase difference: On all plots of cross spectra, arrows to the right (left) denoted signals which are in-phase and correlated (anti-correlated). Vertical arrows indicate 215 a phase relationship of π 2 between the time-series, so that the evolution of one is correlated with the rate-of-change of the other. As for individual power spectra, we define a confidence interval for which cross power of a larger amplitude is deemed significant (>95% confidence interval) by comparing power exhibited by actual series with a theoretical red noise process. The cross power of two such AR1 processes is theoretically distributed such that the probability of obtaining cross power greater than a set of red-noise processes is where σ denotes the standard deviation of the time series, Z is the confidence interval defined by p (Z = 3.999 for 95% confidence), ν is the degrees of freedom for a real wavelet spectrum (ν = 2) and P x k is the theoretical Fourier spectrum of the AR1 process. For a given wavenumber k, this can be expressed as 225

Hilbert Transform
We utilise a signal processing method known as a Hilbert transform to calculate the instantaneous phasor amplitude of a QBO time series. The Hilbert transform of a time series x(t) can be expressed as where˜denotes the transformed series, * signifies a convolution and t is discretised time. Conversely, the original time series 230 can be recovered using an inverse transform expressed as A complex signal which consists of x(t) and its transform is known as the analytic signal of x and can be used to calculate an instantaneous phasor amplitude, A(t), of the signal. X(t) can be expressed as

235
where A(t) is the instantaneous amplitude of the signal and θ(t) is the instantaneous phase angle -a measure of signal progression through a cycle at time t.

Model Diagnostics
We utilise the definition of an SSW event from Butler et al. (2015). An event is recorded when the ZMZW at 60 • N on the 10 hPa level transitions from westerly to easterly during NH winter months (November -March). The day on which this 240 reversal occurs is referred to as the central date. After this date, the ZMZW must recover to westerly for a period of at least 10 consecutive days (which is the approximate radiative timescale of the mid-stratosphere) before another event can be recorded.
If, after the central date, the ZMZW does not recover to westerly for at least 20 consecutive days before the end of April, the warming is classified as a final warming. While we do record all events in extended winter (Nov-Mar) for an initial analysis of mean SSW rates, we use mid-late winter warmings (Dec-Mar) for our analysis of multi-decadal variability and interaction 245 with other other climate variables (this choice is addressed in section 3.1).
We analyse variability in tropical SSTs in four regions identified by Scaife et al. (2017) as key to affecting Rossby wave propagation and interactions with stratospheric winds. The regions are defined as the Tropical Atlantic (5  (2019)) and between 20-50 hPa to identify QBO phases that exhibit winds of the same sign over a relatively large vertical extent.

Modes of Stratospheric Variability
We begin by analysing the representation of modes of stratospheric variability in the UKESM piControl simulation. As described in section 1, the winter polar stratospheric vortex exhibits substantial variability. In some years the westerly winds of the vortex are relatively strong and undisturbed while in other years the vortex is weakened by wave disturbances that in extreme cases can lead to SSWs. The average Nov-Mar SSW rate over the full 1000 years of the UKESM simulation is 0.54 presented in the following sections, tests have been performed to ensure that the results are not sensitive to the inclusion or exclusion of November SSW rates. The model exhibits variability in SSW frequency comparable to observations, including both hiatus and consecutive SSW intervals. Figure 2 shows a sample 40-yr interval of the polar vortex zonal wind strength from the UKESM simulation compared with a similar length from the ERA Interim reanalyses. An extended interval of mainly westerly anomalies indicating a strengthened vortex and lack of SSWs can be seen towards the end of the 40-yr interval, similar to the 1990s in ERA-

285
Interim when only 2 SSW events were recorded in the decade. The simulation contains 8 such hiatus intervals with at least 10 consecutive years with no SSWs, the longest of which lasts 16 years. On the other hand, the simulation only contains 2 intervals in which 10 consecutive years exhibit at least 1 SSW. However, if the threshold interval width for identifying hiatus and consecutive-SSW intervals is shortened from 10 to 5 years, then 9 consecutive-SSW intervals and 25 hiatus intervals are found. These statistics indicate that UKESM is not only able to reproduce the mean state characteristics of SSW events but 290 also decadal-scale variations in SSW rate, underlining its suitability for this study. Figure 3 shows the equatorial wind time-series from a sample 40-yr interval of the simulation compared with the ERA-Interim dataset. The mean period of the oscillation is longer than observed, at ∼38 months compared to ∼28 months in ERA-Interim (Kawatani, 2016). As a result the vertical shear zones descend less rapidly than observed. There is also a westerly bias at low 295 levels where the QBO-E phase does not extend sufficiently deep into the lower stratosphere, which is a common bias in many models (Bushell et al., 2020). The descending shear zones also appear more regular than observed but there is nevertheless some evidence of decadal-scale variations e.g. in the degree of stalling at 30 hPa, although not as pronounced as in the observations.  There is evidence of coupling between the two major modes of stratospheric variability in the model, giving rise to a Holton-Tan relationship (Anstey et al., 2020). Figure 4 shows height-latitude cross-sections of NH winter zonal wind differences between QBO E-W composites defined at various equatorial levels. The familiar pancake structure of alternating easterly / westerly differences is present at equatorial latitudes, indicative of the QBO phase but there is also a response at high latitudes.
In good agreement with observations the largest high latitude response amplitude is seen when the QBO is defined at 50hPa,

315
The presence of the Holton-Tan relationship is also seen in the modelled frequency of SSWs (figures 5). Significantly higher rates are observed in QBO-E winters than QBO-W. Also notable is the asymmetry in abundance of QBO-E and QBO-W winters -nearly twice as many QBO-E winters are observed compared to QBO-W under all phase definitions ( figure 5, legends). This suggests an element of phase locking between the QBO and the seasonal cycle possibly associated with seasonally variations 320 in the strength of mean equatorial upwelling or mid-latitude planetary wave forcing in winter (Pascoe et al., 2005;Gruzdev and Bezverkhny, 2000;Rajendran et al., 2015) resulting in QBO phase transitions that occur preferentially in certain months.

Regression Analysis
We next employ a multi-linear regression analysis to measure the relative contributions to the time-series of SSWs per year (as by the shorter (inter-annual) timescales and any small amplitude variations at longer timescales will not be revealed. The latter can be addressed to some extent by smoothing or filtering the time-series, as discussed in the next section, but this requires prior knowledge of which frequencies are of interest. An alternative and superior approach to this problem employs wavelet analysis, described more fully in the next section, which successively examines the frequency intervals to identify the presence of signals thus avoiding dominance by one particular frequency, and also examines the time evolution of the signal so that 340 non-stationary signals can also be identified. than 5 m s −1 . Coloured shading indicates differences significant above the 95% confidence level under a 2 tail student's t-test.

Long-term Variability of the Polar Vortex
A more comprehensive assessment of the long-term variability of SSWs can be made using a wavelet power spectrum approach.
We count the number of SSWs in each winter season (Dec-Mar) and calculate the corresponding wavelet power spectrum, shown in figure 6. As described above, the analysis highlights the presence of power in the signal as a function of frequency  (period in years, along the y-axis) and as a function of time (year of simulation along the x-axis). As expected, there is an intermittent but relatively persistent signal with period around 2-4 years throughout the simulation, corresponding to the period of the QBO which supports the presence of a Holton-Tan relationship between the QBO and the polar vortex in the model.
The so-called 'global power spectrum' (i.e. the time average of the wavelet spectrum) shown on the right of figure 6 shows that the signal is on the boundary of the 95% statistical significance. Other signals at periods near 20-30 years are similarly 350 intermittent and manifest as a peak in the time-averaged spectrum that is also near the 95% significance boundary. The most persistent feature of the series appears at periods between ∼60-90 years in the interval between 400-800 yrs. This feature shows statistical significance (based on comparisons between power in the spectrum and that of an AR1 process with the same autocorrelation structure as the series being analysed) for around 350 years of the 1000-yr simulation but does not cross the significance threshold for the time-averaged spectra. There is a possible limitation of this wavelet methodology due to the 355 discrete nature of the time series being analysed (time points take values 0/1/2). The Morlet wavelet is a continuous function and, as a result, convolution with a highly discretised series may alias features on the resulting wavelet spectra. This limitation must be considered when drawing conclusions from the wavelet spectra and is discussed further below. The focus of this study is on the longer-term time variations to understand the source of variability characterised by hiatus 360 intervals (no SSWs over an extended period) and consecutive-event intervals (at least one SSW every year for an extended period). We therefore apply low-pass filtering to the time series of SSWs per season using a 5-year rolling window and examine the spectral characteristics of this smoothed series (which we refer to henceforth as SSW 5yr ). This averaging is similar to the standard practice of smoothing daily data to remove the noise associated with daily weather variations, thus isolating longer seasonal timescales. It also decreases the impact of the time series discretisation by reducing the chance of introducing spurious 365 spectral features on the wavelet power spectrum which could be otherwise encountered when analysing the un-smoothed time series. The wavelet power spectrum of SSW 5yr (figure 7) shares many of the characteristics of the spectra of the un-smoothed series (figure 6), but the longer period signals are now more clearly evident, as expected. The SSW 5yr wavelet spectrum shows the 370 two broad regions of statistically significant maxima corresponding to signal periods of ∼20-30 years and ∼60-90 years, but with increased significance both locally and in the time-average. For example, the feature around 90 year period appears significant for 450 years in SSW 5yr compared to 350 years before smoothing. One possible explanation for this increase lies in our definition of the significance level on power which is dependent on the lag-1 autocorrelation of the time-series. Introducing a 5 year averaging window will increase the autocorrelation, possibly leading to a less strict significance level. However, this is unlikely because the significance level is constructed using a red noise process with the same autocorrelation as the series.
This means that for SSW 5yr , the threshold for 95% confidence level increases with increasing period more steeply than in the un-smoothed case and yet the power exhibited at those long periods in SSW 5yr nevertheless achieves higher statistical significance. This indicates that the smoothing has enhanced the visibility of a real signal in the SSW 5yr time series that was less visible in the un-smoothed time-series. As a check for robustness, we also include the SSW 5yr wavelet spectrum 380 including November SSW events (supp figure A1). it looks similar to the spectrum shown in figure 7 particularly on ∼60-90 years timescales with persistent power for around 450 years of the simulation at these periods.

Surface Forcing of Polar Vortex Variability
In the absence of external forcing mechanisms such as greenhouse gas or anthropogenic aerosol forcing, the presence of longterm variability such as the 60-90 year periodicity seen in SSW 5yr (figure 7) suggests a source of long-term internal variability 385 from within the climate system.
The most obvious potential driver of such long timescale variability is the ocean due to its high degree of thermal inertia.
Previous work has identified coupling between tropical SSTs and the polar vortex, such as the relationship to ENSO conditions (see section 1). The model exhibits an expected connection between ENSO and the vortex on interannual timescales indicated by the regression analysis results (table 1) and ZMZW composites for El Niño and La Niña winters (supp figure A2). Figure   390 8a shows the wavelet power spectrum for the 5 year smoothed Sep-Nov ENSO 3.4 index as well as the cross power spectrum with SSW 5yr . We use the early NH winter ENSO index to capture the lagged response of the vortex to this mode of variability.
The ENSO index is slowly varying so will likely remain in the same state between early and mid-winter. We also smooth the   The strength of the Aleutian Low (AL) has also been used as an indicator of tropospheric wave forcing and its influence 415 on the polar vortex (Woo et al., 2015). A similar wavelet and cross-spectrum analysis was therefore performed using an index based on the strength of the modelled NH winter (Dec-Mar) AL (see section 2 for details). The wavelet power spectrum for the 5-year smoothed AL index (figure 9a) exhibits elements of periodic signals with maximum power corresponding to a period of around 55 years (between 40-60 years) but with fairly minimal overlap with the regions enclosed by the 95% confidence level in the corresponding SSW wavelet analysis (green contours). AL indices derived from different winter months give similar 420 spectral patterns (not shown). The cross spectrum analysis between the AL and SSW 5 yr (figure 9b) highlights this relatively small region of overlap in the interval between years 400-500. However, the phase relationship, indicated by the arrows in that region of overlap is difficult to interpret. The proposed physical mechanism of coupling between the AL and the vortex (Woo et al., 2015) involves an association between a deeper AL (i.e. lower pressure and hence a negative anomaly) with increased frequency of SSWs. This negative correlation would give rise to arrows pointing to the left if the relationship was present.

425
In contrast, the upward arrows in figure 9b indicate a π 2 phase difference between the indices on these 60 year timescales, suggesting that peaks in SSW 5yr variations are associated with maximum rates of change of the AL index at the same periods.
As with ENSO3.4, the spectra of the AL shares some features with that of the PDO (supp figure A4). This is consistent with studies into these modes of variability which find significant correlation of the PDO and AL (Mantua et al., 1997;Rodionov et al., 2005) as well as studies that examine influence of PDO on vortex strength through a pathway involving the AL and ENSO 430 (Rao et al., 2019). Despite this possible pathway, the relatively short time interval of overlap between the AL and SSW 5yr signals at the 60-yr period, the absence of any significant signal around the 90-yr period, together with the inconsistent phase relationships, points to a conclusion that AL forcing is unlikely to be the primary driver of long-term variability in SSW 5yr .
Indeed, examination of the cross-spectrum between the un-smoothed AL and SSW indices (figure A7) shows little indication of a coherent relationship between the two indices at any timescale. Finally, while the regression results of the un-smoothed 435 indices give a significant coefficient for the AL (table 1), its magnitude is small compared to that of ENSO3.4 and the deep QBO, the uncertainty on the coefficient is large and the associated p value is close to the 95% significance boundary. The weak relationship between the AL and SSWs is unexpected due to the well acknowledged influence of the AL over planetary wave flux in the upper troposphere (Woo et al., 2015). To attempt to address this, we also analyse an AL metric evaluated as the area weighted average over a box recommended by Garfinkel et al. (2012) who use an SSW precursor region in 500hPa height 440 defined by 52.5 • -72.5 • N , 165 • -195 • E. However, we find a lower correlation between this measure and our SSW timeseries than between the original EOF based metric and SSWs (r = −0.21 for the EOF based AL and r = −0.13 for the box based AL). Despite some coincident signals between tropical SSTs, AL and SSW 5yr , long-term variability in these surface indices are unable to fully account for the multidecadal signals in SSW frequency. An additional potential source of internally generated long-term variability may reside within the stratosphere. Studies have noted relatively long-term variations in the strength of the Holton-Tan relationship (Lu et al., 2008(Lu et al., , 2014Osprey et al., 2010) although the cause of these variations is not well understood. In order to investigate this figure 10 shows the wavelet power spectrum of early winter (Sep-Nov) QBO winds 450 evaluated at selected levels. Since the QBO evolves relatively slowly, employing Sep-Nov averaged winds provides a reasonable representation of the QBO and also allows us to evaluate the in-season lagged relationship between the QBO and subsequent occurrence of an SSW. There is a clear signal between 2 and 4 years for the majority of the simulation, as expected, but no prominent power at longer periods, confirming that there is no significant long-term variability in the periodicity of the QBO winds that could explain the long-term variations in SSW 5yr via the Holton-Tan relationship.  this lag-zero relationship and is discussed below). The anti-phase relationship is consistent with the HT relationship in which a westerly (positive) QBO anomaly corresponds to a reduction in the frequency of SSWs. In our earlier discussion we linked long-term variability in SSW frequency to the existence of extended hiatus periods, during which the vortex is relatively undisturbed with no SSW events (figure 2). The cross-spectrum analysis with deep QBO 480 amplitude modulation suggests a possible physical interpretation involving the Holton-Tan relationship varying on longer timescales, in which a series of consecutive years that exhibit a large amplitude, deep westerly QBO in early winter leads to a series of winters with reduced SSW frequency i.e. a hiatus period. Correspondingly a series of large-amplitude deep easterly QBO years would lead to a series of consecutive-event years.
We verify the results of the wavelet analyses described above by repeating the multi-linear regression (table 1) but using the 485 5-year smoothed QBO, ENSO and AL indices to measure their relative contributions to the 5-year smoothed SSW 5 yr timeseries. The resulting regression coefficients for the deep QBO amplitude and AL remain significant at the 95% level, although the AL's contribution remains small and close to the significance boundary. The coefficient for ENSO3.4 is not significant, suggesting that the connection between ENSO and the vortex variability is dominated by timescales less than 5 years. If we further isolate multi-decadal signals by fourier filtering each timeseries, so that only periodicities longer than 60 years are 490 retained, the ENSO3.4 coefficient is near 0 while the deep QBO amplitude signal is near -0.2. This is consistent with our wavelet analysis which suggest a dominant role for QBO amplitude variations on these long timescales. The AL coefficient remains significant but smaller than that of the QBO (and outside error ranges). For completeness, we repeated the regression analysis using a 5-year smoothed PDO index instead of the AL but there was no significant change in the coefficients (not shown). This is consistent with the fact that the AL and PDO indices exhibit similar spectra and there is a high correlation 495 between them (-0.45 unfiltered and -0.68 filtered), as also found by Mantua et al. (1997) and Rodionov et al. (2005).
To further clarify the role of the QBO, we note that an examination of figure 10 shows that the majority of the long-term amplitude variability in the 15-30 hPa deep QBO index lies in the amplitude of the westerly phase (the easterly phase amplitude is relatively constant with time). Also, as noted earlier, the simulation exhibits more hiatus intervals than consecutive-event intervals, which suggests that the observed long-term variability may arise primarily from the westerly QBO phase. To explore 500 this hypothesis, we isolate the SSW hiatus intervals by modifying the SSW 5 yr time-series in the following way. All SSW rates above 0.54 events per season (the climatological mean) are re-set to 0.54 thereby removing variability in 5 year intervals that exhibit anomalously high SSW rates. Figure 13 shows the cross power spectrum between this modified SSW 5 yr timeseries and the time-series of deep QBO amplitude. It retains significant cross power within the portion of significant SSW 5 yr power (figure 13 green contours) when compared with figure 12b in which the full time-series is used and also shows a phase 505 relationship significantly closer to anti-phased (i.e. pointing to the left). This is further support that the deep QBO -SSW relationship on these long timescales in the model arises primarily from the SSW hiatus periods.
An obvious question is whether this sensitivity to deep QBO westerlies that we find in the model is also present in the real atmosphere. Examination of the ERA-Interim dataset shows limited support. Some winters in the 1990s are characterised by

Summary and Discussion
In this study, we have examined variability in the appearance of hiatus and intervals of consecutive SSWs in a 1000-yr preindustrial control simulation of the UK Earth System Model. While there is much observational evidence for an impact of SSWs on the underlying tropospheric weather and climate, their multi-decadal variability and the associated forcing mechanisms are 520 not well understood due to the short observational record. Analysis of long climate model simulations is currently the only available tool for understanding this variability.
We found realistic decadal and multi-decadal variability in the model, with hiatus intervals of 10 years or more in which no SSWs occurred, similar to the observed SSW record in the 1990s (Pawson and Naujokat, 1999;Shindell et al., 1999) and also intervals of consecutive-event periods in which at least one SSW occurred every year, as observed in the early 2000s (Manney 525 et al., 2005). A 5-yr smoothed representation of SSW frequency (SSW 5yr ) was found to vary periodically for approximately 450 years of the 1000-yr simulation with maxima in wavelet power corresponding to periodicity of around 60-90 years.
A possible tropical SST source of this long-term variability was investigated. Wavelet and cross-spectrum analyses were performed using a variety of different tropical SST indices, including the ENSO 3.4 index, and also an index of the strength of the Aleutian Low (AL) which is linked to large-scale planetary wave forcing of the winter stratosphere. While all of these 530 indices displayed long-term variability, some of which overlapped with the periodicity and time intervals seen in the SSW 5yr spectrum, none of them could fully account for the extended 450 year interval of significant power at 60-90 years seen in the SSW 5yr spectrum. The weak relationship between the AL and SSW occurrence is unexpected and modifying the metric by using a box average SLP measure utilised in Garfinkel et al. (2012) did not recover a stronger connection. Further analysis of the AL-SSW relationship in the simulation would be useful (but outside the scope of this study) to explore whether the AL 535 exhibits a connection with the mean vortex strength even though there is no apparent connection with SSW occurrence.
A second possible source of long-term variability involving variations in the QBO was also investigated. A range of QBO indices were considered, including the standard approach of using equatorial winds at a specified pressure level e.g. 50 hPa and also a 'deep QBO' index which takes the average QBO wind over 15-30 hPa, designed to capture the degree of vertical coherence in the QBO winds (following Gray et al. (2018)  Our analysis has therefore revealed an unexpected relationship between the strength and vertical coherence of the QBO and long-term variability in the frequency of SSWs. The relationship was found to be particularly sensitive to the QBO westerly 550 phase. Extended periods of deep westerly QBO phases were associated with hiatus periods with few or no SSWs, consistent with the Holton Tan relationship. While this result appears compelling, it should be noted that the model showed some biases in its QBO associated with the period and descent rate of shear zones. The extended period of close to 3 years could introduce an element of phase locking between the QBO and seasonal cycle causing winter months to occur preferentially in one QBO phase over the other (evident from figure 5, legends). This may influence QBO-vortex coupling. However, these biases are 555 common in modern GCMs (Bushell et al., 2020) and Rao et al. (2020) show UKESM's representation of the HT effect is better than most major GCMs submitted to CMIP6 indicating this pi-control remains one of the most effective tools for studying multi-decadal variability in the stratosphere. Recent work has also shown a large degree of inter-model variability exists in representations of the QBO as well as SSWs (Bushell et al., 2020;Ayarzagüena et al., 2020) so it is possible the result from this study is model dependant. Additional analysis of long simulations from different models is required to verify these results.

560
Combining the results of all these analyses, our overall conclusion is that multi-decadal variability in SSW frequency in UKESM is primarily accounted for by long term variability in QBO-SSW coupling, particularly at periodicities of around 90 years and, to a lesser extent, by variability in the intensity of the Aleutian Low at periodicities around 60 years, although coherence with the AL signals is far less persistent than with the QBO. Given the observed impact of SSWs on the underlying tropospheric weather and climate, improved understanding of the source and mechanisms of long-term variability in QBO-565 SSW interactions is likely to help improve future seasonal weather forecasts and decadal-scale climate predictions. The precise nature of QBO-SSW interaction mechanisms are still not fully understood (Anstey and Shepherd, 2014). While the importance of wave-mean flow interactions is widely recognised, further studies are required to explore the relevance and usefulness of the deep QBO index highlighted in this study, that identifies a vertically coherent QBO phase. It appears to be especially relevant to long-term QBO-SSW interactions during the QBO-W phase.

570
Further exploration of the source of the long-term variability in amplitude of the QBO-W phase is also required. While a direct influence of tropical SSTs on long-term variability in SSW frequency has not been supported by this study there may nevertheless be an important teleconnection via the QBO, in which the SSTs influence the QBO which subsequently influences the SSW frequency via the Holton Tan relationship. Initial investigation through cross spectrum analysis of the deep QBO index with ENSO and selected SST indices shows some contribution from each of the regions (supp figures A6 and A7), not 575 surprisingly because of the tropical source of equatorial waves that are known to drive the QBO. A closer examination of the precise nature of forcing of the QBO-W phase in the model, in terms of Kelvin and gravity wave forcing would be helpful (but outside the scope of this study).
Time series analysis such as those presented here can highlight associations between modes of variability but are less able to determine causality. Where possible we have selected indices that confirm well-known in-season causality, such as an 580 early winter QBO index compared with a mid-winter SSW index, but determining causality on longer timescales is difficult and would require well designed climate model experiments. The climate system is extremely complex, with many different interactions between modes of variability. The climate system is also clearly non-stationary, as evident in our simulation where the QBO-SSW interaction shows power at periodicities of 60-90 years for 450 years but is absent in the early half of the simulation. While this complexity means that it is extremely challenging to disentangle the influences or to attribute causality, Author contributions. OBDM conducted the analyses, and LJG and SO directed the research. All authors were fully involved in preparing and revising the text.
Competing interests. The authors declare that they have no conflict of interest.    Cross power spectrum. Shading indicates cross power, yellow contours the 95% confidence interval and arrows the relative phase angle between signals in the time series. Green contours on both spectra represent the 95% confidence intervals for the wavelet power spectrum of SSW5yr.