The substructure of extremely hot summers in the Northern Hemisphere

In the last decades, extremely hot summers (hereafter extreme summers) have challenged societies worldwide through their adverse ecological, economic and public health effects. In this study, extreme summers are identified at all grid points in the Northern Hemisphere in the upper tail of the July–August (JJA) seasonal mean 2-meter temperature (T2m) distribution, separately in ERA-Interim reanalyses and in 700 simulated years with the Community Earth System Model 10 (CESM) large ensemble for present-day climate conditions. A novel approach is introduced to characterize the substructure of extreme summers, i.e., to elucidate whether an extreme summer is mainly the result of the warmest days being anomalously hot, or of the coldest days being anomalously mild, or of a general shift towards warmer temperatures on all days of the season. Such a statistical characterization can be obtained from considering so-called rank day anomalies for each extreme summer, that is, by sorting the 92 daily mean T2m values of an extreme summer and by calculating, for every rank, the deviation from 15 the climatological mean rank value of T2m. Applying this method in the entire Northern Hemisphere reveals spatially strongly varying extreme summer substructures, which agree remarkably well in the reanalysis and climate model data sets. For example, in eastern India the hottest 30 days of an extreme summer contribute more than 65% to the total extreme summer T2m anomaly, while the colder days are close 20 to climatology. In the high Arctic, however, extreme summers occur when the coldest 30 days are substantially warmer than climatology. Furthermore, in roughly half of the Northern Hemisphere land area, the coldest third of summer days contribute more to extreme summers than the hottest third, which highlights that milder than normal coldest summer days are a key ingredient of many extreme summers. In certain regions, e.g., over western Europe and western Russia, the substructure of different extreme summers shows large variability and no common characteristic substructure emerges. Furthermore, we show 25 that the typical extreme summer substructure in a certain region is directly related to the region’s overall T2m rank day variability pattern. This indicates that in regions where the warmest summer days vary particularly strongly from one year to the other, these warmest days are also particularly anomalous in extreme summers (and analogously for regions where variability is largest for the coldest days). Finally, for three selected regions, thermodynamic and dynamical causes of extreme summer substructures are briefly discussed, indicating that, for instance, the onset of monsoons, physical boundaries like the 30 sea ice edge, or the frequency of occurrence of Rossby wave breaking, strongly determine the substructure of extreme summers in certain regions.


Introduction
During the last decades, numerous high-impact hot temperature extremes occurred on approximately seasonal time scales, including the extremely hot European summer in 2003 (Fink et al., 2004;Schär and Jendritzky, 2004), the 2010 Russian heat 35 wave (Barriopedro et al., 2011), the hot and dry summer 2015 in Europe (Dong et al., 2016;Hoy et al., 2017;Orth et al., 2016), the hot and humid summer 2015 in western India and Pakistan (Wehner et al., 2016), and the concurrent heat waves across the Northern Hemisphere in the summer 2018 (Vogel et al., 2019). It is well known that individual heat waves on time scales of up to a few weeks cause societal challenges, for example serious public health issues (e.g., Fouillet et al., 2006). However, the large socio-economic and ecological impacts of the seasonal events listed above (e.g., Ciais et al., 2005;Buras et al., 2019) 40 illustrated that many economic sectors such as agriculture, tourism and re-insurance are particularly susceptible to temperature extremes on seasonal (as opposed to synoptic) time scales. Therefore, understanding the statistical properties of entire extremely hot summers (hereafter referred to as "extreme summers") as well as their physical causes is a research topic of high societal relevance.

45
The concept of an extreme summer [as a particular type of an "extreme season", cf. Wernli et al. (in prep.)] is closely related to the concept of a heat wave, even though there are important differences. An individual heat wave is commonly understood to be a single, quasi-continuous episode of abnormally hot surface weather with a duration ranging from days to weeks (Russo et al., 2015;Zschenderlein et al., 2019). Heat waves are thus strongly influenced by individual synoptic flow features such as atmospheric blocks (Brunner et al., 2017;Pfahl and Wernli, 2012;Zschenderlein et al., 2019), 50 stationary ridges (Sousa et al., 2018) or recurrent Rossby wave patterns . In contrast, extreme summers have a fixed duration (of three months), which is beyond the time scale of these synoptic flow features. Consequently, extreme summers require a temporal organization of the relevant synoptic flow features, which can occur either "by chance" (internal atmospheric variability) or favored by more slowly varying processes. Possible candidates for the latter are soil moisture fluctuations (Fischer et al., 2007;Lorenz et al., 2010;Seneviratne et al., 2010), sea ice dynamics (Cohen et al., 2014) 55 or large-scale modes of variability in the ocean and atmosphere (e.g., Schneidereit et al., 2012). Understanding how this temporal organization of weather within seasons occurs is challenging as it requires a seamless approach (Hoskins, 2013), which couples weather system dynamics to these more slower varying processes.
Like any other summer, an extreme summer will inevitably contain cooler and hotter days, which constitute the upper and 60 lower parts of the T2m distribution during that summer. However it is currently not known which part of the T2m distribution is particularly anomalous during an extreme summer. Thus, extreme summers with distinct "substructures" might occur, some of which are schematically illustrated in Fig. 1. For example, a summer might be an extreme summer because the hottest days of the season are particularly anomalous, with the remainder of the summer days being only moderately warmer than or even close to climatology. Such an extreme summer substructure was observed in large parts of Europe in the summer 2015, when Knowledge about the extreme summer substructure is relevant for at least two reasons. Firstly, the societal impact of an extreme summer featuring one (or several) periods of extremely hot temperatures (i.e., hottest summer days being hotter than normally) will likely differ from the societal impact of an extreme summer resulting primarily from a suppression of cool summer days (i.e., coldest summer days being milder than normally), or from an extreme summer characterized by a uniform shift in the entire temperature distribution (i.e., all summer days warmer than normally). Secondly, also the physical and meteorological 75 causes of extreme summers with such distinct substructures conceivably differ. Thus, identifying the substructure of extreme summers is likely a starting point for understanding also their physical causes.
The purpose of this study is to characterize extreme summers statistically by quantifying their substructure. To do so, we define extreme summers in the upper tail of the June-August (JJA) mean two-meter temperature (T2m) distribution. Thereafter, the 80 extreme summer substructure is assessed by decomposing the seasonal mean T2m anomaly of a particular extreme summer into the contributions from all rank days of that season (i.e., the contribution from the coldest day, the second coldest day etc.).
This decomposition thus allows to quantify the contributions from all parts of the T2m distribution (e.g., the coldest, middle and hottest thirds of summer days) to the seasonal T2m anomaly of an extreme summer.

85
Here we use the ERA-Interim re-analysis data set to study the substructure of past extreme summers. However, extreme summers are by definition extremely rare events. Thus, in order to yield robust results, a climatological investigation of the extreme summer substructure requires much longer data records than provided by ERA-Interim or any other currently available high-quality re-analysis data set. We therefore complement ERA-Interim with a 700-year present day climate simulation (for details, see Sect. 2.2) to address the following research goals: 90 1. Propose and illustrate a simple method for decomposing at each grid point the seasonal mean temperature anomaly into its contributions from each rank day.
2. Use this decomposition to analyze the substructure of extreme summers separately at selected grid points.
3. Quantify and compare the spatial variability in extreme summer substructures in the Northern Hemisphere in both reanalysis and climate model data. 95 4. Illustrate physical causes of the observed (and simulated) extreme summer substructures in selected regions.

ERA-Interim
We use ERA-Interim re-analysis data (Dee et al., 2011) covering the period 1979-2018. ERA-Interim is originally produced with a T255 spectral horizontal resolution and 60 hybrid s-p levels in the vertical. We interpolated the data horizontally to a 100 1° by 1° grid and vertically to pressure and isentropic levels. The ERA-Interim data is provided at 6-hourly time intervals, in this study however, we aggregated all data to a daily temporal resolution. Besides the T2m fields, we also use potential vorticity (PV), total precipitation, 250 hPa meridional winds and sea ice concentration. Furthermore, we remove a (40-year) linear trend from all JJA T2m data at each grid point. Our analyses hereafter are based on the detrended data except for Figs. 2, 8 and 9, which are more easily understood based on the non-detrended data (Figs. 2 and 8) or where the absolute T2m values are 105 important (Fig. 9).

CESM
Besides ERA-Interim, the Community Earth System Model version 1 (CESM, Hurrell et al., 2013) is used to perform presentday climate simulations using restart files from the CESM large ensemble project (CESM-LE, Kay et al., 2015). We use atmospheric fields at daily temporal resolution, with a horizontal resolution of approximately 1° and 30 vertical levels. The 110 original CESM-LE data contains a 35-member ensemble of simulations started on 1 January 1920 and integrated forward in time until 2100. These 35 "macro ensemble" members were rerun for the period from 1 January 1990 to 31 December 1999 in order to obtain temporally high-resolution three-dimensional model output. To further increase the number of simulated JJA seasons, a "micro ensemble" with additional 35 members was branched off from member one of the macro ensemble, on 1 January 1980, by adding an (10 %&' ) perturbation to the initial atmospheric temperature field of each micro ensemble. These 115 additional micro ensemble runs are then integrated forward in time until 31 December 1999. Fischer et al. (2013) have shown that at the latest after a decade, the micro ensemble members exhibit a similar spread in atmospheric variables compared to members of the macro ensemble. Thus, for the period 1990-1999, the micro ensemble members can be regarded as additional independent members, yielding a total of 70 ensemble members covering the 10-year period from 1990-1999, i.e., 700 years of present-day climate. As for ERA-Interim data, a linear trend is removed from all JJA T2m data at each grid point and in 120 each ensemble member. Note, however, that due to the ensemble set-up, this trend is calculated over only 10 years.

Decomposing a seasonal T2m anomaly to quantify the season's substructure
To examine the substructure of a particular July-August (JJA) season , we decompose its seasonal T2m anomaly ( , ) into contributions from the ranked daily T2m values of season , where is the number of days in season (e.g., for JJA = 92). We thus aim to quantify how much each rank day (i.e., coldest day, second coldest day, etc.) of season contributes to and therefore, a superscript ∈ { , } will only be used where it is necessary to explicitly distinguish between the two datasets. All the important statistical quantities used in this study are summarized in Tab. 1. Furthermore, bear in mind that all these quantities are calculated at each grid point individually. 130 We start by ranking all daily mean T2m values within their respective season (Figs. 2a,b) and compute seasonal means ( , ), i.e., where <,, is the daily mean T2m value with rank in season (i.e., the temporal ordering of the days is lost, see Fig. 2b).
At each grid point we thus compute BCDE = 40 seasonal mean values for ERA-Interim and GBHI = 700 values for CESM. 135 The climatological seasonal mean ( ) is also calculated from the ranked daily mean T2m values ( <,, ) as ( Using the < , the seasonal T2m anomaly of any season ( , ) can be decomposed into contributions from each of the rank days: where in the last equality the rank day anomaly of the day with rank in season is introduced as <,, = <,, − < . In other words, the seasonal mean anomaly , is expressed as the average rank day anomaly (see also Fig. 2c).

145
This decomposition of , thus allows to assess the exact contribution from each (ranked) day of season to , . For example, if for a particular season , = 1 K and RS,, = 3 K (i.e, the hottest day of season is 3 K warmer than the respective rank day mean) this day contributed 3 92 ⁄ = 0.0326 K or 3.26% to the seasonal anomaly , . In the following we split the 92 days of each JJA season into three parts according to their rank and focus on the relative contributions to , from the coldest, middle and hottest third of the 92 days of season by calculating, e.g.,

Identification and substructure of extreme summers 155
Extremely hot summers at each grid point in the Northern Hemisphere are identified in the ERA-Interim (CESM) data set as The quantities XYZ< , ef<<Zg and hYi again add up to 1 and quantify the relative contributions from the three thirds to the average T2m anomaly of all extreme summers at a particular grid point. Note that the quantities XYZ< , ef<<Zg and hYi characterize the mean extreme summer substructure at a particular grid point, while XYZ<,, , ef<<Zg,, and hYi,, characterize the substructure of a single season .

Extreme summer T2m anomalies
Figures 3a and 4a depict the average T2m anomalies during extreme summers in the two data sets ( BCDE and GBHI , respectively). In both data sets, exhibits considerable spatial variability. The ERA-Interim extreme summers have temperature anomalies of up to 3 K over western Russia, while over some tropical ocean areas BCDE is less than 0.5 K (Fig.   3a). The GBHI field exhibits a generally similar spatial pattern to BCDE , with larger values over land than over the oceans 175 ( Fig. 4a). However, GBHI generally exceeds BCDE , as the summers GBHI are statistically more extreme than the summers BCDE . In the following, we decompose the extreme summer T2m anomalies ( ) shown in Figs. 3a and 4a using the methodology described in Sect. 2.3 and 2.4, first at few selected grid points and then for all Northern Hemisphere grid points.

Extreme summer substructures at selected grid points 180
The rank day anomalies (   Fig. 4c). Thus, at this grid point, all thirds of the T2m distribution contribute to extreme summers, but the contribution from the coldest third is over proportionally large (i.e., considerably larger than 33%). Hence, the re-analysis and the climate model data both suggest that the suppression of cool summer days (leading to coldest days of the summer that are milder than usually) is a key ingredient for extreme summers at 116°W/39°N.

205
Yet a further extreme summer substructure is apparent at the grid point closest to Paris,France (2°E/49°N,Figs. 3d,4d). At this grid point, the ERA-Interim extreme summer of 2018 was characterized by In summary, the mean extreme summer substructure at these four grid points is qualitatively remarkably similar for the 5 225 hottest ERA-Interim summers and the 35 hottest CESM summers. On the one hand, this similarity implies that the rank day anomaly patterns presented in Figs. 3b-e are not artefacts of the rather short ERA-Interim period, but rather must result from physical processes that shape the local extreme summer substructure. On the other hand, these similarities suggest that the CESM is able to correctly capture the processes that generate the distinct extreme summer substructures at these example grid points. We next compare the mean ERA-Interim and mean CESM extreme summer substructures at all grid points in the 230 Northern Hemisphere by considering the spatial patterns of XYZ< BCDE , hYi BCDE , XYZ< GBHI and hYi GBHI .

Spatial variability of ERA-Interim and CESM extreme summer substructure
If extreme summers resulted from a uniform shift in the entire T2m distribution, all three thirds of the T2m distribution would contribute equally (i.e., 33%) to BCDE . However, the hYi BCDE field (Fig. 5a) reveals a complex pattern of coherent regions 235 with increased (> 33%) or decreased (< 33%) contributions from the hottest third of extreme summer days to BCDE . Land areas where particularly large hYi BCDE values are found include the central US, the UK, parts of northeastern Europe, India and southeast Asia as well as the southern Sahel region (Fig. 5a). In some of these areas, hYi,, BCDE exceeded ef<<Zg,, BCDE and XYZ<,, BCDE during at least 4 out of 5 ERA-Interim extreme summers (stippling in Fig. 5a). In these regions, at least 4 out of 5 extreme summers thus exhibited a similar substructure. However, it is important to bear in mind that in other regions the substructure 240 of individual extreme seasons (i.e., XYZ<,, , ef<<Zg,, and hYi,, ) may differ from the mean extreme season substructure characterized by XYZ< , ef<<Zg and hYi . Furthermore, also in parts of the northern North Pacific and northern North Atlantic, hYi BCDE is substantially increased and reaches up to 60%. In many regions, however, hYi BCDE is less than 33%, indicating that in these regions, extreme summers do not arise primarily from the hottest 30 days of the summer being hotter than climatologically. 245 In fact, in many regions it is the contribution to BCDE from the coldest third of the summer ( XYZ< BCDE ) that is substantially increased (Fig. 5c) (Figs. 5c,d). Moreover, over the northern North Pacific as well as the high Arctic, the hYi GBHI and hYi BCDE patterns agree only qualitatively, but not quantitatively (Figs. 5a,b). It is important to note, though, that some differences in the XYZ< and hYi fields for the two data sets are expected due to the different sample sizes, even if the model was perfect. In the remainder of this paper we aim to explain statistical and physical reasons behind selected aspects of the spatial variability in XYZ< and hYi .

A statistical explanation for the observed extreme summer substructures 275
Figures 3b,c and 4b,c clearly illustrate that, at the selected grid points in India (81°E/21°N) and in the US (116°W/39°N) some rank days are climatologically much more variable than others. Importantly, this is the case not just for extreme summers but it is rather a climatological characteristic of the local temperature variability. For example, at 81°E/21°N the hottest 30 days of the summer are much more variable than the colder days. The 5 th to 95 th percentile range of the us,, GBHI -values is roughly four times larger than that of the &s,, GBHI -values (Fig. 4b). At 116°W/39°N the largest rank day variability is found for lower 280 ranks and the 5 th to 95 th percentile range of the us,, GBHI values is roughly 2 times smaller than the same percentile range of the &s,, GBHI -values (Fig. 4c). Similar ratios are found when comparing the spread of us,, BCDE and &s,, BCDE for these two grid points (Figs. 3b,c). Moreover, at both grid points extreme summers occur when the most variable rank days are particularly hot (Figs. 3b,c and 4b,c). Hence, from a statistical point of view, the extreme summer substructure at these two particular grid points appears to be largely determined by the local "rank day variability pattern". That is, the contributions to 285 from the distinct rank days during extreme summers depend on how variable the respective values <,, are climatologically.
We next assess whether the local rank day variability pattern also explains the extreme summer substructure at other Northern Here we have used the fact that the mean of the <,, values is by construction equal to zero and thus their variance reduces to the average of the squared <,, -values of all and all . The contributions from the coldest, middle and hottest third to are then e.g., and analogously for the middle and hottest third of the summer days. variability pattern in most regions, differences in the local rank day variability patterns between the two data sets also lead to differences in the extreme summer substructures.
It is interesting to compare the XYZ< and hYi patterns presented in Figs. 6 and 7 with the skewness of the local daily 315 temperature distributions, which has been studied extensively in the past (Donat and Alexander, 2012;Garfinkel and Harnik, 2017;Linz et al., 2018;Loikith et al., 2018;Loikith and Neelin, 2015;Ruff and Neelin, 2012). The upper tail of, e.g., a positively skewed JJA T2m distribution is longer than the lower tail, which is the case if the hottest summer days are more variable than the coldest summer days (cf. Figs. 5b,c and Fig. S1). Hence, explanations of distinct skewness in daily T2m distributions also help to understand differences in the rank day variability patterns and, subsequently, extreme summer 320 substructures. Garfinkel and Harnik (2017) showed that the winter low-level temperature distributions are positively skewed on the cold side of the Northern Hemisphere storm tracks, primarily because there the magnitude of warm air advection exceeds that of cold air advection. And, vice versa, the winter low-level temperature distributions are negatively skewed on the warm side of the Northern Hemisphere storm tracks, where the magnitude of cold air advection exceeds that of warm air advection. While this argument explains differences in the rank day variability and the extreme summer substructures in regions of strong surface temperature gradients, Figs. 5-7 also reveal numerous rather small-scale features, that do not necessarily occur in 330 regions of strong surface temperature gradients. We therefore next analyze the extreme summer substructure and its causes in three example regions in more detail. Due to the similarity between the ERA-Interim and CESM extreme summer substructures, we restrict this analysis to ERA-Interim data (except where mentioned otherwise).

(Examples of) physical causes of extreme summer substructures 335
A particularly striking feature of Fig. 5 is the large contribution from the hottest third of the summer days to BCDE in India, illustrated exemplarily for the grid point at 81°E/21°N in Fig. 3b. The general temperature evolution in JJA (i.e., considering all JJA seasons) at this grid point follows a particular sub-seasonal pattern (Fig. 8a). In early June, ERA-Interim T2m values are highly variable and range from 27°C to almost 40°C, with a mean of 35°C on 1 June. Throughout June and the first half of July the climatological T2m drops to approximately 26°C and remains at this level until the end of August. Moreover, during 340 that period, the variability in T2m is much smaller than in early June. The extreme summers exhibit comparatively high temperatures primarily in June, while in July and August their T2m evolution does not differ substantially from other JJA seasons (Fig. 8a). The drop of T2m in June is associated with the onset of the Indian summer monsoon [ Fig. 8b; e.g., Slingo, (1999)]. During most JJA seasons, precipitation starts to fall already during the first half of June. However, the extreme summers each featured very little precipitation for at least the first 20 days of June, which suggests that extreme summers at 345 this grid point occur when there is an unusually late onset of the Indian summer monsoon at this particular location. Moreover, the rank day variability pattern at 81°E/21°N is easily understood from Fig. 8: The hottest days of the season mostly occur in June and are associated with dry conditions. The onset date of the monsoon determines how many dry (and thus very hot) days occur in a JJA season, i.e., an early onset of the Indian monsoon suppresses a large number of very hot days and a late onset increases this number, which leads to the large temperature variability seen in the warmest 30 days of the JJA season. 350 A further noteworthy feature in Fig. 5 is the sharp boundary in the extreme summer substructure around 75°N-80°N, for example in the North Atlantic sector. North of this boundary, the coldest third of all extreme summer days contribute up to 60% to the extreme summer anomaly (Figs. 5c,d). South of it, the contribution from the coldest third of extreme summer days is much smaller. (Quantitatively, there is some disagreement between the CESM and ERAI extreme summer substructures, 355 but both data sets agree about the general pattern.) This sharp boundary in the extreme summer substructure is co-located with the climatological sea ice edge in JJA (Fig. 9a). Examining the JJA T2m distributions at three grid points across this boundary (42°W/83°N, 42°W/81°N and 42°W/79°N) reveals that for T2m below -1°C, their probability density functions (pdfs) of the daily T2m values are almost identical, which is not surprising due to their close spatial proximity. However, large differences in the three pdfs are found for T2m at about 0°C and above. At 83°N, i.e., north of the climatological sea ice edge (Fig. 9a), 360 the pdf exhibits a very short upper tail with very little probability density exceeding +2°C (i.e., the pdf is strongly negatively skewed), while at 79°N (i.e., south of the climatological sea ice edge) the upper tail is much more variable. The geographical co-location of this extreme summer substructure boundary and of the climatological sea ice edge is striking and suggests that the contrasting substructures arise because the sea ice buffers "warm" temperatures at 0°C, that is, air with T2m > 0°C is cooled down to close to 0°C by the induced sea ice melting. The same effect has also been shown to shorten the upper tail of 365 the surface temperature pdf over snow covered areas (Loikith et al., 2018).
As a third example, we return to the grid point in Nevada, US (at 116°W/39°N), where the rank day variability is largest for the cold summer days and extreme summers occur when the coldest 30 days exhibit mostly large positive rank day anomalies ( Figs. 3c and 4c). Thus, at this grid point, milder than normal coldest days of the summer (or, equivalently, suppressed cool 370 summer days) are a key ingredient for extreme summers. We therefore briefly explore why, at this grid point, the coldest summer days during extreme summers are warmer than normal.
We first investigate what makes the climatologically coldest summer days at 116°W/39°N particularly cold and then contrast them with the coldest summer days during extreme summers at 116°W/39°N. A composite analysis of the upper-level flow 375 during the 100 climatologically coldest ERA-Interim days of all 1979-2018 summers unravels a characteristic upper-level flow pattern: a highly amplified Rossby wave pattern over the eastern North Pacific and North America, with a breaking synoptic-scale trough covering 116°W/39°N (Fig. 10a). The breaking Rossby wave causing the trough is part of a synopticscale and transient wave packet (Fig. 10b) which has just the right phasing such that the trough axis crosses 116°W/39°N when the amplitude of the trough is largest (Fig. 10b). This type of relatively small-scale troughs, shown here with contours of 380 potential vorticity on an isentrope in the upper troposphere (Fig. 10a), is relatively slow moving (Fig. 10b), such that the induced northwesterly low-level flow along its western flank can lead to strong and persistent cold-air advection to the western US. Additionally, the low-level flow induced by the trough impinges on the topography at the US west coast. Consequently, low-level air masses that are advected into the western US are most likely forced to ascend, which leads to adiabatic cooling of these already cool airmasses and finally results in the climatologically coldest summer days at 116°W/39°N. 385 The composites for the 100 coldest days during extreme summers, in contrast, do not reveal such a wave pattern (Figs. 10a   and 10c). This indicates that the flow pattern characteristic of the climatologically coldest days at this grid point, i.e., the Rossby wave breaking and trough formation with the phasing discussed above, simply did not occur very often during extreme summers. Furthermore, a synoptic analysis of these 100 coldest extreme summer days (not shown) reveals that the associated 390 upper-level flow configurations are rather variable, some featuring troughs while others even exhibited low-amplitude ridges, resulting in the rather zonal composite upper-level flow apparent in Figs. 10a and 10c.
Why in extreme summers at 116°W/39°N such highly amplified troughs with the right phasing did not occur is currently unclear, and at the same time challenging to assess. Possibly, the exact longitude where the synoptic-scale waves have been triggered (Röthlisberger et al., 2018) as well as the strength and longitudinal extent of the North Pacific jet, which modulates the waves' downstream propagation and breaking behavior (e.g., Drouard et al. 2015), might have played a role. However, both the jet strength and the characteristics of the transient waves propagating along the jet are strongly modulated by lowerfrequency processes such as the Madden-Julian Oscillation (Moore et al., 2010) and the El Niño Southern Oscillation (Drouard et al., 2015;Shapiro et al., 2001). This example thus illustrates that a seamless approach, combining processes on different 400 time scales, is most likely required to fully reveal the physical causes of extreme summers.

Summary and concluding remarks
In this study, extreme summers are defined in the upper tail of the JJA seasonal mean T2m distribution at each grid point in the Northern Hemisphere and then analyzed with regard to their substructure. Hereby, the extreme summer T2m anomaly is decomposed into its contribution from each rank day. First, all days are ranked within their respective season (i.e., from rank 405 1 to 92 for JJA) and then compared to the climatological T2m of all days with the same rank. The resulting rank day anomalies exactly quantify how much each (rank) day contributes to the T2m anomaly of the respective season and therefore allow for very intuitive statements about the characteristics of extreme summers. For example, we show that during the 2010 summer at the ERAI grid point at 35°E/58°N the 31 hottest days contributed 53% to the seasonal anomaly of 3.13 K and were each at least 4 K warmer than climatologically. This decomposition is applied to T2m data from ERA-Interim as well as data from 410 700 simulated years with CESM for present day climate conditions. Thereby, the contributions from the coldest, middle and hottest third of extreme summers to the extreme summer T2m anomalies are quantified at each Northern Hemisphere grid point ( XYZ< , ef<<Zg and hYi ).
This analysis reveals clearly distinct extreme summer substructures, occurring in coherent geographical regions. Despite the 415 relatively small scale of the structures in the XYZ< BCDE and hYi BCDE fields as well as different numbers of extreme summers in the two data sets, CESM is able to reproduce these fields to a remarkable degree. This result firstly underlines that the ERA-Interim extreme summer substructures and their spatial variability result from physical processes rather than a too short data record and, secondly, testifies to the model's ability to reproduce the physical processes responsible for the occurrence of extreme summers in most regions in the Northern Hemisphere. Areas where CESM and ERA-Interim extreme summer 420 substructures differ include Greenland, the northern North Atlantic as well as the Arabian Peninsula.
Furthermore, a key finding of this study is that the mean extreme summer substructure is consistent with the shape of the underlying local T2m distribution. The extreme summer substructure is largely determined by which of the 92 JJA rank days are most variable (i.e., the rank day variability pattern), which is qualitatively related to the skewness of the T2m distribution. 425 Simply speaking, in regions where the coldest days of the summer are most variable (i.e., negatively skewed T2m distribution), extreme summers occur when the coldest days of the summer are unusually hot, and, analogously, for the case where hottest days vary the most (i.e., positively skewed T2m distribution). This finding is relevant for two reasons. Firstly, it constrains what kind of extreme summer substructures can locally be expected, in particular in regions with strongly skewed daily temperature distributions. For example, extreme summers arising primarily from extremely hot summer days (i.e., heat waves) 430 are unlikely to occur in regions with strongly negatively skewed temperature distributions. Secondly, some individual extreme summers such as the 2010 summer at the grid point at 35°E/58°N featured clear temperature regime shifts, with rank day anomalies far outside of what could be expected from their climatological variability (e.g., almost twice as large as the second large anomalies for the same ranks during the 2010 summer at 35°E/58°N). The general consistency between the mean extreme summer substructure and the skewness of the underlying T2m distribution illustrates that such regime shifts in the temperature 435 variability during extreme summers are the exception rather than the norm.
This consistency furthermore allows us to rely on previous work on physical causes of skewed surface temperature distributions for interpreting our results. Consistent with the findings of Garfinkel and Harnik (2017), we find distinct extreme summer substructures relative to the location of large surface temperature gradients, in particular in the Northern Hemisphere storm 440 track regions. Extreme summers occurring north of the Northern Hemisphere storm tracks have large contributions from the hottest third of summer days, and south of the storm tracks the contributions from the coldest days are largest. This is primarily because on the cold side of a temperature gradient, warm air advection can reach much larger magnitudes than cold air advection, and vice versa on the warm side (e.g., Garfinkel and Harnik, 2017;Linz et al., 2018;Tamarin-Brodsky et al., 2019).
Moreover, the few areas where the ERA-Interim and CESM extreme summer substructures differ, also have distinct rank day 445 variability patterns in ERA-Interim and CESM. Thus, the climate model's ability to reproduce the ERA-Interim extreme summer substructures in most places results largely from the model's ability to produce local rank day variability patterns that agree with ERA-Interim.
However, three case studies illustrate that the extreme summer substructure cannot always be explained by temperature 450 advection alone. In eastern India, more than 65% of the extreme summer T2m anomaly results from the hottest 30 days of JJA being hotter than climatologically. At the considered grid point, T2m exhibits a distinct sub-seasonal pattern, as it typically drops by almost 10 K with the onset of the Indian summer monsoon. Thus, the hottest days of the season (occurring in June) are highly variable, and extreme summers occur in seasons with particularly late monsoon onsets.

455
In the high Arctic the highest surface temperatures are buffered around 0°C, as excess heat would result in sea ice melting and subsequent latent cooling. Hence, the cold part of the T2m distribution accounts for most of the rank day anomaly variance and, consequently, extreme summers occur when the coldest summer days are warmer than normally. This buffering effect of the Arctic sea ice leads to a strong boundary in the extreme summer substructure around 75°N-80°N, i.e., near the climatological JJA sea ice edge. 460 At a grid point in the western United States, all parts of the T2m distribution contribute significantly to extreme summers, however, an over proportionally large fraction comes from the coldest third of the extreme summer days (i.e., the coldest extreme summer days are warmer than their rank day mean). Composites of the upper-level flow during the 100 climatologically coldest summer days reveal that an amplified upper-level flow pattern with a particular phasing of a prominent 465 trough and its associated cold air advection is characteristic of the climatologically coldest summer days at this grid point. This particular flow pattern did not occur frequently during the extreme summers, leading to milder than normal cool summer days.
This result is consistent with previous work on physical causes of non-Gaussian temperature distributions (Garfinkel and Harnik, 2017;Linz et al., 2018;Tamarin-Brodsky et al., 2019), as it highlights the role of temperature advection by transient waves in generating a non-uniform rank day variability pattern, or similarly, a skewed T2m distribution. 470 Overall, the case studies illustrate that for understanding the physical causes of extreme summers, a seamless approach is necessary, which combines weather system dynamics, local thermodynamics and surface-atmosphere interactions as well as lower frequency variability in the atmosphere and the ocean. Clearly, distinct physical causes might lead to similar extreme summer substructures, in particular when comparing regions that are far apart (e.g., the northern Sahel region and the high 475 Arctic, Fig. 5). However, similar extreme summer substructures in neighboring regions conceivably also point to similar physical causes of extreme summers (e.g., the Asian Monsoon region). Therefore, the extreme summer substructure is a helpful tool for discriminating between neighboring regions with distinct physical causes of extreme summers and might also be helpful for identifying coherent regions with similar physical causes of extreme summers.

480
A further key result of this study is that in most places, the cool summer days contribute substantially to extreme summer T2m anomalies [more than 25% over 83% (86%) of the Northern Hemisphere land area in ERAI (CESM)]. In fact, Fig. 5 reveals that for ERA-Interim (CESM) in 46% (49%) of the Northern Hemisphere land area, the coldest third of the summer contributes more to the extreme summer anomaly ( ) than the hottest third. Thus, large positive seasonal temperature anomalies (i.e. extreme summers as opposed to individual heat waves), cannot be understood and explained by only considering the physical 485 drivers of heat waves. Rather, the processes which suppress the occurrence of cold summer days must also be considered. Yet, these processes are so far virtually unexplored and thus possibly yield an untapped potential for improving our understanding of extreme summers. However, as illustrated by the example of extreme summers in the western US, the processes that suppress the occurrence of cold summer days sometimes seem rather intangible, as they do not necessarily manifest themselves in the occurrence of an unusual flow pattern, but rather in the non-occurrence of the particular flow that typically produces the coldest 490 summer days.
This study has illustrated that extreme summers across the Northern Hemisphere have distinct substructures, which result directly from the physical causes of the extreme summers. However, the concept of the extreme season substructure has applications beyond what has been presented in this study and thus calls for subsequent studies. Firstly, the presented analyses 495 could be extended to the Southern Hemisphere and other seasons and variables. (The application of the technique is most promising for variables that are potentially unbound and variable on both ends, i.e., not for a positive definite variable like precipitation.) Secondly, the concept of a "season substructure" can be relevant for field campaigns, as the representativeness of the campaigns' measurements depends on how representative the time period of the campaign was (Wernli et al., 2010).
Thirdly, extreme summers with distinct substructures conceivably have different societal effects and thus future research 500 should assess whether or not and where the extreme summer substructure is affected by climate change. The results of this study suggest that the CESM is a suitable tool for this task, as it is largely able to reproduce the observed (ERA-Interim) extreme summer substructure in the current climate. However, some of the extreme summers observed within the last 40 years appear to be outside of the spectrum of 700 years of CESM. Hence, while CESM is able to reproduce the local extreme summer substructures, it may not be able to reproduce the most extreme summers that are physically possible in some regions. Clearly, 505 this finding requires detailed and critical further investigation. Finally, changes in the extreme summer substructure with climate change must be related to changes in the physical causes of extreme summers, as a uniform warming would not affect the local rank day variability pattern. Therefore, contrasting extreme summer substructures in present and future climate simulations might also help to identify regions where the physical causes of extreme summers are altered by climate change.

510
Data availability. ERA-Interim data can be downloaded from the ECMWF webpage (https://apps.ecmwf.int/datasets/data/interim-full-daily/levtype=sfc/). The CESM T2m data used here is available upon request from the authors.
Author contributions. MR and HW conceived the study, MS provided technical support, UB performed the CESM 515 simulations, MR analyzed the data and wrote the major part of the manuscript. HW, EF, MS, and UB also contributed to writing the manuscript and commented on earlier versions of this manuscript.
Competing interests. The authors declare no conflict of interest. Figure 1. Schematic surface temperature evolution during extreme summers with different substructures: an extreme summer arising from just one heat wave (orange), from a suppression of cool summer days (green) and from a shift in the entire T2m distribution (blue) and from a general shift towards higher temperatures and a heat wave (red). The schematic climatological surface temperature evolution is 640 depicted in gray.