Synoptic-scale drivers of the Mistral wind: link to Rossby wave life cycles and seasonal variability

The mistral is a northerly low level jet blowing through the Rhône valley in southern France, and down to the Gulf of Lions. It is co-located with the cold sector of a low level lee-cyclone in the Gulf of Genoa, behind an upper level trough 10 north of the Alps. The mistral wind has long been associated with extreme weather events in the Mediterranean, and while extensive research focused on the low-tropospheric mistral and lee-cyclogenesis, the different upper-tropospheric largeand synoptic-scale settings involved in producing the mistral wind are not generally known. Here, the isentropic potential vorticity (PV) structures governing the occurrence of the mistral wind are classified using a self-organizing map (SOM) clustering algorithm. Based upon a 36-year (1981-2016) mistral database and daily ERA-Interim isentropic PV data, 16 distinct mistral15 associated PV structures emerge. Each classified flow pattern corresponds to a different type or stage of the Rossby wave lifecycle, from broad troughs, thin PV streamers, to distinguished cut-offs. Each of these PV patterns exhibit a distinct surface impact in terms of the surface cyclone, surface turbulent heat fluxes, wind, temperature and precipitation. A clear seasonal separation between the clusters is evident and transitions between the clusters correspond to different Rossby wave-breaking processes. This analysis provides a new perspective on the variability of the mistral, and of the Genoa lee-cyclogenesis in 20 general, linking the upper-level PV structures to their surface impact over Europe, the Mediterranean and north Africa.

To identify mistral days, first, a Genoa cyclone database is defined based on the presence of a cyclone in the CYC domain ( Fig. 1) in ERA-Interim, using the sea-level pressure field at 1-degree horizontal resolution and 6-hourly time intervals. The cyclone masks are identified in ERA Interim as the area within the outermost closed contour of the sea-level pressure field at 120 0.5-hPa intervals, adapted from Wernli and Schwierz (2006). Then, wind direction and speed criteria were applied to days when a Genoa cyclone is detected using the WRF-ORCHIDEE model data that allows higher resolution: NW to NE (i.e., +-45⁰) wind direction at 900 hPa and 10-m wind speed of at least 2 m/s averaged in the GOL domain (Fig. 1). The objective identification yielded 2734 mistral days, comprising 21% year-round frequency, in agreement with Burlando et al., (2009).
Consecutive mistral days were grouped into mistral events. The identified mistral events duration and monthly frequency are 125 shown in Fig. 2. Mistral events peak typically in Jan-Feb at 30% frequency, while the mistral is less frequent in summer (~10%). Most mistral events last a single day. Duration of more than 4-8 days occurs exclusively in the autumn and winter months. This distribution generally agrees with the climatological lifetime properties of the Genoa low (e.g., Campins et al. 2011). https://doi.org/10.5194/wcd-2021-7 Preprint. Discussion started: 29 January 2021 c Author(s) 2021. CC BY 4.0 License.

SOM Set-up and Validation 165
The SOM Algorithm is provided by Mathworks (https://www.mathworks.com/help/deeplearning/gs/cluster-data-with-a-self-organizing-map.html). The setting used in the present study yielded a shallow neural network with a single layer, using the sigmoid activation function, in hexagonal grid (Fig. A1). The hexagonal grid setting implies that each cluster interacts with the 6 surrounding clusters in terms of similarity, however the results are displayed on a rectangular grid to simplify the presentation. The choice of the map dimension (and 170 therefore number of clusters) was set as the number beyond which the classification method begins to deteriorate, or does not add relevant information (i.e., near empty clusters or highly similar ones). Here we aimed to classify the mistral events into clusters that pose dynamically-meaningful PV distributions, representative of their daily individual members, and with a considerable annual mean frequency (i.e., on the order of 5%). Eventually, the number of clusters was set to 16 in a 4x4 configuration (Fig. A1), which satisfied these demands. 175 The SOM algorithm learning process optimizes a chosen function indicating each cluster's inner variance, and despite the wide variety of relevant functions, well-performed SOM processes are usually quite indifferent to the chosen function (Sheridan and Lee, 2011). This statement is usually true for other chosen parameters such as the neighborhood size and calculation time steps, within a reasonable range of values (e.g., Cassano et al., 2006;Johnson et al., 2008). For the present study we chose the intuitive RMSE function as the optimization parameter with the initial neighborhood distance set to 2 and number of training 180 steps for initial covering of the input space set to 365. Indeed, the identified patterns were only weakly affected by the choice of these parameters. Furthermore, the SOM readily reproduced similar average patterns for several different mistral datasets.
For example, the process was repeated for a subset of mistral dates in which single-day mistral events were removed and events separated by a single day were joined. This modified subset included 248 days less than the original, reducing the sample size by over 10%, yet the identified patterns were nearly identical. 185 Statistical significance is assessed by the student's t-test between each cluster and the total averaged mistral flow. Following Wilks et al. (2016), an additional criterion was added to the statistical test to account for the multiple testing problem.
Specifically, the Walker criteria was applied, where a threshold of = 1 − (1 − 0 ) 0 −1 is set on the p-value obtained by the t-test. 0 is the required significance level (e.g. 0.05 for 95% confidence) and N0 the number of individual t-tests, or in this case, the number of grid points in the domain of interest. If the p-value of an individual t-test is larger than , the 190 null hypothesis cannot be rejected at a level of 0 .

PV Distribution Clusters
The classification of isentropic PV resulted in 16 clusters, set in a 4X4 hexagonal grid, where similar clusters are placed closer together in the SOM space. The identified mean PV patterns defining each cluster are displayed in Fig. 3, alongside the mean 195 500-hPa geopotential heights. The panel order represents each cluster location in the SOM space, i.e., the least similar clusters are placed farther apart in the SOM space, and more similar ones are placed adjacent to one another. Some clusters correspond to exotic upper-tropospheric PV structures, such as thin streamers and cut-offs, attributed to different Rossby-wave breaking (RWB) events. Dotted regions indicate statistical significance, i.e., the main features by which the classification process is established. For instance, a high PV tongue to the west of a cut-off appears to define cluster 8, while a westward stretching PV 200 streamer defines cluster 9, suggesting these clusters correspond to cyclonic or anticyclonic RWB lifecycles (Thorncroft et al., 1993), respectively. A southwesterly-oriented cut-off defines cluster 5, while a northerly thin streamer defines cluster 2, and so on. Together, the clusters illustrate a thematic separation of the PV continuum responsible for mistral events, and one can easily envision how the waves propagate by switching from one cluster to another.
Upstream of the high-PV anomaly, most clusters exhibit an amplified ridge in the upper troposphere over the Atlantic, a 205 common precursor for intense Mediterranean cyclones (Raveh-Rubin and Flaounas 2017). As expected, the primary mode of variance is in the seasonal cycle, manifested by the meridional shift of the dynamical tropopause (subsection 3.2). Another mode apparently picked up by the SOM is evident when comparing clusters 12 and 16 to 11 and 15. Evidently, the SOM was able to distinguish between mountain-passed PV anomalies (11 and 15) and blocked ones (12 and 16), and the impact on the trough properties is evident by the tilting of the trough axis upon the passage across the Alps from NE to N, respectively. The 210 standard deviation (STD) for the PV distribution within the clusters is presented in Fig. A2 in the appendix, emphasizing the different active regions between the clusters.
Note that the composites presented on Fig. 3 each constitute ~100 days. While some variance within the clusters is inevitable, these composites help to illustrate the SOM clustering process, especially in statistically significant regions. Detailed features of the actual patterns, as picked up by the SOM, can be better understood by carefully examining the cluster members with 215 respect to their mean values. Such examples are provided in Sect. 3.6, aiding in establishing the cluster features, detailed in Table 1.

Seasonal Variation
The climatological monthly occurrence frequency of each cluster is displayed in Fig. 4, demonstrating the strong seasonal 225 affiliation of the clusters, i.e., all clusters have a clear seasonal peak in their occurrence. Clusters 1-4 occur mostly between June and October, while clusters 9-16 occur mainly between November and April. The low-PV background clearly dominates summer clusters (Fig. 3 panels 1-4) while much broader wave amplitudes constitute the winter clusters (9-16). In between, clusters 5-8 peak mainly in the transition seasons. Overall higher frequencies are obtained in the winter clusters, as expected by the larger frequency of mistral events appearing in winter.

Surface Circulation and Surface Impact 235
The surface impact of the differently classified mistral events is presented in terms composites of sea-level pressure (SLP) and precipitation (Fig. 5) and surface heat fluxes (SHF) along with 10-meter winds and 900 hPa equivalent potential temperature ( Fig. 6).
The SLP patterns reveal the typical westward tilt with height, and suggest that some clusters favour a phase lock, usually corresponding to the deepest cyclones (e.g., clusters 5, 8, and 14-16). It is probable that each PV cluster is linked to a different 240 stage of the cyclone lifecycle, which can be centered to the east or west of Italy, or even south in the Ionian Sea, with varying depths. The cyclones are closest to the lee of the Alps in clusters 4, 8, 12 and 16, associating the right column in Fig. 3 to the initial stages of cyclogenesis, while the left column (clusters 1, 5, 9 and 13) likely correspond to the termination stage of the cyclone and their easternmost location. The anticyclone extending from the Atlantic is highly variable among the clusters in its strength, thereby affecting the surface pressure gradient, and its spatial extension towards Europe. At times, the high-245 pressure system dominates the region (e.g., clusters 1, 6, 9, 13), such that a weak cyclone is sufficient for producing the strong mistral winds (see red arrows in Fig. 6). In other cases, the deep Mediterranean cyclone is the dominant feature (e.g., clusters 11,12,15,16). The mean distribution of precipitation is unique for every cluster, with the location of the precipitation maxima differing among clusters more than across the seasons. In summer, precipitation varies sharply between the eastern and northern Alps (i.e., clusters 1-4), while in the winter, it is differently distributed between the Dolomite and Balkan Mountains, (i.e. clusters 12, 255 16) and the Alps (8), with notable precipitation occurring along the African shoreline as well (10, 14, 15 and more). Generally, precipitation is distributed along the northern and eastern sides of the cyclone, and roughly correlates with its intensity, which is consistent with previous work (Flaounas et al. 2015;Wernli 2015, 2016).
The surface heat flux pattern associated with each cluster is relatively localized. Most clusters exhibit the familiar heat loss hotspot in the GOL, however it can extend to different lengths south into the Mediterranean and is absent from several clusters 260 https://doi.org/10.5194/wcd-2021-7 Preprint. Discussion started: 29 January 2021 c Author(s) 2021. CC BY 4.0 License. changes between the different mistral clusters, rather than deviations from climatology. E.g. referring to Fig. 6, the SHF maxima in the GOL are often not statistically significant as it is a standard mistral feature (clusters 6, 11 and 16). However, statistical significance arises when the signal extends further south (clusters 2, 7, and 14), west (clusters 5 and 10) or if it is exceptionally weak (clusters 1, 12, 13). The clusters main characteristics are summarized in Table 1

Time Evolution of Mistral Events 285
Frequent cluster transitions and cluster persistence can be visualized using the transition probability matrix (TPM, Fig. 7).
Column 0 shows the likelihood of a mistral event to end at any cluster with no further transition. The strong amplitude along the shifted-main diagonal suggests that every cluster has a tendency to sustain itself, at a varying likelihood. We interpret this feature as a persistence demonstration by the algorithm, as the time-scale for the evolution and migration of the PV structures driving the mistral events are often longer than a day. The fact that the PV-based SOM is able to consistently classify 290 consequent events of slow-developing waves under the same classification reinforces the robustness of the method, given proper SOM constraints (such as the number of clusters), as only significant (by SOM interpretation) differences in the daily fields within a mistral event can force a transition. After Espinoza et al. (2012), a transition is deemed statistically significant if its frequency exceeds the 90th percentile of the corresponding transition frequency derived from a 1000 random redistributions of the original sequence. We constructed a reference random distribution by considering all mistral days (recall 295 that the clustering is performed only for mistral days and not for all other days). While Huang et al., (2017)

305
Note that the transitions are distributed mostly around the main diagonal and the 0 column, the latter is primarily due to frequent single-day mistral events. Nonetheless, some recurring cluster transitions are showing a considerable amplitude, such as transitions 14→9 and 2→5, and others. These transitions are made clearer when viewed separately for each season (Fig. 8).
Very different amplitudes along the main diagonal and the 0 column suggest some clusters are only self-sustaining in certain seasons and are more likely to be the end of a mistral event in the other seasons (for example, cluster 5 in autumn compared 310 to winter-spring). It is clear from the seasonal TPMs that some transitions are absent from certain seasons, whereas others may occur at any month of the year. Studied carefully, these transitions reveal many details about the development of upper level PV anomalies over the Alps. For instance, transition 12→ 7 (amplifying ridge behind streamer) and 10→ 5 (formation of Genoa cutoff) are more abundant in the transition seasons, while transition 14→ 9 (strong anticyclonic Rossby wave breaking, AWB) occurs in any season except the summer, suggesting this transition and several others are more resilient to the changes 315 of the seasons. Some of these transitions, and indeed, some individual clusters, can be directly related to AWB (14, 15 leading to 9, and 13), cyclonic Rossby wave breaking (CWB, cluster 8), a cut-off low migrating into the domain from the north (2→ 1/5), or being cut out of a north-easterly streamer (10→ 5). Other transitions imply an equatorward stretching of a trough (12→ https://doi.org/10.5194/wcd-2021-7 Preprint. Discussion started: 29 January 2021 c Author(s) 2021. CC BY 4.0 License. 9-11 and 14-16, with preferable paths illustrated by the thick blue lines in the corresponding panels. This view also highlights the directionality of the transitions. For instance, note that the clusters with largest numbers of initiated events that last 3 or more days are the blocked clusters 12 and 16, and that the "relieved" clusters rarely jump back to a blocked cluster, i.e., the transitions 11/15→12 are scarce. Considering the cluster configuration displayed in Fig. 3, the general direction of flow within a mistral event is from right to left and from top to bottom, diverging mostly from clusters 8 and 12 and converging towards 340 clusters 9, 14 and 15 (see Fig. S1 in supplementary material). Overall, the clustering analysis identified robust and distinct isentropic PV patterns, and the transition analysis offers additional perspective on the evolution of these upper level PV structures as they interact with the Alpine ridges during mistral events.
These TPMs, combined with the surface impact of each identified PV cluster, can potentially be utilized to improve weather This spring mistral event begins with a blocked trough and a weak cyclone in the lee of the Alps. The trough is stretched into a thin NE streamer, captured by transition 12→10, representing a common transition. Upon this transition which marks a first 375 AWB, the SHF intensify dramatically, along with precipitation in the Adriatic region. The trough then further stretches and breaks cyclonically (10→8) to form a cut-off, as the cyclone attains a deep symmetric structure (Tous and Romero, 2013).
Note that the streamer is channelled above the Rhône Valley just before breaking, illustrating the wrapping up of PV banners generated in the mistral region (Aebischer and Schär, 1998). The classification of 11/4/2005 to cluster 8, capturing the CWB pattern despite the PV streamer extending to the north-east rather than north-west as suggested by the composite, emphasizes 380 the ability of the SOM to identify distinct geometrical features, rather than only geographical ones.

3.5.3
Cut-off Low Mistral, 25-31 August 1995 The summer mistral event is exceptionally long for the season. It begins with the weakest upper-level forcing recorded by the present analysis, with a 2→1 transition, demonstrating the cut-off of a 3-PVU northerly streamer over the Alps. The propagation of a second wave into the domain is identified as transition 1→3, and this second wave is again stretched southward to form a summer streamer (3→2). This transition is accompanied by an intensification of the mistral, the deepening 390 of a primary cyclone south-east of the Baltic Sea, and a lee cyclone in the Adriatic Sea. The streamer then tilts and breaks to form a cut-off low (2→5), with weakening of the mistral intensity. A noticeable characteristic is the persistent precipitation response of the days corresponding to cluster 2, centered just between the eastern Alps and northern Dolomites, as suggested by the precipitation composites (Fig. 5, cluster 2).
https://doi.org/10.5194/wcd-2021-7 Preprint. Discussion started: 29 January 2021 c Author(s) 2021. CC BY 4.0 License. Figure A1: The 4x4 hexagonal SOM map with the numbered clusters and neighbor distances, or similarity between clusters (dimensionless) in colored connections. Brighter colors represent short distances, or more similar clusters, and dark colors indicate large distances, or dissimilarities.

485
A.2 Intra-cluster variability The variability among the members for each cluster can be quantified with the standard deviation (STD) map of the PV fields (Fig. A2). The resulting patterns demonstrate the uncertainty in the magnitude and exact location of the identified PV feature.
As such, the largest STD is either along the boundary of the streamer (e.g., clusters 3, 6, 10, 14) or within a cut-off (1,5,8). It is apparent that some clusters mean signal is reasonably well aligned with the individual members, while others exhibit larger 490 inner variance. For example, most members of cluster 9 fit right in the composite, as opposed to cluster 1, where the depth of https://doi.org/10.5194/wcd-2021-7 Preprint. Discussion started: 29 January 2021 c Author(s) 2021. CC BY 4.0 License.