Interactive 3-D visual analysis of ERA5 data: improving diagnostic indices for marine cold air outbreaks and polar lows

Recent advances in visual data analysis are well suited to gain insights into dynamical processes in the atmosphere. We apply novel methods for three-dimensional (3-D) interactive visual data analysis to investigate Marine Cold Air Outbreaks (MCAOs) and Polar Lows (PLs) in the recently released ERA5 reanalysis data. Our study aims at revealing 3-D perspectives on MCAOs and PLs in ERA5 and at improving the diagnostic indices to capture these weather events in long-term assessments on seasonal and climatological time-scales. Using an extended version of the open-source visualization framework Met.3D, 5 we explore 3-D perspectives on the structure and dynamics of MCAOs and PLs and relate these to previously used diagnostic indices. Motivated by the 3-D visual analysis of selected MCAO and PL cases, we conceptualize alternative index variants that capture the vertical extent of the lower-level instability induced by MCAOs and its distance to the dynamical tropopause. The new index variants are evaluated, along with previously used indices, with a focus on their skill as a proxy for the occurrence of PLs. Testing the association of diagnostic indices with observed PLs in the Barents and the Nordic Seas for years 2002-2011 10 shows that the new index variants based on the vertical structure of cold air masses are more skillful in distinguishing the times and locations of PLs, compared with conventional indices based on sea-air temperature difference only. We thus propose using the new diagnostics for further analyses in climate predictions and climatological studies. The methods for visual data analysis applied here are available as an open-source tool and can be used generically for interactive 3-D visual analysis of atmospheric processes in ERA5 and other gridded meteorological data. 15


Introduction
Marine Cold Air Outbreaks (MCAOs) are transport events of cold air from sea-ice or snow-covered regions towards relatively warmer oceans (Rasmussen, 1983;Kolstad and Bracegirdle, 2008;Gryschka, 2018). Understanding MCAOs is relevant because they represent conditions favourable for extreme weather phenomena known as Polar Lows (PLs; Rasmussen, 1983;Ese et al., 1988;Kolstad, 2011;Michel et al., 2018). PLs are intense mesoscale cyclones, which have been called "Arctic hurri-20 canes" (Nordeng, 1992;Føre et al., 2012;Bracegirdle, 2012) due to similarities with tropical hurricanes including symmetric vortex-like cloud patterns. PLs in the Northern Hemisphere usually occur during winter and are characterized by strong winds, heavy precipitation and severe marine icing, which pose substantial risks to marine activities and infrastructures (Aarnes et al., 2018). Improving the understanding of MCAOs and their relation to PLs could improve marine services in the polar regions.
In addition to representing conditions conducive for extreme weather, MCAOs are also important in the context of deep water 25 formation as they contribute to cooling of the ocean surface (Papritz and Spengler, 2017).
To facilitate statistical statements about atmospheric processes in, e.g., climatological or in climate prediction studies, weather phenomena are commonly characterized by means of diagnostic indices that abstract a complex atmospheric structure into a simple numerical value. To define such indices for MCAOs and PLs, various interdependent factors need to be considered, 30 and understanding the link between MCAOs and PLs remains an area of active research (Claud et al., 2007;Kolstad and Bracegirdle, 2008;Kolstad, 2011;Terpstra et al., 2016;Afargan-Gerstman et al., 2020;Stoll et al., 2021). To diagnose MCAOs, most previous studies used an index that represents a simplified version of the Brunt-Väisälä frequency for quantifying static stability, by considering only the sign of the vertical temperature gradient between potential skin temperature of the ocean and potential air temperature aloft (e.g., Papritz et al., 2015;Kolstad, 2017). The temperature gradient is termed MCAO 35 index, where positive potential temperature difference between the ocean surface and the air aloft indicates vertical instability.
Previously used MCAO indices (Papritz et al., 2015;Kolstad, 2017) have the form or variations thereof (Kolstad et al., 2009;Fletcher et al., 2016;Papritz and Sodemann, 2018;Landgren et al., 2019), where θ skin is the potential skin temperature (which is the sea surface potential temperatureover the ocean), and θ 850hPa is the potential air 40 temperature at 850 hPa. Other related indices use the difference between the temperature at a certain height and the sea surface temperature (e.g., Zappa et al., 2014). In what follows, we will summarize the different variants of previously used MCAO indices using the term conventional MCAO index (to distinguish these from other metrics considered here). The vertical level at which air aloft is considered for calculation of the conventional MCAO index (850 hPa in Eq. 1) will be referred to as the characteristic pressure level.

45
The choice of the characteristic pressure level used for computing the conventional MCAO index varies substantially amongst previous studies -from 500 hPa (Landgren et al., 2019), to 700 hPa (Kolstad et al., 2009), 800 hPa (Fletcher et al., 2016), 850 hPa (Papritz et al., 2015;Kolstad, 2017;Polkova et al., 2019Polkova et al., , 2021, to 900 hPa (Papritz and Sodemann, 2018). hand it also complicates comparison and interpretation of results from different studies. Open questions remain on the appropriate choice of a characteristic pressure level and the sensitivity of results to this choice (e.g., the frequency of occurrence of MCAOs in climatological assessments). Since MCAOs have been reported to be a necessary, but not a sufficient condition for the occurrence of PLs (Kolstad, 2011), an open question is whether a diagnosed presence of MCAOs can be useful as a proxy for the occurrence of PLs. A recent climatological study finds a weak relation between MCAOs and the occurrence 60 of polar mesoscale cyclones, including PLs (Michel et al., 2018). In this study, we investigate the effect of the choice of the characteristic pressure level for the MCAO index and the vertical structure of MCAOs in relation to PLs. We develop and test alternative indices that do not rely on a subjective element in the choice of a characteristic pressure level, but instead take into account the vertical structure of MCAOs and PLs. We quantify the link between MCAO indices and the time and location of observed PLs and propose a simple method for determining the characteristic pressure level that maximizes the linkto observed 65 PLs ::: this :::: link.
For our analyses we make use of recently released reanalysis data and innovative methods for 3-dimensional (3-D) interactive visual analysis (IVA). The global reanalysis dataset ERA5 (in the following referred to as ERA5), recently released by the European Centre for Medium-Range Weather Forecast (ECMWF), is considered to be the most detailed and highest quality 70 global reanalysis data available (Hersbach et al., 2020; Copernicus Climate Change Service (C3S), 2017). The increased spatial and temporal resolution of ERA5, compared to its predecessors, along with its large temporal coverage, allows for both, 3-D analysis of mesoscale atmospheric phenomena during single weather events and statistical analysis over multiple decades. Recent advances in graphics hardware along with innovative methods for visual data analysis facilitate the interactive 3-D visual exploration of ERA5 and other gridded meteorological data. Rautenhaus et al. (2018) provide a comprehensive overview of 75 methods and potential benefits. Examples of 3-D IVA include supercell tornados (Orf et al., 2017), jet-stream core lines (Kern et al., 2018), and synoptic-scale fronts (Kern et al., 2019). 3-D IVA provides more comprehensive impressions and can help improving our understanding of meteorological phenomena, for example by rapid visual investigation of dynamical processes and by exploration of unknown features for formulating new hypotheses. Despite ::: The :::::::: potential :: of :::: 3-D ::::::::: depictions ::: has ::::: long :::: been ::::::::: appreciated ::: by :::::::::::: meteorologists :::::::::::::::::::::::::::::::::::::::::: (see e.g., Uccellini, 1990;Rautenhaus et al., 2018). :::::::: However, :::::: despite novel 3-D visualiza-80 tion being available in several frameworks (open-source examples include ParaView (Ayachit, 2015), Vapor (Li et al., 2019) and Met.3D (Rautenhaus et al., 2015b,a)), meteorological studies are still mainly conducted by means of visualizing static 2-D slices. Rautenhaus et al. (2018) discuss reasons for the slow uptake of modern visualization methods in the atmospheric sciences, including usability aspects as well as too few studies in the atmospheric community that demonstrate the potential benefits to be gained from, e.g., 3-D IVA. In this study, we extend and apply the open-source interactive visualization frame-85 work Met.3D (Rautenhaus et al., 2015b,a) for interactive 3-D investigation of the structure of MCAOs and PLs as represented in ERA5. We investigate to which level of detail the 3-D structure of MCAOs and PLs is resolved in ERA5, demonstrate the potential of 3-D IVA as a tool to understand atmospheric processes, and use insights from 3-D IVA as inspiration for conceptualizing improvements to existing MCAO and PL indices.

90
The objectives for this study are (a) to obtain a 3-D perspective on the structure and dynamics of MCAOs and PLs, (b) to relate the 3-D structure to previously used diagnostic indices for representing these weather events on seasonal and climatological time-scales, and (c) to evaluate diagnostic indices in the context of observed PLs for the main purpose of testing if these indices could serve as proxies for PLs. The article is structured as follows. Sect. 2 describes the data, the visual analysis setup, the candidates for improved indices and the methodology to evaluate these. In Sect. 3 we illustrate insights gained from the 3-D 95 visual analysis and describe the evaluation of diagnostic indices. The article is concluded in Sect. 4.

Data and Methods
Our analysis starts with the interactive 3-D visual exploration of selected cases of MCAOs and PLs in ERA5 (Sect. 2.1 and 2.2). Subsequently, we conceptualize alternative diagnostic indices (Sect. 2.3) using insights from the 3-D IVA as inspiration.
The performance of the new indices is tested in comparison with conventional MCAO indices by assessing associations with 100 observed PLs (Sect. 2.4).

Interactive visual analysis of ERA5 with Met.3D
Met.3D is a meteorology-specific 3-D visualization framework that provides the user with various methods for IVA of gridded 110 meteorological data (Rautenhaus et al., 2015b,a). The framework focuses on interactive rapid exploration of the 3-D atmosphere and on uncertainty represented by ensemble simulations. It is designed to bridge the gap between 2-D visualizations (including horizontal maps, vertical cross-sections, vertical profiles), 3-D visualizations (including isosurfaces, direct volume rendering, 3-D streamlines and trajectories) and novel feature-based displays (Kern et al., 2018(Kern et al., , 2019 We apply Met.3D to investigate ::: the 3-D features ::::::: structure : of MCAOs and PLs in ERA5. For the initial explorative phase of the investigation, a set of exemplary :::::: MCAO :::: and ::: PL cases were selected based on previous studies (Kolstad, 2011;Føre et al., 2012;Bracegirdle, 2012) about strong MCAO events and symmetric PLs (e.g., the case on Dec. 18, 2002 in Fig. 1 and 2 1 and 2). The following exploratory visualization methods were used to study the ERA5 atmosphere: interactive sliding of 2-D horizontal and vertical cross-sections through the 3-D atmosphere, exploration of the shape and location of 3-D isosurfaces of selected variables including potential temperature and wind speed, direct volume rendering to inspect cloud liquid and 130 ice water, computation of 3-D streamlines of wind fields. During the initial phase of the visual case-analyses, we explored >10 ERA5 variables on a large geographical domain (all longitudes; northern latitudes in the interval 25-90 :::: over ::: the :::::::: Northern :::::::::: Hemisphere ::::: (north :: of ::: 25°; grid-dimension in lat-lon-height: 261×1441×137 : N) for several cases of MCAOs and PLs to obtain a picture of the large-scale atmospheric situation. During the second phase, we visually analysed single cases of MCAOs and PLs in more detail by inspecting ERA5 variables (t, pv, u, v, w, z, q, cc, ciwc, clwc) on a smaller grid (440×440×137) with 135 polar stereographic projection :::::: domain : covering the Barents and the Nordic Seas. More technical details are given in Appendix A.

New indices for MCAOs and PLs
Conventional MCAO indices have been calculated with a variety of different characteristic pressure levels, ranging from 500 to 900 hPa , at which air aloft is considered for calculation of the MCAO index (Landgren et al., 2019;Kolstad et al., 2009;140 Fletcher et al., 2016;Papritz et al., 2015;Kolstad, 2017;Papritz and Sodemann, 2018;Polkova et al., 2019). For understanding the 3-D structure of MCAOs and the sensitivity of conventional MCAO indices to the choice of the characteristic pressure level, we have implemented a functionality in Met.3D that allows for IVA of the effect of varying the characteristic pressure level.
For this purpose, we introduce a simple 3-D extension of conventional MCAO indices. We compute the temperature difference between the surface and each vertical pressure level, p, instead of considering only one particular characteristic pressure level, 145 as was usually done in previous studies on MCAOs. That is, instead of the conventional MCAO index (Eq. 1), we compute its 3-D variant, and use methods for interactive visual data exploration for its analysis (e.g. sliding a horizontal cross-section through all vertical levels, p, of m p θ ). In a similar way, we implement in Met.3D the conventional MCAO index variants described in Kolstad and  Table 1.
Two new diagnostic indices are introduced and tested: (i) the new MCAO index, and (ii) the new PL index. Put simply, the new MCAO index approximates the vertical extent (in what follows also termed "height") of MCAOs, and the new PL index measures the vertical distance between the upper boundary of MCAOs and the dynamical tropopause. These metrics are designed to address shortcomings in conventional indices (e.g., a subjective choice of pressure level and weak relation between MCAOs and PLs) by taking into account 3-D features of the atmospheric circulation, while remaining simple and computationally cheap, and hence feasible for use in further climatological studies that rely on processing of large amounts of data.

160
(i) The new MCAO index approximates the vertical extent of the lower-level instability induced by MCAOs (expressed as a pressure difference). It is calculated for each horizontal grid-cell and time-step (hourly in ERA5) as the pressure difference between the surface and the upper boundary of the lower-level instability caused by a MCAO,

165
where p 0 is defined here as the standard constant surface pressure 1013.25 hPa. For the computation of the upper boundary of the MCAO, p * , we determine the pressure level at which the potential air temperature equals the potential skin temperature.

190
To investigate if the diagnostic indices may be used to distinguish times and locations at risk for PL occurrence, it is important to not only analyse index values at times and locations when PLs have occurred, but also when no PLs have occurred. For that purpose, we calculate all diagnostics during a set of randomly selected "pseudo-events". Pseudo-events are defined as hypothetical PL events with the same frequency, and the same temporal and spatial scale as the actual PL events observed in STARS.

195
This provides an experimental setup for testing if diagnostic indices are able to distinguish actual from hypothetical PL events.

250
A PL that was described as an "Arctic hurricane" (Bracegirdle, 2012) formed west of the coast of Novaya Zemlya, during the 19-20th of December 2002, towards the end of the MCAO illustrated in Fig. 1. The capacity for quickly analysing various data variables from different angles using different visualization methods is a key advantage of 3-D IVA, because it makes it easier to explore new perspectives and discover potentially interesting features. One such example emerging from our case-studies is a "slow wind perspective" on PLs (Fig. 2, Movie 2). During some of the PL events we analysed, there is a coherent tube-like MCAO index values decrease with increasing altitude (Fig. 3, Movie 3). At low altitudes, close to the ocean surface, potential air temperature is lower than potential skin temperature, leading to positive MCAO indices. At higher altitudes, potential air temperature increases, leading to negative MCAO indices (set to zero in Fig. 3a-c, Movie 3). Note that areas with extreme 275 values of the conventional MCAO index, i.e. maximum temperature difference between the sea surface and a certain pressure 1, 2002/12/19, 16:00, a symmetric slow wind "eye" is observed (gray isosurface) in the area where a PL is reported (Bracegirdle, 2012;Føre et al., 2012), with a pronounced vortex in the wind field around it (see inlay). The tube-like, coherent volume of air with slow winds extends up into the stratosphere. As expected, fast winds are observed near the surface. Complex 3-D structures of the wind field in the lower troposphere are visualized by means of isosurfaces constrained to different bounding boxes for selective illustration of various wind speeds in different volumes of air in close proximity to the observed PL. The illustrated aspects of the 3-D wind field show similarities to previously described, average wind-patterns during reverse shear PLs (Fig. 8 in Michel et al., 2018;Terpstra et al., 2016). Movie 2 demonstrates the interactive 3-D data exploration of PL case 1 using Met.3D. (b) In case 2, 2011/03/24, 11:00, no symmetric slow wind eye is observed in the lower troposphere in the area of the reported PL (Noer et al., 2011). The jet-stream is stronger and located further south. level, do not necessarily coincide with areas where the conventional MCAO index is positive also at high altitudes (i.e. m p θ > 0 at high vertical levels p; see The sensitivity of conventional MCAO indices to the choice of the characteristic pressure level can be expected, considering standard vertical profiles of potential temperature along with a dynamic 3-D shape of the volume of cold air and its complex mixing with air masses above the ocean. With the aim of formulating a diagnostic index that is not sensitive to the choice of a characteristic pressure level, we consider in more detail the 3-D structure of potential air temperature.

The upper boundary of MCAOs
The interactive visual analyses of the 3-D MCAO index, along with standard vertical profiles of potential air temperature, suggests that the vertical profile of potential air temperature in the column of air above each location inside of a MCAO may be sketched as follows: in the lower troposphere, there is an unstable layer of air, in which the potential air temperature decreases with altitude, followed by a layer with approximately constant potential air temperature, and then a stable layer of air in which 290 potential temperature increases with altitude. Fig. 3d-f shows the vertical profile of potential temperature at some exemplar locations. This implies that, in the column of air above each location (grid-cell) within the area of a MCAO, there should be at least one pressure level at which the 3-D MCAO index, m p θ , changes its sign. The critical pressure level, p * , at which m p θ changes its sign is the altitude at which potential temperature aloft equals potential skin temperature,

295
The column of air below p * is unstable and the air column above it is stable -with respect to the simple static stability criterion based on the difference in potential temperature aloft and at the surface, as used in in conventional MCAO indices (Landgren et al., 2019;Kolstad et al., 2009;Fletcher et al., 2016;Papritz et al., 2015;Kolstad, 2017;Papritz and Sodemann, 2018;Polkova et al., 2019). We therefore define the critical pressure level, p * , as a simple measure of the upper boundary of MCAOs.

300
The upper boundary of the lower-level instability caused by MCAOs can be visualized by computing the zero-isosurface of m p θ . Visual analyses of the dynamics of the zero-isosurface reveal interesting spatio-temporal dynamics in the upper boundary of MCAOs (Fig. 4, Movies 4-6). Investigation of several MCAO cases indicates a trend for the upper boundary of the lowerlevel instability to increase with distance from the sea-ice edge. This in accordance with conceptual descriptions about strong organized convective processes and convective overturning that cause a vertical increase of the MCAO depth with increasing 305 distance from the sea-ice (e.g., Gryschka, 2018). However, there are substantial spatio-temporal variations during the course of single MCAOs and particularly between different MCAOs. Interestingly, visual comparison of the upper boundary of MCAOs and the position of observed PLs during several of our use-cases shows that geographical areas with the highest vertical extent of MCAOs coincide with geographical areas of PLs.

3.2.3
The vertical distance between lower-level instability and upper level :::::::::: upper-level anomaly during PLs 310 Both a lower-level instability and an upper-level forcing of the dynamical tropopause are required for PL genesis (Kolstad, 2011;Grønås and Kvamstø, 1995). Along these lines, Grønås and Kvamstø (1995) showed that, for 2 out of 4 PLs, the distance between the surface and the top of the atmospheric boundary layer was smaller than 2500 m. In related previous work Kolstad  Interactive 3-D visual analysis of single cases indicates that the distance between the dynamical tropopause and the upper 320 boundary of MCAOs (Sect. 3.2.2) is smaller in geographic proximity to areas where PLs occur. During some of the PL cases, the dynamical tropopause extends downward into the lower-level instability region induced by MCAOs (Fig. 4e, crossing the zero-isosurface of m p θ with the dynamical tropopause; Movie 5).

Evaluating diagnostic indices by comparison with observed PLs
In previous sections, we summarized our case-studies of MCAOs and PLs along with the definition and motivation for new  The mean distance between the upper boundary of the lower-level instability and the dynamical tropopause, as captured by 345 m tr , during all PL events is approximately 345 hPa, but substantially smaller distances are also observed (Fig. 5-h : h). Interestingly, in 41 % of all PLs, there is a short time of at least one hour, during which the dynamical tropopause extends downward into the lower-level instability (p tr (t) > p * (t)) within the area of observed PLs (separate analysis not shown in the Figure). In comparison, a larger mean distance of approximately 510 hPa is observed in the areas outside of observed PLs (Fig. 5).
Investigating diagnostic indices during "pseudo-events", defined as randomly selected times when no PL occurred with an area that matches in scale the area of actual PL events (see Sect. 2.4), shows that low values for the new MCAO index ("shallow" MCAOs) and larger values of the new PL index (low or absent forcing from the dynamical tropopause) occur more often during normal weather conditions than during PLs (gray bars in Fig. 5e-f and h-i). This suggests that the new indices capture features that are useful for distinguishing meteorological conditions during PLs from meteorological conditions during a randomly 355 selected set of days in the Nordic winter.

Association between diagnostic indices and locations of observed PLs
The locations with high values of the new indices, m p and m tr , resemble the locations of observed PLs for selected cases. For example, the area with high values of the new MCAO index during the MCAO case that we previously illustrated (Fig. 1, 3, and 4) matches the area of the observed PL rather well (Fig. 5), considering that it is a very simple index computed without
To assess if the number of matches that we obtain is substantially different to the number of matches one would expect by random choice of high index values somewhere in the geographical region of interest, we count the number of overlaps between the areas of randomly selected "pseudo-events" (Sect. 2.4) and the areas of high index values. This shows that the number of 395 matches between high index values and observed PLs is substantially higher than the number of matches between high index values and randomly chosen pseudo-events (Fig. 5), which provides additional supportive evidence for the robustness of the new diagnostics.
Results in this section suggest that the new diagnostic indices are useful and informative for distinguishing the location of PLs, given knowledge about the time of occurrence of PLs. However, for the diagnostics to be useful in predictability studies 400 or marine services as PL proxies, it is necessary to demonstrate that they are able not only to identify locations with higher risk for PLs, conditional on knowledge about the time of occurrence, but that they are able to distinguish times and locations of PLs without any prior knowledge from observations. This is tested in the next section.

Performance of diagnostic indices in distinguishing the time and location of PLs
For distinguishing times and locations with higher risk for PL occurrence from times and locations with lower risk for PL 405 occurrence based on the simple diagnostic indices, it is necessary that these take on sufficiently different values during PL events compared with no-PL events. As a first step to test this, we analyse ::::::: conduct : a ::::::::: composite ::::::: analysis ::: for ::::::::: analysing the difference between the long-term average of index values during PL events and non-PL events (random pseudo-events) and compare it visually with observed PL tracks ( Fig. 6c-d). On average, the ::: The : difference between the new MCAO index during PL events compared with non-PL events is particularly large in those areas where many PLs have been observed, whereas the 410 difference is not as large in those areas for the conventional MCAO index (Fig. 6c-d, :::::::: compare ::: e.g. ::: the :::: area ::::::: between ::::::: Iceland ::: and ::::::: Norway :: in :::::: panels :: c, :: d). This indicates that the new MCAO index may serve as an approximate classifier to distinguish times and locations of PLs, whereas the conventional MCAO index is not well suited for this task.
For systematically testing the performance of index values in distinguishing PL occurrence in the observational STARS data, thresholds to the index values, we obtain binary indices, for all grid-cells during all events (i.e. for observed PLs and for randomly chosen pseudo-events when no PL was observed).
We then test the performance in distinguishing PL occurrence based on these binary indices. In the following paragraphs, we first inspect the performance of indices in distinguishing the time of occurrence of PLs (task 1) and second the performance in 425 distinguishing both the time and location of PLs (task 2).

Task 1: Distinguish the time of PL occurrence
For distinguishing the time of occurrence of PLs, we use the following classification (summary in Appendix E, Table E1): (i) if (Eq. 6) that maximize Youden's index (Youden, 1950). These are used for calculating the accuracy of indices in distinguishing times and areas of PLs (see text and Table 2).
the binary index values are positive (M i = 1) anywhere in the geographical domain of interest at the time of the event, we define 430 this as a "prediction" that a PL occurs at this time anywhere in the domain (true positives: a PL occurs during that time; false positives: a PL does not occur during that time); (ii) if the binary index,M i , is zero everywhere in the domain, we define this as a "prediction" that no PL occurs (true negatives: no PL occurs during that time; false negative: a PL occurs during that time).
By computing sensitivity and specificity values for the set of critical threshold values, we obtain one ROC curve for each index (see Fig. 7). The ratio of times that the new index correctly identifies PL events defines the true positive rate (sensitivity) and 435 the ratio of times that the new index incorrectly identifies the PL events as the false positive rate (1 -specificity). For the ROC analysis, the two rates are then plotted against each other (see e.g., Fawcett, 2006). : A ::::::: perfect ::: true ::::::: positive :::: rate :: of ::: 1.0 :::::: would ::::: imply ::: that ::: all ::: PL ::::: events ::: are :::::::: identified :::::::: correctly. : In our experimental setup with 50% observed PL events and 50% non-events, it is necessary that the new index has an area under the ROC curve (AUC score) that is higher than 0.5 for performing better than random chance and being counted as a potentially useful proxy for PLs. All three diagnostic indices perform substantially 440 better than random chance, suggesting that they could be useful for indicating the time of occurrence of PLs (AUC values of 0.78, 0.80 and 0.83, for the conventional MCAO index, the new PL index and the new MCAO index, respectively).
From the set of threshold values, M crit i , which are used in the ROC analysis for distinguishing the time of PLs, we select the one that maximizes Youden's Index (Youden, 1950), which is defined as sensitivity + specificity -1, as the best threshold.

445
The best threshold for the conventional MCAO index to distinguish times of occurrence of PLs is M crit θ = 8 K. For the new MCAO index, it is M crit p = 390 hPa, and for the PL index, it is M crit tr = 250 hPa (as shown in Fig. 7a). Using these threshold values we compute the accuracy score, which is defined as the sum of true positives (TP) and true negatives (TN) divided by all events (see e.g., Tharwat, 2021), for measuring how well the diagnostic indices with the selected binary classifier performs in distinguishing the times of PLs. In our experimental setup, with the same number of PL events as non-PL "pseudo-events", an Results of the ROC analyses show that the conventional MCAO index performs poorly (close to random chance) in distinguishing the time and location of PLs (AUC value of 0.52). This underlines that the magnitude of the conventional MCAO index is not useful for identifying times and locations at risk for PLs. In contrast, the newly introduced indices perform substantially better in distinguishing the time and location of PLs compared with the conventional MCAO index (AUC values of 0.67, and 0.74, for the new PL index and the new MCAO index, respectively). Interestingly, the new MCAO index performs better than the new PL index, despite being simpler and requiring less meteorological input data (the PL index requires the 470 potential vorticity field as an additional 3-D data field). The best threshold value for the new MCAO index that maximizes the sum of sensitivity and specificity in distinguishing the times and locations of PLs in the ROC analysis is M crit p = 330 hPa (thresholds for the other indices in Fig. 7 : b). With this threshold, we obtain a sensitivity of 0.78, a specificity of 0.58, and an accuracy of 0.67. In light of the complexity of PL genesis and the simplicity of the new diagnostic index, the aforementioned performance is noteworthy (though still far from maximally attainable reference values of the respective performance metrics). as it exhibits an association with the times and locations of observed PLs, and hence may serve as a simple proxy for PL occurrence. The new index incorporates more information about the 3-D structure of cold air intrusions from the Arctic and is more skilful in identifying areas and timing favourable for PL development.

Determining a region-specific characteristic pressure level from observational data
The new MCAO index was designed to be a simple metric that requires processing of only one 3-D input data field (air temper-485 ature) for allowing its use in computationally expensive, long-term assessments that require processing large amounts of data.
However, compared with the 2-D conventional MCAO index, the new MCAO index has the disadvantage that it requires more input data. This means that its use in, e.g., predictability studies that compare multiple data sets, might be computationally challenging. In this section, we address this disadvantage of the new MCAO index, along with the disadvantage of the conventional MCAO index regarding the subjective element in the choice of a characteristic pressure level. Both of these challenges can be 490 addressed by determining a characteristic pressure level for the conventional MCAO index from observational data about PLs.
Determining the characteristic pressure level for the conventional MCAO index from observational data results in a regionspecific MCAO index that has the same form as the conventional MCAO index, 495 but is based on the critical characteristic pressure level, p crit , that maximizes the link to observed PLs. The critical characteristic pressure level can be determined directly from the best classifier (M crit p ) for the new MCAO index, The region-specific MCAO index, m crit θ , has the advantage that it is computationally cheaper than the new MCAO index, m p , as it can be calculated based only on 2-D meteorological data fields, at the ocean surface and at p crit , while maintaining the 500 same skill in distinguishing the times and locations of PLs as compared to the new MCAO index. Table 1 summarizes key differences between the region-specifc MCAO index and the other diagnostics considered in this study.
The equivalence in skill to distinguish PLs from non-PLs of the region-specific index, m crit θ , and the new MCAO index, m p , is evident from the definition of the indices. Similar to the conventional MCAO index, the region specific index has positive 505 values if the potential skin temperature is larger than the potential temperature aloft. The region-specific MCAO index takes on positive values in all grid-cells, in which the new MCAO index, m p , is higher than the threshold value, M crit p . Essentially, this is so :::: This :: is ::: the :::: case because in these grid-cells the critical pressure level, p crit , is vertically located below the upper boundary of the MCAO. The upper boundary of the MCAO, p * , is defined as the vertical level, at which potential temperature aloft equals potential skin temperature. As potential temperature increases with height in the top vertical layers of the MCAO, the potential 510 temperature at the critical pressure level below the upper boundary of the MCAO is lower than the potential temperature at the upper boundary, and hence also lower than the potential skin temperature, which means that the region-specific MCAO index is positive in these grid-cells (in summary: if m p > M crit p , then p crit > p * , which implies that θ pcrit < θ p * = θ skin , so that m crit θ > 0). This was confirmed by numerical analysis (see Appendix D, Fig. D1). As the skill for distinguishing PLs from non-PLs is based on the binary index values obtained via the critical thresholds (see Eq. 6), the skill based on m p > M crit p is equivalent to 515 the skill based on m crit θ > 0.
The region-specific MCAO index introduced in this section with a threshold of 0 can be used just as well as the new MCAO index with a threshold of M crit p for distinguishing the times and locations of observed PLs. While the procedure of determining the region-specific MCAO index requires processing of the 3-D potential temperature field, this procedure only has to be 520 conducted once in a baseline study for a specific geographical region. The critical characteristic pressure level obtained here is p crit = p 0 − M crit p = 1013.25 − 330 = 683.25 hPa for the geographical region of the Barents and the Nordic Seas. For other geographical regions and observational data about PLs, the calculation should be repeated to account for regional differences.
The resulting region-specific MCAO index, with the parameter p crit "fitted" to empirical data, is computationally cheap and thus feasible for use in climatological assessments and for quick operational risk assessments as part of marine services.
In our analysis, we consider a set of pseudo-events for analysing the behaviour of indices during randomly selected weather 590 conditions. A broader climatological assessment, computing hourly indices for time-intervals of several decades is beyond the scope of this study. Considering that PLs are rare events, the association between areas and times with high index values with areas and times of observed PLs might be overestimated. Future studies could build on this analysis, for example, by analysing longer PL data sets, such as provided in Rojo et al. (2019), taking into account different geographical areas and considering alternative performance metrics, such as the extremal dependence index (EDI, Wulff and Domeisen, 2019). If the performance 595 of diagnostic indices shown in this study for a limited time-interval (2002-2011) and geographical region proof :::: prove : to be robust in other times and regions, then the new indices might be a useful complement for marine services on PLs.
The methods for 3-D interactive visual analysis of ERA5 data introduced here (Sect. 2.2, Sect. 3.1) are publicly available (Met.3D -Homepage, 2021;Met.3D -Documentation, 2021;Met.3D -Documentation ERA5, 2021) and can be used generically, for interactive visual analysis of meteorological phenomena resolved in ERA5 data. We see great potential in using methods for interactive 3-D visual data exploration during the explorative phases of scientific workflows, as performed in this study, for detailed meteorological case-analyses, diagnosis of model simulations and development of new hypotheses.
The Python and bash scripts for pre-and post-processing, statistical analyses and visualizations are available upon request.
Video supplement. The following movies illustrate interactive visual data analysis using Met.3D and provide supplementary insights into the 3-D dynamics of MCAOs and PLs in ERA5:

610
-Movie 1: Interactive visual data analysis of a Marine Cold Air Outbreak in ERA5 data.
-Movie 2: Interactive visual data analysis of a Polar Low in ERA5 data.
-Movie 3: Interactive visual data analysis of the characteristic pressure level in the conventional MCAO index. Research Centre (SFB/TRR165) "Waves to Weather". We would like to thank the German Climate Computing Center (DKRZ) for providing an excellent research infrastructure, and Frank Toussaint for guidance regarding the ERA5 data archive. Many thanks to Daniel Runow from the Basis-Infrastruktur Group at the Regional Computing Center, Universität Hamburg, for maintaining the virtual GPU setup that was used for conducting parts of the analyses described here. ::: We ::::: highly ::::::: appreciate ::: the :::::: detailed :::::::: comments :::: from :: the ::::::: referees ::::: during :: the ::::: review ::::::: process, :::: which :::: have :::::: greatly ::::::: improved ::: the :::::::: manuscript. : (b) Regional index (m crit θ , Eq. 7). Orange area: the regional index is larger than zero. As described in Sect. 3.3.4, the critical pressure level for computation of the regional index can be obtained as pcrit = p0 −M crit p = 1013.25−330 = 683.25 hPa. The comparison provides a numerical example to illustrate that mp > M crit p implies m crit θ > 0 (assuming that potential temperature increases with height in the top layers of MCAOs). This means that the skill of the new MCAO index in distinguishing times and locations of PLs with a critical threshold of 330 hPa also holds for the regional MCAO index with a critical threshold of 0, because the classification is based on the dichotomized index values (Eq. 6, Sect. 3.3.3). Note that for the example illustrated here we compute the regional index based on ERA5 data on the nearest pressure level (700hPa ≈ 683.25hPa) to show that this may be used as a computationally cheaper approximation.
The new MCAO index is calculated based on ERA5 data on model levels with interpolation to the pressure level where θp * = θskin. This explains the small deviations between the two orange areas.  Table E1. Confusion matrix summarizing the classification scheme for task 1. In task 1 we test the performance of indices in distinguish-  Table E2. Confusion matrix summarizing the classification scheme for task 2. In task 2 we test the performance of indices in distinguishing the time and location of PL occurrence.