The importance of model resolution on simulated precipitation in Europe – from global to regional model

Precipitation, and especially extreme precipitation, is a key climate variable as it effects large parts of society. It is difficult to simulate in a climate model because of its large variability in time and space. This study investigates the importance of model resolution on the simulated precipitation in Europe for a wide range of climate model ensembles: from 10 global climate models (GCM) at horizontal resolution of around 300 km to regional climate models (RCM) at horizontal resolution of 12.5 km. The aim is to investigate the differences between models and model ensembles, but also to evaluate their performance compared to gridded observations from E-OBS. Model resolution has a clear effect on precipitation. Generally, extreme precipitation is more intense and more frequent in high-resolution models compared to low-resolution models. Models of low resolution tend to underestimate intense precipitation. This is improved in high-resolution 15 simulations, but there is a risk that high resolution models overestimate precipitation. This effect is seen in all ensembles, and GCMs and RCMs of similar resolution give similar results. The number of precipitation days, which is more governed by large-scale atmospheric flow, is not dependent on model resolution, while the number of days with heavy precipitation is. The difference between different models is often larger than between the lowand high-resolution versions of the same model, which makes it difficult to quantify the improvement. In this sense the quality of an ensemble is depending more on 20 the models it consists of rather than the average resolution of the ensemble. Furthermore, the difference in simulated precipitation between an RCM and the driving GCM depend more on the choice of RCM and less on the down-scaling itself; as different RCMs driven by the same GCM may give different results. The results presented here are in line with previous similar studies but this is the first time an analysis like this is done across such relatively large model ensembles of different resolutions, and with a method studying all parts of the precipitation distribution. 25


Introduction
Precipitation is a key climate variable affecting the environment and human society in different ways and on different temporal and spatial scales. In particular, precipitation extremes (heavy precipitation events) may lead to large damages caused by floods or landslides, while the absence of precipitation may cause droughts and has impact on water-and hydropower supply. In recent decades there has therefore been extensive study, and considerable advancement in our 30 https://doi.org/10.5194/wcd-2020-31 Preprint. Discussion started: 6 August 2020 c Author(s) 2020. CC BY 4.0 License. understanding, of the response of extreme precipitation to climate change (O'Gorman, 2012;Kharin et al. 2013;Donat et al., 2016;Pfahl et al. 2017). For example, it is widely held through theoretical considerations and model experiments that extremes will respond differently than changes in mean precipitation (e.g. Allen and Ingram 2002;Pall et al 2007).
Still, the simulation of precipitation in climate models is challenging because of the wide range of processes involved that 35 acts and interacts on widely different temporal and spatial scales. In order to accurately represent precipitation, models need skill in simulating (1) the large-scale circulation, (2) interaction of the flow with the surface, and, (3) convection and cloud processes. With the typical horizontal grid resolution of O (100 km) of global climate models (GCMs) factor (1) can to a large extent be properly represented but less so for (2) and (3). In particular, convection is not resolved and needs to be treated statistically with convection parameterizations. As the range of scales resolved is broadened through decreasing the 40 horizontal grid spacing the simulation of precipitation generally improves. This is achieved through more realistic representation of surface characteristics (such as complex topography and coastlines) and through more accurately solving the motion equations resulting in more accurate horizontal moisture transport and moisture convergence (Giorgi and Marinucci 1996;Gao et al. 2006;Prein et al. 2013a). Indeed, GCMs with ~ 25-50 km grid spacing show promise to improve simulation of precipitation (Haarsma et al., 2016;Baker et al., 2019). In addition to model resolution, model skills are 45 sensitive to the choice of convection parameterization, which can affect not only the occurrence and amount of precipitation but also the onset timing and location (e.g. Dai et al., 1999).
Dynamical down-scaling of GCMs with regional climate models (RCMs) allows for even finer grids which leads to more detailed information of and further improvements in regional and local climate features, for example spatial patterns and 50 distributions of precipitation in areas of complex terrain (Rauscher et al., 2010;Di Luca et al., 2011;Prein et al., 2013b).
This can also have important implications for climate change signals. Giorgi et al. (2016) found that an ensemble of RCMs at ~12 km resolution showed consistently an increase in summer precipitation over the Alps region which contrasted to the forcing GCMs that instead showed a decrease. The different responses was attributed to increased convective rainfall in the RCMs due to enhanced potential instability by surface heating and moistening at high altitudes not captured by the GCMs. 55 RCMs are constrained by the lateral boundary conditions provided by the forcing GCM and studies of RCM ensembles have shown that the choice of forcing GCM have introduced the major part of the overall uncertainty in regional climate (e.g. Déqué et al., 2007;Kjellström et al., 2011). This effect is relatively more important for large-scale precipitation systems, for example frontal systems associated with extra-tropical cyclones. In seasons and regions when smaller scale processes like convection dominate, for example in summer over mid-latitudes, simulated precipitation is to a larger degree dependent of 60 the RCM itself, in terms of grid resolution and sub-grid scale parameterizations.
Even at grid spacings of around 10 km convection is not resolved by the model dynamics and needs to be parameterized.
However, models with parameterized convection often exhibit common biases in certain precipitation characteristics (Liang, https://doi.org/10.5194/wcd-2020-31 Preprint. Discussion started: 6 August 2020 c Author(s) 2020. CC BY 4.0 License. resolution ofthen suffer from giving precipitation over too large area compared to observations, and usually also too many days with weak precipitation (the "drizzle" problem) (e.g. Dai, 2006, Stephens et al., 2010. Another deficiency is that convections starts too early (e.g. Dai and Trenberth, 2004;Dai, 2006;Brockhaus et al., 2008). At sufficiently high resolution (< 4 km) models start to largely resolve deep convection enabling the parameterization to be turned off, so called "convection-permitting" regional climate models (CPRCM) (Prein et al., 2015). CPRCM simulations are widely shown to 70 reduce, at least to some extent, these biases, most evidently by improving the match of the diurnal cycle to observations (e.g. Prein et al., 2013a;Ban et al., 2014;Brisson et al., 2016;Gao et al., 2017;Leutwyler et al., 2017;Belušić et al. 2020) and better representation of sub-daily high-intensity precipitation events (e.g. Ban et al., 2014;Kendon et al., 2014;Fosser et al., 2015;Lind et al., 2020) than models with parameterized convection. A major draw-back using these high-resolution climate models is the very high computational cost, making their use in ensembles to only recently emerge (Coppola et al., 2018). 75 The goal of this study is to examine how precipitation in Europe depends on model resolution and type of model from RCMs of relatively high resolution to GCMs of standard resolution. CMIP5 (Climate Model Intercomparison Project phase 5, Taylor et al., 2012) of standard resolution are compared with newer GCMs which participated in the HighResMIP (Haarsma et al., 2016) experiment within the H2020-EU-project PRIMAVERA. These models are: ECMWF-IFS (Roberts et al., 2018), 80 HadGEM3-GC31 (Roberts et al., 2019), MPI-ESM1.2 (Gutjahr et al., 2019), CNRM-CM6.1 (Voldoire et al., 2019) and EC-Earth3P (Haarsma et al., 2020). Furthermore, the first results from the CMIP6 (Climate Model Intercomparison Project phase 6, Eyring et al., 2016) GCMs are included in the analysis. The GCMs are compared with RCMs from CORDEX (COordinated Regional Downscaling EXperiment, Gutowski et al., 2016). This allows for comparisons of different generations of models, global versus regional models and the impact of model horizontal grid resolutions. For a few cases, 85 the same model version has been applied at two different grid resolutions which allows for investigating the impact of resolution alone. The simulated precipitation is analysed both in terms of precipitation intensity distributions and through a collection of standard precipitation based indices.
The aim of this study is to: i. Investigate to what extent a large number of global and regional climate models can reproduce observed precipitation 90 climatologies and characteristics over Europe.
ii. Investigate how model horizontal grid resolution in either global or regional models affect the simulated precipitation in Europe; are there systematic differences and if so, are these persistent for different parts of Europe and for different seasons.

Method
The models used in this study are a selection of CMIP5 global models (~100-300 km resolution); the high (~40-80 km) and 95 low (~80-160 km) resolution versions of the PRIMAVERA global models and the first models from CMIP6 (~100-300 km); https://doi.org/10.5194/wcd-2020-31 Preprint. Discussion started: 6 August 2020 c Author(s) 2020. CC BY 4.0 License. and a selection of CORDEX regional models (at 12.5 and 50 km resolution). The low resolution versions in each model ensemble is called LR, and the high resolution HR. Note that the full CMIP5, CMIP6 and CORDEX ensembles are not used, but rather "ensembles of opportunity" for which daily precipitation were easily available. To evaluate the models we compare them with data from E-OBS version 19.0e (Cornes et al., 2018). Table 1 lists the ensembles used. The simulated 100 precipitation for all models is analysed over the PRUDENCE regions in Europe ( Fig. 1; .
Prior to analysis all grid points over sea are filtered out, and then for each region and model we calculate precipitation characteristics for all remaining land grid points.
To investigate the effect of model grid resolution on the full distributions of daily precipitation intensities, we use the ASoP 105 (Analysing Scales of Precipitation) method (Klingaman et al., 2017;Berthou et al., 2018). ASoP involves splitting precipitation distributions into bins of different intensities and then provides information of the contribution from each precipitation intensity separately to the total mean precipitation rate (i.e. given by all intensities taken together). The ASoP method is here applied to each model grid point, and then averaged over desired regions. In the first step, different precipitation intensities are binned in such a way that each bin contains a similar number of events, with the exception of 110 most intense events, which are rare. The actual contribution (in mm) of each bin to the mean precipitation rate is obtained by multiplying the frequency of events by the mean precipitation rate. The sum of the actual contributions from all bins gives the total mean precipitation rate. Then, the fractional contribution (in %) of each bin is obtained by dividing the actual contributions by the mean precipitation rate. The sum of all fractional contributions is equal to one, so the information provided by fractional contributions is predominantly about the shape of the distribution. The result of the ASoP analysis is a 115 distribution for each model showing the probability of different precipitation intensities based on daily precipitation. The ASoP distributions of all analysed models are used to compare model behaviour and performance. In particular to see how changing the grid resolution affects different parts of the distribution, for example if contributions from low and high precipitation intensities are different. Taking the absolute differences between two fractional distributions and sum over all bins gives a measure of the difference in the shapes of the precipitation distributions. This is here called the "Index of 120 fractional contributions".
In addition to ASoP, a number of indices based on daily precipitation are calculated for the same regions (Table 2 lists the indices and their definitions). All daily precipitation values for all grid points within a region (land points only) are used to calculate the indices so that each grid point gets one value of the index representing the time period. These values are then 125 pooled to calculate percentiles representing the region for each model. In this way we don't have to interpolate the models to a common grid, but can study them on their native grid. An advantage since all interpolation may alter precipitation characteristics (Klingaman et al., 2017). This also means that the calculated model spread reflects geographical and not temporal variability. These percentiles are used in the box plots (Sect. 3).
https://doi.org/10.5194/wcd-2020-31 Preprint. Discussion started: 6 August 2020 c Author(s) 2020. CC BY 4.0 License. The ASoP analysis provides information on how a range of precipitation intensities contribute to the total precipitation amount. Figure 2 presents results for annual daily precipitation over four of the PRUDENCE regions -Scandinavia, mid-Europe, the Alps and the Mediterranean. In general, the model ensembles have higher amounts of precipitation compared to 135 E-OBS, signified by larger contributions at low (< 2-3 mm day -1 ) and moderate-to-high (> 5-10 mm day -1 ) intensities. An exception is the CMIP6 ensemble that instead shows lower contributions for moderate-to-high precipitation intensities, i.e. above 10-20 mm day -1 (Scandinavia, mid-Europe and the Alps) or between 5-20 mm day -1 (Mediterranean). This ensemble also tends to have the largest overestimates of contributions from low intensities (below c 5 mm day -1 ).

140
Another consistent feature is that the probabilities for the higher intensities (above c 15 mm day -1 ) increases with increasing grid resolutions of respective model ensemble, and consequently the contributions becomes increasingly larger than E-OBS ( Fig. 2). This is most evident for the Alps region where the CMIP6 models (100-300 km resolution) clearly give smaller contributions than E-OBS and the PRIMAVERA models (40-160 km), the latter having smaller contributions than the CORDEX LR models (50 km) and the CORDEX HR models (12.5 km). The higher resolution models peak at higher 145 intensities and have wider distributions with more intense extreme precipitation. The sensitivity of model grid resolution to precipitation amounts and variability in association with areas with complex and steep topography (e.g. Prein et al., 2015) is most likely the main reason for the large differences between model ensembles in the Alps region. For example, the upper end of the CMIP6 distributions is around 30 mm day -1 while corresponding part in CORDEX HR models is around 100 mm day -1 (bottom right panel in Fig. 2). 150 It is worth noting that the reliability of ground based observations in mountainous areas is relatively low, especially in wintertime when the fraction of snowfall is largest which is more sensitive to wind induced undercatch (e.g. Yang et al., 2005, Rasmussen et al., 2012. Further, there are considerable issues with the spatial representation as stations are often located in valleys. Therefore, E-OBS data (which is not bias-corrected for undercatch) most certainly underestimates 155 precipitation amounts in regions with steep topography, for example in the Alps (Prein and Gobiet, 2017).

Seasonal precipitation
Further insight can be gained by investigating seasonal differences. First of all, in all regions the spread within each model ensemble is larger in summer (JJA) than in winter (DJF) (see coloured shadings in Fig. 3). The difference can be related to the large-scale circulation. In winter it is dominated by the presence and higher activity of the North Atlantic storm track 160 with frequent passing of synoptic weather systems over Europe. These features are generally well represented in climate https://doi.org/10.5194/wcd-2020-31 Preprint. Discussion started: 6 August 2020 c Author(s) 2020. CC BY 4.0 License. models -hence larger consistency with associated precipitation across models. In summer, on the other hand, synoptic activity is reduced and convective processes (either as isolated or organized systems or embedded in larger scale features like fronts) become more prominent in precipitation events. Sensitivity to model grid resolution and physics parameterizations (e.g. convection parameterization) is larger during this season. The larger summertime spread in ensembles seen in Fig. 3  165 might then reflect larger uncertainties associated with model resolution and formulation. It is further noted that the ensemble spread is not increased as much over northern/north-western Europe (from winter to summer) which is relatively more affected by synoptic scale events during summer compared to southern parts of Europe (not shown).
Model ensemble differences for all regions and seasons are summarized in Fig. 4, using the model ensemble with the highest 170 grid resolution, CORDEX HR, as the reference. Common in all seasons and regions is that the CMIP ensembles consistently exhibit larger differences in fractional contributions (compared to CORDEX HR) than CORDEX LR and PRIMAVERA (larger fractional values on the y-axis). The differences are largest in the Alps (AL), Iberian Peninsula (IP) and the Mediterranean (MD), while they are the least in central and eastern Europe (Fig. 4). The former regions are either characterized by complex and steep topography (e.g. the Alps and the Pyrenees) and/or by dry environments dominated by 175 precipitation of convective nature, which could to an extent be the causes of the larger differences between the coarse CMIP GCMs and the higher resolution PRIMAVERA GCMs and CORDEX RCMs. It is also noted that for all regions PRIMAVERA HR and CORDEX LR give comparable distributions as they are of similar resolution.
Region total seasonal precipitation (averaged within each ensemble), are either mostly in the range of +/-20 % from 180 CORDEX HR (e.g. eastern Europe, EA) or more strongly biased lower (e.g. the Alps, AL) (x-axis in Fig. 4). An exception is over the British Isles (BI) where most ensembles instead have higher total precipitation amounts than CORDEX HR (MD), in contrast, there is a large spread between the ensembles, and further, the differences to E-OBS are not only overall largest but also give more support to the coarser CMIP models (Fig. 4). This region has a large fraction of coastal grid points and also complex topography (Fig. 1), both factors contributing to uncertainties in quality and representativeness of observational data.
On the model ensemble scale, it is only possible to do a qualitative assessment of the sensitivity to model grid resolutions since other aspects, such as differences in model formulation, can also contribute to differences in model performance. In other words, it cannot be definitely stated whether any difference in performance comes from higher resolution or from other differences in the model code. For the PRIMAVERA models, however, it is possible to directly compare the low-and highresolution model versions. For some CORDEX models this is also possible when the low-and high-resolution versions of 200 the RCMs were forced by the same GCMs. This is possible for 8 RCM-GCM combinations (6 different RCMs driven by 4 different GCMs).
For both the PRIMAVERA and CORDEX models the high-resolution versions consistently simulate larger contributions from the higher precipitation intensities, above 10-20 mm day -1 (Fig. 5). For some models this increase in high precipitation 205 contribution is "compensated" by a decrease in lower intensities (1-10 mm day -1 ), as seen for example in the AL region. In other regions, for example IP, this feature is not as prominent and in some cases model responses seem relatively insensitive to grid resolution. Differentiating between seasons (not shown) reveal that in summer, high-resolution models most often have larger contributions for nearly all intensities while in winter there is stronger indication of a "phase shift" of the distributions with larger contributions from high-intensity events compensated by lower contributions from more moderate 210 events (below 10 mm day -1 ). In most regions, the difference between low-and high-resolution models is also more pronounced in the CORDEX ensemble than in PRIMAVERA.
It is worth to note that the differences between different RCM simulations, and how they respond to differences in resolution, may very well be explained by the driving GCM and the state of the atmospheric general circulation in them 215 (Kjellström et al., 2018, Sørland et al., 2018. Higher resolution is expected to give a better described and more detailed climate, with for example deeper cyclones and more intense local showers; in a sense with more pronounced weather events.
If two models are in different states, for example when it comes to were storm tracks cross Europe, and if these states are pronounced, that may lead to even larger model differences. Instead of a weak storm track in the south and a weak storm track in the north in the low resolution model, we may now instead have strong storm tracks, which mean that the difference 220 between the models increases. To fully answer that would require an analysis of the circulation patterns in the different models. This is not done here, but should be a topic for further studies.

Model ensemble comparison
When do intense precipitation events, which have become more frequent in recent past and are projected to increase in the 225 future, occur in the high-resolution models? The kind of events that are rarely seen or absent in the low resolutions https://doi.org/10.5194/wcd-2020-31 Preprint. Discussion started: 6 August 2020 c Author(s) 2020. CC BY 4.0 License. simulations? Figure 6 shows the number of precipitation days (RR1, Table 2) as simulated by all models for each PRUDENCE region. The number of precipitation days does not differ much between the model ensembles. There are clear differences between individual models, but it is difficult to establish any significant differences between the model ensembles. This is the case both for regions with more precipitation days (e.g. SC) and regions with fewer precipitation days 230 (e.g. IP). All models show about the same number of precipitation events over the whole year, which may suggest that the large-scale weather patterns are not influenced that much by higher resolution. Also when looking at individual seasons the differences between ensembles are small. Note, however, that the large-scale circulation in the RCMs to a large extent is governed by the driving GCM which have typical resolutions of around 200 km. Most models do overestimate the number of precipitation days compared to observations. It is a well known feature of climate models, particularly those that use 235 parameterized convection, that they tend to have too many wet days (e.g. Dai, 2006;Stephens et al., 2010).
The number of days with large precipitation amounts, above 10 mm day -1 and 20 mm day -1 , become more frequent with higher model resolution. For example, the number of days with precipitation over 20 mm (R20mm, Table 2) increases from just a few in CMIP5 to 5-10, or even more, in CORDEX HR (Fig. 7). The 10 th to 90 th inter-percentile range increases, due to 240 a larger increase in the 90 th percentile. Generally, the spread is larger for models with high resolution. This could partly be explained by higher number of data points in the high-resolution models (i.e. larger number of grid points); a high-resolution model is more likely to better represent the spatial variations of precipitation within a region while in coarser scale models precipitation fields are smoother due to fewer grid points. Compared to E-OBS the average number of days with more than 10 mm day -1 is more accurately simulated in the high resolution ensembles, but the spread is highly exaggerated. The 245 PRIMAVERA models have an average similar to E-OBS and also a more similar spread.
The fact that the number of wet days is similar between LR and HR models (Fig. 6) but with increased frequency of (extreme) precipitation in HR models (Fig. 7) suggests that, for the latter, the precipitation intensity on the wet days is higher. This is shown in the simple precipitation intensity index (SDII, Table 2, Fig. 8). SDII is indeed affected by 250 resolution; the wet day average precipitation is larger in the HR simulations compared to LR models, and also the intra model spread (spread between models within the ensemble) is larger. For all regions, SDII is higher in the HR models.
Perhaps, the relative increase in SDII is higher in regions with large spatial variations (for example because of complex orography or coastlines) such as IP and AL. The median SDII values in high-resolution models are in all regions closer to E-OBS than the low-resolution models, even though the model spread is generally larger in the climate models than in E-OBS. 255 The higher intensities for extreme precipitation in high-resolution models compared to low-resolution models are also seen in the maximum one-day (Rx1day, Table 2, Fig. 9) and maximum five-day precipitation (not shown). There is a clear increase in both intensities and intra model spread in the high-resolution models. It can be discussed if this increase is an https://doi.org/10.5194/wcd-2020-31 Preprint. Discussion started: 6 August 2020 c Author(s) 2020. CC BY 4.0 License. improvement since the CORDEX HR models give a maximum one-day precipitation that is significantly larger than E-OBS. 260 On the other hand it can be discussed if E-OBS is able catch the extremes (Hofstra et al., 2009;Prein and Gobiet, 2017).

One-to-one comparison
We let the mid-Europe region (ME) represent the whole domain, as the same conclusions can be made for all regions, only with small differences in the number of models that give significant differences. A one-to-one comparison is made of the selected indices for the models where there is both a low and a high grid resolution version (Fig. 10). The LR and HR 265 versions are compared with a Welsh's t-test (Welsh, 1947) at the 0.05 significance level to see if the simulated indices are significantly different. This corroborates the analysis above, and adds some further detail by quantifying the differences.
Although the difference in the number of precipitation days (RR1, Fig. 10, top left) is significant for most models it is not clear how it is affected by resolution. The differences are small (~10 days year -1 ) and the sign of the difference is negative 270 for some models and positive for some. The differences between different models are larger than the differences between resolutions. It is clear, however, that all models overestimate the number of precipitation days compared to E-OBS.
The number of days with precipitation more than 20 mm (R20mm, Fig. 10, top right) is significantly different between HR and LR for all models and E-OBS. For the CORDEX models R20mm is higher in most HR versions, while the difference is 275 less clear in the PRIMAVERA models. All simulations with the RCA4 RCM, regardless of the driving GCM, clearly show higher R20mm in the HR version compared to the LR versions, which indicates that the difference in the index mainly is a result of the changed grid resolution in the RCM. CORDEX LR is close to E-OBS, while CORDEX HR generally overestimates R20mm.

280
The simple precipitation intensity index (SDII, Fig. 10, bottom left) is significantly different in one out of four PRIMAVERA models and four out of nine CORDEX models. Differences are small, tenths of mm day -1 , for most models.
The maximum one-day precipitation (Rx1day, Fig. 10, bottom right) is significantly different in the HR version in all but one model (a PRIMAVERA model). The HR versions have higher precipitation values and larger spread in all but two PRIMAVERA models and one CORDEX model. Especially the CORDEX HR models have a higher maximum one-day 285 precipitation. This seems to be driven by the RCM rather than the driving GCM. As an example three RCMs are forced with the MPI-ESM-LR GCM. When forced by this GCM the rx1day in the CCLM4-8-17 RCM is lower in the HR version, while in REMO2009 and RCA4 HR RCMs Rx1day is higher. In RCA4 the difference is particularly large, regardless of the driving GCM.

290
The one-to-one comparison of selected indices shows that there are significant differences between the LR and HR versions of the models. It also shows that for some indices the largest difference occurs between CMIP5/6 and PRIMAVERA HR, https://doi.org/10.5194/wcd-2020-31 Preprint. Discussion started: 6 August 2020 c Author(s) 2020. CC BY 4.0 License. rather than between PRIMAVERA and CORDEX. This means that some of the differences seen in Figures 6-9 are not as clear in figure 10. The comparison also shows that even though there are significant differences between LR and HR it is for some cases difficult to establish significant differences between two ensembles since the difference between two different 295 models are often larger than the differences between the LR and HR version of the same model.

Discussion and conclusions
This study investigates the importance of model resolution on the simulated precipitation in Europe. The aim is to investigate the differences between models and model ensembles, but also to evaluate their performance compared to gridded observations. In a similar study Demory et al. (2020) compares PRIMAVERA models with mainly CORDEX LR, but to 300 some extent also CORDEX HR. They come to the conclusion that CORDEX indisputably improves the data from the driving CMIP5 models, but that the differences between CORDEX LR and PRIMAVERA are generally small. Both ensembles perform well, but tend to overestimate precipitation in winter and spring. The largest differences between the ensembles are for high precipitation intensities, in especially summer, where PRIMAVERA gives less heavy precipitation which makes it agree more with observations than CORDEX. Iles et al. (2020) compare the effect of resolution on extreme 305 precipitation in Europe in GCMs and RCMs. They conclude that high resolution models systematically give extremes that are heavier and more frequent. Our interpretation of this, given the results in our study, is that this may also mean that in some cases the overestimation of precipitation increases with higher resolution. The findings in this study support the conclusions from the above mentioned studies, and add details based on a wider range of model ensembles and precipitation metrics. 310 The ASoP analysis in this study shows that all model ensembles have larger contributions from heavy precipitation in winter compared to E-OBS, and that the higher values becomes most prominent for the ensemble with the highest grid resolution, CORDEX HR. The biases compared to E-OBS are in summer generally smaller. The PRIMAVERA ensemble is in good agreement with observations and has smaller bias than CORDEX for many regions. CMIP6 mostly underestimates 315 contributions from moderate-to-high precipitation intensities in summer while overestimating low-intensity events. Overall, in the summer season, the spread is large between ensembles and between models within the ensembles. This is indicative of large uncertainties which are most likely related to uncertainties in how models are able to treat smaller scale precipitation events involving convection. With respect to E-OBS, the ASoP results partly show that high resolution does not necessarily mean better. However, in coastal regions and regions with steep or complex topography there are uncertainties in both 320 models and observations. Particularly in winter observations suffer from undercatch when precipitation falls as snow during windy conditions and in summer, smaller scale convective precipitation may be smoothed considerably or missed completely by ground rain gauges (which E-OBS is based on). Therefore, it is not always obvious which model or ensemble of models is closest to reality. https://doi.org/10.5194/wcd-2020-31 Preprint. Discussion started: 6 August 2020 c Author(s) 2020. CC BY 4.0 License.
It is clear that the horizontal resolution of a model has a large effect on precipitation, mostly on the more heavy precipitation.
The number of precipitation days does not depend much on resolution as this is mostly depending on large scale weather patterns and not so much on local topography and convection. For extreme precipitation events, that are more local and short-lived, model resolution is more important. A high-resolution model better resolves such events and distinguishes better between different parts of a region. Thus, extreme precipitation is more intense and more frequent in HR models compared 330 to LR models. With the same amount of wet days this means that precipitation intensifies so that the wet days get wetter. The largest impact of increased model grid resolution on precipitation is most evident for the coarser scale models; increasing the resolution from CMIP5/6 to PRIMAVERA HR has a greater effect than increasing from CORDEX LR/PRIMAVERA HR to CORDEX HR. This does not, however, mean that increased resolution gets less and less worthwhile; further refining the grid until convection permitting resolutions are reached (less than ~5 km grid spacing), in which case convection 335 parameterizations may be turned off, will have a large positive effect (e.g. Prein et al. 2015). This is not shown here as the smallest grid spacing in models in this study is 12.5 km. The effect of higher resolution is seen in regions with small amounts of precipitation as well as regions with high amounts of precipitation, and in regions with small and large geographical differences. The higher percentiles change more than the low percentiles for all studied indices. Increasing resolution has about the same effect on both GCMs and RCMs, furthermore GCMs and RCMs of comparable resolution simulate 340 comparable precipitation climates.
Higher resolution does not necessarily mean better results. If a model is already to wet the increase in heavy precipitation that is induced by the higher resolution means that the HR version agree less with observations that the LR version. For the individual model it is possible to quantify the difference and improvement between LR and HR. On the ensemble level this 345 is more difficult. The difference between different models is often larger than between LR and HR versions of the same model. In this sense the quality of an ensemble is depending more on the models it consists of rather than the average resolution of the ensemble. Furthermore, when downscaling with an RCM, the simulated extreme precipitation, and the differences between GCM and RCM, depends more on the used RCM and less on the down-scaling itself. The results presented here are in line with a previous similar studies (Demory et al., 2020;Iles et al., 2020), but this is the first time it is 350 done across such relatively large model ensembles of different resolutions, and with a method studying all parts of the precipitation distribution.

Acknowledgements
This work has been funded by the PRIMAVERA project, which is funded by the European Union's Horizon 2020 programme, Grant Agreement no. 641727PRIMAVERA. This work used JASMIN, the UK collaborative data analysis 355 facility. Some analysis was performed on the Swedish climate computing resource Bi provided by the Swedish National https://doi.org/10.5194/wcd-2020-31 Preprint. Discussion started: 6 August 2020 c Author(s) 2020. CC BY 4.0 License.
Infrastructure for Computing (SNIC) at the Swedish National Supercomputing Centre (NSC) at Linköping University. We acknowledge the E-OBS dataset from the EU-FP6 project UERRA (http://www.uerra.eu) and the Copernicus Climate Change Service, and the data providers in the ECA&D project (https://www.ecad.eu) 360 Data: The data are stored on the Jasmin infrastructure, http://www.ceda.ac.uk/projects/jasmin/. The simulations are part of the High Resolution Model Intercomparison project (HiResMIP) and will be uploaded to the ESGF: https://esgfnode.llnl.gov. Scripts for analyzing the data will be available from the corresponding authors upon reasonable request.    https://doi.org/10.5194/wcd-2020-31 Preprint. Discussion started: 6 August 2020 c Author(s) 2020. CC BY 4.0 License. Figure 4: The index of fractional contributions (y-axis) plotted as a function of the fractional difference in seasonal total precipitation (x-axis). CORDEX HR ensemble is the reference ensemble. E-OBS ave rage annual total precipitation (mm) is shown in lower right in each panel.
https://doi.org/10.5194/wcd-2020-31 Preprint. Discussion started: 6 August 2020 c Author(s) 2020. CC BY 4.0 License. Figure 10. Number of precipitation days (RR1 (days year -1 ), top left), number of days with precipitation amount over 20 mm (R20mm (days year -1 ), top right), simple precipitation intensity index (SDII (mm day -1 ), bottom left), maximum one day precipitation (Rx1day (mm day -1 )) in the Mid-European region (ME) in the PRIMAVERA LR (pink) and HR (red) models, CORDEX LR (light blue) and HR (purple) models as well as E-OBS LR (grey) and HR (black). Boxes mark the 25 th and 75 th percentile, with the median inside; whiskers go from the 10 th to the 90 th percentile. If the the high resolution version of a model is 625 significantly different from the low resolution version this is marked with a vertical line in the high resolution boxes.