The monsoon hydroclimates in HadGEM3 model configurations GA3.0 and GA4.0: Impact of remote versus local circulations errors and horizontal resolution

State-of-the-science general circulation models (GCMs) are the primary tools for making policy-relevant climate calculations. Yet, these models face challenges in monsoon regions where live more than 70% of the world’s population, due to the complex interplay of local and remote influences on a spectrum of space and time scales. This work examines the fidelity to reproduce regional and global monsoons climatological features using the Met Office Unified Model (MetUM) third and fourth generations Global Atmosphere (GA3.0) and (GA4.0), two configurations of the HadGEM3 system developed for 15 seamless use across climate and weather time scales. Results are compared both against multiple observational gridded datasets and outputs from 20 atmospheric-only GCMs simulations from the CMIP5 campaign. Furthermore, we investigate the influence of remote versus local atmospheric circulation errors by constraining realistically HadGEM3 circulations over a prescribed monsoon domain and examining the consequences outside and inside this domain using the “grid-point nudging” method. The GA3.0 largely captures global monsoon features, including the monsoon precipitation patterns and extent of 20 regional monsoon domains, when integrated using a low (~135-km in mid-latitudes), medium (~60 km in mid-latitudes) and high (~25km in mid-latitudes) horizontal resolutions. GA4.0 and GA3.0 results display a close similarity, and compares reasonably well against the best available CMIP5 models. The common failure of HadGEM3 configurations is the simulated weak magnitude and extent of the Asian Summer Monsoon (ASM) precipitation pattern, and associated low-level Somali jet. This situation is also apparent within HadGEM2-A, ACCESS1-0, and CSIRO-Mk3-6-0 – three GCMs sharing dynamical and 25 physical components. HadGEM3 performance improves significantly over ASM with atmospheric circulations constrained realistically over the tropics, West African and Asian Summer monsoon domains. Conversely, constraining atmospheric circulations over other remote monsoon domains show little influence on the ASM precipitation. We argue that GA4.0 and GA3.0 poor simulations over the ASM domain are attributable, partly to local atmospheric circulations errors and excessive https://doi.org/10.5194/wcd-2020-38 Preprint. Discussion started: 28 September 2020 c © Author(s) 2020. CC BY 4.0 License.

systems from remote regions, suggesting that both regional and global perspectives are needed to improve the understanding 95 and predictions of monsoon climates (Bielli et al., 2010;Flaounas et al., 2012;Rowell et al., 2013;Chadwick et al., 2017a).
Climate simulations performed with Coupled General Circulation Models (CGCMs) are the primary tools for assessing the Earth system's response to large-scale forcings and developing policies to address the future global climate change through hypothesis testing, sensitivity studies, internationally coordinated multi-model experiments and evaluation campaigns. The 100 Coupled Model Intercomparison Project Phase 5 (CMIP5) and 6 (CMIP6) provide the basis for multimodel evaluation and reveals a variety of systematic differences between models and observations, with many persisting from a model generation to the next (Sperber et al., 2013;Eyring et al., 2016b;Annamalai et al., 2017;Anand et al., 2018;Xue et al., 2018). The evaluation of historical simulations and estimation of uncertainty in the future projections are prerequisite to assessment of the future changes. Confronting CGCMs outputs against observations provides further insight into shortcomings and ways in 105 which the various processes are represented within climate models. Several studies suggest that the CMIP5 multi-model ensemble mean captures well observed spatial patterns and interannual variability of GM's rainfall, though most individual models underestimate its extent and intensity (Flato et al., 2013;Wang et al., 2017). CMIP6 models simulate better global monsoon intensity and precipitation over CMIP5 models, but common biases and large intermodal spreads persist (Wang et al., 2020). Moreover, projected CMIP6 monsoon precipitation indicates that an important uncertainty arises from circulation 110 changes, which can be partly explained by model-dependent response to uniform sea surface temperature warming (Chen et al., 2020).
There are long standing errors in the CMIP5 sea surface temperature (SST) field leading to insufficient South Asian monsoon rainfall along with too weak monsoon circulations (Annamalai et al., 2017;Găinuşă-Bogdan et al., 2018); systematic 115 southward shift of the West African monsoon (WAM) rainfall (Roehrig et al., 2013); unrealistic simulations of easterly lowlevel moisture flux across the Caribbean region, which can lead to the well-known "monsoon retreat bias" or excessive North American monsoonal rainfall in the fall (Pascale et al., 2016(Pascale et al., , 2017; and deficiencies in simulating the East Asian Summer monsoon rainfall including topography-related cold biases, excessive rainfall, unrealistic southeast-northwest rainfall gradient, overestimated interannual variability of temperature and precipitation, and unrealistic monsoon circulation's magnitude (Boo 120 et al., 2011;Feng et al., 2014;Li et al., 2014). The analysis of 28 CMIP5 models highlights issues to resolve the spatial and temporal characteristics of the South Asian Monsoon precipitation adequately and it is found that precipitation and 2 m surface air temperature (T2m) biases are reduced with increase in the model horizontal resolution (Anand et al., 2018). Furthermore, model systematic errors are often reduced for atmosphere-only models (AGCMs) simulations forced by observed historical sea SSTs, suggesting that many CGCMs biases are not inherent and arise from the challenge of capturing interactions between 125 monsoon systems and oceanic processes (Wang et al., 2017;Jiang and Zhou, 2019). https://doi.org/10.5194/wcd-2020-38 Preprint. Discussion started: 28 September 2020 c Author(s) 2020. CC BY 4.0 License. This paper is motivated by the opportunity to advance our understanding on GCM simulations of the monsoon hydroclimates.
The main purpose is to assess the capability of the Met Office Unified Model (MetUM) Global Atmosphere (GA) modelling systems GA3.0 and GA4.0, two configurations of the HadGEM3 system employed for seamless predictions across climate and 130 weather time scales. For this we attempt to disentangle the relative influence of physical formulation, horizontal resolution, and remote versus local atmospheric circulations on the HadGEM3system's performance, relative to that of state-of-thescience experiments from the CMIP5 dataset. This analysis provides a framework for further understanding the interactions between individual monsoon systems and associated mechanism including teleconnections, though it may not address fully the issue of attributing specific causes to specific systematic errors. We argue that this is crucial to improve the credibility of 135 climate predictions and projections of future GM changes, and to increase the confidence that model development chains seen to improve performance in individual monsoon region are doing so by modelling accurately climate processes underpinning the GM variability. Thus, refining the understanding of climate model's uncertainty is an important step to mitigate the impact of monsoon-dominated climate extremes on human and natural systems, for a large population, mostly in developing countries (Cerezo-Mota et al., 2016;Novello et al., 2017;Chang et al., 2018;Huang et al., 2018;Mishra et al., 2018;Talento et al., 140 the three-dimensional wind components, potential temperature, Exner pressure, and density. Moist prognostics such as specific humidity, prognostic cloud fields, and other atmospheric loadings are advected as free tracers. The vertical decomposition is performed using terrain-following hybrid height coordinates and Charney-Phillips staggering approach (Charney and Phillips, 160 1953). Large scale clouds are considered using prognostic cloud fractions and prognostic condensate (Wilson et al., 2008) and hydrology-soil-vegetation-atmosphere interactions are calculated using the Global Land 3.0 (GL3.0) Joint UK Land Environment Simulator (Best et al., 2011).
The GA4.0 version uses dynamical and physical cores that are largely identical to GA3.0. But, it has a slightly advanced formulation of sub-grid scale processes including the replacement of specific quantities for moist prognostics with mass mixing 165 ratios; a correction to the treatment of shortwave fluxes in the coupling to the sea ice component of coupled modelling systems at grid points with fractional land cover, in view to ensure that shortwave fluxes to sea ice points are accounted for on every atmospheric model time step; large-scale precipitation scheme is modified to improve particle size distribution for rain and mitigate spurious feedback caused by a long-standing and explicit link between the rate of sub-grid homogenisation in the cloud erosion parametrization and the relative humidity (Abel and Boutle, 2012); a new formulation of cloud erosion that 170 relates the rate of mixing to the cloud fraction and reaches maximum mixing whenever the sky is half covered by clouds (Morcrette, 2012); improvements in the boundary-layer scheme to correct a significant and long-standing radiative bias within GA3.0 (Bodas-Salcedo et al., 2012). The aerosol scheme is modified to include the effect of seasonal vegetation dieback on dust emissions.

Grid point nudging technique 175
The grid-point nudging or Newtonian relaxation strategy is a form of data assimilation that helps adjust dynamical variables of free running GCMs via meteorological re-analyses. The basic idea is to constrain realistically the AGCM's solution or trajectory over or outside a prescribed domain via meteorological re-analyses to capture daily variability of key climate phenomena and examine the consequences outside or inside this domain. This technique was first applied successfully in the mid-90s for validation of the Hamburg climate model (ECHAM) and enable the comparison of model outputs with 180 observations over short periods of time (Jeuken et al., 1996). Since then, numerous nudged climate experiments have been carried out worldwide (Telford et al., 2008;Bielli et al., 2010;Pohl and Douville, 2011;Flaounas et al., 2012;Omrani et al., 2015;Hardiman et al., 2017). The IRCAAM (Influence Réciproque des Climats d'Afrique de l'ouest,du sud de l'Asie et du bassin Méditerranéen) project, a research initiative coordinated by the French National Research Agency (ANR), used the grid-point nudging technique to explore remote versus regional causes of GCM' systematic errors over the West African 185 Monsoon domain and associated teleconnections (Pohl and Douville, 2011).
The Grid-point nudging capability of the MetUM was initially developed at the University of Cambridge under the Met Office -NCAS (National Centre for Atmospheric Sciences) partnership and used for the following short-term applications: to allow https://doi.org/10.5194/wcd-2020-38 Preprint. Discussion started: 28 September 2020 c Author(s) 2020. CC BY 4.0 License.
comparisons of model outputs with episodic satellites campaigns (Voulgarakis et al., 2011); to produce hindcasts over a short period of interest, such as the eruption of Mt. Sarychev (Haywood et al., 2010); to ensure that the large scale circulation is the 190 same between model simulations and reanalysis; and to compare tropical convection characteristics for a given year in range of atmospheric models, as well as numerical weather prediction (NWP) models, chemistry transport models (CTMs), and chemistry-climate models (Russo et al., 2011).
For the MetUM, only few variables are nudged towards the 6-hourly ERA-Interim reanalysis (Dee et al., 2011) including horizontal wind components (u and v) and potential temperature (θ). This is achieved through addition of non-physical 195 relaxation terms to the model equations describing the rate of change for a given prognostic variable: Where Fm is the rate of change in the variable due to all other factors, X is the state of the model, Xana is the reference field toward which the model is relaxed, and G is the strength of the relaxation (Jeuken et al., 1996). Interpolation onto the MetUM model grid is done through a bi-linear interpolation with respect to the logarithm of pressure, ln(P), from the ERA-Interim 200 reanalysis hybrid pressure levels to the UM hybrid height levels. The selection of parameter G is critical, such that if it is too small nudging is ineffective, yet too large and the model becomes unstable. Here, we have chosen the "natural" value of 1/6 h−1, which corresponds to the time spacing of ERA-Interim reanalysis data. However, the relaxation parameter is set to be weaker at the 3 lowest and 5 highest levels, to allow for a smooth model adjustment to the nudging. Table 1 summarizes the experiments performed in this study. Three simulations are carried out continuously through 1982-2008 (27 years) using the GA3.0 model configuration with a fixed lid at 85 km above the surface and 85 vertical levelswith 50 levels below 18 km and 35 levels above. First, GA3.0 is integrated using the N96 horizontal grid resolution i.e. 1.258 longitude by 1.8758 latitude, which corresponds roughly to 135-km in mid-latitudes. This experiment, referred to as GA3-N96, uses daily Met Office Operational observed sea surface temperatures (SSTs) and Sea Ice Analysis (OSTIA) data (Donlon 210 et al., 2012), sea ice fractions, and time-varying forcings CO2 concentrations and other external forcings consistent with the Atmospheric Model Intercomparison Project (AMIP) component of CMIP5 (Taylor et al., 2012;Ackerley et al., 2018). SSTs are specified such that the monthly means computed from the model outputs agree with the observations (Mizielinski et al., 2014). A second simulation, referred to as GA3-N216, is performed in all identical conditions to GA3-N96; except that the horizontal grid resolution is enhanced to N216 (~60 km in mid-latitudes). This simulation is repeated using the N512 horizontal 215 grid resolution (~25km in mid-latitudes) and referred to as GA3-N512. The aim of GA3-N216, GA3-N512, and GA3-N96 experiments is to document the influence of horizontal resolution on simulated climatological features of monsoon precipitation worldwide. One simulation is carried out continuously through 1982-2008 (27 years) using GA4.0 model https://doi.org/10.5194/wcd-2020-38 Preprint. Discussion started: 28 September 2020 c Author(s) 2020. CC BY 4.0 License. configuration, with spatial resolution and forcing conditions identical to GA3-N96. This experiment is referred to as GA4-N96 aims to explore the sensitivity of model performance to changes in the physical formulation.  Figure   1 illustrates the geographical areas considered for the regional nudging. 230

Validation datasets
To contextualize the MetUM Global Atmosphere configurations results, we have extended the analysis to include outputs of 20 atmospheric-only GCMs experiments, randomly selected, from the CMIP5 campaign (Table 2). These outputs were accessed through: cmippcmdi.llnl.gov/cmip5/data_portal.html and chosen to illustrate the broad range of CMIP5 behaviour, forced with observed SSTs and sea ice concentrations. By comparing MetUM Global Atmosphere configurations against 235 CMIP5 multi-models, we anticipate gaining further insights on the models' uncertainties in the monsoon regions.
The model validation process requires observational variables covering various time and space scales. But given the limited availability of routine meteorological observations and in situ data in monsoon regions, we base the validation on gridded datasets listed in Table 3. These include satellite estimates, gauges-based estimates, and reanalysis products including ERA-240 interim dataset, which is also employed to supply initial and atmospheric boundary conditions for the grid-point nudging strategy. ERA-Interim consists of 6-hourly estimates for three-dimensional (3D) meteorological variables and 3-hourly estimates of surface parameters, archived with a horizontal grid resolution of approximately 79km and 60 vertical levels (with a top at ~40km). Land precipitation are primarily assessed using the merged satellite estimates and in situ observations from the Climate Hazards Group InfraRed Precipitation with Stations (CHIRPS) data archive (Funk et al., 2015). CHIRPS dataset 245 is a quasi-global (50°S-50°N, 180°E-180°W), 0.05° resolution, 1981 to present gridded precipitation time serieswhich blends in more station data than other precipitation products and uses a high-resolution background climatology, providing better estimates of precipitation means and variations (Rivera et al., 2018).

Tropical climatology 250
This section assesses the ability of GA3.0 and GA4.0 to reproduce basic features of the mean tropical climate. It draws largely on the comparison between modelled quantities and corresponding observational estimates, using series of performance metrics and 'process based statistics'; developed for routine evaluation against gridded observations and reanalysis datasets of GCMs participating in fifth and sixth phase of the Coupled Model Intercomparison Project (Gleckler et al., 2008;Anav et al., 2013;Eyring et al., 2016;Stouffer et al., 2016). These performance diagnostics are widely used to quantify similarities between 255 models and observations and changes between different generations of models, and to explore uncertainties arising from internal variability, climate forcings and model formulations (Klein et al., 2013;James et al., 2018).
We begin by analysing GA3.0 and GA4.0 ability to reproduce contemporary climatic conditions at every grid point, and then aggregate the results over the tropical subdomains (30°N-30°S) for the period 1984-2005 (Fig. 2). The large-scale climate 260 responses of GA3-N96 and GA4-N96 is compared against CMIP5 models and observations, with view to portray models' behaviour relative to each other. The analysis is based on the space-time root mean-square error (RMSE) metric where individual model's RMSE is normalised by the median models' error, accounting for spatial patterns and annual cycles. Blue shading indicates the performance being better, and red shading indicates worse, than the median of all simulations. The median rather than the mean models' error is used to prevent outliers from influencing the results. Contributing models are 265 displayed along the horizontal axis, while large-scale climatic fields (as described in the caption) are shown on the vertical axis. Two reference datasets are used whenever possible to estimate the observational uncertainty and panel's grid squares are split into diagonals, with view to illustrate the relative error with respect to the primary (upper left triangle) and the alternate (lower right triangle) references. White triangles indicate the unavailability of an alternate reference dataset. For instance, precipitation reference datasets include merged satellite-gauge estimates from the Global Precipitation Climatology Project 270 (GPCP) and the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS). Wind is assessed through ERA-Interim and MERRA2 (Modern-Era Retrospective analysis for Research and Applications, Version 2) reanalyses.
In general, the performance of GA3-N96 varies across vertical levels and climate variables, and consistently outperforms that of most CMIP5. GA3-N96 wind components compare reasonably well against ERA-Interim dataset and displays negative 275 relative errors 10% to 20% smaller than the median at most vertical levels. Such high skills are only matched by EC-EARTH and GFDL-CM3, highlighting some benefits of using a seamless prediction system that forges weather forecasting and earthsystem modelling into a single modelling framework (Hazeleger et al., 2010). At the surface, GA3-N96 also reproduces basic feature of observed precipitation and air temperature fields, and its performance matches that of best CMIP5 models including EC-EARTH, MPI-ESM-LR, and GFDL-CM3, with relative errors of 10-20% smaller than the ensemble median model. There 280 https://doi.org/10.5194/wcd-2020-38 Preprint. Discussion started: 28 September 2020 c Author(s) 2020. CC BY 4.0 License. simulation improves with enhancing the horizontal resolution and GA3-N216 shows RMSE values ranging between 0-10% of the ensemble median. GA4-N96's performance displays a close similarity with GA3-N96, but there are also important differences with respect to wind components at 200 hPa, precipitation and clouds. There is a tendency for the ensemble multimodel mean and the median to agree more favourably with observations than any individual climate model. Furthermore, choice of the reference datasets does not alter the evaluation of model performance, indicating that inter-model uncertainty is 285 larger than observational uncertainty. This finding is consistent with the long-recognised behaviour of the coupled oceanatmosphere ensemble mean of models participating in the IPCC AR5 assessment reports (Gleckler et al., 2008;Flato et al., 2013). In the tropics, GA3-N96 and GA4-N96 large-scale behaviour is within the range of CMIP5 uncertainty and realistically capture 300 the trends of precipitation. On the figure's x-axis, GA3-N96 falls at the right of GPCP dataset indicating wet biases of ~17% (250 mm/year), which is comparable to biases found in CMIP5 models including HadGEM2-A, EC-EARTH, FGOALS-s2, CSIRO-Mk3-6-0, MPI-ESM-LR, MIROC5, BNU-ESM, and CESM1-CAM5. Similar biases are found in GA3-N216, GA3-N512, and GA4-N96 simulations. While no models exhibit MVI values close to the threshold of best performing simulations (0.5), GA3-N96 tends to outperform most CMIP5 models, with values ranging between 1.32-1.48. Best performing models in 305 terms of MVI include GA3-N512 and CESM1-CAM5, while worse results are found within FGOALS-g2, CMCC-CM and BCC-CSM1-1. The GPCP dataset indicates an equatorially symmetric structure of the GMI, with more intense precipitation in the tropical band and centred over core monsoon regions, and key features such as the intertropical convergence zone (ITCZ), the South Pacific convergence zone and Indo-Pacific warm pool. These features are well captured in GA3-N96 reference experiment. 320 GA4-N96 errors have virtually the same magnitude and spatial structure as GA3-N96, and tend to be larger over the ocean than land, with a markedly underestimate of precipitation intensity over the West African monsoon and South Asian summer monsoon region, from India to Southeast Asia and extending eastward over Indochina peninsula and Indonesia. Alternatively, precipitation is overestimated over most tropical land with positive bias values of 4 mm/day within the oceanic ITCZ and 6 mm/day over land, particularly in South Africa, South and Central America, and East Asia. In western Pacific, there is a tripole 325 error pattern from the equator to 45N and 45S. Precipitation intensity is generally too high over the equatorial Indian Ocean, south of India. The situation improves slightly with either increasing the spatial resolution (GA3-N216 and GA3-N512) or improving the sub-grid scale parametrizations (GA4-N96). The dry bias issue over the South Asian summer monsoon region is also a striking feature of HadGEM2-A, ACCESS1-0 models, CSIRO-Mk3-6-0, GISS-E2-R, and MIROC5 experiments. By contrast, most CMIP5 models overestimate precipitation over land in the South Asian summer monsoon region (Fig. 5). 330

Global and regional monsoon features
To further assess the precipitation pattern's agreement between models and observations in the major regional monsoon areas, we use the Taylor diagram approach. This provides a concise statistical summary of the model behaviour in terms of rootmean-square difference (RMSD), pattern correlation (PCC), and ratio of variances of the model errors for seasonal mean precipitation, with respect to reference datasets (Taylor, 2001). By "model errors" we refer to departure of the simulation from 335 observations, assuming implicitly that observational uncertainty is smaller than model errors.  diagram, the PCC between any model and the reference data is related to the azimuthal angle, and the RMSD between models and reference data is proportional to the distance between these two points. Biases, defined as the spatially averaged differences 345 between the modelled and observed time-mean, are not shown on this diagram. In West Africa, a region in which monsoon https://doi.org/10.5194/wcd-2020-38 Preprint. Discussion started: 28 September 2020 c Author(s) 2020. CC BY 4.0 License. precipitation is highly sensitive to the ITCZ latitudinal migration in summer and to organised mesoscale convective systems, GA3-N96 agrees favourably with the observations and model performance is comparable to that of best performing CMIP5 including MPI-ESM-LR, CMCC-CM, HadGEM2-A, and ACCESS1-0. PCC ranges from 0.9 to 0.95 and normalised RMSD is lower than 0.5 indicating a high fidelity in reproducing the observed patterns of precipitation. GA3.0 shows also high 350 performance over the Australian monsoon domain with PCC ranging between 0.95 and 0.99, and the ratio of model errors variances is close to 1. Conversely, GA3.0 shows medium performance over the South African and North American monsoons regions with PCC dropping to values between 0.6 and 0.8, and RSMD ranging between 0.5 and 1. There is an overall improvement of GA3.0 spatial variability with increasing the horizontal resolution. Conversely, GA3.0 and GA4.0 experience challenges to capture monsoon precipitation patterns in the South Asia, East Asia, and South America (Fig. 7). These are 355 regions with complex topographic features where performance is comparable to the tier of CMIP5 models and overestimate the spatial variability. GA3.0 and GA4.0 indicate relatively low spatial correlation coefficients (0.1 to 0.7) and high RMSD (1 to 1.5) values.
Since the spatial pattern of monsoon precipitation varies across regions, examining the timing and seasonality of simulated 360 precipitation will provide further insight on the model's ability. Figure 8 shows the monthly-mean cycle of modelled and observed precipitation in three monsoon regions, computed from averaging over land points only. GA3-N96 reproduces basic features of the time-evolution of the North American monsoon precipitation including the dry season from November to May, peak in August, and the subsequent rapid decline, though this analysis can be hampered by observational uncertainty (Fig 8,   Top). There is an overestimate of precipitation during the monsoon peak and since similar results are found in in the simulations 365 with the high spatial resolution (GA3-N216, GA3-N512) and in GA4-N96, we argue the issue is rooted within the model physics. Interestingly, HadGEM2-A and ACCESS-1-0 models agree more favourably with the observations and show improvement over GA3.0 and GA4.0 experiments.
The West African monsoon, particularly the Sahel region (10°W-10°E, 10°N-20°N), is dominated by a mono-modal 370 precipitation cycle which begins in May, reaches a peak of 4.5 mm/day in August when the ITCZ reaches its northernmost position, and terminates in October (Fig 8, Centre). GA3-N96 captures the timing and peak of precipitation within the observational uncertainty, but also indicates a too early monsoon onset occurring in March to June. This behaviour is consistent with most CMIP5 and persists even with increasing the horizontal resolution to N216 and N512. GA3-N96 generates too little precipitation in August, though the situation improves slightly with increasing the horizontal resolution. Over South America, 375 GA3-N96 shows high fidelity in mimicking the monsoon precipitation's seasonal cycle (Fig. 8, bottom). The model most noticeable bias is the overestimation of precipitation during the peak season [December-March (DJFM)], compared to GPCP-SG and CHIRPS observations. This behaviour is consistent with that of most CMIP5 models.
To reflect the monsoon variability from a circulation perspective, we assess the mean seasonal cycle of Webster and Yang index (WYI) in models and reanalysis, computed as a time-mean zonal wind (U) shear between 850 and 200 hPa, U850−U200, averaged over south Asia from the equator to 20°N and from 40° to 110°E (Fig. 9). The WYI provides a first-order measure of the South Asian Summer Monsoon strength and represents the combined effect of the low-level (850 hPa) westerly jet and upper-level (200 hPa) easterly jet, two important features of in this region. Generally, GA3-N96 reproduces well the timing and peak of monsoon strength and shows consistency with most CMIP5 models.

Impact of remote versus local atmospheric circulations errors 385
To disentangle the impact of remote versus local atmospheric circulation errors on the GA3.0 and GA4.0 performance, we focus on the Asian Summer Monsoon, a system for which the complexity causes great challenges to many AGCMs in simulating the associated climatological seasonal means and annual cycles of precipitation (Sperber et al., 2013;Singh, 2015;Cherchi et al., 2016;Annamalai et al., 2017;Anand et al., 2018). between wet summers and dry winters, but without land-ocean thermal contrasts, and which is not regarded as a monsoon region in this analysis.
Globally, GA3-N96 reproduces the extent of most domains experiencing dry-wet alteration in the tropical Asia, Australia, 415 Africa, and the Indian Ocean. But the model falls short in simulating the South Asian monsoon domain, with the ratio of local summer precipitation to annual rainfall being less than 0.55. The failure over the ASM domain persists with increasing the model horizontal resolution (GA3-N216 and GA3-N512) and enhancing the parametrizations of sub-grid scale processes (GA4-N96), and it is common to CSIRO-Mk3-6-0 and ACCESS1-0two climate models sharing some dynamical and physical components with the MetUM Global Atmosphere configurations (not shown). Meanwhile, constraining the 420 atmospheric circulations toward ERA-Interim reanalysis values leads to substantial improvement, especially if the grid-point nudging technique is applied over the tropical and ASM domains (GA3-NUDG-TROPICS and GA3-NUDG-ASM). Figure 13 shows that the observed and GA3-N96 core South Asian summer precipitation occurs over land during June-July-August-September (JJAS). The observations show that the monsoon onset occurs with an abrupt increase in precipitation 425 during early June, peaking during July and followed by a retreat at a relatively slower rate as compared to the onset. GA3-N96 captures the timing of the climatological annual cycle of precipitation, but there is a significantly dry bias with respect to the total amount. The annual cycle improves considerably with the ASM and tropical atmospheric circulations realistically constrained. We hypothesize that poor simulations of precipitation over South Asia by GA3.0 and GA4.0 are attributable partly to atmospheric circulations errors over and excessive precipitation over the southwest equatorial Indian Ocean, rather 430 than to tropical atmospheric responses of varying forcing fields, such as SST over the Arabian Sea, aerosols, and growing greenhouse gas emissions as suggested by several studies for some CMIP5 models (Hsu, 2016;Kitoh et al., 2013;Levine et al., 2013). But this needs further examination, which is beyond the scope of present paper.

Conclusion
Based on the Coupled Model Intercomparison Project/Atmospheric Model Intercomparison Project AMIP 435 framework, which uses observational SST and sea ice to drive AGCMs simulations, we have evaluated the features of regional and global monsoon hydroclimates in the MetUM Global Atmospheric configurations GA3.0 and GA4.0two configurations developed for seamless use across climate and weather time scales. GA4.0 uses dynamical and physical cores that are largely identical to GA3.0 but includes some improvements in the physics. GA3.0 is first integrated to explore the influence of resolution on model performance, using separately three horizontal resolutions: a low (~135-km in mid-latitudes), medium 440 (~60 km in mid-latitudes) and high (~25km in mid-latitudes). Second, GA4.0 is integrated following the exact condition of GA3.0 reference experiment to explore the impact of changes in the parametrization of sub-grid scale processes. Third, we further investigate, for the first time, the impact of remote versus local atmospheric circulation errors on the GA3.0 https://doi.org/10.5194/wcd-2020-38 Preprint. Discussion started: 28 September 2020 c Author(s) 2020. CC BY 4.0 License. performance in monsoon regions using the "grid-point nudging" or "Newton relaxation" technique -whereby the AGCM's atmospheric circulation fields are fully constrained by reanalysis over or outside a prescribed domain in view to examine the 445 consequences outside or inside this domain, and to provide the most realistic representations of the atmosphere at a given time.
To provide further context, the results are compared against 20 atmospheric-only GCMs simulations from the CMIP5 dataset.