Articles | Volume 5, issue 2
Research article
17 Apr 2024
Research article |  | 17 Apr 2024

Quantifying uncertainty in simulations of the West African monsoon with the use of surrogate models

Matthias Fischer, Peter Knippertz, Roderick van der Linden, Alexander Lemburg, Gregor Pante, Carsten Proppe, and John H. Marsham

Simulating the West African monsoon (WAM) system using numerical weather and climate models suffers from large uncertainties, which are difficult to assess due to nonlinear interactions between different components of the WAM. Here we present a fundamentally new approach to the problem by approximating the behavior of a numerical model – here the Icosahedral Nonhydrostatic (ICON) model – through a statistical surrogate model based on universal kriging, a general form of Gaussian process regression, which allows for a comprehensive global sensitivity analysis. The main steps of our analysis are as follows: (i) identify the most important uncertain model parameters and their probability density functions, for which we employ a new strategy dealing with non-uniformity in the kriging process. (ii) Define quantities of interest (QoIs) that represent general meteorological fields, such as temperature, pressure, cloud cover and precipitation, as well as the prominent WAM features, namely the tropical easterly jet, African easterly jet, Saharan heat low (SHL) and intertropical discontinuity. (iii) Apply a sampling strategy with regard to the kriging method to identify model parameter combinations which are used for numerical modeling experiments. (iv) Conduct ICON model runs for identified model parameter combinations over a nested limited-area domain from 28° W to 34° E and from 10° S to 34° N. The simulations are run for August in 4 different years (2016 to 2019) to capture the peak northward penetration of rainfall into West Africa, and QoIs are computed based on the mean response over the whole month in all years. (v) Quantify sensitivity of QoIs to uncertain model parameters in an integrated and a local analysis.

The results show that simple isolated relationships between single model parameters and WAM QoIs rarely exist. Changing individual parameters affects multiple QoIs simultaneously, reflecting the physical links between them and the complexity of the WAM system. The entrainment rate in the convection scheme and the terminal fall velocity of ice particles show the greatest effects on the QoIs. Larger values of these two parameters reduce cloud cover and precipitation and intensify the SHL. The entrainment rate primarily affects 2 m temperature and 2 m dew point temperature and causes latitudinal shifts, whereas the terminal fall velocity of ice mostly affects cloud cover. Furthermore, the parameter that controls the evaporative soil surface has a major effect on 2 m temperature, 2 m dew point temperature and cloud cover. The results highlight the usefulness of surrogate models for the analysis of model uncertainty and open up new opportunities to better constrain model parameters through a comparison of the model output with selected observations.

1 Introduction

The West African monsoon (WAM) is a prominent seasonal large-scale circulation feature associated with a deep northward penetration of rainfall into West Africa during the boreal summer months, usually peaking in August (Hastenrath1991). The precipitation associated with the WAM is crucial for the livelihoods of hundreds of millions of people and has great socioeconomic impacts through effects on agriculture, energy production, water resources and health (Haile2005; Paeth et al.2008). The WAM, conceptually depicted in Fig. 1, constitutes a complex deep overturning circulation whose formation, maintenance and variability are governed by various regional and remote forcings (Hall and Peyrillé2006). One of its main initial drivers is the large temperature and thus pressure gradient between the hot, dry and often dusty Sahara manifested in the Saharan heat low (SHL) and cooler, moister conditions over the tropical Gulf of Guinea. The marked discontinuity between these fundamentally different air masses, the intertropical discontinuity (ITD), which lies around 20° N during boreal summer, is associated with shallow and dry overturning only (Nicholson2009; Thorncroft et al.2011). Abundant deep convection is rather observed in a band south of the ITD, often called the monsoonal rain belt. There, mainly between 8 and 13° N, the bulk of summertime precipitation is produced by frequently passing large convective systems with a high degree of organization (Mathon et al.2002; Lebel et al.2003; Lebel and Ali2009).

Figure 1Schematic illustration of the WAM system in a height–latitude display (inspired by Fink et al.2017), including the TEJ, the AEJ, the SHL, the ITD, 2 m temperature (T2 m) and 2 m dew point temperature (Td 2 m). The main rainfall area is indicated by light blue shading. Circulation in the height–latitude plain is depicted through streamlines. The approximate latitudinal position of the Guinea Coast is also given.


The monsoonal rain belt is enclosed by two distinctive dynamical features, the African easterly jet (AEJ) to the north and the tropical easterly jet (TEJ) to the south. The AEJ, a pronounced easterly jet at around 600–700 hPa maintained by the low-tropospheric meridional temperature gradient, regularly features wave disturbances. These so-called African easterly waves (AEWs) with wavelengths between 2000 and 5000 km and periods of 2–7 d (Burpee1972; Reed et al.1977; Kiladis et al.2006) strongly modulate convection, mainly by enhancing vertical wind shear to levels favorable for the generation of organized squall lines (Fink and Reiner2003). In the upper troposphere, the WAM circulation is characterized by a jet-like intensification of the tropical easterlies. This distinct easterly current observed between 5 and 20° N, called TEJ, evolves over the South Asian monsoon system, where it is also the strongest, and extends westward to Africa under gradual weakening (Flohn1964). Previous studies have demonstrated that seasonal-mean WAM rainfall is strongly correlated with the intensity of the TEJ over West Africa (Grist and Nicholson2001). At least on shorter timescales, the TEJ is, however, mainly thought of as a passive feature, which can intensify after periods of increased convective activity through the enhanced divergent outflow at upper levels (Lemburg et al.2019).

Despite its outstanding importance for the region, simulations of the WAM spanning timescales from weather to climate are fraught with substantial uncertainties. With respect to weather forecasts, Vogel et al. (2018, 2020) showed that ensemble predictions of rainfall over tropical Africa have the lowest skill throughout the tropics and are often barely better than climatological forecasts (Walz et al.2021), even after the removal of systematic errors through statistical post-processing. This poor performance is partly related to errors stemming from initial condition uncertainty in a region known for a sparse operational network (e.g., Parker et al.2008; Fink et al.2011). Moreover, there appear to be issues with data assimilation, as the availability of additional observations during field campaigns shows relatively small improvements (Agustí-Panareda et al.2010; van der Linden et al.2020). In weather forecasts, but also in mean-state focused simulations (beyond the problem of initial state uncertainty), the representation of the WAM and its features is affected by various model uncertainties. Shortcomings in adequately simulating small-scale diabatic processes such as deep moist convection not only directly impact rainfall prediction skill but may further impose errors in the entire WAM circulation, as it is – like many tropical large-scale flows – strongly driven by the diabatic heating of the troposphere (Marsham et al.2013; Martin et al.2017). Model-related uncertainties regarding the representation of deep convection and other physical processes are also reflected on climate timescales where many models struggle to realistically reproduce the rainfall distribution over the WAM region and its seasonal evolution (Cook and Vizy2006; Xue et al.2010; Vellinga et al.2013). Considerable problems are also evident on paleoclimate timescales, with many models struggling to accurately describe the magnitude and time of precipitation changes of the African humid period during the Holocene, which amongst other things led to a green Sahara (Claussen et al.2017; Brierley et al.2020).

How can we improve model simulations over West Africa? The most obvious way is trying to improve the numerical model itself. Janicot et al. (2011) argued that biases and uncertainties can be substantially reduced if processes on weather timescales are better understood and defined. They therefore underlined the necessity of analysis on shorter timescales to improve not only weather forecasts but also climate predictions. As mentioned in the paragraph above, the correct representation of diabatic processes, most of them indeed acting on short timescales and rather small spatial scales, in particular still constitutes a major challenge. In this regard, a key problem is model uncertainties associated with grid resolution and parameter choices in the representation of sub-grid-scale processes. For example, the explicit or parameterized representation of deep convection has a large effect on the amount, spatial distribution and diurnal cycle of precipitation, with substantial impacts on the large-scale dynamics and thermodynamics, even beyond the African continent (Marsham et al.2013; Pante and Knippertz2019; Kendon et al.2019). Matsui et al. (2018) found that the treatment of radiation in their model affects precipitation, low clouds and the entire WAM circulation, while Tchotchou and Kamga (2009) highlighted the deficiencies in selected convection schemes in simulating the monsoon rainfall accurately. Gbode et al. (2018), Flaounas et al. (2011) and Klein et al. (2015) considered microphysical, convective and boundary layer processes and found substantial influences of process parameter variations on the accuracy and spread of precipitation and other outputs. In other studies, effects of different meteorological phenomena and boundary conditions on the WAM were investigated. For instance, Kniffka et al. (2019) highlighted that variations in low-level clouds can have a substantial impact on precipitation. Zheng and Eltahir (1998) and Hopcroft et al. (2017) investigated the influence of vegetation, where the former considered variations in the meridional distribution of vegetation on a weather timescale and the latter revealed the relationship between past vegetation coverage and climate for the mid-Holocene. Messager et al. (2004) found that the sea surface temperature (SST) appears to be a major factor in the seasonal and interannual monsoon precipitation regime.

The abovementioned studies were conducted to assess isolated relationships between certain model parameters and simulated WAM quantities. A general problem of this approach is that it is very challenging to study the combined effects of several sources of uncertainty at once. Nonlinear interactions and buffering effects will make it nearly impossible to deduce such effects from single-parameter perturbation experiments. Ideally one should conduct experiments across a wide range of parameter combinations, but this will very quickly become too expensive, as a certain simulation period is required to separate differences from day-to-day weather noise.

An attractive alternative to such a costly approach is surrogate models – also known as emulators or meta-models – which allow for a comprehensive but resource-friendly statistical investigation of the sensitivity of QoIs to uncertain model parameters (Cheng et al.2020). This approach has gained increasing popularity in nearly all scientific fields, such as engineering (e.g., Sudret2014), chemistry and economics, within the past 3 decades (Cheng et al.2020). For this purpose, outputs of simulations with a numerical model are used as training data to develop the surrogate models, which can then be used for a comprehensive sensitivity analysis. In meteorology, many different weather and climate models have been used for the application of surrogate models. For instance, methane-emission-related parameters (Müller et al.2015) and hydrological parameters (Ray et al.2015) have been considered for model calibration. Fletcher et al. (2018) studied the impact of aerosol forcing and atmospheric parameters on climate sensitivity, where two cloud- and convection-related parameters showed the strongest impacts. Lee et al. (2011) studied cloud condensation nuclei (CCN) sensitivity to eight emission and microphysical process parameters and found that uncertainty in the sulfur emissions explains 80 % of the output variance.

There exists a range of methodological approaches for surrogate models. Among these, Gaussian process regression, also known as kriging, is the most popular one in meteorological literature and has for instance been applied by Williamson (2015), Lee et al. (2011) and Fletcher et al. (2018). Alternatives include polynomial regression (Holden et al.2009), polynomial chaos expansion (Massoud2019), radial basis functions (Müller et al.2015), neural networks (Lu and Ricciuto2019) and combinations of these (Ray et al.2015). For the construction, appropriate sampling strategies are used to define the training points for the surrogate model. In most cases in meteorological literature, Latin hypercube sampling (Morris and Mitchell1995) is used (e.g., Lee et al.2011; Lu and Ricciuto2019), but in some studies other methods are applied, such as quasi-Monte Carlo sampling (e.g., Ray et al.2015) and polynomial chaos-based approaches (e.g., Massoud2019).

Universal kriging (Matheron1969) is a general form of Gaussian process regression, where explicit basis functions can be incorporated to describe certain relationships in the regression technique based on prior knowledge of the problem. In meteorology, universal kriging has been applied in several studies such as by Glassmeier et al. (2019), Wellmann et al. (2020) and Diamond et al. (2020), where either linear or quadratic basis functions are used as trend functions. However, to the best of the authors' knowledge, universal kriging with explicit nonlinear basis functions other than polynomials has not been applied in connection with meteorological applications. Furthermore, there have not been many studies regarding criteria for the choice of basis functions for universal kriging.

This study aims at quantifying the uncertainty contributions and effects of selected model parameters on a variety of QoIs and output fields that characterize the WAM system. There has been no such study that also includes potential interactions of multiple model parameters. The Icosahedral Nonhydrostatic (ICON) model, the operational weather prediction model of the German Weather Service (DWD), is used to simulate the rainy seasons in 4 years in limited-area mode. We investigate the influence of six model parameters that are expected to have substantial impacts on the WAM characteristics. For each of them, probability density functions (PDFs) are assigned based on the literature and expert knowledge. Maximin Latin hypercube sampling (Morris and Mitchell1995) is applied in order to define optimal parameter combinations. From the output fields of each run, QoIs are computed that represent the characteristics of the WAM system, namely monthly accumulated precipitation, latitudinal position of the WAM rain belt, location and strength of the TEJ and the AEJ, location and extent of the SHL, latitude of the ITD, and spatially averaged output fields (e.g., 2 m temperature, cloud cover). Universal kriging is then used to obtain a surrogate model for each QoI, which describes the relationship between all uncertain model parameters and the QoI. The surrogate models serve to carry out sensitivity and parameter studies. Here, the global sensitivity analysis (GSA) evaluates the quantitative influence of the PDFs of all model parameters on the variability of the QoIs, whereas the parameter studies involve varying one parameter at a time to observe the relationship between the parameter and each QoI. The results indicate for which parameters (and thus processes) uncertainties need to be reduced to lower the spread in simulated QoIs.

The paper is organized as follows: in Sect. 2, the applied methods are explained, including surrogate modeling methods and the ICON model setup. In Sect. 3, results of the conducted analyses are presented and discussed, including model validation, GSA and the parameter studies. Section 4 provides a summary, the main conclusions and a short outlook.

2 Data and methods

This section details the applied methods and employed datasets. In Sect. 2.1, PDFs are assigned to considered uncertain model parameters. The surrogate modeling procedure is explained in Sect. 2.2, including the definition of training points, universal kriging, model validation and global sensitivity analysis. In Sect. 2.3, the ICON model setup and considered model outputs are presented. Considered QoIs and their computation are shown in Sect. 2.4. Finally, a procedure for local parameter studies is presented in Sect. 2.5.

2.1 Selected uncertain model parameters

A crucial first step on the way to develop surrogate models is to identify relevant uncertain model parameters and to define meaningful PDFs representing our best knowledge of the associated epistemic uncertainty. Based on experience from sensitivity studies, literature review and expert judgment, we take into consideration six parameters which cover a fairly broad spectrum of the model's physics. These are the grid-scale microphysics (zvz0i), turbulence (tkhmin), land–surface interaction (c_soil) and in particular the parameterization of deep convection (entrorg, rhebc_land_trop, rcucov_trop). For the purpose of the analysis in this work, the parameters are grouped into three pairs with regard to their physical implication, namely deep-cloud (entrorg, zvz0i), below-cloud (rhebc_land_trop, rcucov_trop) and boundary layer (tkhmin, c_soil) parameters (see Table 1).

Table 1Selected uncertain model parameters and a short description, the assumed PDF and physical unit.

* For lognormal distribution: μ and σ are the mean and standard deviation of the variable's natural logarithm.

Download Print Version | Download XLSX

The entrainment rate (entrorg) controls the mixing of ambient air into convective plumes. Depending on the free-tropospheric humidity, higher entrorg values may lead to decreased buoyancy within the convective plumes and possibly reduced convective rainfall. The terminal fall speed of ice crystals (zvz0i) determines the lifetime of cirrus clouds and therefore average high-level cloud cover. Particularly in the tropics, this parameter may strongly influence cloud-radiative heating rates, which can, in turn, feed back on the large-scale circulation. Despite the different physical influence of the entrainment rate and the terminal fall velocity of ice, the overall effects are known to be similar: Reinert et al. (2019) found that less entrainment increases the tops of tropical convection and thus the production of cloud ice in the upper tropical troposphere. This needs to be accompanied by faster cloud ice sedimentation in order to keep the radiative forcing at a similar level. This is why the DWD varies these two parameters inversely in the ensemble physics perturbations (a probabilistic forecast where model parameters are varied to generate a range of possible outcomes to account for uncertainties in the model's physics) to keep the systematic impact on the model climate small (Reinert et al.2019). The below-cloud parameters concern the computation of evaporation in convective regions. The parameter rhebc_land_trop refers to a relative humidity threshold below which evaporation occurs below the cloud base in convectively active grid cells over tropical land areas. The parameter rcucov_trop estimates – again specifically for the tropics – the areal fraction of convection within a grid cell that is used for the calculation of evaporation below the cloud base. In contrast to rhebc_land_trop, which uses a threshold value for relative humidity and thus mostly affects areas where relative humidity is close to that threshold, the parameter rcucov_trop affects evaporation in a more general sense and thus over most of the domain. The choice of tkhmin influences the turbulent diffusion of heat and moisture, which can in some situations impact cloud formation. Some level of vertical diffusion is practically always present in reality. In the case of highly stable conditions and weak vertical wind shear, however, this diffusion is underestimated by the turbulence parameterization. Therefore a minimum vertical diffusion (i.e., tkhmin) is set in the model to counteract this underestimation (Raschendorfer2012). The parameter c_soil denotes the evaporating fraction of soil in the form of a unitless fraction. Higher values lead to higher relative humidity, which can possibly increase cloud cover. Particularly for entrorg, rhebc_land_trop and rcucov_trop the net effect on area- and time-integrated precipitation is uncertain, as it strongly depends on the meteorological context.

In various meteorological studies and applications, uniform parameter distributions over an estimated range of plausible values are assumed (e.g., Wan et al.2014), as is also the case for the operational ensemble forecast of DWD (Reinert et al.2019). This is reasonable if limited information is available about the considered parameters and where the main purpose of the parameter variation is to induce spread in the ensemble forecast to better reflect the full forecast uncertainty. Since the definition of PDFs is often challenging and vague, we emphasize that using uniform ranges is often preferable. However, in the case of a fundamental sensitivity analysis, a uniform distribution is not necessarily a good choice, as there is no physical foundation for assuming a jump in the PDF from a constant value to 0 at the upper and lower limits. Values near the uniform range limits would have a disproportionate influence in the global sensitivity analysis, while those just outside the range would contribute nothing. Although defining alternative distributions presents a challenge in its own right, these are considered to be more appropriate in this study. Non-uniform PDFs for the parameters considered in this study have already been used by Lang et al. (2021) and Ollinaho et al. (2017), where normal and lognormal distributions are applied to represent parameter uncertainties. In our study, one source for the definition of PDFs is the mean values and ranges that are used for operational ensemble forecasts by the DWD (DWD2019), including further expert knowledge. Generally, the functions are defined in such a way that physical constraints or symmetry preferences are fulfilled; e.g., parameters which are strictly positive are described by functions that can only attain positive values (e.g., lognormal PDFs), and parameters which are bounded between 0 and 1 (i.e., that describe percentages of a certain quantity) are well described by a beta distribution. The selected model parameters and the assigned PDFs are shown in Table 1. Illustrations of the PDFs are shown in Sect. 3 with the results (see Fig. 5).

2.2 Surrogate modeling procedure

In order to represent the relationship between the uncertain model parameters listed in Table 1 and QoIs in a computationally effective way, surrogate models, also known as emulators or meta-models, are used. In this work, we use Gaussian process regression, also known as kriging, due to its flexibility and robustness. In order to build a surrogate model, training points (i.e., sets of combinations of the uncertain model parameters) are defined through an experimental design, and ICON model runs are conducted for these points. The individual steps necessary to develop the surrogate models are explained in the following subsections.

2.2.1 Training points

In order to build a surrogate model, training points for the model parameters have to be defined based on the PDFs specified in Sect. 2.1. Hereafter, we refer to the model parameter space as input space, as commonly done in the scientific discipline of uncertainty quantification (UQ). Since probability varies substantially across the input space, the density of points selected in the parameter space for global sensitivity analysis corresponds to the probability density function (PDF), resulting in regions of higher probability being sampled more frequently. Therefore, it is considered meaningful to train the model with higher accuracy in these regions. However, this method inherently leads to a reduced focus on areas of lower probability, which, despite being sampled less frequently, are still essential for a comprehensive global sensitivity analysis. The assumption that prioritizing areas of higher probability leads to more accurate sensitivity analysis outcomes requires further scientific investigation. Furthermore, sequential algorithms can be employed to supplement the base design with additional training points in regions where enhanced model accuracy is required. Additionally, if the sole objective was to develop a surrogate model with uniform accuracy across the entire parameter space, including the tails of the PDFs, then employing a uniform density of training points would be more appropriate. In our case, using more training points in regions with higher probability leads to an experimental design with inhomogeneous space-filling properties where surrogate modeling methods may struggle. As a consequence, the trained surrogate models may have problems with predicting QoIs in the tails of the PDFs. Therefore, we transform the physical (hereinafter used to denote parameter PDFs according to Table 1) input space to an independent and identically distributed (i.i.d.) uniform input space. In the transformed uniform input space, which can be thought of as a multidimensional unit hypercube, every region is associated with equal probability, and thus we can apply a space-filling sampling technique. In particular, we use maximin Latin hypercube sampling (Morris and Mitchell1995) to define 60 training points. We use the recommendation given by Loeppky et al. (2009) to choose the number of training points as n=10p, where p is the number of input dimensions (p=6 in our case).

For the sake of simplicity and interpretability of the results, the model parameters are kept temporarily and spatially constant during individual model runs. Therefore, one training point corresponds to a fixed set of the six model parameters which is used for one ICON model run.

The number of necessary training points strongly depends on the nonlinearity of the investigated problem. Therefore, validation (see Sect. 2.2.3) of the surrogate model remains inevitable. The training points obtained from the experimental design are transformed back into the physical input space and are then used for the configuration of the respective ICON model runs. From the outputs of the ICON model simulations for all training points, QoIs are computed as described in Sect. 2.4.

2.2.2 Gaussian process regression

In this study, we aim at describing a relationship between six model parameters and selected QoIs. We construct a separate surrogate model for each QoI, which can later be used to employ sensitivity studies with significantly reduced computational cost.

Among available surrogate modeling methods, Gaussian process regression offers wide flexibility and potential for extensions and is therefore used in this study. We apply the universal kriging method, a general form of Gaussian process regression, where explicit basis functions can be incorporated. We base our choice on Fischer and Proppe (2023), where suitable basis functions for transformed input spaces have been proposed and were shown to be very effective. This method is meaningful to apply in this work, since PDFs are assigned to model parameters of different orders of magnitude, and the input space is thus transformed to i.i.d. uniform random variables in order to avoid ill-conditioned problems and to apply space-filling sampling techniques.

Our aim is to build a surrogate model  for a QoI y based on function evaluations (here integrated quantities from ICON model simulations) y={yi,i=1n} at n training points X={xi,i=1n}. The prediction mean and prediction variance at a set of input points X*={xi,i=1l} are to be determined.

For the purpose of universal kriging,

(1) g ( x ) = f ( x ) + h ( x ) β

is used, with a 0 mean Gaussian process f(x)GP(0,k(x,x)) and vectors of known basis functions h(x)={hj(x),j=1q} and unknown coefficients β={βj,j=1q}.

Here, the anisotropic form of the radial basis function

(2) k ( x , x ) = θ 0 exp - i = 1 p | x i - x i | θ i 2

with respect to hyperparameters θ={θi,i=0p} is used as kernel to allow for different levels of smoothness between input dimensions. This makes sense, as the relationships between input parameters and QoIs are not known in advance and may differ substantially between the parameters. In addition to the radial basis function, the Matérn kernel function has also proven to be effective in the literature, particularly for its ability to better capture sharp jumps (Rasmussen and Williams2005). In our work, the radial basis function has proven to be practical and sufficient.



are the vectors and matrices of kernel function evaluations, and


are the matrices of basis function evaluations at training points X and prediction points X*, respectively.

The prediction mean and prediction variance, as shown by Rasmussen and Williams (2005), can be derived as


with μ=(HKy-1H)-1HKy-1y, R=H*-HKy-1K* and Ky=K+σn21.

Here, additive i.i.d. Gaussian noise with variance σn2 is considered, where σn2 is treated as another hyperparameter to allow for aleatoric uncertainties, i.e., uncertainties that are attributed to weather noise in the ICON simulations. In reference to the geostatistical origin of the method, this corresponds to the nugget effect (Matheron1969). The hyperparameters θ and σn2 are determined by maximum likelihood estimation (Rasmussen and Williams2005). The noise level σn2 can then provide insight into aleatoric uncertainties in the ICON model. To speed up optimization, the gradient of the log marginal likelihood with respect to the hyperparameters θ can be incorporated. For universal kriging, corresponding equations are given in Fischer and Proppe (2023).

The selection of basis functions for universal kriging is a crucial step because prediction accuracy of the surrogate model may strongly depend on it. Oakley (2004) emphasizes that the basis functions should be chosen to incorporate any beliefs regarding the problem, e.g., the physical evolution of the output variable depending on the input parameters. When applying universal kriging, low-order polynomials are usually used, which can often approximate physical relationships relatively well for small input parameter ranges. In addition, compared to higher-order polynomials, the risk of overfitting can be reduced.

In our work, we consider transformed input spaces. Fischer and Proppe (2023) suggested using transformed basis functions to account for the input space transformation. In a general case with correlated input parameters, the Rosenblatt transformation (Rosenblatt1952) can be used. However, in our case we consider uncorrelated input parameters. Thus, the Rosenblatt transformation can be expressed as quantile functions (inverse cumulative distribution functions) with respect to the physical basis function of the individual model parameters (see Fischer and Proppe2023, for more details).

In our study, we assume linear basis functions in the physical input space, which are transformed into the i.i.d. uniform space by the Rosenblatt transformation. Assuming linear basis functions in the physical input space is considered reasonable, since most parameter ranges are relatively small compared to their absolute values, and linear relationships may be sufficient in order to represent a global trend. Furthermore, quadratic basis functions would in a general case imply the inclusion of np=p(p+1)/2+p+1 basis functions, which would yield np=28 basis functions for p=6 input parameters. This number is relatively high compared to the number of training points of n=60. Here, n is desired to be at least 2 to 3 times higher than the number of the polynomial basis function np to achieve sufficiently small generalization errors (see, e.g., the work by Sudret2014, within the context of polynomial chaos expansion). The number of basis functions does not change through the transformation. Thus, we presume the linear basis function

(5) h ̃ x ̃ = 1 , x ̃ 1 , x ̃ 2 , x ̃ p

with respect to the physical input parameters x̃i. By applying the Rosenblatt transformation (Fischer and Proppe2023), we obtain a transformed basis function h(x)=h̃(Tros-1(x)) with respect to the i.i.d. uniform input parameters xi. For the independent physical parameters x̃i, the vector of basis functions from Eq. (5) becomes

(6) h ( x ) = 1 , CDF - 1 x 1 , CDF - 1 x 2 , CDF - 1 x p ,

where CDF−1 is the inverse cumulative distribution function (or quantile function). This expression is used in Eq. (1) and subsequent equations.

2.2.3 Model validation

Obtained surrogate models need to be validated to assess their accuracy, which depends on various factors, e.g., the number of training points, the choice of basis functions and nonlinearities in the physical model. As a validation criterion for a surrogate model , the root mean squared error (RMSE) and the normalized mean squared error (NMSE) are used as follows:


Here, {xval,i,yval,i,i=1nval} is a set of nval validation points obtained from evaluations of the numerical model, and σyval2 is the variance of the evaluation yval,i. While the RMSE offers insight into the error in absolute values, the NMSE is a dimensionless measure which allows for better comparison between the QoIs.

Because of high computation cost, using a separate validation set that is not used for model training is not effective. Therefore, cross-validation techniques, such as leave-one-out validation or leave-k-out validation, can be applied. The validation errors for leave-one-out validation can be formulated as


where i is the surrogate model obtained when using all n training points except for the ith one, and σy2 is the variance of evaluations yi. We use leave-k-out cross-validation with k=2 as a compromise between validation accuracy and computation time. We emphasize that it is important to use generalization errors instead of measures for goodness of fit such as the coefficient of determination R2, since the effect of overfitting is thereby not considered.

Model accuracy is considered to be high if NMSE values are close to 0 and low if NMSE values are close to 1. By definition, values are non-negative and should not exceed 1, as the covariance between the surrogate model and data would in that case be higher than the variance of the data itself. Interpretation of the NMSE could become problematic if QoI values do not substantially change and the variance σyval2 is very small. In such cases, the RMSE should be considered.

2.2.4 Global sensitivity analysis

In order to quantify the relative magnitude of the dependency of the QoIs on the uncertain model parameters, global sensitivity analysis is used. We apply FAST (Saltelli et al.1999), a variance-based sensitivity analysis, where main effect and total effect sensitivity indices can be determined. This method has been used in many meteorological studies (e.g., Massoud2019; Fletcher et al.2018). Main effect indices indicate the contribution to the output variance of varying one model parameter alone, averaged over variations in other model parameters. Total effect indices indicate the contribution to the output variance of one model parameter, including all variance caused by its interactions with other model parameters. Comparison between main and total effect indices provides information on the extent to which the interactions between the model parameters contribute to the variations in the QoIs.

2.3 The ICON model

2.3.1 Model setup

The Icosahedral Nonhydrostatic (ICON) model (Zängl et al.2015), the operational forecast system of the DWD, is used here as the full-physics numerical model to simulate the WAM. For this purpose, we employ the 2.5.0 model version in a limited-area nested configuration, where a 26 km grid spacing for the outer region and a 13 km grid spacing for the inner region are used. The outer area extends from 28° W to 34° E and from 10° S to 34° N with the nested domain 2° smaller in each direction as shown in Fig. 2. At the outer boundary, ERA5 reanalysis data (Hersbach et al.2020) are used. ERA5 data are available hourly but are updated every 6 h in our simulations to limit the amount of data and computation cost. Apart from this, the model setup, including all namelist parameters, is based on the configuration used in the operational global setup by the DWD. Pante and Knippertz (2019) have already obtained reasonable simulation results for the West African region with a similar model setup, although the convection parameterization turned out to be problematic for precipitation forecasting. To separate sensitivities that are related to model parameters from weather noise and to reduce the influence of initial conditions, a sufficiently long simulation period is required. At the same time we aim to fully concentrate on the peak of the WAM in boreal summer. To account for both points, we concentrate on August data from the 4 years between 2016 and 2019. Each simulation starts on 21 July and is run for 41 d, but only the data from 1 to 31 August are analyzed in order to reduce initial condition influence. First tests showed that by using simulations from a single year, fluctuations in the considered QoIs were still relatively high, reflecting aleatoric uncertainties caused by small-scale chaotic behavior of the atmosphere. In order to obtain a more robust measure while keeping computational cost manageable, studying rainy seasons in 4 years turned out to be a good compromise. All QoIs are thus averaged over these four August periods and used as training points for the surrogate models. Preliminary studies with only a single August month were still relatively volatile with respect to large uncertainties in the surrogate model, which could then be strongly reduced by including 4 months. Validation of the surrogate model is essential to ensure that averaging over 4 years does not lead to non-realistic behavior in the average. The results of this validation should demonstrate a robust signal, indicating that the process results in a smoothing of individual signals rather than their cancellation. Using data from 4 different years also has the advantage of representing different states of SSTs, which are prescribed as boundary conditions and are based on the SST analysis at model initialization time. During the simulation the SST is updated incrementally based on its annual climatological cycle (Reinert et al.2019).

Figure 2ICON model setup, outer domain with 26 km grid spacing (green), inner domain with 13 km grid spacing (brown) and domain for which output data are stored (blue).

2.3.2 Selected model output

Simulation data are stored with a horizontal resolution of 0.1° within the region from 0 to 25° N and 15° W to 15° E (see Fig. 2) from 1 to 31 August of the years 2016, 2017, 2018 and 2019. This region and time range are hereafter denoted as study region and study time. The selected model outputs that represent key characteristics of the WAM are listed below. The temporal resolution of the recorded data is specified with the outputs. Notably, a finer resolution is applied to cloud cover data to accurately capture its anticipated higher variability:

  • cloud cover at high (>7 km), middle (2–7 km) and low (<2 km above ground) levels (%), 1-hourly;

  • column-integrated water vapor (kg m−2), 3-hourly;

  • precipitation (mm per month), 3-hourly;

  • 2 m temperature (K), 3-hourly;

  • 2 m dew point temperature (K), 3-hourly;

  • mean sea level pressure (Pa), 3-hourly;

  • u and v wind at pressure levels 200 and 600 hPa (m s−1), 3-hourly.

The data necessitate approximately 475 GB of storage space.

The output quantities from the ICON simulations are validated. In Sect. 2.2.3, validation of surrogate models was introduced to assess their accuracy based on given ICON model simulations, ignoring that the ICON simulations do not represent the true state of the atmosphere. For the validation in this section, the ICON model output is compared to our best estimate of the true state of the atmosphere averaged over the simulated time taken from ERA5 data. For precipitation, GPM IMERG data (Huffman et al.2019) are additionally included as reference. For this purpose, the ICON model output (horizontal resolution of 0.1°), native ERA5 data (horizontal resolution of 0.25°) and native GPM IMERG data (horizontal resolution of 0.1°) are linearly remapped on a rectangular grid with a mesh size of 0.5°. Similar to Sect. 2.2.3, the measures


are used as validation criteria. Here, yi is the averaged output value over all ICON simulations, and yval,i is corresponding ERA5 (or GPM IMERG) data at locations i=1 … nval, where nval is the number of grid points of the remapped grid. For the NMSE, the mean squared error is normalized with respect to the spatial variance of data σy2 over the considered domain. We emphasize that the spatial variance strongly depends on the considered domain. However, it is still used to obtain dimensionless reference values instead of using the temporal variance because the latter may become 0 in certain regions (e.g., precipitation or cloud cover in the Saharan region), which causes the measure to fail.

2.4 Quantities of interest

In this section, we describe the QoIs we selected to characterize the WAM and explain how these quantities are determined from the ICON model output. The results for all QoIs are averaged over the study time (1 to 31 August of the years 2016, 2017, 2018 and 2019) using all data with the temporal resolution given in Sect. 2.3.2. A schematic illustration of the monsoon system is given in Fig. 1.

2.4.1 Accumulated precipitation (mm per month)

The accumulated precipitation fields are computed and averaged over the study region to obtain one scalar value representing the overall precipitation.

Figure 3Illustrations of selected QoI computations: (a) accumulated precipitation used for the determination of the precipitation center latitude, (b) u wind at 600 hPa used for the determination of the AEJ latitude, (c) 2 m dew point temperature used for the determination of the ITD latitude based on a threshold value of 14 °C and (d) mean sea level pressure used for the determination quantification of the southern boundary of the SHL based on a threshold value of 1009 hPa. All scales are linear.

2.4.2 Precipitation latitude (° N)

The latitude of the rain belt is determined to investigate the potential influence of model parameters on a north–south shift in the average precipitation. For this purpose, the latitudinal center of the accumulated precipitation is computed in the study region between 12° W and 2° E (Fig. 3a) as a weighted average where the accumulated precipitation at each grid point is taken as the weight for its latitude. The longitudinal range is chosen such that distinct topographical effects (Guinea highlands to the west, Cameroon line and wet Niger Delta to the east) are reduced, so the influence of model parameters on the precipitation distribution becomes more evident. The range is fixed for all simulations to ensure comparability of the results.

2.4.3 Column-integrated water vapor (kg m−2)

The averaged column-integrated water vapor over the study region is computed.

2.4.4 Cloud cover (high, middle, low) (%)

The averaged cloud cover at high, middle and low levels over the study region is computed.

2.4.5 Jet latitude (° N)

TEJ and AEJ are the main zonal wind features of the WAM system. They are characterized by the same measures applied at different pressure levels (200 hPa for TEJ, 600 hPa for AEJ). The averaged latitudes of the jet streams are considered in order to investigate the potential influence of model parameters on the north–south shift of the jet streams. Computing only the latitude of the maximum zonal wind speeds turned out to be a non-robust measure, since it is very sensitive and likely to fluctuate for small parameter changes due to the chaotic nature of the atmosphere. Therefore, we introduce a more robust measure that takes the neighboring latitudes into account to compute an averaged latitude of maximum zonal wind speeds. First, we compute the averaged zonal wind speeds for each latitude on the grid. This step simplifies the data to a manageable form with one average wind speed value for each latitude on the grid. The distribution of these wind speeds is still relatively flat, making it difficult to robustly determine the latitude of maximum wind speed. We thus employ a strategy where we exponentiate the average wind values (here, an exponent of 3 yielded useful results) to assign higher weight to the highest values and to reduce the influence of values far from the jet center which are still relatively high (Figs. 6.5 and .6). Finally, we determine the weighted average of latitudes analogously as for the precipitation center (Fig. 3b). Without the exponentiation strategy the chosen latitudinal range for the analysis would have a major effect on the result, which should be avoided.

2.4.6 Jet speed (m s−1)

The jet speed is determined by averaging the zonal wind along the obtained jet latitude. It should be noted that the maximum jet speed at instantaneous points in time is much higher, since we consider the average of wind speeds over time for the sake of a greater robustness.

2.4.7 ITD latitude (° N)

The ITD indicates the location where dry northeasterly winds from the Sahara and moist southwesterly winds from the tropical Atlantic Ocean meet. The ITD is characterized by a marked jump in moisture content near the surface. We use the 14 °C 2 m dew point temperature as a measure for the ITD latitude (Fink et al.2017). The average over latitude values between 12° W and 8° E is used as shown in Fig. 3c. Compared to the longitudinal range for the precipitation latitude, a broader range can be employed here for the sake of a more robust analysis, as the ITD behaves relatively steadily.

2.4.8 SHL strength (Pa)

The characteristic heat low in the region of the Sahara, one of the main drivers of the WAM, is characterized by its strength. For this purpose, the average pressure field is determined within the region from 15 to 25° N and 15° W to 5° E, where the heat low is expected on the basis of climatological results (Lavaysse et al.2009), for each August month. The SHL strength is characterized by the average of the 10 % lowest mean sea level pressure (MSLP) values within this region.

2.4.9 SHL latitude (° N)

The latitude of the SHL is characterized by the southern boundary of the SHL region based on a MSLP threshold of 1009 hPa between 9 and 1° W as shown in Fig. 3d. Simulations for the investigated years indicate that the given threshold value and longitudinal range are robust measures. A broader latitude range or a higher threshold value would potentially lead to a situation where the threshold value is not reached anymore for certain longitudes, and the characterization would not be meaningful anymore. The latitudes are computed for each monthly averaged pressure field, and resulting latitudes of the 4 months are averaged.

2.4.10 2 m temperature (K)

The average 2 m temperature over the whole study region is computed.

2.4.11 2 m dew point temperature (K)

The average 2 m dew point temperature over the whole study region is computed.

2.5 Local parameter studies

In Sect. 2.2, surrogate modeling methods for investigating the dependency between uncertain model parameters and QoIs are introduced. However, since QoIs are defined as single values according to Sect. 2.4, information on spatial variability is lost for the benefit of robust analysis. In this section, we therefore discuss an alternative approach to bring out the influence of uncertain model parameters on the geographical distribution of the chosen output quantities (Sect. 2.3.2). For this purpose, all training points from the experimental design (Sect. 2.2.1) are used, but for each model parameter i the training points with the 25 % lowest and the 25 % highest xi values are selected. Let these sets for each model parameter i be Xi,low and Xi,high. For example, the set Xentrorg,low includes the 15 training points (25 % of 60 training points) with the lowest values of entrorg within the experimental design. For each training point, an ICON model simulation has been conducted, and output fields are available. These output fields are averaged over the whole evaluation period (4 August months). Furthermore, they are then averaged over the Xi,low and Xi,high sets. As a result, the average spatial output data are obtained for low and for high values of the considered uncertain model parameters. For example, using the Xentrorg,low set, all average output fields for low entrorg values are computed. Finally, the averaged fields with low (Xi,low) and high (Xi,high) values are subtracted to obtain a spatial variability field. The variability plot indicates in which regions the output value becomes higher or lower for an increase in model parameter i.

This procedure is applied to all combinations of model parameters and available output data. To quantify the significance of such investigations, a Kruskal–Wallis test (Kruskal and Wallis1952) is performed. Since we cannot assume normally distributed data due to selecting 25 % of training points from the tails of the distributions, a standard t test is not applicable. For the Kruskal–Wallis test the difference between the data of the Xi,low and Xi,high sets from a zero field is considered. Statistical tests are conducted for every grid point, and results are averaged over the whole region to include a sufficient amount of data. The test indicates whether there are significant signals in the variability fields other than random noise.

As a reference, the mean field plots can be obtained for each meteorological variable by averaging output data obtained from all available training points. These reference plots together with the variability plots can then serve as a basis for interpretation of regional influences of model parameters.

3 Results

3.1 Model validation

Validation is an essential step before discussing the results of the conducted studies. It offers insight into how informative and significant the analysis of this work is. We conducted validation for the outputs of the ICON model simulations (see Sect. 2.3.2) and for the obtained surrogate models (see Sect. 2.2.3).

The validation results for the averaged ICON model outputs with respect to ERA5 data – and additionally GPM IMERG data for precipitation – are shown in Table 2. For the purpose of validation, the average over the 4 August months on the 0.5° grid is used, which should represent the climatological spatial distribution. The RMSE for precipitation is 47.7 (62.3) mm per month for GPM IMERG (ERA5), which corresponds to 15.8 % (19.4 %) in NMSE. An inspection of the spatial distribution shows that the differences are mostly due to wetter conditions along the rainy southwestern coast of West Africa and the Niger Delta region in ERA5 (not shown). Differences in cloudiness are also substantial. While high clouds agree best with 7.6 % RMSE, low- and mid-level cloud cover is substantially higher in ICON with RMSEs of 15.5 % and 6 %, respectively. These correspond to NMSEs of 46.9 % and 24.1 %, indicating substantial disagreement. Low clouds over tropical West Africa are controlled by a subtle balance of advective, radiative and turbulent processes (Lohou et al.2020), and differences between models tend to be large (Hannak et al.2017). Cloud cover and precipitation are strongly influenced by model parameterizations, and therefore differences are to be expected. Moreover, ERA5 itself, although much improved compared to earlier products, may still have difficulties with these quantities (Gbode et al.2023). The other moisture variables, i.e., column-integrated water vapor and 2 m dew point temperature, however, show only minor disagreement, as does MSLP. Differences in 2 m temperature in contrast are larger (1.7 K RMSE and 20.1 % NMSE). This is mostly due to a warmer Sahara in ICON (not shown). Modeling near-surface temperature in deserts is challenging due to the enormous solar heating and turbulent surface sensible heat fluxes, which can lead to superadiabatic lapse rates in the lowest meters of the atmosphere (e.g., Knippertz et al.2009). Finally, the four wind variables show good agreement, with the exception of v at 600 hPa (NMSE 26.4 %, which, however, corresponds to an RMSE of only 0.6 m s−1). This is mostly due to stronger northerlies over the Sahara in ERA5 (not shown). These validation results show that the model setup can generally be considered to be valid, even though there are considerable differences in certain quantities. Since in this work sensitivity studies are conducted based on the ICON model alone, perfect agreement of simulation output and ERA5 data is not a requirement. However, for the overall significance of this work, the obtained differences should be taken into account.

Table 2Validation results for the ICON model outputs with ERA5 data.

Download Print Version | Download XLSX

For validation of the surrogate models, leave-k-out (k=2) cross-validation is applied to all QoIs individually, since separate surrogate models have been obtained for each QoI. The RMSE and NMSE are shown in Table 3. Errors include both aleatoric uncertainties in weather simulations (which are inevitable due to the chaotic nature of the system) and surrogate model uncertainties. The prediction variance (Eq. 4) is a measure of the uncertainty of the surrogate model. Therefore, large errors do not necessarily mean that a surrogate model with low accuracy was obtained, but it could also mean that high aleatoric uncertainties in this QoI are present. The standard deviation σn of the Gaussian noise in the regression model (see Sect. 2.2.2) provides an estimate of the aleatoric uncertainty in the ICON data. The results from the maximum likelihood estimation are also given in Table 3. Given that the values of σn are generally lower but still of a similar magnitude compared to the RMSE, this indicates that a significant proportion of the observed validation errors may be ascribed to aleatory uncertainties inherent in the weather model. The error attributed to the uncertainty in the surrogate model is already relatively low but could be further reduced by including more training points or averaging over more data (i.e., more years). A small RMSE (or NMSE) indicates that surrogate model accuracy is high and aleatoric uncertainty is small. Small validation errors are therefore evidence that sensitivity analysis and parameter studies are meaningful. In this study, NMSEs are considered to be small for all QoIs except for the AEJ speed and precipitation latitude. However, the small RMSEs for these quantities indicate that the absolute errors are very small. Since changes in these QoIs are very small (see also Fig. 5), the variance σy2 of data used for the normalization in Eq. (12) becomes very small, too, and NMSE values become larger. Thus, large NMSE values in these cases do not affect the overall validity of this study.

Table 3Validation results for the surrogate models of all QoIs with cross-validation.

Download Print Version | Download XLSX

Figure 4Main and total effect sensitivity indices of the six selected uncertain model parameters for all QoIs, resulting from the global sensitivity analysis FAST.


3.2 Global sensitivity analysis

The results of the global sensitivity analysis are shown in Fig. 4. For each QoI, the bar plots indicate the main and total effect sensitivity indices of the six uncertain model parameters. The results should be considered to be relative contributions to the total variance of each QoI such that comparison of the magnitudes between the different QoIs is not meaningful. A comparison between the absolute uncertainty contributions on the different QoIs is difficult or impossible in any case, as they have different units. Overall, the main and total effect indices do not differ strongly, which indicates that interactions between the parameters are relatively weak. This justifies interpreting influences on QoIs of individual model parameters separately as done in this study. The interactions between the parameters is expected to be larger for broader parameter ranges as nonlinear effects may become more dominant.

Sensitivities of cloud cover (leftmost columns in Fig. 4) are generally dominated by the two deep-cloud parameters, the entrainment rate (entrorg) and the terminal fall velocity of ice (zvz0i). High-level clouds are strongly affected by entrainment, which can prevent convection reaching high levels, in contrast to mid-level clouds, where effects are minor. The fall velocity of ice controls the dissolution of high clouds but also has a dominant effect on mid-level clouds. Low-level clouds are affected by more parameters in a more complex way. As expected, below-cloud and boundary layer parameters have a substantial effect at these altitudes. Particularly, the relative humidity threshold for onset of evaporation below the cloud base (rhebc_land_trop) and the surface area density of the evaporative soil surface (c_soil) dominate the influence on low clouds, whereas deep-cloud parameters only play a minor role (20 % combined).

Column water vapor is mostly influenced by the deep-cloud parameters, similar to high clouds, but the boundary layer parameters also play a minor role. This suggests that this variable is in fact more sensitive to interactions with clouds at middle and high levels than to changes in evaporation and vertical mixing at low levels. Somewhat unexpectedly, 2 m temperature and 2 m dew point temperature are mostly influenced by the deep-cloud parameters, too, with the entrainment rate playing the biggest role. This suggests that these parameters must cause substantial indirect effects outside of the clouds. More obviously, c_soil significantly affects the thermodynamics at the surface level. Precipitation shows the overall most complex response being affected to various degrees by all model parameters except the convective area fraction used for computing evaporation below the cloud base (rcucov_trop). While the impact of deep-cloud parameters is no surprise, there is also a considerable impact from the boundary layer parameters, indicating the importance of low-level moisture for precipitation. The parameter rhebc_land_trop also shows a small influence due to the effect of evaporation below the cloud base on surface rainfall.

The eight rightmost double columns in Fig. 4 show corresponding sensitivities for the AEJ, TEJ and SHL strengths, as well as the latitude of various WAM features. The AEJ speed and position are most sensitive to entrorg followed by zvz0i. This suggests that deep clouds matter most, likely through their effects on baroclinicity and vertical momentum transport. It is interesting to note that the AEJ speed is the only parameter with a considerable contribution from rcucov_trop, possibly due to its location in the relatively dry Sahel, where evaporation below the cloud base is large. The latitudes of the ITD and the AEJ show similar sensitivities, suggesting a relatively tight coupling between the two. The TEJ speed is dominated by zvz0i, as this controls the divergent outflow from convective anvils, which feeds the jet (Lemburg et al.2019). Interestingly, its position is also sensitive to entrainment and even boundary layer parameters and shows the largest difference between the total and main effect. The strength and latitude of the SHL are most sensitive to entrorg, which is surprising given the absence of deep clouds over most of the Sahara. A potential explanation is that entrainment affects free-tropospheric water vapor content, which is a strong control on longwave cooling in dry regions (Pante and Knippertz2019). Finally, the latitude of the precipitation maximum is most sensitive to rhebc_land_trop ( 65 % contribution) with minor contributions from all other parameters. This behavior is in stark contrast to precipitation amount and essentially all other QoIs shown in Fig. 4. Given the large gradient in absolute and relative humidity across the Sahel, it demonstrates that shifting the onset of subcloud evaporation in the model is a powerful mechanism to shift the entire rain belt meridionally. This result may help explain some of the variability in rain belt position seen in many model intercomparison projects (e.g., Fotso-Nguemo et al.2017).

3.3 Parameter studies

Results of the parameter studies for the QoIs based on the surrogate models described in Sect. 2.2 and of the local parameter studies described in Sect. 2.5 are discussed here consecutively for the six uncertain model parameters.

As surrogate model predictions depend on all six parameters, the full relationship cannot be visualized graphically. Instead, it is possible to illustrate one-at-a-time changes. Since parameter interactions were shown to be relatively low in Sect. 3.2, such illustrations are meaningful. Figure 5 shows the individual relationships between each model parameter and each QoI, while all other model parameters are set to their mean values which correspond to the ICON default values. Due to the low parameter interactions, choosing different fixed values from the mean values would predominantly result in vertical displacements of the presented curves in our analysis. The prediction variance from the Gaussian process regression (Eq. 4) is indicated by the shaded areas around the curves.

Figure 5Dependencies of all QoIs (ordinate) with respect to the six selected uncertain model parameter (abscissa). The shaded area around the curves illustrates prediction variance (see Eq. 4). In each plot, only one model parameter is varied, while all other model parameters are set to their mean value. Model parameter PDFs, including their mean value, are shown at the bottom.


Averages of all output variables over all available ICON simulations and the entire evaluation period (Augusts 2016–2019) are shown in Fig. 6. Spatial variability plots for all three groups of model parameters are shown in Figs. 79. The idea to compare the 25 % lowest and highest values of the model parameters to investigate the regional dependencies is supported by the fact that changes in QoIs are, if present, monotonic and in some cases even close to linear (see Fig. 5).

Figure 6Average of selected output fields over the evaluation period (August 2016–2019), averaged over all available ICON simulations. Figure numbers are chosen in accordance with the variables and labels in Figs. 79. All scales are linear.

Figure 7Spatial variability of selected output fields for the uncertain model parameters entrorg and zvz0i. The difference in the output quantity with respect to an increase in the model parameter value based on the Xi,low and Xi,high sets (see Sect. 2.5) is shown. Results of the statistical test are denoted with the following: (**) – statistical significance on a 5 % level, (*) – statistical significance on a 10 % level. All scales are linear.

Figure 8Spatial variability of selected output fields for the uncertain model parameters rhebc_land_trop and rcucov_trop. The difference in the output quantity with respect to an increase in the model parameter value based on the Xi,low and Xi,high sets (see Sect. 2.5) is shown. Results of the statistical test are denoted with the following: (**) – statistical significance on a 5 % level, (*) – statistical significance on a 10 % level. All scales are linear.

Figure 9Spatial variability of selected output fields for the uncertain model parameters tkhmin and c_soil. The difference in the output quantity with respect to an increase in the model parameter value based on the Xi,low and Xi,high sets (see Sect. 2.5) is shown. Results of the statistical test are denoted with the following (**) – statistical significance on a 5 % level, (*) – statistical significance on a 10 % level. All scales are linear.

Results from the statistical Kruskal–Wallis test for the variability fields are denoted in Figs. 79. Variability fields are denoted with two asterisks (**) for average p values p<0.05 (statistical significance on a 5 % level) and with one asterisk (*) for average p values 0.05<p<0.10 (statistical significance on a 10 % level). While a 5 % significance level is a common choice, Quinn and Keough (2002) suggest that this threshold should not be rigid and should depend on specific circumstances. For instance, a larger sample size is more likely to yield statistically significant results. Given that our analysis includes only 60 training points, it is considered reasonable to include a less stringent significance level (10 %) as well. However, care must be taken to avoid overconfident statements. Overall, the validation results should be taken into account in the interpretation of local influences of the model parameters. Variability fields for entrorg and zvz0i are much more significant than for others. Significance is closely related to the sensitivities; i.e., the greater the influence of a model parameter on a QoI (see Fig. 4), the more significant the variability field of the corresponding output quantity is in general.

3.3.1 Deep-cloud parameters

The effect of the investigated deep-cloud parameters, entrainment rate (entrorg) and the terminal fall velocity of ice (zvz0i) is considerably greater for most QoIs than that of other parameters, as evident from Fig. 4 (blue-colored bars) and Fig. 5. Both parameters directly affect cloudy regions only, and thus signals outside the rain belt will to some extent be due to indirect effects.

As shown in Fig. 5, the main effects of a larger entrorg are a decrease in 2 m dew point, column water vapor, high-level cloud cover and precipitation, suggesting an overall drying of the WAM system, which is also accompanied by an increase in 2 m temperature and lower pressure in the SHL. In addition, we see a consistent southward shift of the northern WAM features ITD, SHL boundary and AEJ, while the southern features, precipitation center and TEJ, remain at their latitudes. The strengths of the jets as well as low- and mid-level cloud cover are hardly affected.

Figure 7a (i.e., first and third columns from left) shows the corresponding results on a horizontal map, which are all statistically significant on a 5 % level according to the Kruskal–Wallis test. Increasing entrorg reduces precipitation to the north and south of the rain belt, as expected from Fig. 5, but surprisingly slightly increases precipitation within a narrow strip through the rain belt (Fig. 7a4). We interpret this increase as a concentration of rain in areas where ambient conditions are most suitable, while the higher entrainment suppresses rain in more marginal areas. It is also conceivable that the southward shift of the AEJ (Fig. 5) alters the distribution of low-level wind shear, which is important for convective organization (Fink and Reiner2003). Despite the local precipitation increase, high clouds decrease over the entire domain by up to 25 % (Fig. 7a1) but less so over the rain belt, where they maximize climatologically (Fig. 6.1). Nevertheless, this may indicate that weaker convective systems are suppressed and that rainfall is generated more effectively by fewer, more intense systems. A higher entrorg also yields an increase in mid-level clouds in the southeastern parts of the domain (Fig. 7a2), while ocean and western land areas show a slight decrease. It is generally plausible that entrainment reduces convective instability and thus retains clouds at the middle levels in marginally unstable regions, but the reasons for the details of the spatial distribution are not clear. With respect to low-level cloud cover (Fig. 7a3), more entrainment implies widespread reduction over the Sahel, indicating that the northern edge of the low-cloud zone over southern West Africa (see Fig. 6.3) retreats southward, while values over the ocean and coastal areas increase. This shift may be related to the overall southward shift of several WAM features, already discussed in the context of Fig. 5. The large sensitivity of high-level clouds determines the signal in total cloud cover (not shown).

The column-integrated water vapor (Fig. 7a7) reduces in and around regions with less precipitation and increases (or remains the same) in wetter regions, in particular in the southeast, where we also found increased mid-level clouds (Fig. 7a2). Over the Sahara, the drying is also pronounced at the surface (2 m dew point, Fig. 7a10) but less so farther south. The decrease may be a combination of less rain and evaporation plus a southward-shifted monsoon circulation. The slight increases in the rain belt is probably a direct consequence of more rainfall. The overall reduced cloud cover, precipitation and thus evaporation cause a surface warming over almost the entire land area of the domain (Fig. 7a9), associated with a lower mean sea level pressure due to thermal expansion (Fig. 7a8), the maximum of which is to the south of the climatological SHL center (Fig. 6.8), creating a southward shift. In addition, altered temperature advection associated with the southward shift of the ITD (see Fig. 5) could play a role.

As already pointed out in the discussion of Fig. 5, the sensitivity of the zonal jets to entrorg is less pronounced. The most systematic signal is the clear southward shift in zonal wind at 600 hPa (Fig. 7a6) with a decrease of several meters per second to the south of the climatological axis (Fig. 6.6). In the meridional direction (Fig. 7a12) we see an overall strengthening of the climatological northerlies (Fig. 6.12), indicating a stronger shallow monsoon circulation consistent with the stronger SHL (Fig. 7a8). At the TEJ 200 hPa level the broad climatological easterlies (Fig. 6.5) are slightly weakened by larger entrainment, apart from the southeastern corner of the domain (Fig. 7a5). In the meridional direction (Fig. 7a11), the reduced rainfall over the Guinea Coast is associated with a weakening of the northerly divergent outflow towards the equatorial Atlantic (Fig. 6.11), which likely contributes to a weaker TEJ in the west (Lemburg et al.2019). At the same time, the outflow into the Northern Hemisphere is slightly enhanced, shifting the relative importance of the two deep monsoonal overturning cells. Given the large decrease in high-level cloud cover (Fig. 7a5), it is also plausible that radiative cooling in the upper troposphere increases (Stubenrauch et al.2021), which would contribute to a weaker monsoon cell, consistent with a Gill-type circulation response to a decreased off-equatorial heating (Gill1980).

Comparing the effect of enhanced entrainment with that of a faster terminal fall velocity of ice, we see many commonalities despite the fundamentally different microphysical processes at play. With respect to the overall effects displayed in Fig. 5, most signals are consistent in sign (and even in magnitude). The most notable differences are a northward shift of the TEJ with higher zvz0i, a weaker impact on the SHL strength, a decrease in mid-level clouds and a smaller impact on the 2 m dew point. Looking at the corresponding horizontal distributions (second and fourth columns in Fig. 7), there is a striking similarity in spatial patterns, too, however with some differences in magnitude, such as for example a stronger signal in high-level cloud cover (Fig. 7b1), which is directly impacted by ice particles, and weaker signals in surface temperature, dew point and mean sea level pressure as well as low-level cloud cover (Fig. 7b3 and b8–b10), where effects can only be indirect. The most striking difference is the absence of an anomalous behavior in the southeastern part of the study domain. Here effects of larger zvz0i are more consistent with other areas, i.e., implying less mid-level clouds (faster dissolution), decreased column water vapor and a weaker or unchanged TEJ (Fig. 7b2, b5 and b7). Other changes in the circulation variables are almost identical (compare Fig. 7a6, a11, a12 with 7b6, b11, b12). The impact of a larger zvz0i on precipitation also resembles that for entrorg but with a smaller amplitude.

3.3.2 Below-cloud parameters

The investigated below-cloud parameters, namely the relative humidity threshold for onset of evaporation (rhebc_land_trop) and the convective area fraction used for computing evaporation (rcucov_trop), also affect cloudy areas only, and thus effects outside of the rain belt will largely be indirect. Their impacts on almost all QoIs (Fig. 4, green-colored bars, and Fig. 5) and output fields (Fig. 8) are considerably smaller than for the deep-cloud parameters discussed in the previous subsection. Increased evaporation leads to cooler subcloud layers, resulting in greater negative buoyancy and enhanced lateral acceleration compared to adjacent grid cells, somewhat akin to intensified cold pools. However, the 13 km grid spacing in our experiments may not adequately resolve this process, including new storm triggering by cold pools. Our findings therefore provide only limited insights into the actual significance of cold pools in the monsoon system and the potential benefits of a cold-pool parameterization.

Signals that stand out in Fig. 4 are those for low-level cloud cover and precipitation latitude (both rhebc_land_trop) and to a lesser extent those for precipitation amount, TEJ speed, SHL latitude and intensity (all rhebc_land_trop) and AEJ speed (rcucov_trop). Looking at the dependencies of the QoIs in Fig. 5 reveals that allowing evaporation at higher relative humidity in the model (i.e., increasing rhebc_land_trop) suppresses precipitation and leads to a slight southward shift of the rain belt due to a decrease in precipitation in the Sahel (Fig. 8c4), where cloud bases are higher and where subcloud relative humidity is close to the threshold climatologically. At the same time, there is a widespread increase in MSLP over the northern and central parts of the domain (Fig. 8c8), associated with a weakening and slight northward shift of the SHL (Fig. 5). The increased subcloud evaporation is also associated with more low-level clouds over most inland areas south of the Sahara (Fig. 8c3), but 2 m temperature and dew point do not show significant changes (Fig. 8c9 and c10). The small signal in near-surface temperature could be the result of less surface evaporation due to reduced rainfall and soil moisture compensating for the reduced radiative heating due to more low-level clouds and the increased subcloud evaporative cooling. Changes in temperature advection due to the weaker SHL may play a role, too. Interestingly, increasing rhebc_land_trop also affects high- and mid-level clouds and column water vapor (Fig. 8c1, c2 and c7) but mostly in areas away from the rain belt (i.e., Gulf of Guinea, Mauritania) and with little or no statistical significance. The increases in these areas are consistent with weaker overturning circulations associated with the suppressed precipitation and possibly a redistribution of the moisture left in the atmosphere. There are some mild indications of this in the 200 hPa wind signals as well, showing a marginally significant decrease in the northerly outflow over Nigeria (Fig. 8c11) and the strength of the TEJ (Fig. 8c5), while changes at 600 hPa (Fig. 8c6 and c12) are insignificant. For rcucov_trop, the only field that shows significant changes on a 10 % level is MSLP (Fig. 8d8) with a pattern similar to the signal for rhebc_land_trop (Fig. 8c8). In this case, the 2 m temperature decrease over the Sahel (Fig. 8d9) and 2 m dew point increase over the Sahara (Fig. 8d10) are slightly more pronounced but statistically still not significant. All other fields in Fig. 8 show very weak signals, in particular the circulation and precipitation variables, consistent with Figs. 4 and 5.

3.3.3 Boundary layer parameters

Effects of the scaling factor for minimum vertical diffusion for heat and moisture (tkhmin) and the surface area density of the evaporative soil surface (c_soil) are also less prominent than for the deep-cloud parameters (Sect. 3.3.1). The largest sensitivities are found for near-surface QoIs such as low-level cloud cover, 2 m temperature and 2 m dew point but also for integrated quantities like column water vapor and precipitation (Fig. 4, brown-colored bars).

For a higher value of tkhmin, moisture is more effectively transported upwards, leading to an increase in column-integrated water vapor almost everywhere (Fig. 9e7, also evident in Fig. 5). Through the enhanced vertical transport of moisture, high clouds increase quite homogeneously across the domain (Fig. 9e1), while mid-level clouds increase over the rain belt but not by a statistically significant amount (Fig. 9e2). Low-level clouds are reduced consistently over the Gulf of Guinea (Fig. 9e3), where more mixing brings drier air from the mid-troposphere into the boundary layer, supporting cloud dissolution. The mixing of drier air is also evident in lower 2 m dew points in the Sahel, in contrast to higher values in parts of the Sahara, where mixing may bring moister air into the boundary layer (Fig. 9e10). The 2 m temperature (Fig. 9e9) is hardly affected, apart from an increase over the Sahara, where longwave warming due to the higher column moisture and/or mixing of warm air from above the top of the boundary layer inversion may play a role. The enhancement of the vertical exchange of moisture leads to a slight increase in accumulated precipitation (Fig. 5), which, however, is hardly visible in the spatial field (Fig. 9e4). A similar result is found for MSLP, with a slight strengthening of the SHL (Fig. 5) but little signal in the spatial field (Fig. 9e8). The fact that the small mean change in MSLP is statistically significant on a 5 % level suggests that this change is systematic without much random fluctuations.

In contrast, c_soil directly increases surface evaporation, leading to a significantly higher 2 m dew point (Fig. 9f10) and lower 2 m temperature (Fig. 9f9) almost everywhere over land, which is also clearly visible in the overall dependencies shown in Fig. 5. Column-integrated water vapor (Fig. 9f7) is also enhanced but mostly to the north and south of the rain belt, where increased precipitation (Fig. 9f4; see also Fig. 5) likely removes some of the additional moisture but also slightly (and insignificantly) increases high- but not mid-level clouds (Fig. 9f1 and f2). As the increased surface latent heat fluxes over land areas moisten the boundary layer, low-level cloud cover increases over the southern parts of West Africa (Fig. 9f3), which may further enhance near-surface cooling. This cooling in turn leads to an increased pressure (Fig. 9f8), resulting in a weakening of the SHL and northward shift of the SHL and ITD (Fig. 5).

Both tkhmin and c_soil have a remarkably similar effect on the circulation. Since an increase in the parameters yields stronger convection and more precipitation (Fig. 5), i.e., an overall strengthening of the WAM system, the 200 hPa outflow from the rain belt to the south is enhanced (Fig. 9e11 and f11), and the TEJ is accelerated (Fig. 9e5 and f5). The AEJ, in contrast, is only weakly affected.

4 Conclusions

The aim of this study was to quantify uncertainty contributions of selected uncertain ICON model parameters for a set of QoIs that characterizes the WAM system. Findings should help to improve parameter specifications to make long-term simulations and forecasts more accurate. Due to computational cost, surrogate models are used as a resource-friendly alternative to describe the relationship between model parameters and QoIs. The study was based on a novel approach by Fischer and Proppe (2023) to include parameter PDFs in the construction of basis functions for universal kriging.

The dependency of QoIs on multiple model parameters and the influence of single parameters on multiple QoIs reflect the complex coupled relationships in the WAM system. Although the magnitude of the impact of individual model parameters varies quite strongly, most parameters show distinct effects on many facets of the system, which are illustrated schematically for the four most important parameters in Fig. 10. The results can be summarized as follows:

Figure 10Illustration of the qualitative effects on the WAM system due to an increase in the investigated model parameters that have the strongest impacts: (a) entrorg, zvz0i, (b) rhebc_land_trop, (c) c_soil. See Sect. 3 for a more detailed discussion.


  • The entrainment rate (entrorg) and terminal fall velocity of ice (zvz0i) have the strongest effects on the WAM system (see Fig. 10a). An increase in these parameters decreases cloud cover and precipitation, mainly to the north and south of the rain belt across West Africa. Surprisingly, particularly for entrorg, precipitation even increases along a narrow strip through the rain belt, which may benefit from the suppressed rain elsewhere. Larger values of both parameters lead to a stronger SHL with warmer and drier conditions in the Sahara and a stronger shallow overturning as well as a southward shift of the ITD and AEJ, while the TEJ weakens.

  • The parameters rhebc_land_trop and rcucov_trop control the evaporation below the cloud base in the tropics with an overall weaker impact on the WAM. An increase in rhebc_land_trop (Fig. 10b) leads to less precipitation and increased low-level clouds. This appears to weaken the monsoon overturning, as reflected in a weaker SHL and moister columns in the subsidence regions over the northwestern Sahara and the Gulf of Guinea, however with little impact on AEJ and TEJ. An increase in rcucov_trop induces much weaker effects, particularly an increase in low-level and a decrease in mid-level cloud cover, with no substantial precipitation change.

  • The scaling factor for vertical diffusion of heat and moisture (tkhmin) impacts on the exchange of moisture between the boundary layer and the free troposphere. An increase in this parameter therefore increases column water vapor and leads to more high- and mid-level clouds, but precipitation is hardly affected. The evaporative soil surface (controlled by c_soil) also increases column water vapor and cloud cover but in this case mainly the low-level clouds, even leading to a small increase in precipitation at the southern side of the rain belt (see Fig. 10c). Near-surface temperature decreases through increased evaporation, while 2 m dew point temperature and MSLP increase, shifting the SHL northwards. Impacts on the AEJ and TEJ are rather small for both tkhmin and c_soil.

Concerning the selected uncertain model parameters (Sect. 2.1), given the limited information from the literature, the definitions are rough estimates, and obtained results should be interpreted with some caution. Furthermore, only six model parameters are included in the study, but other parameters may also have relevant uncertainty contributions. Moreover, the results based on the ICON model version used cannot directly be transferred to other model versions or even other models, where different parameters are used in parameterizations. Nevertheless, the outcome of this study highlights the usefulness of the applied methodology including training procedure and surrogate models. The methodology is not limited to a few model parameters but can be extended. The computational effort is expected to increase linearly with the number of model parameters (see Sect. 2.2.1). As this study has shown that the entrainment rate has a strong influence, other related parameters might be of interest, such as distinction between turbulent and organized entrainment as well as detrainment parameters. Another interesting parameter for future studies might be cloud inhomogeneity.

This study has shown that it is mainly the entrainment rate, the fall speed of ice and surface evaporation that should be specified more accurately. This can be done by including further investigations, measurements and expert knowledge, including a more complex representation in parameterizations. Moreover, these parameters could be optimized with respect to the WAM simulation through parameter identification studies by including reanalysis and satellite data as observational references. The surrogate models that were obtained in this study can serve as the basis to conduct such identification studies. However, the outcome would be limited to the West African region. Thus, it might be possible to specify parameters that should only be valid in regions for which they have been optimized, as is already the case for rhebc_land_trop and rcucov_trop, which have been tuned for tropical regions. The implementation of parameter identification studies based on the obtained surrogate models is currently ongoing.

Code availability

The computational framework used in this study primarily relies on publicly available software packages, along with some custom extensions. Gaussian process regression analyses were performed using the scikit-learn package for Python (Pedregosa et al.2011), incorporating extensions based on Fischer and Proppe (2023). Global sensitivity analysis was conducted using the SALib package for Python (Iwanaga et al.2022). Weather simulations were executed within the ICON modeling framework (Zängl et al.2015).

Data availability

The data used for model validation in this study include the ERA5 reanalysis data (Hersbach et al.2020) and the GPM IMERG precipitation data (Huffman et al.2019) ( These datasets are publicly available and have been widely utilized in the meteorological research community.

Author contributions

PK conceived the overall concept of the study, including all necessary steps to quantify uncertainties in the selected model parameters. MF designed the study, including the experimental design, surrogate models, computation of QoIs, sensitivity analysis and local parameter studies, with input from all co-authors. GP set up the ICON model including ERA5 data. RVDL, GP and PK contributed to the meteorological aspects in the model setup. CP contributed to the methodical aspects of the study. PK, AL and JM contributed to the interpretation of the results and strategies for post-analysis. MF prepared the paper with input from all co-authors.

Competing interests

At least one of the (co-)authors is a member of the editorial board of Weather and Climate Dynamics. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.


Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.


Peter Knippertz acknowledges the C2 Prediction of wet and dry periods of the West African Monsoon project of the Transregional Collaborative Research Center SFB/TRR 165 Waves to Weather funded by the German Science Foundation (DFG).

Financial support

This research has been supported by the German Research Foundation (DFG) (grant no. SFB/TRR 165, Waves to Weather).

Review statement

This paper was edited by Stephan Pfahl and reviewed by two anonymous referees.


Agustí-Panareda, A., Beljaars, A., Cardinali, C., Genkova, I., and Thorncroft, C.: Impacts of Assimilating AMMA Soundings on ECMWF Analyses and Forecasts, Weather Forecast., 25, 1142–1160,, 2010. a

Brierley, C. M., Zhao, A., Harrison, S. P., Braconnot, P., Williams, C. J. R., Thornalley, D. J. R., Shi, X., Peterschmitt, J.-Y., Ohgaito, R., Kaufman, D. S., Kageyama, M., Hargreaves, J. C., Erb, M. P., Emile-Geay, J., D'Agostino, R., Chandan, D., Carré, M., Bartlein, P. J., Zheng, W., Zhang, Z., Zhang, Q., Yang, H., Volodin, E. M., Tomas, R. A., Routson, C., Peltier, W. R., Otto-Bliesner, B., Morozova, P. A., McKay, N. P., Lohmann, G., Legrande, A. N., Guo, C., Cao, J., Brady, E., Annan, J. D., and Abe-Ouchi, A.: Large-scale features and evaluation of the PMIP4-CMIP6 midHolocene simulations, Clim. Past, 16, 1847–1872,, 2020. a

Burpee, R. W.: The Origin and Structure of Easterly Waves in the Lower Troposphere of North Africa, J. Atmos. Sci., 29, 77–90,<0077:toasoe>;2, 1972. a

Cheng, K., Lu, Z., Ling, C., and Zhou, S.: Surrogate-assisted global sensitivity analysis: an overview, Struct. Multidiscip. Optimiz., 61, 1187–1213,, 2020. a, b

Claussen, M., Dallmeyer, A., and Bader, J.: Theory and Modeling of the African Humid Period and the Green Sahara, in: Oxford Research Encyclopedia of Climate Science, Oxford University Press,, 2017. a

Cook, K. H. and Vizy, E. K.: Coupled Model Simulations of the West African Monsoon System: Twentieth- and Twenty-First-Century Simulations, J. Climate, 19, 3681–3703,, 2006. a

Diamond, M. S., Director, H. M., Eastman, R., Possner, A., and Wood, R.: Substantial Cloud Brightening From Shipping in Subtropical Low Clouds, AGU Adv., 1, e2019AV000111,, 2020. a

DWD – Deutscher Wetterdienst: ICON Namelist Overview, Tech. rep., 2019. a

Fink, A. H. and Reiner, A.: Spatiotemporal variability of the relation between African easterly waves and West African squall lines in 1998 and 1999, J. Geophys. Res., 108, 4332,, 2003. a, b

Fink, A. H., Agustí-Panareda, A., Parker, D. J., Lafore, J.-P., Ngamini, J.-B., Afiesimama, E., Beljaars, A., Bock, O., Christoph, M., Didé, F., Faccani, C., Fourrié, N., Karbou, F., Polcher, J., Mumba, Z., Nuret, M., Pohle, S., Rabier, F., Tompkins, A. M., and Wilson, G.: Operational meteorology in West Africa: observational networks, weather analysis and forecasting, Atmos. Sci. Lett., 12, 135–141,, 2011. a

Fink, A. H., Engel, T., Ermert, V., van der Linden, R., Schneidewind, M., Redl, R., Afiesimama, E., Thiaw, W. M., Yorke, C., Evans, M., and Janicot, S.: Mean Climate and Seasonal Cycle, in: Meteorology of Tropical West Africa, John Wiley & Sons, Ltd, 1–39,, 2017. a, b

Fischer, M. and Proppe, C.: Enhanced universal kriging for transformed input parameter spaces, Probabil. Eng. Mech., 74, 103486,, 2023. a, b, c, d, e, f, g

Flaounas, E., Bastin, S., and Janicot, S.: Regional climate modelling of the 2006 West African monsoon: sensitivity to convection and planetary boundary layer parameterisation using WRF, Clim. Dynam., 36, 1083–1105,, 2011. a

Fletcher, C. G., Kravitz, B., and Badawy, B.: Quantifying uncertainty from aerosol and atmospheric parameters and their impact on climate sensitivity, Atmos. Chem. Phys., 18, 17529–17543,, 2018. a, b, c

Flohn, H.: Investigations on the Tropical Easterly Jet, Bonner meteorologische Abhandlungen, Dümmlers, (last access: 12 April 2024), 1964. a

Fotso-Nguemo, T. C., Vondou, D. A., Pokam, W. M., Djomou, Z. Y., Diallo, I., Haensler, A., Tchotchou, L. A. D., Kamsu-Tamo, P. H., Gaye, A. T., and Tchawoua, C.: On the added value of the regional climate model REMO in the assessment of climate change signal over Central Africa, Clim. Dynam., 49, 3813–3838,, 2017. a

Gbode, I. E., Dudhia, J., Ogunjobi, K. O., and Ajayi, V. O.: Sensitivity of different physics schemes in the WRF model during a West African monsoon regime, Theor. Appl. Climatol., 136, 733–751,, 2018. a

Gbode, I. E., Babalola, T. E., Diro, G. T., and Intsiful, J. D.: Assessment of ERA5 and ERA-Interim in Reproducing Mean and Extreme Climates over West Africa, Adv. Atmos. Sci., 40, 570–586,, 2023. a

Gill, A. E.: Some simple solutions for heat-induced tropical circulation, Q. J. Roy. Meteorol. Soc., 106, 447–462,, 1980. a

Glassmeier, F., Hoffmann, F., Johnson, J. S., Yamaguchi, T., Carslaw, K. S., and Feingold, G.: An emulator approach to stratocumulus susceptibility, Atmos. Chem. Phys., 19, 10191–10203,, 2019. a

Grist, J. P. and Nicholson, S. E.: A Study of the Dynamic Factors Influencing the Rainfall Variability in the West African Sahel, J. Climate, 14, 1337–1359,<1337:asotdf>;2, 2001. a

Haile, M.: Weather patterns, food security and humanitarian response in sub-Saharan Africa, Philos. T. Roy. Soc. B, 360, 2169–2182,, 2005. a

Hall, N. M. and Peyrillé, P.: Dynamics of the West African monsoon, Journal de Physique IV (Proceedings), 139, 81–99,, 2006. a

Hannak, L., Knippertz, P., Fink, A. H., Kniffka, A., and Pante, G.: Why Do Global Climate Models Struggle to Represent Low-Level Clouds in the West African Summer Monsoon?, J, of Climate, 30, 1665–1687,, 2017. a

Hastenrath, S.: Climate Dynamics of the Tropics, Springer Netherlands,, 1991. a

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.-N.: The ERA5 global reanalysis, Q. J. Roy. Meteorol. Soc., 146, 1999–2049,, 2020. a, b

Holden, P. B., Edwards, N. R., Oliver, K. I. C., Lenton, T. M., and Wilkinson, R. D.: A probabilistic calibration of climate sensitivity and terrestrial carbon change in GENIE-1, Clim. Dynam., 35, 785–806,, 2009. a

Hopcroft, P. O., Valdes, P. J., Harper, A. B., and Beerling, D. J.: Multi vegetation model evaluation of the Green Sahara climate regime, Geophys. Res. Lett., 44, 6804–6813,, 2017. a

Huffman, G., Stocker, E., Bolvin, D., Nelkin, E., and Tan, J.: GPM IMERG final precipitation L3 half hourly 0.1 degree × 0.1 degree V06, GES DISC – Goddard Earth Sciences Data and Information Services Center [data set],, 2019. a, b

Iwanaga, T., Usher, W., and Herman, J.: Toward SALib 2.0: Advancing the accessibility and interpretability of global sensitivity analyses, Socio-Environ. Syst. Model., 4, 18155,, 2022. a

Janicot, S., Lafore, J.-P., and Thorncroft, C.: The West African Monsoon, in: The Global Monsoon System, World Scientific, 111–135,, 2011. a

Kendon, E. J., Stratton, R. A., Tucker, S., Marsham, J. H., Berthou, S., Rowell, D. P., and Senior, C. A.: Enhanced future changes in wet and dry extremes over Africa at convection-permitting scale, Nat. Commun., 10, 1794,, 2019. a

Kiladis, G. N., Thorncroft, C. D., and Hall, N. M. J.: Three-Dimensional Structure and Dynamics of African Easterly Waves. Part I: Observations, J. Atmos. Sci., 63, 2212–2230,, 2006. a

Klein, C., Heinzeller, D., Bliefernicht, J., and Kunstmann, H.: Variability of West African monsoon patterns generated by a WRF multi-physics ensemble, Clim. Dynam., 45, 2733–2755,, 2015. a

Kniffka, A., Knippertz, P., and Fink, A. H.: The role of low-level clouds in the West African monsoon system, Atmos. Chem. Phys., 19, 1623–1647,, 2019. a

Knippertz, P., Ansmann, A., Althausen, D., Müller, D., Tesche, M., Bierwirth, E., Dinter, T., Müller, T., Hoyningen-Huene, W. V., Schepanski, K., Wendisch, M., Heinold, B., Kandler, K., Petzold, A., Schütz, L., and Tegen, I.: Dust mobilization and transport in the northern Sahara during SAMUM 2006 – a meteorological overview, Tellus B, 61, 12–31,, 2009. a

Kruskal, W. H. and Wallis, W. A.: Use of Ranks in One-Criterion Variance Analysis, J. Am. Stat. Assoc., 47, 583–621,, 1952. a

Lang, S. T. K., Lock, S.-J., Leutbecher, M., Bechtold, P., and Forbes, R. M.: Revision of the Stochastically Perturbed Parametrisations model uncertainty scheme in the Integrated Forecasting System, Q. J. Roy. Meteorol. Soc., 147, 1364–1381,, 2021. a

Lavaysse, C., Flamant, C., Janicot, S., Parker, D. J., Lafore, J.-P., Sultan, B., and Pelon, J.: Seasonal evolution of the West African heat low: a climatological perspective, Clim. Dynam., 33, 313–330,, 2009. a

Lebel, T. and Ali, A.: Recent trends in the Central and Western Sahel rainfall regime (1990–2007), J. Hydrol., 375, 52–64,, 2009. a

Lebel, T., Diedhiou, A., and Laurent, H.: Seasonal cycle and interannual variability of the Sahelian rainfall at hydrological scales, J. Geophys. Res., 108, 8389,, 2003. a

Lee, L. A., Carslaw, K. S., Pringle, K. J., Mann, G. W., and Spracklen, D. V.: Emulation of a complex global aerosol model to quantify sensitivity to uncertain parameters, Atmos. Chem. Phys., 11, 12253–12273,, 2011. a, b, c

Lemburg, A., Bader, J., and Claussen, M.: Sahel Rainfall–Tropical Easterly Jet Relationship on Synoptic to Intraseasonal Time Scales, Mon. Weather Rev., 147, 1733–1752,, 2019. a, b, c

Loeppky, J. L., Sacks, J., and Welch, W. J.: Choosing the Sample Size of a Computer Experiment: A Practical Guide, Technometrics, 51, 366–376,, 2009. a

Lohou, F., Kalthoff, N., Adler, B., Babić, K., Dione, C., Lothon, M., Pedruzo-Bagazgoitia, X., and Zouzoua, M.: Conceptual model of diurnal cycle of low-level stratiform clouds over southern West Africa, Atmos. Chem. Phys., 20, 2263–2275,, 2020. a

Lu, D. and Ricciuto, D.: Efficient surrogate modeling methods for large-scale Earth system models based on machine-learning techniques, Geosci. Model Dev., 12, 1791–1807,, 2019. a, b

Marsham, J. H., Dixon, N. S., Garcia-Carreras, L., Lister, G. M. S., Parker, D. J., Knippertz, P., and Birch, C. E.: The role of moist convection in the West African monsoon system: Insights from continental-scale convection-permitting simulations, Geophys. Res. Lett., 40, 1843–1849,, 2013. a, b

Martin, G. M., Peyrillé, P., Roehrig, R., Rio, C., Caian, M., Bellon, G., Codron, F., Lafore, J.-P., Poan, D. E., and Idelkadi, A.: Understanding the West African Monsoon from the analysis of diabatic heating distributions as simulated by climate models, J. Adv. Model. Earth Syst., 9, 239–270,, 2017. a

Massoud, E. C.: Emulation of environmental models using polynomial chaos expansion, Environ. Model. Softw., 111, 421–431,, 2019. a, b, c

Matheron, G.: Le krigeage universel, vol. 1, École nationale supérieure des mines de Paris, (last access: 12 April 2024), 1969. a, b

Mathon, V., Laurent, H., and Lebel, T.: Mesoscale Convective System Rainfall in the Sahel, J. Appl. Meteorol., 41, 1081–1092,<1081:mcsrit>;2, 2002. a

Matsui, T., Zhang, S. Q., Lang, S. E., Tao, W.-K., Ichoku, C., and Peters-Lidard, C. D.: Impact of radiation frequency, precipitation radiative forcing, and radiation column aggregation on convection-permitting West African monsoon simulations, Clim. Dynam., 55, 193–213,, 2018. a

Messager, C., Gallée, H., and Brasseur, O.: Precipitation sensitivity to regional SST in a regional climate simulation during the West African monsoon for two dry years, Clim. Dynam., 22, 249–266,, 2004. a

Morris, M. D. and Mitchell, T. J.: Exploratory designs for computational experiments, J. Stat. Plan. Infer., 43, 381–402,, 1995. a, b, c

Müller, J., Paudel, R., Shoemaker, C. A., Woodbury, J., Wang, Y., and Mahowald, N.: CH4 parameter estimation in CLM4.5bgc using surrogate global optimization, Geosci. Model Dev., 8, 3285–3310,, 2015. a, b

Nicholson, S. E.: A revised picture of the structure of the “monsoon” and land ITCZ over West Africa, Clim. Dynam., 32, 1155–1171,, 2009. a

Oakley, J.: Estimating percentiles of uncertain computer code outputs, Appl. Stat.-J. Roy. C, 53, 83–93,, 2004. a

Ollinaho, P., Lock, S.-J., Leutbecher, M., Bechtold, P., Beljaars, A., Bozzo, A., Forbes, R. M., Haiden, T., Hogan, R. J., and Sandu, I.: Towards process-level representation of model uncertainties: stochastically perturbed parametrizations in the ECMWF ensemble, Qu. J. Roy. Meteorol. Soc., 143, 408–422,, 2017. a

Paeth, H., Capo-Chichi, A., and Endlicher, W.: Climate change and food security in tropical West Africa – a dynamic-statistical modelling approach, Erdkunde, 62, 101–115,, 2008. a

Pante, G. and Knippertz, P.: Resolving Sahelian thunderstorms improves mid-latitude weather forecasts, Nat. Commun., 10, 3487,, 2019. a, b, c

Parker, D. J., Fink, A., Janicot, S., Ngamini, J.-B., Douglas, M., Afiesimama, E., Agusti-Panareda, A., Beljaars, A., Dide, F., Diedhiou, A., Lebel, T., Polcher, J., Redelsperger, J.-L., Thorncroft, C., and Wilson, G. A.: The AMMA Radiosonde Program and its Implications for the Future of Atmospheric Monitoring Over Africa, B. Am. Meteorol. Soc., 89, 1015–1028,, 2008. a

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E.: Scikit-learn: Machine Learning in Python, J. Mach. Learn. Res., 12, 2825–2830, 2011. a

Quinn, G. P. and Keough, M. J.: Experimental Design and Data Analysis for Biologists, Cambridge University Press, ISBN 9780511806384,, 2002. a

Raschendorfer, M.: Operationelles NWV-System, hier: Verminderung der minimalen Diffusionskoeffizienten für COSMO-EU/DE/EPS, Tech. rep., (last access: 12 April 2024), 2012. a

Rasmussen, C. E. and Williams, C. K. I.: Gaussian Processes for Machine Learning, The MIT Press,, 2005. a, b, c

Ray, J., Hou, Z., Huang, M., Sargsyan, K., and Swiler, L.: Bayesian Calibration of the Community Land Model Using Surrogates, SIAM/ASA J. Uncertain. Quantif., 3, 199–233,, 2015. a, b, c

Reed, R. J., Norquist, D. C., and Recker, E. E.: The Structure and Properties of African Wave Disturbances as Observed During Phase III of GATE, Mon. Weather Rev., 105, 317–333,<0317:tsapoa>;2, 1977. a

Reinert, D., Prill, F., Frank, H., Denhard, M., and Zängl, G.: Database Reference Manual for ICON and ICON-EPS, Version 1.2.11, Tech. rep., Deutscher Wetterdienst, Offenbach am Main,, 2019. a, b, c, d

Rosenblatt, M.: Remarks on a Multivariate Transformation, Ann. Math. Stat., 23, 470–472,, 1952. a

Saltelli, A., Tarantola, S., and Chan, K. P.-S.: A Quantitative Model-Independent Method for Global Sensitivity Analysis of Model Output, Technometrics, 41, 39–56,, 1999. a

Stubenrauch, C. J., Caria, G., Protopapadaki, S. E., and Hemmer, F.: 3D radiative heating of tropical upper tropospheric cloud systems derived from synergistic A-Train observations and machine learning, Atmos. Chem. Phys., 21, 1015–1034,, 2021. a

Sudret, B.: Polynomial chaos expansions and stochastic finite element methods, in: Risk and Reliability in Geotechnical Engineering, edited by: Phoon, K.-K. and Ching, J., CRC Press, London, 265–300, ISBN 978-1-4822-2721-5, 2014. a, b

Tchotchou, L. A. D. and Kamga, F. M.: Sensitivity of the simulated African monsoon of summers 1993 and 1999 to convective parameterization schemes in RegCM3, Theor. Appl. Climatol., 100, 207–220,, 2009. a

Thorncroft, C. D., Nguyen, H., Zhang, C., and Peyrillé, P.: Annual cycle of the West African monsoon: regional circulations and associated water vapour transport, Q. J. Roy. Meteorol. Soc., 137, 129–147,, 2011. a

van der Linden, R., Knippertz, P., Fink, A. H., Ingleby, B., Maranan, M., and Benedetti, A.: The influence of DACCIWA radiosonde data on the quality of ECMWF analyses and forecasts over southern West Africa, Q. J. Roy. Meteorol. Soc., 146, 1719–1739,, 2020. a

Vellinga, M., Arribas, A., and Graham, R.: Seasonal forecasts for regional onset of the West African monsoon, Clim. Dynam., 40, 3047–3070,, 2013. a

Vogel, P., Knippertz, P., Fink, A. H., Schlueter, A., and Gneiting, T.: Skill of Global Raw and Postprocessed Ensemble Predictions of Rainfall over Northern Tropical Africa, Weather Forecast., 33, 369–388,, 2018. a

Vogel, P., Knippertz, P., Fink, A. H., Schlueter, A., and Gneiting, T.: Skill of Global Raw and Postprocessed Ensemble Predictions of Rainfall in the Tropics, Weather Forecast., 35, 2367–2385,, 2020. a

Walz, E., Maranan, M., van der Linden, R., Fink, A. H., and Knippertz, P.: An IMERG-Based Optimal Extended Probabilistic Climatology (EPC) as a Benchmark Ensemble Forecast for Precipitation in the Tropics and Subtropics, Weather Forecast., 36, 1561–1573,, 2021. a

Wan, H., Rasch, P. J., Zhang, K., Qian, Y., Yan, H., and Zhao, C.: Short ensembles: an efficient method for discerning climate-relevant sensitivities in atmospheric general circulation models, Geosci. Model Dev., 7, 1961–1977,, 2014. a

Wellmann, C., Barrett, A. I., Johnson, J. S., Kunz, M., Vogel, B., Carslaw, K. S., and Hoose, C.: Comparing the impact of environmental conditions and microphysics on the forecast uncertainty of deep convective clouds and hail, Atmos. Chem. Phys., 20, 2201–2219,, 2020. a

Williamson, D.: Exploratory ensemble designs for environmental models using k-extended Latin Hypercubes, Environmetrics, 26, 268–283,, 2015.  a

Xue, Y., Sales, F. D., Lau, W. K.-M., Boone, A., Feng, J., Dirmeyer, P., Guo, Z., Kim, K.-M., Kitoh, A., Kumar, V., Poccard-Leclercq, I., Mahowald, N., Moufouma-Okia, W., Pegion, P., Rowell, D. P., Schemm, J., Schubert, S. D., Sealy, A., Thiaw, W. M., Vintzileos, A., Williams, S. F., and Wu, M.-L. C.: Intercomparison and analyses of the climatology of the West African Monsoon in the West African Monsoon Modeling and Evaluation project (WAMME) first model intercomparison experiment, Clim. Dynam., 35, 3–27,, 2010. a

Zängl, G., Reinert, D., Rípodas, P., and Baldauf, M.: The ICON (ICOsahedral Non-hydrostatic) modelling framework of DWD and MPI-M: Description of the non-hydrostatic dynamical core, Q. J. Roy. Meteorol. Soc., 141, 563–579,, 2015. a, b

Zheng, X. and Eltahir, E. A. B.: The Role of Vegetation in the Dynamics of West African Monsoons, J. Climate, 11, 2078–2096, 1998. a

Short summary
Our research enhances the understanding of the complex dynamics within the West African monsoon system by analyzing the impact of specific model parameters on its characteristics. Employing surrogate models, we identified critical factors such as the entrainment rate and the fall velocity of ice. Precise definition of these parameters in weather models could improve forecast accuracy, thus enabling better strategies to manage and reduce the impact of weather events.