the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Surrogate-based model parameter optimization in simulations of the West African monsoon
Matthias Fischer
Peter Knippertz
Carsten Proppe
Download
- Final revised paper (published on 21 Jan 2025)
- Preprint (discussion started on 19 Jul 2024)
Interactive discussion
Status: closed
-
RC1: 'Comment on egusphere-2024-1984', Anonymous Referee #1, 09 Aug 2024
The authors use the regional climate model ICON driven by ERA5 boundary conditions to simulate the West African Monsoon (WAM) system. A perturbed parameter ensemble was run, each simulation covered 41 days during the summer monsoon within 4 different years, namely 2016 to 2019. The ensemble size was 60, taking into account 6 ICON model parameters. The output was averaged over the simulation period. Two surrogate models are constructed using Gaussian process regression and Principal Component Analysis (PCA), respectively, and particular objective functions are defined to perform multi-objective optimizations with regard to the ICON model parameters. Uncertainty intervals are determined for the weights in the objective functions using expert judgment to define their relative magnitudes, and the sensitivity of optimal parameters to the variations in the weights is explored.
According to the authors the overall conclusion is that: "To further enhance the accuracy of climate simulations and potentially improve weather predictions, it is crucial to prioritize the refinement of the overall physical models, including the reduction of inherent structural errors, rather than solely adjusting the uncertain parameters in existing model parametrizations. Nevertheless, our methodology demonstrates the potential of integrating statistical and expert-driven approaches to assess and improve the simulation accuracy of the WAM."The main issue in my view is that there have been various similar proof-of-concept studies of objective calibration of weather, climate, ocean or cryospheric models in the past. I don't see any major conceptual or methodological innovation in the present work, except maybe the use of Principal Component Analysis in the dimension reduction process, although Gaussian process models have certainly been used extensively (An internet search suggests that PCA has been used previously, too, such as in: Chang, W., Applegate, P. J., Haran, M., and Keller, K.: Probabilistic calibration of a Greenland Ice Sheet model using spatially resolved synthetic observations: toward projections of ice mass loss with uncertainties, Geosci. Model Dev., 7, 1933–1943, https://doi.org/10.5194/gmd-7-1933-2014, 2014). It is good to be reminded of advantages and drawbacks of objective model calibration in the context of weather and climate modelling, but similar studies can be performed using a multitude of models, regions, parameters, and loss functions; there should be a particular added value in such an effort, either in terms of methodology or process understanding.
There is a long history of using surrogate models (more often called "emulators" in the weather and climate community) to calibrate weather and climate models, the relevant literature is not quite adequately reviewed in the present manuscript.
There are also various software packages in this context, e.g. Watson-Parris, D., Williams, A., Deaconu, L., and Stier, P.: Model calibration using ESEm v1.1.0 – an open, scalable Earth system emulator, Geosci. Model Dev., 14, 7659–7672, https://doi.org/10.5194/gmd-14-7659-2021, 2021; Couvreux, F., Hourdin, F., Williamson, D., Roehrig, R., Volodina, V., Villefranque, N., et al. (2021). Process-based climate model development harnessing machine learning: I. A calibration tool for parameterization improvement. Journal of Advances in Modeling Earth Systems, 13, e2020MS002217. https://doi.org/10.1029/2020MS002217.
Nowadays studies often use machine learning methods in the process of building emulators, which can be viewed an innovation to some extent.I like the honesty of the conclusions in the present manuscript, but there are various past proof-of-concept studies with very similar conclusions and I struggle to see any substantial innovation in the present work. Whether one can say that the incorporation of the approximation of 2D fields with Principal Component Analysis (PCA) in the optimization process is an innovation I'm not quite sure, certainly Gaussian process models have been used extensively (see e.g. Watson-Parris et al., 2021; Couvreux et al. 2021). PCA can be seen as a Gaussian process model, too, in a sense. In machine learning models dimension reduction is often performed using autoencoders, something I would consider more of an innovation in the model calibration context.
There are issues in particular with calibrating physical parameterisations in regional model simulations: it is usually not very difficult to improve physical parameterisations when studying a particular region (considering specific objective functions). The main difficulty in developing phyical parameterisations is that they need to be valid across a wide range of climates and meteorological conditions, from high to low latitudes, from wet to dry conditions etc. This is particularly obvious in the discussion of the convective entrainment rate in the present manuscript. Entrainment is a key process in convection and has been studied extensively. It is well known that in weather and climate models entrainment rates in convection parameterisations can affect weather and climate in various ways. E.g. Zhu, H. and Hendon, H.H. (2015), Role of large-scale moisture advection for simulation of the MJO with increased entrainment. Q.J.R. Meteorol. Soc., 141: 2127-2136. https://doi.org/10.1002/qj.2510; Sherwood, S. C., D. Hernández-Deckers, M. Colin, and F. Robinson, 2013: Slippery Thermals and the Cumulus Entrainment Paradox. J. Atmos. Sci., 70, 2426–2442, https://doi.org/10.1175/JAS-D-12-0220.1.
But again, I feel that I am repeating points here that have been discussed for many years. I'm sympathetic towards using more objective methods in weather and climate model calibration, and I don't have any particular objection against the present manuscript being published. Indeed, some of those methodologies are used to some extent in model development nowadays at some institutions, which is to be encouraged. But the present study, explicitly and implicitly, also quite clearly shows the limitations, as have done previous studies.Citation: https://doi.org/10.5194/egusphere-2024-1984-RC1 -
RC2: 'Comment on egusphere-2024-1984', Anonymous Referee #2, 15 Aug 2024
Work summary: The Authors presents a research that shows the effort in enhance numerical weather (and climate) forecast of WAM by applying a surrogate based optimization in the space defined by 3 parameters of a ICON regional model. Multiple Objective functions, are based on the squared distance of icon outputs respect to several reference vales. Even if this can be considered as multiple objective, authors have transformed it in a single-weighted objective optimization problem, also to take in account of the parameter model uncertainties within the optimization results.
Detailed considerations and Comments:
The paper is characterized by a well described and well documented monsoon physical features and the way it can be resolved/modeled by a NWP model like ICON. Then follows a large part based on description and application of methodologies aimed to solve an optimization problem of parameters controlling unresolved micro-physics in a regional NWP model.
Those research paper follows a previous activities Author's research, like Universal Kriging (Fischer, M. and Proppe, C.: Enhanced universal kriging for transformed input parameter spaces, Probabilistic Engineering Mechanics, 74,103 486, https://doi.org/10.1016/j.probengmech.2023.103486, 2023.) and SBO uncertainties quantification applied to NWP prediction of WAM (Fischer, M., Knippertz, P., van der Linden, R., Lemburg, A., Pante, G., Proppe, C., and Marsham, J. H.: Quantifying uncertainty in simulations of theWest African monsoon with the use of surrogate models,Weather and Climate Dynamics, 5, 511–536, https://doi.org/10.5194/wcd-5-511-2024, 2024.), in which the starting parameters under investigation was in number of six.
In the current work, the tuning parameters have been reduced to 3 due to the sensitivity results of previous works (entrainment rates, ice crystals fall speed and soil moisture evaporation fraction).
A first comment is that, even if the work is well described, probably suffers from the fact that some contents are based also on the results of research previously published by the same authors, and a lack of smooth reading could be envisaged. This is not necessarily an issue, since in this way it is preserved the briefness of the work, but in some points some recalls could helps readers.
Coming back to technical aspects, Authors presented a multi-objective optimization of that 3 parameters in order to minimize distance with referenc/bserved values in two different problems:
A) 15 QOI (such as cumulated montly precipitation)
b) 12 2D outputs fields
Furthermore, since the MOO is transformed in a single weighted objective optimization for both A and B problems, the influences of parameters uncertainties are also take in account by means of weight uncertainties and variation.In details, for A, the Surrogates are based on the 15 relation among input parameters and the 15 QOI, ie, the squared error of QOI between icon and reference values.
For B, a PCA based surrogate model was employed to recompute 12 2D averaged fields to be compared to reference fields.Here comes a question about the PCA scalar terms (Cm in formula 7 and 8): To my understand, Authors find PCA scalar terms by minimizing the distance between training 2D fields and reconstructed fields, considering 3 terms in the PCA expansion series. To my knowledge, that scalar terms are related to the inner product of PCA modes with the scalar fields, and the former are sorted according to their energetic content. Here the authors are trying to reduce the truncation error: maybe this requires more discussion and argumentation also in the text.
Trying to explain if this approach have could have an impact on the physical meaning of that scalar coefficient, e.g, considering that the energy content of the first 3 modes is related to the energy content expressed by the 3 first eigenvalues: are the new resulting scalar coefficients modifying this, since it can be demonstrated that the PCA scalar vector is linked to the relative eigenvalue, and hence to the energy content.
Coming back again to the paper: Both Surrogate models trained by means of 60 runs ICON outputs by sampling the design space defined by the 3 parameters ( other, old, 3 parameters was kept constant on their default values).A clarification was due referring to the training stage of SBO/PCA-sbo: in the paper there is a brief description about the design of sampling 60 simulations. Previous author's research states the maximum LHS was adopted: the space (reduced) of 3 parameters is still well represented by the sampling of a 60 samples in a space of 6 variables? Since the training strategies is crucial in SBO and PCA technique, some clarification should be given in the current paper.
Finally, authors state that the results of the optimization indicates how the ICON default values, relatively to the investigated parameters, are already well tuned, since the optimal values obtained are quite near that default values. Further enhancements was found with certain configurations for other variables, give rise to deteriorating of some others. Anyway, lower values of entrainment rates enhance accumulated precipitation and temperatures.
Question is: have the authors verified that the set of optimal parameters found effectively give rise to better ICON results? for example by running icon with optimal set found.
Final Considerations:
The article can be placed in the context of automatic tuning of model parameters in a NWP model, and is worthy of publication. It would be worthwhile to include some more description of the PCA methodology, both from a bibliographical and methodological point of view, as discussed before.
Furthermore, since the optimization is conducted in the space of 3 parameters, although the objective function is a function characterized by a highly irregular landscape, maybe less than 60 training runs could be sufficient. With the remaining 'computational budget', an attempt in updating Surrogate models could be verified, adopting an approach that mitigate the risk of low reliability at first stages, that could affect also the optimal set of parameters.
Citation: https://doi.org/10.5194/egusphere-2024-1984-RC2 -
AC1: 'Comment on egusphere-2024-1984', Matthias Fischer, 20 Sep 2024
We would like to thank the reviewers for their constructive and helpful comments on the manuscript. Overall, we agree with the given remarks and provide a response to the main points of criticism below.
1 – NOVELTY (RC1)
We appreciate the reviewer’s comments and fully agree that the use of surrogate models in parameter tuning is not novel. Many previous studies have indeed explored the opportunities and limitations of such methods. However, we have advanced these approaches in three specific and meaningful ways:
· Multi-Objective Studies: It is common for multiple objectives to require optimization simultaneously, and this often necessitates a robust framework for handling competing goals. While some previous works have employed fixed weights, our study emphasizes the importance of a more nuanced approach that accounts for both weight uncertainties and systematic weight variations. We believe this provides modelers with deeper insights into the relationships and trade-offs between objectives, an area that we found to be less explored in the meteorological literature.
· Application to West African Monsoon (WAM) : To our knowledge, there is limited literature on surrogate models over tropical Africa and none with the ICON model. Since the ICON model has not been specifically optimized for the African continent, our results offer valuable insights into both the model’s current capabilities and potential areas for improvement in relation to the WAM system. The meteorology of West Africa is known to be a particular challenge for models due to the high degree of convective organization and high sensitivity to surface fluxes, low clouds and rain evaporation, making it an ideal target for the approach we present in the paper.
· Application of Principal Component Regression: In the context of parameter tuning, we demonstrate that principal component regression is a promising tool. The interactive tool developed in this study provides a computationally efficient and insightful option for model developers, which can be applied in other contexts as well.We also acknowledge the reviewer’s point regarding the relatively limited scope of our literature review on surrogate models for parameter tuning. We will expand this section by incorporating additional relevant and useful literature to provide a more comprehensive overview. While we do not claim to have reinvented surrogate-based parameter tuning, we believe our study introduces innovative elements that extend the conventional methodologies in meaningful ways. In the manuscript, we will modify the relevant sections to more explicitly highlight these innovative components.
2 – REPETITION (RC2)
RC2: „A first comment is that, even if the work is well described, probably suffers from the fact that some contents are based also on the results of research previously published by the same authors, and a lack of smooth reading could be envisaged. This is not necessarily an issue, since in this way it is preserved the briefness of the work, but in some points some recalls could helps readers.”
We acknowledge that certain sections of the manuscript build upon our previously published research, which may have affected the overall flow for readers unfamiliar with the prior work. To improve clarity, we will ensure that key elements from the original paper are briefly recalled at relevant points, enhancing the reader's understanding while maintaining the manuscript's conciseness.3 – PCA (RC2)
RC2: “Cm in formula 7 and 8. To my knowledge, that scalar terms are related to the inner product of PCA modes with the scalar fields, and the former are sorted according to their energetic content. Here the authors are trying to reduce the truncation error: maybe this requires more discussion and argumentation also in the text. […]
Trying to explain if this approach have could have an impact on the physical meaning of that scalar coefficient, e.g, considering that the energy content of the first 3 modes is related to the energy content expressed by the 3 first eigenvalues: are the new resulting scalar coefficients modifying this, since it can be demonstrated that the PCA scalar vector is linked to the relative eigenvalue, and hence to the energy content.”
The coefficients Cm are obtained by solving a general minimization problem aimed at reducing the overall prediction error. This approach, commonly employed in regression models, ensures that the coefficients are optimized to fit the model to the data, even though they may not directly correspond to PCA eigenvalues or be associated with energy content. Nevertheless, the principal components are ordered in descending sequence based on their associated eigenvalues, with each eigenvalue reflecting the amount of variance explained by the corresponding component. The input space transformation T_ros is used to improve model performance by addressing distortions in the input space. While PCA coefficients can be analytically computed when using a linear ansatz for the coefficients, in our case, solving the minimization problem numerically was necessary and computationally feasible. As discussed in our original paper, the input space transformation enhances the surrogate modeling process. Although it may affect the traditional interpretation of the coefficients with respect to energy content, it does not compromise the validity or efficacy of the approach.
In the manuscript, we will include additional information to provide a clearer understanding of our methodology in the broader context of PCA. Furthermore, we will expand on the bibliographical background and offer more detailed explanations regarding the methodological aspects.4 - SAMPLING SPACE and VERIFICATION (RC2)
C2: “the space (reduced) of 3 parameters is still well represented by the sampling of a 60 samples in a space of 6 variables? Since the training strategies is crucial in SBO and PCA technique, some clarification should be given in the current paper. […] maybe less than 60 training runs could be sufficient. With the remaining 'computational budget', an attempt in updating Surrogate models could be verified, adopting an approach that mitigate the risk of low reliability at first stages, that could affect also the optimal set of parameters […] have the authors verified that the set of optimal parameters found effectively give rise to better ICON results? for example by running icon with optimal set found.“
In our original work, we conducted computationally intensive ICON model simulations, which required several hundred thousand CPU hours for all years and parameter configurations. Given these high computational costs and the already limited experimental design (60 training points for 6 input dimensions), it was not feasible to exclude certain parameters via “cheap” simulations beforehand. Consequently, we employed a space-filling design with all 6 parameters.
Based on the results from sensitivity and parameter studies, it is indeed apparent that excluding the three parameters with minimal effect and reallocating the computational budget (or even less) to create a more accurate surrogate model for the remaining three parameters would be a promising approach. This could potentially involve the use of sequential training algorithms. However, the computational budget has already been expended for the original studies and the identification of parameter effects. Any alternative strategy would thus entail significant additional computational effort. Therefore, our focus remains on utilizing the models we have already developed without requiring further model simulations.
In our original work, we validated the Gaussian Process Regression models and demonstrated very good model accuracy. Although accuracy may vary throughout the input domain, the accuracy across the full 6-dimensional input space also covers the 3-dimensional subspace used in the optimization process. Hence, we can rely on the original validation results and the surrogate models employed.
As discussed in our manuscript, the overall improvement of the ICON model is relatively limited, and depends on the weights of the objectives. With the high accuracy of the surrogate models based on validation results, Figure 5 illustrates possible changes in relation to the applied weights and their associated uncertainties.
In the manuscript, we will provide additional details on the step where the 6-dimensional surrogate models are applied to the 3-dimensional subspace, including further explanations of the reasons behind this approach and its limitations.Citation: https://doi.org/10.5194/egusphere-2024-1984-AC1 -
EC1: 'Comment on egusphere-2024-1984', Tim Woollings, 23 Sep 2024
I'd like to thank the reviewers for their constructive comments, and also the authors for their reply. I look forward to seeing a revised version of the paper.
Citation: https://doi.org/10.5194/egusphere-2024-1984-EC1