Moment-based metrics for global sensitivity analysis of hydrological systems

We propose new metrics to assist global sensitivity analysis, GSA, of hydrological and Earth systems. Our approach allows assessing the impact of uncertain parameters on main features of the probability density function, pdf, of a target model output, y. These include the expected value of y, the spread around the mean and the degree of symmetry and tailedness of the pdf of y. Since reliable assessment of higher-order statistical moments can be computationally demanding, we couple our GSA approach with a surrogate model, approximating the full model response at a reduced computational cost. Here, we consider the generalized polynomial chaos expansion (gPCE), other model reduction techniques being fully compatible with our theoretical framework. We demonstrate our approach through three test cases, including an analytical benchmark, a simplified scenario mimicking pumping in a coastal aquifer and a laboratory-scale conservative transport experiment. Our results allow ascertaining which parameters can impact some moments of the model output pdf while being uninfluential to others. We also investigate the error associated with the evaluation of our sensitivity metrics by replacing the original system model through a gPCE. Our results indicate that the construction of a surrogate model with increasing level of accuracy might be required depending on the statistical moment considered in the GSA. The approach is fully compatible with (and can assist the development of) analysis techniques employed in the context of reduction of model complexity, model calibration, design of experiment, uncertainty quantification and risk assessment.


Introduction
Our improved understanding of physical-chemical mechanisms governing hydrological processes on multiple scales of space and time and the ever increasing power of modern computational resources are at the heart of the formulation of conceptual models which are frequently characterized by marked levels of sophistication and complexity.This is evident when one considers the spectrum of mathematical formulations and ensuing level of model parametrization rendering our conceptual understanding of given environmental scenarios (Willmann et al., 2006;Grauso et al., 2008;Koutsoyiannis, 2010;Wagener et al., 2010;Elshorbagy et al., 2010a, b;Wagener and Montanari, 2011;Hartmann et al., 2013;Herman et al., 2013;Förster et al., 2014;Paniconi and Putti, 2015).Model complexity can in turn exacerbate challenges associated with the need to quantify the way uncertainties associated with parameters of a given model propagate to target state variables.
In this context, approaches based on rigorous sensitivity analysis are valuable tools to improve our ability to (i) quantify uncertainty, (ii) enhance our understanding of the relationships between model input and outputs and (iii) tackle the challenges of model-and data-driven design of experiments.These also offer insights to guide model simplification, for example, by identifying model input parameters that have negligible effects on a target output.The variety of available sensitivity methodologies can be roughly subdivided into two broad categories, i.e., local and global approaches.Local sensitivity analyses consider the variation of a model output against variations of model input solely in the neighborhood of a given set of parameter values.Otherwise, global sensitivity analysis (GSA) quantifies model sen-sitivity across the complete support within which model parameters can vary.Error measurements and/or lack of knowledge about parameters can be naturally accommodated in a GSA by specifying appropriate parameter intervals and evaluating sensitivity over the complete parameter space.Recent studies and reviews on available sensitivity analysis and approaches are offered by, e.g., Pianosi et al. (2016), Sarrazin et al. (2016) and Razavi and Gupta (2015).
Our study is framed in the context of GSA methods.A broadly recognized strategy to quantify global sensitivity of uncertain model parameters to model outputs relies on the evaluation of the Sobol' indices (Sobol, 1993).These are typically referred to as variance-based sensitivity measures because the output variance is taken as the metric upon which sensitivity is quantified.A key limitation of a variance-based GSA is that the uncertainty of the output is implicitly considered to be fully characterized by its variance.Relying solely on this criterion can provide an incomplete picture of a system response to model parameters, also considering that probability densities of typical hydrological quantities can be characterized by highly skewed and tailed distributions (e.g., Borgonovo et al., 2011).Recent studies (e.g., Krykacz-Hausmann, 2001;Borgonovo, 2007;Borgonovo et al., 2011) introduced a sensitivity metric grounded on the complete probability density function, pdf, of the model output.These so-called moment-independent analyses may suffer from operational constraints because a robust evaluation of the complete pdf may require a number of model runs which is computationally unaffordable.The PAWN method developed by Pianosi and Wagener (2015) attempts to overcome this limitation introducing a sensitivity metric based on the cumulative density function, which can potentially be estimated more robustly than its associated pdf for a given sample size.
It is clear that while a variance-based GSA can be favored for its conceptual simplicity and ease of implementation, and variance can be considered in some cases as an adequate proxy of the spread around the mean, it does not yield a forthright quantification of the way variations of a parameter can affect the structure of the pdf of a target model output.Otherwise, moment-independent methodologies condense sensitivity of the entire pdf in only one index, somehow shadowing our understanding of how the structure of the pdf is affected by variations of each uncertain model parameter.Here, our distinctive objective is to contribute to bridge the gap between these two types of GSA.We do so by introducing theoretical elements and an implementation strategy which enable us to appraise parameter sensitivity through the joint use of sensitivity indices based on four (statistical) moments of the pdf of the model output: expected value, variance, skewness and kurtosis.The key idea at the basis of this strategy is that linking parameter sensitivity to multiple statistical moments leads to improved understanding of the way a given uncertain parameter can govern key features of the shape of the pdf of desired model outputs, which is of interest in modern applications of hydrological and Earth sciences.
Variance-based GSA has also been applied (a) to guide reduction of model complexity, e.g., by setting the value of a parameter which is deemed as uninfluential to the variance of a target model output (e.g., Fu et al., 2012;Chu et al., 2015;Punzo et al., 2015) and (b) in the context of uncertainty quantification (Saltelli et al., 2008;Pianosi et al., 2016;Colombo et al., 2016).Only limited attention has been devoted to assessing the relative effects of uncertain model parameters to the first four statistical moments of the target model output.This information would complement a model complexity analysis by introducing a quantification of the impact that conditioning the process on prescribed parameter values would have on the first four statistical moments of the output.Our approach is based on the joint use of multiple (statistical) moments for GSA.It enables us to address the following critical questions: when can the variance be considered as a reliable proxy for characterizing model output uncertainty?Which model parameter most affects asymmetry and/or the tailing behavior of a model output pdf?Does a given model parameter have a marked role in controlling some of the first four statistical moments of the model output, while being uninfluential to others?Addressing these questions would contribute to prioritizing our efforts to characterize model parameters that are most relevant in affecting important aspects of model prediction uncertainty.
Even as the richness of information content that a GSA grounded on the first four statistical moments might carry can be a significant added value to our system understanding, it may sometimes be challenging to obtain robust and stable evaluation of the proposed metrics for complex and computationally demanding models.This can be especially true when considering higher-order moments such as skewness and kurtosis.To overcome this difficulty, we cast the problem within a computationally tractable framework and rely on the use of surrogate models that mimic the full model response with a reduced computational burden.Amongst the diverse available techniques to construct a surrogate model (see, e.g., Razavi et al., 2012a, b), we exemplify our approach by considering the generalized polynomial chaos expansion (gPCE) that has been successfully applied to a variety of complex environmental problems (Sudret, 2008;Ciriello et al., 2013;Formaggia et al., 2013;Riva et al., 2015;Gläser et al., 2016), other model reduction techniques being fully compatible with our GSA framework.In this context, we also investigate the error associated with the evaluation of the sensitivity metrics we propose by replacing the original (full) system model with the selected surrogate model.We consider three test cases in our analysis.These include a widely employed analytical benchmark, a pumping scenario in a coastal aquifers and a laboratory-scale transport setting.The remainder of the work is organized as follows.Section 2 presents our theoretical framework and developments.Section 3 illustrates our results for the three test cases indicated above, and conclusions are drawn in Sect. 4.

Theoretical framework
We start by recalling the widely used variance-based GSA metrics in Sect.2.1.These allow quantifying the contribution of each uncertain parameter to the total variance of a state variable of interest.We also provide a brief overview of the gPCE technique, which we use to construct a surrogate of the full system model.We then illustrate in Sect.2.2 the theoretical developments underlying our approach and introduce novel GSA indices.

Sobol' indices for variance-based GSA and generalized polynomial chaos expansion
We consider a target system state variable, y, which depends on N random parameters.These are collected in vector x = (x 1 , x 2 , . .., x N ) and defined in the parameter space = x 1 × x 2 ×. . .x N , x i = [x i,min , x i,max ] being the support of the ith random variable x i .Variance-based GSA approaches consider variance as the sole metric to quantify the contribution of each uncertain parameter to the uncertainty of y.Iman and Hora (1990) introduce the following index where E[-] and V [-], respectively, denote expectation and variance operators.Index HI x i quantifies the expected reduction of variance due to knowledge of x i (the notation |x i in Eq. (1) indicates conditioning on x i ).A similar measure is offered by the widely used Sobol' indices (Sobol, 1993).These have been defined starting from the Hoeffding-Sobol' decomposition (see, e.g., Sobol, 1993;Le Maître and Knio, 2010) of y(x) when x is a collection of independent random variables such as x i <x j y x i ,x j (x i , x j ) where and so on, ρ x being the pdf of x.The integral ∼x i y(x)ρ ∼x i dx ∼x i in Eq. (3) represents integration of y(x) over the space of all entries of vector x excluding x i , ρ ∼x i being the corresponding pdf.The Sobol' index S x i 1 ,x i 2 ,...,x is is associated with the mixed effect of x i 1 , . .., x i s on the variance of y(x), V [y], and can be computed as The principal and total Sobol' indices are, respectively, defined as Note that S x i describes the relative contribution to V y due to variability of only x i .Otherwise, S T x i quantifies the total contribution of x i to V y , including all terms where x i appears.In other words, S T x i also includes interactions between x i and the remaining uncertain parameters, collected in vector x ∼x i .Note that according to Eqs. (1), ( 2) and ( 5) i.e., the principal Sobol' index represents the relative expected reduction of process variance due to knowledge of (or conditioning on) a parameter.Sobol' indices are commonly evaluated via Monte Carlo quadrature schemes that can be markedly demanding in terms of computational time, especially for complex and highly nonlinear settings.Relying on a gPCE as a surrogate of the full mathematical model of the system (Ghanem and Spanos, 1991;Xiu and Karniadakis, 2002;Le Maitre and Knio, 2010;Formaggia et al., 2013;Ciriello et al., 2013;Riva et al., 2015) allows for reducing the computational burden associated with GSA techniques.The process y(x) is represented as a linear combination of multivariate polynomials, ψ p (x), i.e., where p = {p 1 , . the gPCE coefficients; i contains all indices such that only the ith component does not vanish; i,j contains all indices such that only the ith and j th components are not zero, and so on.Note that β 0 ≡ y 0 ; i.e., β 0 is the unconditional mean of y(x).Finally, the Sobol' indices Eqs. ( 4)-( 5) and the variance of y(x) can be computed from Eq. (8) as 2.2 New metrics for multiple-moment GSA We introduce new metrics to quantify the expected relative change of main features of the pdf of y due to variability of model input parameters.In contrast with traditional variancebased GSA techniques of the kind described in Sect.2.1, we quantify changes in the pdf of y through its first four statistical moments, i.e., mean, and kurtosis, k[y].The latter is an indicator of the behavior of the tails of the pdf of y and is particularly useful in the context of risk analysis, while γ [y] quantifies the asymmetry of the pdf of y.
The effect of changes of x on the mean of y cannot be systematically analyzed by the metrics currently available in the literature.We therefore introduce the following quantity y 0 being defined in Eq. ( 3).Extension of Eq. ( 10) to consider the joint effect of x i 1 , x i 2 , . .., x i s on the mean of y is straightforward, leading to the following index Note that index AMAE x i quantifies the expected relative variation of the mean of y due to variations of only x i , while AMAE x i 1 , . .., x is also includes all interactions amongst parameters x i 1 , x i 2 , . .., x i s .
Along the same lines, we introduce the following index quantifying the relative expected discrepancy between unconditional and conditional (on x i ) process variance.Note that Eq. ( 12) does not generally coincide with the principal Sobol' index S x i in Eq. ( 2) that quantifies the expected relative reduction of the variance due to knowledge of x i (or, in other words, the relative contribution to the variance arising from uncertainty in x i ).Index AMAV x i reduces to S x i only if the conditional variance, V [y|x i ], is always (i.e., for each value of x i ) smaller than (or equal to) its unconditional counterpart V y .The difference between AMAV x i and S x i , as well as advantages of using AMAV x i , will be elucidated through the numerical examples illustrated in Sect.3. Extension of Eq. ( 12) to consider the joint effect of x i 1 , x i 2 , . .., x i s reads (13) Index AMAV x i 1 , . .., x is quantifies the expected relative discrepancy between V y and the variance of the process conditional to joint knowledge of x i 1 , x i 2 , . .., x i s .
We then quantify the relative expected discrepancy between unconditional, γ [y], and conditional, γ [y|x i ], skewness through the index Extension of Eq. ( 14) to consider the joint effect of x i 1 , x i 2 , . .., x i s gives Hydrol.Earth Syst.Sci., 21, 6219-6234, 2017 www.hydrol-earth-syst-sci.net/21/6219/2017 The relative variation of the kurtosis of y due to variations of a parameter x i or of the parameter set x i 1 , x i 2 , . .., x i s can be, respectively, quantified through Relying jointly on Eqs. ( 10)-( 17) enables one to perform a comprehensive GSA of the target process y(x) quantifying the impact of x on the first four (statistical) moments of the pdf of y(x).This strategy yields information about the way important elements of the distribution of y(x), such as mean, spread around the mean, symmetry and tailedness, are affected by uncertain model parameters collected in the parameter vector x.This analysis is not feasible through a classical variance-based GSA.
Calculation of the indices we propose entails evaluation of conditional moments of y(x).This step can be computationally very demanding.Along the lines of our discussion about Sobol' indices in Sect.2.1, the new metrics Eqs. ( 10)-( 17) can be evaluated via a surrogate model, as we illustrate through our examples in Sect.3.

Illustrative examples
The theoretical framework introduced in Sect. 2 is here applied to three diverse test beds: (a) the Ishigami function, which constitutes an analytical benchmark typically employed in GSA studies; (b) a pumping scenario in a coastal aquifer, where the state variable of interest is the critical pumping rate, i.e., the largest admissible pumping rate to ensure that the extraction well is still not contaminated by seawater; and (c) a laboratory-scale setting associated with nonreactive transport in porous media.In the first two examples the relatively low computational costs associated with the complete mathematical description of the target outputs enables us to assess the error associated with the evaluation of indices Eqs. ( 10), ( 12), ( 14) and ( 16) through a gPCE representation of the output.In the third case, due to the complexity of the problem and the associated computational costs, we rely on the gPCE representation for the target quantity of interest.We emphasize that the use of a gPCE as a surrogate model is here considered only as an example, as our GSA approach is fully compatible with any full model and/or model order reduction technique.A critical limiting factor to our and any GSA approach could be the associated computational burden.This is expected to increase according to the following two features, which are mainly associated with the conceptual and mathematical model used to describe the target variables of interest: (a) the complexity of the hydrological system (in terms of, e.g., hydrogeological heterogeneity, nonlinearity and/or transient effects) and/or (b) the number of uncertain model input parameters considered.According to the relative weight of these features, some computational constraints might arise limiting our ability to (i) perform the analysis by relying exclusively on the full system model or (ii) construct a sufficiently accurate surrogate model through a number of full model runs that can be affordable in terms of available computational resources.Application of our GSA methodology to scenarios of increased level of complexity will be the subject of a future study.
In all of the above scenarios, uncertain parameters x i collected in x are considered as independent and identically distributed, i.i.d., random variables, each characterized by a uniform distribution within the interval x i = x i,min , x i,max .Note that varying the pdf of the uncertain model input parameters does not impact the definition of the GSA indices proposed in Sect. 2. Otherwise, it may affect the actual results, depending on the test case considered.All results are grounded on 5×10 5 Monte Carlo realizations, enabling convergence of all statistical moments analyzed.Series appearing in the gPCE Eq. ( 8) are evaluated up to a given order of truncation in all three examples.Here, we apply the totaldegree rule and construct a polynomial of order w through a sparse grid technique (see, e.g., Formaggia et al., 2013, and references therein).We then analyze the way the selected order w influences the results.Note that the optimal choice of the polynomial ψ p (x) in Eq. ( 8) depends on the pdf of the random variables collected in x (Xiu and Karniadakis, 2002).In our exemplary settings we use the multidimensional Legendre polynomials which are orthonormal with respect to the uniform pdf.

Ishigami function
The nonlinear and non-monotonic Ishigami function is widely used in the literature (e.g., Homma and Saltelli, 1996;Chun et al., 2000;Borgonovo, 2007;Sudret, 2008; www.18) can be evaluated analytically as Equation ( 19) reveals that the unconditional pdf of ISH is symmetric with tails that increase with |b| and decrease with |a|, as quantified by k ] can be evaluated analytically as For the sole purpose of illustrating our approach, here and in the following we set a  20), it is seen that E [ISH|x 3 ] coincides with its unconditional counterpart E [ISH], indicating that conditioning on any value of x 3 does not impact the mean of ISH.Otherwise, setting x 1 or x 2 to a given value clearly affects the mean of ISH in a way which is governed by Eq. ( 20) and shown in Fig. 1a.It is clear from Eq. ( 20) that E [ISH|x 2 ] has a higher frequency of oscillation within x 2 than E [ISH|x 1 ] has within x 1 .The global index in Eq. ( 10) can be evaluated analytically as Note that AMAE x 2 does not depend on specific values of a and b.Equation ( 21) shows that all random model parameters influence the variance of ISH, albeit to different extents, as also illustrated in Fig. 1b.Note that V [ISH|x 2 ] is always smaller than V [ISH] (compare Eqs. 19a and 21) and does not depend on x 2 ; i.e., conditioning ISH on x 2 reduces the process variance regardless the conditioning value.Otherwise, V [ISH|x 3 ] can be significantly larger or smaller than its unconditional counterpart.Table 1 lists values of AMAV x i (x i = x 1 , x 2 , x 3 ) computed via Eq.( 12) with the a and b values selected for our demonstration.The principal Sobol' indices (Sudret, 2008) are also listed for completeness.As expected, values of AMAV x i listed in Table 1 suggest that conditioning on x 3 has the strongest impact on the variance of ISH, followed by x 1 and x 2 .Note that S x 3 = 0, a result which might be interpreted as a symptom that ISH is insensitive to x 3 .The apparent inconsistency between the conclusions which could be drawn by analyzing AMAV x 3 and S x 3 is reconciled by the observation that the function V [ISH] − V [ISH|x 3 ] can be positive and negative in a way that its integration over x 3 vanishes (see also Fig. 1b).Therefore, the mean reduction of the variance of ISH due to knowledge of (or conditioning on) x 3 is Table 1.Global sensitivity index AMAE x i Eq. ( 10), AMAV x i Eq. ( 12), AMAγ x i Eq. ( 14) and AMAk x i Eq. ( 16) associated with the Ishigami function Eq. ( 18).Principal Sobol' indices, S x i Eq. ( 7), are also listed; zero.We note that this observation does not imply that the variance of ISH does not vary with x 3 , as clearly highlighted by Fig. 1b and quantified by AMAV x 3 .The symmetry of the pdf of ISH is not affected by conditioning on x 2 or x 3 , as demonstrated by Eq. ( 22).Otherwise, γ [ISH|x 1 ] is left (or right) skewed when x 1 is smaller (or larger) than 0.5, as dictated by Eq. ( 22) and shown in Fig. 1c.
The conditional kurtosis k [ISH|x 2 ] does not depend on the conditioning value x 2 (see Eq. 23).We then note that this conditional moment is always larger than (or equal to) its unconditional counterpart k [ISH], regardless of the particular values assigned to a and b, as we verified through extensive numerical tests.This result implies that the pdf of ISH conditional on x 2 is characterized by tails which are heavier than those of its unconditional counterpart.Figure 1d reveals that k [ISH|x 1 ] and k [ISH|x 3 ] are smaller than k [ISH] for the values of a and b implemented in this example.Table 1 lists the resulting values of AMAk x i (x i = x 1 , x 2 , x 3 ) for the selected a and b values.
We close this part of the study by investigating the error which would arise when one evaluates our GSA indices by replacing ISH through a gPCE surrogate model.We do so on the basis of the absolute relative error where j = AMAE x i , AMAV x i , AMAγ x i or AMAk x i (x i = x 1 , x 2 , x 3 ); the subscripts full model and gPCE, respectively, indicate that quantity j is evaluated via Eq.( 18) or through a gPCE surrogate model, constructed as outlined in Sect.2.1.Figure 2 depicts Eq. ( 26) versus the total degree w of the gPCE.Note that the lower limit of the vertical axis of Fig. 2 is set to 0.001 % for convenience of graphical representation.Approximation errors associated with GSA indices related to the mean, AMAE x i , rapidly approach zero as w increases.
Note that e AMAE x 3 is smaller than 0.001 % for all tested values of w and it is therefore not included in Fig. 2a.Values of e j linked to AMAV x i , AMAγ x i and AMAk x i do not show a consistently decreasing trend until w ≥ 5. Values of e j associated with the variance, skewness and kurtosis decrease with approximately the same average linear rate (in log-log scale) for the largest w considered (Fig. 2b, c and d).This example reinforces the need for reliably testing the accuracy of a gPCE-based model approximation as a function of the x w  total degree desired, depending on the statistical moment of interest.Note that a generalization of our findings about the error (Eq.26) is outside the scope of the current study.This would require the derivation of (a) the analytical format of the pdf of a target model output through its gPCE-based approximation at a given order w (see, e.g., Riva et al., 2015) and (b) the corresponding pdf resulting from the full system model (e.g., by formulating and solving exact equations for the target pdf, or its moments, typically invoking problemspecific assumptions).

Critical pumping rate in coastal aquifers
The example we consider here is taken from the study of Pool and Carrera (2011) related to the analysis of salt water contamination of a pumping well operating in a homogenous confined coastal aquifer of uniform thickness b .The setting is sketched in Fig. 3.
is pumped from a fully penetrating well located at a distance x w [L] from the coastline, and a constant freshwater flux, q f [L T −1 ], flowing from the inland to the coastline, is set.Pool and Carrera (2011) introduced a dimensionless well discharge Q w = Q w /(b x w q f ) and defined the critical pumping rate Q c as the value of Q w at which a normalized solute concentration monitored at the well exceeds 0.1 %.A key result of the study of Pool and Carrera (2011) is that Q c can be approximated through the following implicit equation: Hydrol.Earth Syst.Sci., 21, 6219-6234, 2017 www.hydrol-earth-syst-sci.net/21/6219/2017/ Here, is the uniform hydraulic conductivity; α T [L] is transverse dispersivity; ρ = ρ s − ρ f ; ρ f and ρ s are fresh and saltwater densities, respectively.The quantity P e T is a measure of the intensity of dispersive effects, J is the natural head gradient of the incoming freshwater and x w is the dimensionless distance of the well from the coastline.Pool and Carrera (2011) demonstrated the accuracy of Eq. ( 27) in predicting the critical pumping rate when λ D ∈ [0 − 10].Additional details about the problem setting, boundary and initial conditions, as well as geometrical configuration of the system, can be found in Pool and Carrera (2011).Here, we focus on the main result of Eq. ( 27) which represents the complete mathematical description of the problem we analyze.We perform a sensitivity analysis of Q c with respect to P e T , J and x w .While the first two quantities are difficult to assess experimentally in practical applications, the well location can be considered as an operational/design variable.Table 2 lists the intervals of variation we consider for P e T , J and x w .10), AMAV x i Eq. ( 12), AMAγ x i Eq. ( 14) and AMAk x i Eq. ( 16) associated with the critical pumping rate Q c Eq. ( 27).Principal Sobol' indices, S x i Eq. ( 7), are also listed; x i = P e T , J , x w .and x w (red curves) within the parameter space.The corresponding unconditional moments (black curves) are also depicted for completeness.Note that each parameter interval of variation has been normalized to span the range [0, 1] for graphical representation purposes.Table 3 lists the values of indices AMAE x i AMAV x i , S x i , AMAγ x i and AMAk x i (x i = P e T , J , x w ) associated with Q c .As in our first example, it is clear that sensitivity of Q c with respect to P e T , J , x w depends on the statistical moment of interest.Inspection of Fig. 4a reveals that the mean of Q c is more sensitive to conditioning on J or x w than to conditioning on P e T .Note that increasing P e T , i.e., considering advectiondominated scenarios, leads to an increase of the mean value of Q c .This is so because the dispersion of the intruding saltwater wedge is diminished and the travel time of solutes to the well tends to increase.High values of the natural head gradient of the incoming freshwater, J , are associated with high mean values of Q c .This is consistent with the observation that the inland penetration of the wedge is contrasted by the effect of freshwater which flows in the opposite direction.As expected, decreasing x w (moving the pumping well towards the coast) leads to a reduction of the mean value of Q c . Figure 4a shows that mean Q c varies with x w and J in a similar way.This outcome is consistent with Eq. ( 27) where Q c depends on the product x w J , i.e., increasing x w or J has the same effect on Q c .

AMAE
It can be noted (see Table 3) that AMAE P e T is smaller than AMAE J and AMAE x w , consistent with Fig. 4a. Figure 4b shows that the variance of Q c decreases as P e T , J or x w increase.This trend suggests that the uncertainty of Q c , as quantified by the variance, decreases as (i) the intruding wedge sharpens or is pushed toward the seaside boundary by the incoming freshwater or (ii) the well is placed at increasing distance from the coastline.Inspection of Fig. 4c-d shows that conditioning on P e T , J or x w causes the pdf of Q c to become less asymmetric and less tailed than its unconditional counterpart.This behavior suggests that the relative frequency of occurrence of (high or low) extreme values of Q c tends to decrease as additional information about the model parameters become available.
Figure 5 depicts error, e j (Eq.( 26)), versus total degree, w, of the gPCE representation of Q c , for j = (a) AMAE x i , (b) AMAV x i , (c) AMAγ x i and (d) AMAk x i (x i = P e T , blue curves; J , red curves; x w , green curves).These results indicate that (i) e j associated with AMAE x i is negligible (≈ 1 %) even for low w; (ii) e AMAV P e T ≈ 10 % for w = 2 and rapidly decreases to values below 1 % for increasing w; (iii) e AMAV J and e AMAV xw are always smaller than 1 %; and (iv) the trend of e AMAγ x i is similar to that of e AMAk x i for all x i , with values of the order of 10 % or higher for w = 2 and displaying a decrease with increasing w which then stabilizes around values smaller than 1 % when w ≈ 4 or 5.We note that the absolute relative error (Eq.26) for AMAE x i with a given value of w is always lower than errors associated with higher-order moments.Similar to our results in Sect.3.1, it is clear from Fig. 5 that attaining a given level of accuracy for the gPCEbased indices for Q c requires considering a diverse total order w of the gPCE depending on the order of the statistical moment considered.As such, following the typical practice of assessing the reliability of a gPCE surrogate model solely on the basis of the variance or of a few random model realizations does not guarantee satisfactory accuracy of the uncertainty analysis of a target model output, which requires considering higher-order statistical moments.

Solute transport in a laboratory-scale porous medium with zoned heterogeneity
As a last exemplary showcase, we consider the laboratoryscale experimental analysis of nonreactive chemical transport illustrated by Esfandiar et al. (2015).These authors consider tracer transport within a rectangular flow cell filled with two types of uniform sands.These were characterized by diverse porosity and permeability values, which were measured through separate, standard laboratory tests.A sketch of the experimental setup displaying the geometry of the two uniform zones, respectively, formed by coarse and fine sand is illustrated in Fig. 6.After establishing fully saturated steady-state flow, a solution containing a constant tracer concentration is injected as a step input at the cell inlet.The tracer breakthrough curve is then defined in terms of the temporal variation of the spatial mean of the concentration detected along the flow cell outlet.Esfandiar et al. (2015) modeled the temporal evolution of normalized (with respect to the solute concentration of the injected fluid) concentration at the outlet, C(t) (t denoting time), by numerically solving within the flow domain the classical advection-dispersion equation implementing an original and accurate space-time grid adaptation technique.Unknown longitudinal dispersivities of the two sands (a L,i , i = 1, 2, respectively, denoting the coarse and fine sand) were considered as uncertain system parameters to be estimated against the available experimental solute breakthrough data.To minimize the computational costs in the model calibration process, Esfandiar et al. (2015) relied on a gPCE approximation of C(t).The authors constructed a gPCE of total degree w = 3 by considering log 10 a L,i to be two i.i.d.random variables uniformly distributed within log 10 (aL,i) = [−6, −2], a L,i being expressed in [m].Further details about the problem setup, numerical discretization and grid adaptation technique as well of the construction of the gPCE representation can be found in Esfandiar et al. (2015).Here, we ground the application of our new GSA metrics on the gPCE surrogate model already constructed by Esfandiar et al. (2015) to approximate C(t).
Figure 7 depicts the temporal evolution of the unconditional expected value, E C (t) , variance, V C (t) , skewness, γ C (t) , and kurtosis, k C(t) , of normalized C(t).Time steps t 0.02 , t 0.4 , and t 0.96 , i.e., the times at which E C (t) = 0.02, 0.4 and 0.96, respectively, are highlighted in Fig. 7a. Figure 7a reveals a pronounced tailing of E C (t) at late times, the short time mean breakthrough being associated with a rapid temporal increase of E C (t) .A local minimum at t 0.4 and two local peaks are recognized in V C (t) (Fig. 7b).The variance peaks at times approximately corresponding to the largest values of ∂ 2 E C (t) /∂t 2 .This outcome is consistent with the results of numerical Monte Carlo (MC) simulations depicted in Fig. 8 of Esfandiar et al. (2015) where the largest spread of the MC results is ob-served around these locations.The local minimum displayed by V C (t) suggests that C(t) at observation times close to t 0.4 is mainly driven by advection, consistent with the observation that advective transport components are the main driver of the displacement of the center of mass of a solute plume.The late time variance V C (t) tends to vanish because the normalized breakthrough curve is always very close to unity irrespective of the values of a L,1 and a L,2 .Joint inspection of Fig. 7c and d reveals that the pdf of C(t) tends to be symmetric around the mean (Fig. 7c) and characterized by light tails (Fig. 7d) at about t 0.4 .Otherwise, the pdfs of C(t) tend to display heavy right or left tails, respectively, for observation times shorter or longer than t 0.4 .These observations suggest that the relative frequency of rare events (i.e., very low or high solute concentrations, which can be of some concern in the context of risk assessment) is lowest at intermediate observation times across the duration of the experiment.
Figure 8 depicts the temporal evolution of (a) AMAE x i , (b) AMAV x i , (c) AMAγ x i and (d) AMAk x i (x i = log 10 (a L,1 ), log 10 (a L,2 )) of C(t).Results shown in Fig. 8 demonstrate that statistical moments of C(t) are more sensitive to log 10 (a L,1 ) than to log 10 (a L,2 ) at early times.The opposite occurs when t > t 0.4 .Our set of results suggests that the overall early time pattern of solute breakthrough is mainly dictated by the value of a L,1 ; the late time behavior is chiefly influenced by a L,2 .These conclusions are supported by the results of Figs.9-11, where we depict the expected value, variance, skewness and kurtosis of C(t) conditional to log 10 (a L,1 ) (blue curves) and log 10 (a L,2 ) (red curves), at times t = t 0.02 (Fig. 9), t 0.4 (Fig. 10) and t 0.96 (Fig. 11).The corresponding unconditional moments are also depicted (black curves) for ease of comparison.Figure 9 shows that the first four statistical moments of C (t 0.02 ) are practically insensitive to the value of the fine sand dispersivity, a L,2 .As one could expect by considering the relative size and geometrical pattern of the two sand zones, Fig. 9a shows that the average amount of solute reaching the cell outlet at early times increases with a L,1 , because dispersion of solute increases through the coarse sand which resides in the largest portion of the domain.Figure 9b shows that V C(t 0.02 ) is negligible when a L,1 is known.Consistent with this result, Fig. 9c and d, respectively, show a reduction in the asymmetry and in the tailing behavior of the pdf of C (t 0.02 ) when a L,1 is fixed.These results are a symptom of a reduced process uncertainty, which is in line with the observation that the bulk of the domain is filled with the coarse sand whose dispersive properties become deterministic when a L,1 is known.
Inspection of the first four unconditional statistical moments of C (t 0.4 ) (black curves in Fig. 10) indicates that the unconditional pdf of C at this intermediate time closely resembles a Gaussian distribution.Conditioning C (t 0.4 ) on dispersivity causes a variance reduction, an increase of the tailing and the appearance of a negative (left) or positive (right) skewness, respectively, when conditioning is per-  formed on a L,1 or a L,2 .This behavior suggests that in the type of experimental setting analyzed the variability of a L,1 promotes the appearance of values of C (t 0.4 ) larger than the mean, the opposite of what occurs when solely a L,2 is considered as uncertain.
Figure 11 shows that all four statistical moment of C (t 0.96 ) are chiefly sensitive to the dispersivity of the fine sand box, which is placed near the cell outlet.One can note that knowledge of a L,2 yields a diminished variance of C (t 0.96 ), which drops almost to zero, an increased degree of symmetry and a reduce tailing of the pdf of C (t 0.96 ), all symptoms of uncertainty reduction.
Results depicted in Figs.9-11 and our earlier observations about Fig. 7 are consistent with the expected behavior of  transport in the system and the relative role of the dispersivities of the two sand regions.The high level of sensitivity of C(t) to a L,1 at the early times of solute breakthrough is in line with the observation that solute particles are mainly advected and dispersed through the coarse sand.Both dispersivities affect the behavior of C(t) at intermediate times, when solute is traveling through both sands.The dispersivity of the coarse sand plays a minor role at late times because virtually no concentration gradients arise in this portion of the domain.Otherwise, concentration gradients persist in the fine sand zone close to the outlet, and the solute breakthrough is clearly controlled by the dispersive properties of the fine sand.

Conclusions
We introduce a set of new indices to be employed in the context of global sensitivity analysis, GSA, of hydrological and Earth systems.These indices consider the first four (statistical) moments of the probability density function, pdf, of a desired model output, y.As such, they quantify the expected relative variation due to the variability in one (or more) model input parameter(s), of the expected value, variance, skewness and kurtosis of y.When viewed in the current research trend, our work is intended to bridge the gap between variance-based and pdf-based GSA approaches since it embeds the simplicity of the former while allowing for a higher-order description of how the structure of the pdf of y is affected by variations of uncertain model parameters.We cope with computational costs, which might be high when evaluating higher-order moments, by coupling our GSA approach with techniques approximating the full model response through a surrogate model.For the sake of our study, we consider the generalized polynomial chaos expansion (gPCE), other model reduction techniques being fully compatible with our approach.Our new indices can be of interest for application in the context of current practices and evolving trends in factor fixing procedures (i.e., assessment of the possibility of fixing a parameter value on the basis of the associated output sensitivity), experiment design, uncertainty quantification and environmental risk assessment due to the role of the key features of a model output pdf in such analyses.
We exemplify our methodology on three test beds: (a) the Ishigami function, which is widely employed to test sensitivity analysis techniques, (b) the evaluation of the critical pumping rate to avoid salinization of a pumping well in a coastal aquifer and (c) a laboratory-scale nonreactive transport experiment.Our theoretical analyses and application examples lead to the following major conclusions.
The calculated sensitivity of a model output, y, with respect to a parameter depends on the selected global sensitivity index, i.e., variability of a model parameter affects statistical moments of y in different ways and with different relative importance depending on the statistical moment considered.Relying on the indices we propose allows us to enhance our ability to quantify how model parameters affect features of the model output pdf, such as mean, degree of spread, symmetry and tailedness, in a straightforward and easily transferrable way.
Joint inspection of our moment-based global sensitivity indices and of the first four statistical conditional and unconditional moments of y increases our ability to understand the way the structure of the model output pdf is controlled by model parameters.As demonstrated in our examples, classical variance-based GSA methods cannot be used for this purpose, leading, in some cases, to the unwarranted conclusion that a given parameter has a limited impact on a target output.
Analysis of the errors associated with the use of a surrogate model for the evaluation of our moment-based sensitivity indices suggests that (a) attaining a given level of accuracy for the gPCE-based indices associated with a target variable, y, might require considering a diverse total order w of the gPCE, depending on the target statistical moment considered in the GSA of y; and (b) in our examples, the absolute relative error (Eq.26) for AMAE x i based on a given total degree w of the gPCE approximation is always lower than its counterpart associated with higher-order moments (see Figs. 2 and 5).
= 5 and b = 0.1, which corresponds to E [ISH] = 2.50, V [ISH] = 10.84 and k [ISH] = 4.18.Figure 1 depicts the first four moments of ISH conditional to values of x 1 (blue curves), x 2 (red curves) and x 3 (green curves) within the parameter space.The corresponding unconditional moments (black curves) are also depicted for completeness.Comparing Eqs.(19a) and (

Figure 2 .
Figure2.Error e j Eq. (26) versus the total degree w of the gPCE representation of ISH for j = (a) AMAE x i , (b) AMAV x i , (c) AMAγ x i and (d) AMAk x i , with x i = x 1 (blue curves), x 2 (red curves), x 3 (green curves).Note that AMAE x 3 is always smaller than 0.001 %.Average slopes of the rate of decrease of e j for the largest w values considered are indicated as a reference in (b-d).

Figure 3 .
Figure3.Sketch of the critical pumping scenario taking place within a coastal aquifer of thickness b .A constant freshwater (in blue) flux, q f , flows from the inland to the coastline (saltwater in red).A constant discharge, Q w , is pumped from a fully penetrating well located at a distance x w from the coastline.Color scale indicating variable concentration of salt is only qualitative for illustration purposes.

Figure 4 .
Figure 4. First four moments of Q c Eq. (27) conditional to values of P e T (blue curves), J (green curves) and x w (red curves) within the parameter space:(a) expected value, E Q c |x i , (b) variance, V Q c |x i , (c) skewness, γ Q c |x i , and (d) kurtosis, k Q c |x i , (x i = P e T , J , x w ).The corresponding unconditional moments (black curves) are also depicted.Intervals of variation of P e T , J and x w have been rescaled between 0 and 1 for graphical representation purposes.
These are designed to (a) resemble realistic field values and (b) obey the above-mentioned constraint about λ D .Numerical evaluation of the first four unconditional statistical moment of Q c yields a mean value E [Q c ] = 1.65, variance V [Q c ] = 0.17, skewness γ [Q c ] = −0.30(which indicates a light asymmetry in the pdf) and kurtosis k [Q c ] = 2.51 (i.e., pdf tails decrease faster than those of a Gaussian distribution). Figure 4 depicts the first four moments of Q c conditional to values of P e T (blue curves), J (green curves)

Figure 5 .
Figure5.Error e j Eq. (26) versus total degree w of the gPCE representation of Q c , for j = (a) AMAE x i , (b) AMAV x i , (c) AMAγ x i and (d) AMAk x i ; x i = P e T (blue curves), J (red curves), x w (green curves).

Figure 8 .
Figure 8.Time evolution of the global sensitivity index (a) AMAE x i , (b) AMAV x i and S x i (dashed curves), (c) AMAγ x i and (d) AMAk x i of C(t) (x i = log 10 (a L,1 ) (blue) or log 10 (a L,2 ) (red)).

Table 3 .
Global sensitivity index AMAE x i Eq. ( x i AMAV x i S x i AMAγ x i AMAk x i