Toward seamless hydrologic predictions across scales

Land surface and hydrologic models (LSM/HM) are used at diverse spatial resolutions ranging from 1-10 km in catchment-scale applications to over 50 km in global-scale applications. Application of the same model structure at different spatial scales requires that the model estimates similar fluxes independent of the model resolution and fulfills a flux-matching condition across scales. An analysis of state-of-the-art LSMs and HMs reveals that most do not have consistent and realistic parameter fields for land surface geophysical properties. Multiple experiments with the mHM, Noah-MP, PCR-GLOBWB and 5 WaterGAP models are conducted to demonstrate the pitfalls of poor parameterization practices currently used in most operational models, which are insufficient to satisfy the flux-matching condition. These examples demonstrate that J. Dooge’s 1982 statement on the unsolved problem of parameterization in these models remains true. We provide a short review of existing parameter regionalization techniques and discuss a method for obtaining seamless hydrological predictions of water fluxes and states across multiple spatial resolutions. The multiscale parameter regionalization (MPR) technique is a practical and robust 10 method that provides consistent (seamless) parameter and flux fields across scales. A general model protocol is presented to describe how MPR can be applied to a specific model, with an example of this application using the PCR-GLOBWB model. Applying MPR to PCR-GLOBWB substantially improves the flux-matching condition. Estimation of evapotranspiration without MPR at 5 arcmin and 30 arcmin spatial resolutions for the Rhine river basin results in a difference of approximately 29%. Applying MPR reduce this difference to 9%. For total soil water, the differences without and with MPR are 25% and 7%, 15

1 Introduction ... "If it disagrees with experiment, it's wrong".Richard P. Feynman upscale evaporation and transpiration fluxes (e.g., Kozak et al., 2005) at the field scale and have not been used at larger scales (Vereecken et al., 2007) until very recently.Montzka et al. (2017) applied the scaling proposed by Miller and Miller (1956) to generate a global database of soil hydraulic properties.This data set, however, has not been used by any LSM/HMs up to now.The inverse modeling approach is frequently used to estimate soil-related parameters at regional scales (Kitanidis and Vomvoris, 2010).Stochastic and geostatistical theories (Dagan, 1989;Gelhar, 1993;Neuman, 2010) have also been applied in saturated porous media for upscaling measured point-scale geophysical properties to the aquifer scale, including volume averaging theories and pre-asymptotic and asymptotic expansion theories.More recently, coarse graining methods (Attinger, 2003) were introduced to reduce the complexity of complex groundwater models and use effective aquifer parameters.Many of these seminal methodological and mathematical developments in scaling issues, developed for sub-surface hydrological problems, have not been applied or incorporated into regional-scale LSM/HMs.
A third reason is related to the lack of transparency in most of the existing LSM/HM source code of with respect to the meaning, origin and uncertainty associated with the hard-coded numerical values (i.e., parameters) in the code or in the lookup tables.It has been shown that these hidden parameters constrain the agility of the numerical model because they hinder the possibility of exploring their sensitivity on model outputs and the possibility of inferring them using observations (Mendoza et al., 2015).Model source code is often mixed, with no clear distinctions between physical or numerical constants (e.g., the acceleration of gravity g or π 2 ) and empirical effective parameters such as the soil porosity of a given soil type (Oleson et al., 2013, see p. 160).Cuntz et al. (2016) noted that model output fluxes in the NOAH-MP model are sensitive to two-thirds of its applicable standard parameters, but most are hidden in the source code.From a statistical point of view, parameters and numerical constants are categorically very different.The numerical constants can be specified with a great level of precision, but the physical constants and parameters cannot be because they must be treated as random variables (Nearing et al., 2016).
In this study, we provide a short overview of the challenges and limitations of existing HM/LSM parameterizations in providing seamless predictions of water fluxes and states across multiple spatial resolutions.Through several control experiments, we demonstrate that a large portion of the predictive uncertainty in existing LSM/HMs originates from the poor estimation of effective parameters, which leads to a lack of scale invariance and thus to their poor transferability across scales and locations.
We argue that the multiscale parameter parameterization (MPR) technique (Samaniego et al., 2010b) offers a method to address the challenges of linking the field scale (observations) with the catchment scale (Dooge, 1982) and to account for the effect of the spatial variability of geophysical characteristics in the parameterization of hydrologic processes that operate at a range of spatial resolutions (Dooge, 1982;Wood et al., 1988).Because MPR relies on empirical transfer functions to link geophysical properties with model parameters, it provides a very effective procedure to transfer "global parameters" to scales and locations other than those used in model calibration (Samaniego et al., 2010a, b;Kumar et al., 2013b).This dependency on several transferable coefficients also contributes to a serious drawback of spatially explicit models called "overparametrization" (Beven, 1995).Finally, we provide a modeling protocol that can be used as a guide to address the key question stated by Wood (1990): The core of the Freeze and Harlan (1969) blueprint is embracing the spatial distribution of geophysical land-surface attributes in an HM/LSM.A large part of the literature in hydrology (Blöschl et al., 2013) is devoted to streamflow prediction, which, according to Jakeman and Hornberger (1993), is a hydrological variable that exhibits low dimensionality and represents the integral signal of a basin.As a result, most of the "precipitation-runoff" models were conceived as lumped models (e.g., Crawford and Linsley, 1966;Burnash et al., 1973;Lindstrom et al., 1997;Edijatno et al., 1999;Fenicia et al., 2011;Martina et al., 2011;Andréassian et al., 2014;Singh et al., 2014) or as semi-distributed models (e.g., Leavesley et al., 1983;Kavetski et al., 2003;Lindström et al., 2010;Hundecha and Bárdossy, 2004;Merz and Blöschl, 2004;Hundecha et al., 2016) and thus lead to poor spatial representation of geophysical parameters, state variables and fluxes.
Lumped and semi-distributed models often exhibit reasonably good efficiency in predicting streamflow observations at determined locations, but their performance is largely dependent on the parameter calibration.Due to their excessive reliance on parameter calibration (Brynjarsdottir and O'Hagan, 2014), there is a trade-off: the performance at interior points of the basin or at other locations becomes worse than that for calibrated outlets (Pokhrel and Gupta, 2010;Lerat et al., 2012).A second drawback of this type of model is that the parameter fields, which are obtained by combining lumped model parameters from sub-basins, are unrealistic because they exhibit sharp discontinuities along the sub-basin boundaries (Li et al., 2012;Blöschl et al., 2013;Merz and Blöschl, 2004).In addition, these "patch quilt" parameter fields exhibit significant sensitivity to the calibration conditions (Merz and Blöschl, 2004).Thus, these models may exhibit (1) poor predictability of state variables and fluxes at locations and periods not considered in calibration and (2) sharp discontinuities along sub-basin boundaries in state, flux and parameter fields (e.g., Merz and Blöschl, 2004;Lindström et al., 2010).Parameter fields derived using basinwise "calibrated" lumped models lack spatial seamlessness, and thus are "inadequate representations of real-world systems" (Savenije and Hrachowitz, 2017).
There have been many attempts to improve the realism and performance of lumped and semi-distributed models by further discretizing the sub-basins into a given number of regions that exhibit nearly similar hydrologic responses, i.e., the so-called hydrologic response units (HRU) concept initially proposed by Leavesley et al. (1983) and further developed by others (e.g., Flügel, 1995;Beldring et al., 2003;Blöschl et al., 2008;Viviroli et al., 2009;Zehe et al., 2014).Unfortunately, the results obtained in these modeling attempts have not been very successful in realistically representing the spatial variability of model parameters, states and fluxes because of the lack of regionalized parameters and the unabridged reliance on parameter calibration to improve model performance.Many attempts have been tested to enforce the continuity and monotony of the parameter fields (Gotzinger and Bárdossy, 2007;Singh et al., 2012), but with somewhat limited success in the transferability of param-Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2017-89, 2017 Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 6 March 2017 c Author(s) 2017.CC-BY 3.0 License.eters across scales and locations.In addition, models parameterized using the HRU concept do not lead to mass conservation of water fluxes (i.e., flux-matching) when applied to scales other than those used in calibration (Kumar et al., 2010(Kumar et al., , 2013b)).
Recent attempts have been made to improve the HRU concept to increase the seamless representation of parameters, states and fluxes (Chaney et al., 2016a).However, this concept has not been tested for scalability and seamlessness of the estimated fields at coarse resolution.Representative elementary watersheds (Reggiani et al., 1998) are an interesting theoretical concept, which scales mass and momentum balance equations that, to the best of our knowledge, have not been used to estimate effective parameters at meso-and regional scales.Recently, a thermodynamic reinterpretation of the HRU concept was proposed by Zehe et al. (2014), but to date, the implementation of this approach has not found its way into meso-to macro-scale LSMs/HMs.
A priori regularization functions (e.g., pedo-transfer functions) were introduced by Koren et al. (2013) to correct the "inappropriate randomness in the spatial patterns of model parameters", i.e., the lack of seamlessness.Unfortunately, in this case, the parameters (or coefficients) of regularization functions were not subject to parameter estimation or to verification of their ability to predict fluxes and states across various scales.The use of empirical point-scale-based relationships to link geophysical characteristics with LSM/HM parameters and the generalized assumption that their coefficients are universally applicable with certainty (e.g., the coefficients in the Clapp and Hornberger (1978) pedo-transfer functions) is a major reason for the proliferation of hidden parameters in LSM/HM code (Mendoza et al., 2015;Cuntz et al., 2016).
Many types of regionalization (or regularization) approaches have been tested for semi-distributed and distributed models.
None of these procedures consider the sub-grid variability of the model parameters or geophysical characteristics.Livneh and Lettenmaier (2013) noted that most of these regionalization procedures exhibit limited transferability because of the use of discrete soil texture classes as predictors.
Recently, a dissimilarity-based regionalization technique was used by Beck et al. (2016) to generate an ensemble of global parameters of the HBV model at a 0.5 • resolution for global-scale hydrological modeling.A shortcoming of this approach is the use of ad hoc nearest-neighbor interpolation of parameter fields to fill gaps where no donor basins are available in (geographically) surrounding regions (e.g., over the majority of Eurasia, Africa, South America, and South Europe).Following a similar concept of that of Beck et al. (2016), the HRU parameterization method proposed by Bock et al. (2016) for the CONUS will likely lead to discontinuous parameter fields because the calibration regions are fixed to one hundred.The authors did not report the parameter fields obtained using this method.Consequently, it is very likely that LSM/HMs using this type of regionalized parameters would not exhibit realistic spatial patterns.Unfortunately, this hypothesis cannot be properly tested because of the lack of studies revealing the spatial patterns of regionalized parameters obtained using these regionalization techniques.
Many attempts have been made in the land surface modeling community to address Dooge's challenges, especially with respect to the transferability of model parameters across locations and scales.One of the earliest prominent experiments was conducted in the Project for Intercomparison of Land-surface Parameterizations (PILPS) (Wood et al., 1998).In this project, calibrated LSM parameters were transferred from small catchments to their nearest computational grid cells.The results indicated that LSMs exhibited poor transferability across space, leading to significant differences in the partitioning of water and energy fluxes.For instance, Troy et al. (2008) used calibrated VIC model parameters from small basins to generate parameter fields for continental-scale land surface modeling by "linearly interpolating to fill in those grid cell not calibrated" on a sparse grid.As noted by Samaniego et al. (2010b), this type of regionalization is not adequate because of the nonlinearity of soil and geological formations.The spatial patterns of model parameters that would be obtained by ad-hoc extrapolations based on calibrated parameters at small basins or grid cells would most likely lead to unrealistic parameter fields with significant discontinuities, as in those shown by the VIC parameters obtained in a CONUS climate change projection assessment (Wood and Mizukami, 2014;Mizukami et al., 2017).
Recent community-driven efforts, such as the Protocol for the Analysis of Land Surface Models (PALS) Land Surface Model Benchmarking Evaluation Project (PLUMBER) (Haughton et al., 2016), indicate that the hurdles noted in PILPS have not been overcome.Thus, it is possible to postulate that the poor predictability of many LSMs evaluated with empirical benchmarks in the PLUMBER project (e.g., CABLE, CHTESSEL, JULES, Noah) may be the result of poor parameterizations, among other factors.
Soil porosity is one of the most common parameters in many LSM/HMs.This parameter controls the dynamic of several state variables and fluxes such as soil moisture, latent heat, and soil temperature, and its sensitivity has been demonstrated in various studies (Goehler et al., 2013;Cuntz et al., 2015;Mendoza et al., 2015;Cuntz et al., 2016).A representation of the porosity of the top 2 m soil column in these models over the Pan-EU is shown in Figure 1.The Pan-EU domain was selected for depiction, but we note that the problem is general and persistent across other domains (Mizukami et al., 2017).For cases in which an HM does not use this parameter, the "available water capacity" (WaterGAP) or the "field capacity"(HBV) were selected as a surrogate due to their similarity with porosity.Both surrogate fields are normalized (in space) to ease their comparison with the porosity fields.Soil porosity is expressed in m 3 m −3 and can be compared among different models.
The following lessons can be learned from Figure 1: (1) There is a serious deficiency in the parameterization of this key physical parameter because none of the analyzed models have comparable spatial patterns or comparable estimates at a given location.Thus, its uncertainty is very large.It should be noted that the definition of this parameter is rather simple: it represents the ratio of the volume of voids to the total volume in the soil column.We can now wonder how large the uncertainty of Hydrol.Earth Syst.Sci. Discuss., doi:10.5194/hess-2017-89, 2017 Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 6 March 2017 c Author(s) 2017.CC-BY 3.0 License.other complex parameters whose relationship with soil properties is very nonlinear would be (e.g., hydraulic conductivity).( 2) The degree of seamlessness strongly depends on the level of aggregation and the upscaling of underlying soil texture fields.
(3) A parameter field becomes highly discontinuous and patchy when, for a given model, the parameter is calibrated in a limited domain (or basins) and then extrapolated to other regions (e.g., as shown in panel (k) of Figure 1, corresponding to the HBV model).( 4) These experimental results confirm the postulation of Dooge (1982) that the parameterization of the existing state-of-the-art LSM/HMs at large and continental scales is an unsolved problem.
These facts allow us to postulate the following questions: (1) Why there are such large differences between models in estimating a physically interpretable parameter?(2) What are the consequences of model calibration on parameter fields?(3) Are current model parameterizations scale invariant?(4) Do the fluxes estimated with these models at various scales satisfy the fundamental mass conservation criterion (hereafter denoted the flux-matching test)?( 5) What are the consequences of poor parameterizations on the spatio-temporal dynamics of state variables and fluxes?
3 The MPR framework

The flux-matching postulation
The key postulation aiming at obtaining scalable (global) parameters that are transferable across locations and scales was proposed by Samaniego et al. (2010b) and further tested in Kumar et al. (2013b, a) and Rakovec et al. (2016b).We hypothesize that Flux matching across scales leads to quasi scale-invariant global parameters γ, thus: Here, k denotes the sub-grid elements constituting a given modeling cell i with area a k .i denotes a modeling grid cell i with area a i .W i and w k denote fluxes at two modeling scales 1 and 1 , respectively, with 1 1 = ai a k .Ω denotes the modeling domain, e.g., a river basin, and t a point in time.It should be noted that the topology of the cells at either level is not specified.
Normally, rectangular grid cells are used for convenience, but this is not a necessary condition.This stronger flux-matching condition can be used as a penalty function or as an additional test to discriminate parameter sets obtained with conventional parameter estimation approaches.

Regionalization and upscaling
Multiscale parameter regionalization (MPR), proposed by Samaniego et al. (2010b), aims to estimate model parameters that are seamless across scales, satisfy the flux-matching conditions, and enable the transferability of global or transfer-function parameters across scales and locations (Samaniego et al., 2010a, b;Kumar et al., 2013a;Wöhling et al., 2013;Livneh et al., 2015;Rakovec et al., 2016a).The development of MPR is ongoing work.Regionalization functions used in MPR for the mHM Currently, MPR is the only method that consistently and simultaneously addresses the scale, nonlinearity and over-parameterization issues (Beven, 1995).The MPR approach also addresses the principle of scale-dependent subgrid parameterization (i.e., "net fluxes must satisfy the conservation of mass") proposed by Beven (1995) but does not adhere to Beven's other principles, such as that sub-grid parameterizations may be data-and scale-dependent (principle 3 and 4), because exhaustive tests reported in the above mentioned references carried out over hundreds of river basins do not appear to support them.We find MPR to be a robust technique that has the ability to provide "effective parameters" and is capable of addressing the scaling problem; in this sense, it diverges from the Beven's view (Beven, 1995, p.507) that these "effective parameters" are an "inadequate approach to the scale problem".Furthermore, MPR differs on the regionalization and aggregation scheme (i.e., patch model areal weighting) proposed by Beven (1995, p.520).
The scaling problem in MPR is approached by addressing the existence of process specific representative elementary areas (REA) that determine the minimum computational grid size 1 at which the continuum assumptions can be used without explicit knowledge of the actual patterns of the topography, soil, or rainfall fields (Wood et al., 1988).The REA of a specific process, such as streamflow, can be determined by conducting a careful sensitivity analysis as shown by Samaniego et al. (2010b).To estimate an "effective" model parameter (e.g., total soil porosity) at the selected modeling scale, it is first necessary to estimate its variability at a much finer scale 0 1 so that the effects of its spatial heterogeneity can be adequately represented.In other words, the parameter at the fine scale 0 represents the minimum support at which the proposed equations are still valid.Barrios and Francés (2011) indicated that a suitable estimate of 0 for a given parameter could be near its correlation length.
The sub-grid variability of a parameter β 0 depends, in turn, on the spatial heterogeneity of geo-and bio-physical characteristics (u 0 ), such as terrain elevation, slope and aspect, soil texture, geological formation, and land cover, which are now available at hyper-resolution for the entire globe.The mathematical relationships that link model parameters with these characteristics at the finer resolution are called pedo-transfer, regionalization or regularization functions f (Clapp and Hornberger, 1978;Cosby et al., 1984;Wösten et al., 2001).The constants required in these functions are usually denoted as global parameters γ, thus β 0 = f (u 0 , γ).Note that the fields β 0 and u 0 are dependent on space and time, but the vector γ is not.
Regularization functions are commonly used in mathematics and statistics to solve ill-posed problems (which is the case when the parameters of a distributed LSM/HM are determined by calibration) and/or to prevent over-fitting.The direct consequence of the regularization is the substantial decrease in degrees of freedom of the optimization problem because the cardinality of the gridded parameter fields #β 0 is orders of magnitude larger than that of the vector of the global parameters #γ.Hence, MPR is a parsimonious (not over-parameterized) parameterization technique that offers spatially continuous model parameter fields and removes spatial discontinuities, as observed by Gotzinger and Bárdossy (2007) and discussed by Mizukami et al. (2017).From the Bayesian point of view, the regularization functions impose a prior distribution on the model parameters.Consequently, greater care should be taken in their selection.
The second step of the MPR approach consists of upscaling the sub-grid distribution of every regionalized parameter to the modeling scale.In other words, β 1 = β 0 .Here, the symbol • represents an averaging or scaling operator that is parameter- A schematic representation of the MPR procedure can be seen in Figure 2. In short, the motto of MPR is "estimate first, then average" whereas other existing regionalization methods follow the opposite approach of "average first, then estimate." Because the processes in LSM/HMs are highly nonlinear, this sequence of operations does not commute.The consequences can be dramatic (see Figure 7).The latter, which is the standard approach, does not preserve fluxes/states across scales, whereas MPR does to a great extent.The key question here is in finding the right scaling rule for the model parameters such that the fluxes/states are preserved.
Model parameters at the 1 scale (i.e., 1 km to 100 km) are called "effective" parameters because they cannot be measured by physical means at this resolution and can only be inferred by heuristic relationships f (•).Thus, it is essential that the inequality 0 1 is fulfilled so that the law of large numbers leads to stable estimates of the effective parameter β 1 having low uncertainty.Since every LSM/HM (e.g., those mentioned in Section 2.2) contains "effective" model parameters, depending on heuristic relationships (that are hidden in the source code in many cases (Mendoza et al., 2015;Cuntz et al., 2016)), it is logical that existing LSM/HMs are subject to parameter uncertainty.These models can be treated as stochastic models, even though their governing equations are deterministic in nature and based on physical principles such as the conservation of mass and energy (Clark et al., 2015;Nearing et al., 2016).Effective parameters should not be the pure result of a blind calibration algorithm.MPR varies from other regionalization approaches in that the introduced relationships lead to realistic parameter fields.
The selection of regionalization functions and scaling operators is fundamental to ensure the transferability of global parameters across scales and to guarantee the seamlessness of parameter fields across scales, e.g., from 1 to 2 1 ... Samaniego et al. (2010b) proposed that the key to determining them is the flux-matching condition mentioned above.A seamless parameter field β 1 can be interpreted as the corollary of the flux-matching condition.Moreover, MPR employs geophysical properties at 0 that allow for a representative sample at the hyper-resolution promoted by Wood et al. (2011) and Bierkens et al. (2014).

Protocol for evaluation of model parameterization
The development of LSM/HMs should be guided by a strict hypothesis driven framework (Nearing et al., 2016) that aims at finding parsimonious and robust parameter sets that fulfill the flux-matching condition and a number of efficiency metrics that are not used during the parameter estimation phase.A multivariate, multiscale evaluation assessing the reliability of model simulations should follow the scheme presented in Rakovec et al. (2016a).Based on our experience, we suggest the following procedure to obtain a robust and seamless parameterization.
1. Retrofit the source code of an LSM/HM so that all model parameters are exposed to analysis algorithms.Parameters are the values of a model that can be considered random variables, i.e., those that are subject to various outcomes and can be fully defined by a probability density function.Parameters should not be confused with numerical or physical constants.(Morris, 1991), its enhanced version such as that proposed by Cuntz et al. (2015), or the distributed evaluation of local sensitivity analysis (DELSA, Rakovec et al., 2014;Mendoza et al., 2015) are of particular interest because use of the popular standard Sobol' method (Sobol', 2001) can be computationally expensive although still possible (Cuntz et al., 2016).
3. Regionalize sensitive model parameters the exhibit marked spatial variabilities.The selection of the regionalization function f (•) can be guided by existing literature or by step-wise methods (e.g., Samaniego and Bárdossy, 2005).This regularization step should be conducted at the highest available spatial resolution for all predictor fields.This resolution is denoted as level 0 .The output of the regularization is the parameter field β 0 .
4. Estimate effective parameter fields β 1 using upscaling operators based on the underlying sub-grid variability β 0 .The scale 1 is determined by synthetic experiments aimed at finding the optimal REA for processes related to the parameter in question (Samaniego et al., 2010b;Kumar et al., 2013b).
5. Estimate the global-parameters γ using standard optimization algorithms (simulated annealing, shuffled complex evolution (SCE), dynamically dimensioned search (DDS)) by minimizing a compromise metric that includes observations at multiple scales and locations (Duckstein and Opricovic, 1980;Rakovec et al., 2016a).The compromise metric could also include hydrologic signatures to extract as much information from a time series as possible (Nijzink et al., 2016).
6. Perform multi-basin, multi-scale, multi-variate cross-validation tests to evaluate the robustness of the regionalization functions, scaling operators, and global parameters (Rakovec et al., 2016a).
7. If the cross-validation tests provide satisfactory results (e.g., Kling-Gupta efficiency (KGE) of the compromise solution > 0.6), then evaluate the flux-matching condition given by eq. 1.If the total error is too large to be tolerated, repeat steps 3 to 8. 8. Evaluate the parameter seamlessness and the preservation of the statistical moments of fluxes and states across scales.
It should be noted that any of the steps above can be tested within a sequential hypothesis testing framework (Clark et al., 2016).A substantial difference from a standard model optimization exercise is that the transfer function f (•) (step 3) and the upscaling operator (step 4) can also be modified in the modeling protocol.
Failure to satisfy the imposed condition, such as the flux-matching test, after exhaustively testing the options in steps 3 to 6 may indicate deficits in process understanding and/or poor data.Consequently, the evaluation step should also provide guidance on detecting and separating the errors stemming from process conceptualization (modeling) and input data.A graphical depiction of the estimation procedure at multiple scales is shown in Figure 2.

Seamless parameter fields across multiple scales using MPR
In Section 3.2, it was postulated that the MPR technique aims at estimating seamless parameter fields across scales which minimize the occurrence of discontinuities and ease the transferability of model parameters across scales and locations.The latter has been tested and reported in many studies in Europe, USA, and other basins worldwide (Samaniego et al., 2011;Kumar et al., 2013b, a;Rakovec et al., 2016b, a).In this study, we provide evidence in favor of the former postulation.
To achieve this goal, the mHM model is parameterized using the multiscale parameter regionalization technique (MPR) (Samaniego et al., 2010b) with hyper-resolution fields of geophysical characteristics at 0 = 500 m resolution as input.Among them, the land cover data was obtained from the Corine data sets (land.copernicus.eu/pan-european/corine-land-cover),and the soil texture information was derived from SoilGrids (soilgrids.org).These very detailed and homogenized soil texture fields provide the fractions of clay and sand, mineral bulk density, and fraction of organic matter for six soil horizons up to 2 m deep.A hyper-resolution digital elevation model (DEM) over Europe (approximately 30 m) from the GMES RDA project (EU-DEM; www.eea.europa.eu/data-and-maps/data/eu-dem) was used to derive terrain characteristics such as slope, aspect, flow direction.The underlying hydro-geological characteristics are based on the International Hydrogeological Map of Europe (IHME; www.bgr.bund.de/ihme1500),available at a 1:1 500 000 scale.Details on the pedo-transfer function used for these simulations can be found in Livneh et al. (2015).mHM global parameters were obtained by closing the water balance over selected river basins in Europe (Rakovec et al., 2016a).
Based on these settings, which constitute the basis for the EDgE project, we estimated porosity fields at three modeling resolutions of 1 = 5, 10, and 25 km, based on the same 0 support information.Following the MPR procedure depicted in Figure 2, the parameter fields for the mHM model at these three resolutions can be estimated.Results are shown in Figure 3.
The results illustrate that the MPR approach can preserve the spatial pattern of the porosity fields (see panels (a), (b) and (c) in Figure 3) and the first and second moments of its probability density function shown in panels (e)-(g).Two-sample Kolmogorov-Smirnov tests indicate that there is insufficient evidence to reject the null hypothesis that any of the three possible pairs of empirical distributions were drawn from the same unknown distribution.Due to the similarity of the empirical distribution functions (EDF) of the porosity fields, the mean value estimated at resolutions of 5, 10, and 25 km is approximately 0.42 m 3 m −3 in all three cases.
The recently derived soil porosity field at 15 arcmin (≈25 km) over the Pan-EU (Montzka et al., 2017), hereafter denoted as Miller-Miller Scaling (MMS), is included in Figure 3d to visualize the effects of the pedo-transfer functions and the scaling operators.It should be noted, however, that both approaches (i.e., MPR and MMS) use the same input SoilGrids data set (Hengl et al., 2017) at 0 = 1 km.The main differences between both approaches stem from the type of regionalization functions and upscaling operators used, and the procedure to estimate the PTF parameters.It should be noted that the flux-matching test was only verified in MPR.To generate the MMS dataset, Montzka et al. (2017) used the ROSETTA-based pedo-transfer-function (PTF) proposed by Schaap et al. (2001) and the upscaling method described by Miller and Miller (1956).MPR, on the other hand, employs the PTF proposed by Zacharias and Wessolek (2007) and the harmonic upscaling operator proposed by Zhu and Mohanty (2002) (see Livneh et al. (2015) for more details).Both methods lead to seamless fields because they follow the principle "first estimate then average" which considers the sub-grid variability of the soil porosity.The differences between the MPR and MMS derived porosity fields (panels (c) and (d) of Figure 3), however, are striking.In general, MMS tends to have lower porosity values than MPR.The spatial mean of MMS is 5% lower than MPR but its standard deviation is in the same range as that of MPR at 10 km resolution.In particular, the porosity obtained by MMS over the northeastern part of Germany, eastern Europe, and Scandinavia are underestimated compared with MPR.
The main advantage of MPR over MMS is that the former has the ability, if used in a LSM/HM, to preserve the mass balance of water fluxes (e.g., evapotranspiration) and states across scales, such that the resulting fields have similar spatial patterns and moments across scales as shown in Figure 3(a-c,e-g).The method of Montzka et al. (2017), has yet to be tested for these fundamental scaling properties when applied to LSMs/HMs.Nevertheless, this experiment allows to conclude that the PTFs and the upscaling operators should not be applied independently from a LSM/HM because the resulting effective parameters would have large biases that would likely lead to large predictive uncertainties.
4 Experiments that reveal inadequate parameterizations

Visualization of artifacts in state and flux fields
As noted in the introduction, on-site (basin-specific) parameter estimation based on HRU or similar techniques (based on clustering grid cells or sub-basins into regions that exhibit quasi-similar hydrological behavior) leads to non-seamless parameter fields such as those reported in Merz and Blöschl (2004).Here, we go one step further to show the consequences of this common practice on state variables such as soil moisture.Our postulation is that an on-site calibration of global-parameters γ leads to biased state variables even with sophisticated regularization techniques such as MPR.To falsify this postulation, we performed two model simulations denoted "on-site" and "multi-site" calibration schemes.In both cases, we used the mHM setup described in Rakovec et al. (2016b) over the Pan-EU domain at a 0.25 • resolution.
In the first experiment, we performed on-site calibrations at 400 river basins in the Pan-European domain.Subsequently, the respective optimized parameter sets are used in each corresponding basin to generate the target variable, in this case, the daily soil moisture of the top 1 m soil column.Lastly, daily soil moisture fields were assembled using the independent basin The first experiment shows clear evidence of strong spatial discontinuity in the soil moisture fields that is easily identifiable because the shapes of the constituent river basins (Figure 4a) are apparent.Another interesting feature is a strong wet bias in a basin located in center of the Iberian Peninsula compared to its neighboring regions.Wet soils during this period are very unlikely because the entire region was enduring a prolonged and extreme drought.Moderate dry-bias is apparent in basins in southwest Germany, and a strong dry-bias was detected in basins in west Croatia, south Lithuania, south Hungary and north Bosnia and Herzegovina.Conversely, the soil moisture field obtained with the multi-basin parameter estimation does not Hydrol.Earth Syst.Sci. Discuss., doi:10.5194/hess-2017-89, 2017 Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 6 March 2017 c Author(s) 2017.CC-BY 3.0 License.exhibit these nuisances and thus can be regarded as a spatially seamless field.In this case, parameter estimation with a large sample of geophysical characteristics and many streamflow times series to estimate efficiency measures leads to a well-posed parameter estimation problem.
Based on these results, it can be concluded that parameter sets obtained using the on-site parameter estimation technique does not lead to seamless parameter fields or state variables.Moreover, automatic optimization algorithms, such as SCE or DDS, tend to over-learn from time series with large observational errors, which in turn leads to poor identifiability of parameters (Brynjarsdottir and O'Hagan, 2014) and biased simulations, as demonstrated above.Consequently, parameter estimation should be performed with a representative sample of basins that adequately cover the variability of hydrological regimes and geophysical properties (e.g., soil types) (Kumar et al., 2015).It is worth noting that if the parameters of a model are estimated in a small basin with very few soil types, a single geological formation, or very flat terrain, then it is very likely that the estimated parameters are biased to a specific basin setting and are not useful for seamless and continental scale simulations.

Detecting effects of ad hoc regionalization techniques
The effects of the commonly used parameterization techniques used to generate the porosity fields of LSMs (such as CHTES-SEL and Noah-MP depicted in Figure 1) are important to investigate.These fields were obtained by combining the majority (or dominant) upscaling operator and a look-up table containing categorical values of model parameters tabulated for a limited set of dominant soil types ( e.g., Niu (2011, p.20.), ECMWF (2016, p.137)).The majority upscaling is represented as β 1,i = β 0,k , where β 1,i denotes the upscaled parameter at grid cell i and β 0,k is the parameter value corresponding to the most common soil type class k in level-0 within i.The soil types are given in a deterministic look-up table with m classes, and k ∈ m.The majority-based operator is used for estimating grid-specific vegetation classes in LSMs (Li et al., 2013).
The porosity field, based on a majority upscaling for the Noah-MP model used in EURO-CORDEX (www.euro-cordex.net)at an approximately 12-km resolution, is depicted in Figure 1g.Compared with the other model derived porosity fields, the Noah-MP field appears to be most homogeneously distributed in space, and it is very likely that the spatial heterogeneity is under-represented in this case.It should be noted that a model such as CABLE that uses a porosity field with an approximately 100-km resolution has a larger variability than that of Noah-MP at 12 km.However, one can argue that the underlying soil texture map, and not the upscaling method, might cause the lack of variability To test this hypothesis, the highest resolution soil map available for Europe was used and applied in the same manner to derive porosity fields as described above.The texture field is provided by the SoilGrids dataset (soilgrids.org) at 1000-m resolution (level-0).The upscaled porosity field was generated at 5 km for the EDgE project (edge.climate.copernicus.eu).The soil characteristics for Noah-MP were estimated using the same look-up table as in the EURO-CORDEX-Noah-MP case.The Other upscaling operators, such as the weighted arithmetic mean, are commonly used in LSMs in combination with the mosaic approach.For example, in CLM (Oleson et al., 2013, see p. 160) the texture class of the sub-units of the cell, called tiles, are provided in a look-up table.The upscaled porosity field obtained using this approach is shown in Figure 1b at a 1 • (100 km) resolution.Methods based on the majority and weighted arithmetic mean operators exhibit some similarity and lack spatial variability.In both cases, the spatial mean is approximately 0.43 m 3 m −3 .Hydrologic models that do not use soil porosity tend to use a similar conceptualization and values denoted as the total available water capacity (TAWC, WaterGAP versions 2 and 3) and field capacity (FC, HBV).For these type of conceptual models, normalized values of these parameters are used as surrogates for soil porosity.The consistency of the spatial patterns of TAWC and FC are compared here instead of their actual values.A distinctive difference in the patterns can be observed.For example, WaterGAP3 exhibits lower values than WaterGAP2, whereas the pattern of the normalized FC in HBV is the opposite in many locations (e.g., Spain, Germany, Scandinavia).Details of the parameterization schemes used to estimate TAWC and FC are beyond the scope of this study.Interested readers may refer to Müller Schmied et al. (2014) or Beck et al. (2016), respectively.However, the TAWC in WaterGAP is obtained by linking the soil type provided by the FAO soil map with available water capacity values estimated by Batjes (1996).Thus, no scaling rule or form of regularization is used in this case.The field capacity parameters used in HBV were determined using an ad hoc nearest-neighbor interpolation technique that relies on calibrated parameters from nearby similar donor basins.The parameter fields obtained for two versions of WaterGAP (30 arcmin, 5 arcmin) and HBV are depicted in Figure 1, panels i, j, and k, respectively.It can be concluded that the procedure employed is not scale invariant by comparing the parameter sets from both WaterGAP model versions, which were operated at different resolutions.The regionalization proposed by Beck et al. (2016) leads to a patch-quilt field that does not resemble to any other field presented.The HBV field lacks seamlessness and becomes conclusive evidence that calibrated model parameters cannot be transferred to other locations or interpolated in space by heuristic algorithms.

Changes in the dynamics of water fluxes and states
There is a complex interplay between soil moisture (SM) and latent heat (LH) in LSM/HMs.Improving our understanding of soil-land-atmosphere feedback is fundamental to making reliable predictions of water and energy fluxes on land systems.
In this context, it can be hypothesized that the parameterization of soil related parameters (e.g., soil porosity) has significant effects on related state variables and fluxes.To falsify this hypothesis, two contrasting modeling paradigms (Noah-MP and mHM) were employed.
The WRF/Noah-MP system is forced with ERA-interim at the boundaries of the rotated CORDEX-Grid (www.meteo.unican.es/wiki/cordexwrf) at a spatial resolution of 0.11 • covering Europe from 1989 to 2009.To ease comparison, the process-based hydrological model mHM (www.ufz.de/mhm) was driven with daily precipitation and temperature fields generated by the WRF/Noah-MP system during the same period.The spatial resolution of mHM was fixed at 5×5 km 2 .The main geophysical characteristics in WRF/Noah-MP of land cover and soil texture are represented with a 1×1 km 2 MODIS and a single-horizon, The settings of the mHM model used in this experiment are described in Section 3.4.In contrast with those of Noah-MP, the global parameters of mHM estimated using the MPR technique are obtained by closing the water balance over selected river basins in Europe (Rakovec et al., 2016a).The porosity fields obtained for mHM over the Pan-EU is depicted in Figure 1f.
Notably, although the overall mean of the porosity estimated using MPR is only 2.3% lower than that calculated using the majority-based approach in Noah-MP (Figure 5), the spatial patterns obtained by both models are very different.The evidence of this remarkable dissimilarity can also be visualized by comparing the empirical density functions shown in Figures 3d and   5c, both corresponding to a field at 1 = 5 km and with the same input data.
In general, it can be concluded that the porosity field estimated by Noah-MP tends to have lower water holding capacity values than that of the mHM.Deviations of up to -6.8% were detected in Germany, where a detailed evaluation was conducted by Samaniego et al. (2012).Within the same domain, the porosity values in Noah-MP could be up to 15% less than those estimated by mHM.However, at very few locations the opposite can happen, reaching values of 40% over-estimation.
The phase diagrams of the monthly fraction of soil water saturation fSM = θ θs (i.e., plots of monthly fSM(t) vs. fSM(t + 1)) are subsequently estimated to investigate how the differences in porosity estimates of the top 2 m soil column affect the soil moisture dynamics (Figure 6).Two locations in Germany are selected in which Noah-MP significantly over-or underestimated the latent heat fluxes with respect to mHM (the latitude and longitude coordinates of the center of the selected Noah-MP grids are (54 • N ,10 • E) and ( 51• N ,7 • E), respectively).This experiment unambiguously shows that, at locations where Noah-MP over-estimates latent heat with respect to mHM, the dynamic of the monthly SM time series is enhanced, leading to much lower soil moisture values (extreme droughts) (Figure 6a).Conversely, underestimation of latent heat greatly constrains the soil moisture dynamics (Figure 6b).

Flux matching test
In Section 2.1, it was postulated that ad hoc parameterization schemes do not fulfill the flux-matching test performed with a flux simulated by a given model at two modeling resolutions (e.g., 1 = 5, 30 arcmin).A detailed description of how to perform this test is provided in Samaniego et al. (2010b).The following experiment was conducted with three models: mHM, PCR-GLOBWB, and WaterGAP at the resolutions above to try to falsify this strong postulation.All models use the same forcings and geophysical information.The experiment was conducted in the Rhine River upstream of the Lobith gauging station.All three models are driven by daily forcing with a spatial resolution of 5 km, which was kindly provided by the EFAS team at JRC (www.eea.europa.eu).Additional details of the modeling settings of this experiment are provided in Sutanudjaja et al. (2015) and www.hyperhydro.org/.The KGE and bias of these three models obtained for both scales at the Lobith station during 2003 are reported in Table 2.The streamflow during this year was selected for evaluation because it exhibited strong temporal dynamics, with wet conditions in the beginning of the year followed by a drought during the summer.These efficiency metrics indicate that mHM is the only model that can have higher KGE efficiencies regardless of the scale.7.These results reveal that mHM is the only model able to fulfill the flux-matching test, as apparently in this figure.This experiment also shows that the MPR technique implemented in mHM leads to ET fields that satisfy mass conservation across scales and exhibit seamlessness that becomes apparent at higher resolutions (see Figure 7a).
The PCR-GLOBWB and WaterGAP models reveal large inconsistencies in annual ET, although the streamflow performance at the outlet is good (greater than 0.83 in both cases).These results also confirm the postulation that "streamflow-related metrics are a necessary but not sufficient condition to warrant the proper partitioning of incoming precipitation P into various spatially distributed water storage components (e.g., SM) and fluxes (e.g., ET)" (Rakovec et al., 2016b).Because all models are forced with exactly the same forcings, share the same geophysical information and have almost similar hydrological process descriptions, it can be safely concluded that the parameterization method used in the models should cause the ET mismatch.
To falsify this postulation, the MPR parameterization protocol proposed in Section 3.3 was applied to PCR-GLOBWB.

Implementation of the parameterization protocol in PCR-GLOBWB
To evaluate the consistency of land surface fluxes before and after MPR implementation, we performed a sensitivity analysis to study the impact of MPR on evaporative fluxes and soil moisture content in PCR-GLOBWB (van Beek et al., 2011;Wada and Bierkens, 2014;Sutanudjaja et al., 2016) over the Rhine River basin during 2003.The model was used to simulate the hydrological states at two different spatial resolutions ( 1 =5, 30 arcmin), and the sensitivity to MPR implementation was evaluated using a field difference method (in line with eq.1): where W c and w f are the coarse (c) and fine (f ) resolution simulations of variable W , respectively, and T is the total time series length.
The original PCR-GLOBWB parameterization does not include consistency in upscaling as enforced by MPR, leading to a larger difference in soil properties.These differences in soil hydraulic properties influence the derived hydrological properties, leading to changes in saturated conductivity and storage capacity in the unsaturated zone.The considerable differences in ET fluxes are shown in panels (c) and (d) of Figure 7 and are the result of these changes.
When MPR is employed, we see that the actual average Rhine basin evapotranspiration ∆ drops from 29% to 9.4% (Figure8).
For the total column soil moisture, we find a stronger decrease in ∆ from 25% to 6.9%, clearly indicating the benefits of MPR implementation.We also observe an increase in the discharge KGE performance compared with observations at Lobith.The original KGEs were 0.86 ( 1 =5 arcmin) and 0.93 ( 1 =30 arcmin), whereas the KGEs with MPR implementation are 0.91 and 0.93, respectively.Another advantage is that PCR-GLOBWB is calibrated at a coarser resolution, whereas this model was calibrated for each spatial resolution individually in the original set-up and with lower consistency in the discharge simulation.From these evaluations, we conclude that MPR implementation leads to significant improvement in the flux matching and discharge simulations across scales, allowing for more consistency across scales for hydrological model simulations.Notably, additional parameters in PCR-GLOBWB still need to be regionalized within the MPR framework, which could potentially lead to better performance and transferability.

Conclusions
Hyper-resolution modeling initiatives (Wood et al., 2011;Bierkens et al., 2014) challenge the hydrological community to intensify efforts to make water (quantity and quality) and energy flux predictions "everywhere" and for these predictions to be "locally relevant."The predictions should have small uncertainties to be useful for the end-users.These grand challenges also imply that the next generation of land surface and hydrologic models must incorporate probabilistic descriptions of the sub-grid variability of geophysical land surface properties -such as POLARIS (Chaney et al., 2016b) and SoilGrids (Hengl et al., 2017) -to cope with the large uncertainties that characterize the related process below the REA scale.Consequently, great efforts should be made in hyper-resolution monitoring at the global scale, in improving the computational efficiency of LSM/HMs, and in the development of scale-invariant parameterizations for these models.In this study, we have shown that the state-of-the-art is totally inadequate to address this grand challenge, especially scale-invariant parameterization.
We presented and tested a technique called multiscale parameter regionalization (Samaniego et al., 2010b), currently available only in mHM but recently implemented in PCR-GLOBWB and VIC (Mizukami et al., 2017).Moreover, we proposed a Parameterization Protocol as a guideline to apply MPR and to retrofit existing LSM/HMs to make them sufficient for the task of addressing these grand challenges.
This study has shown that two models that use ad-hoc parameterizations can have reasonable efficiency with respect to simulated streamflow but poor performance with respect to distributed fluxes such as evapotranspiration.The implementation of this protocol in PCR-GLOBWB in this study increased the model efficiency by almost 6% and improved the consistency of simulated ET fields across scales.We have shown that the PCR-GLOBWB global parameters can be transferred across scales with a consistent ET patterns and model efficiency.
In general, it can be concluded that the estimation of global parameters is feasible with MPR and that these scalars are transferable across scales and locations.The successful application of MPR implies that the averaging procedure of geophysical properties matters and that having the right physics with incorrect "effective" parameters leads to incorrect fluxes, states and feedbacks in the soil-vegetation-atmosphere continuum.Consequently, MPR is a step forward to quasi scale-invariant parameterizations and is feasible to implement in existing LSM/HMs whose goal should be seamless parameter fields across scales.
We consider that this feature is the key for the next generation of LSM and NWP models such as the "model for prediction across scales" (MPAS) (www.mmm.ucar.edu)and the "nested-domain ICOS" (www.earthsystemcog.org/projects/dcmip-2012/icon-mpi-dwd).Furthermore, a proper implementation of MPR in process based (conceptual) models may contribute to recent efforts towards identifying their "effective" parameters through observational datasets at the scale of interest (Savenije and Hrachowitz, 2017).
Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-89,2017   Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 6 March 2017 c Author(s) 2017.CC-BY 3.0 License."What modeling experiments need to be performed to resolve the scale question and what is the trade-off among model complexity, the physical basis for land parameterizations and observational data for estimating model parameters?" 2 Parameterization of hydrologic and land surface models 2.1 Representing spatial heterogeneity in HMs/LSMs: a brief review Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-89,2017   Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 6 March 2017 c Author(s) 2017.CC-BY 3.0 License.model (www.ufz.de/mhm)bySamaniego et al. (2010a) were further improved byKumar et al. (2013b) and more recently, a model-agnostic implementation of MPR has been proposed byMizukami et al. (2017).
Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-89,2017   Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 6 March 2017 c Author(s) 2017.CC-BY 3.0 License.specific, and thus β 1 denotes the upscaled effective parameter field.It is important to note that this scaling operator is not necessarily the arithmetic mean.
simulations for the entire Pan-EU domain.The results of this experiment are shown in panel (a) of Figure 4 for a day in August 2005.In the second experiment, the global-parameters γ are estimated simultaneously for a set of 13 basins covering various hydro-climatic regimes in the Pan-EU domain.The corresponding soil moisture field for the same point in time is depicted in panel (b) of Figure 4.
comparison of both parameter fields (i.e., EDgE-Noah-MP and EURO-CORDEX-Noah-MP) and the main statistical moments describing the spatial variability of the porosity fields are shown in Figure 5.The results clearly indicate the inappropriateness of the majority-based upscaling operator for this parameter in both cases.It led to reduction of the variance of the porosity field and to the least sensitive operator at the resolution of the level-0 data.This means that the informational content of the hyper-resolution soil maps, commonly available globally, is almost lost.Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-89,2017 Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 6 March 2017 c Author(s) 2017.CC-BY 3.0 License.
Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-89,2017   Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 6 March 2017 c Author(s) 2017.CC-BY 3.0 License.coarse-resolution FAO soil map with 16 soil texture classes, respectively.The porosity field of Noah-MP is estimated by applying a majority-based operator to values for different soil classes, as shown in Figure 1g.
Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2017-89,2017   Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 6 March 2017 c Author(s) 2017.CC-BY 3.0 License.The flux-matching test presented in Section 3.1 was performed with simulated evapotranspiration (ET) because it is the largest flux in the water cycle besides precipitation and is prone to the largest predictive uncertainties.The results of this test are shown in Figure Hydrol.EarthSyst.Sci.Discuss., doi:10.5194/hess-2017-89,2017   Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 6 March 2017 c Author(s) 2017.CC-BY 3.0 License.

Figure 6 .Figure 7 .Figure 8 .
Figure 6.Phase diagrams of monthly soil moisture fraction for two locations in Germany, (a) 54 • N ,10 • E and (b) 51 • N ,7 • E, in which the latent heat estimated by Noah-MP is over-or under-estimated with respect to corresponding estimates of mHM.The models have identical forcings.