Estimating catchment scale groundwater dynamics from 1 recession analysis – enhanced constraining of hydrological 2 models

Response to Referee #2 on the review of “Estimating catchment scale groundwater 1 dynamics from recession analysisenhanced constraining of hydrological models”, by T. 2 Skaugen and Z. Mengistu. 3 4 General comments: 5 The paper is much improved (compared to the previous version which I reviewed) in terms of 6 readability, and justification of the parameter estimation method. My remaining concern is the 7 justification for the chain of manipulations made in section 2.3. The assumptions made are plausible and 8 pragmatic, e.g., 9 “the temporal distribution of catchment scale storage can be considered as a scaled version to that of 10 the recession characteristic” 11 “the mean value of the sampled ΛΛ′s, Λ, represents the slope of recession in a state of mean storage in 12 the catchment,” 13 “The water left in the soils, SSSSSS, at steady state (after JJ time intervals) and hence assumed to represent 14 the mean storage” 15 However, a different assumption could have been made in every case, the alternatives are not 16 considered, and only the first of the three listed here is later discussed. As a result it is difficult to know 17 whether all of the choices made were critical to the success of the new model. At the very least, I think 18 the authors should list in the discussion any assumptions which have not been tested. 19 20 Response: We agree to that and have included a paragraph, which lists and discusses the 21 remaining two assumptions (see p20, l 3-19 in marked MS). 22 23 Specific points: 24 25 P2 “A single linear reservoir is, however, too simple for describing the variability and non-linearity of 26 hydrological response.” Please cite some references for this. 27 Response: done, see p4, l 23 in marked MS. 28 29 P6 18 “In the current version of DDD, MMMM is a calibrated parameter and is divided into storage levels, 30 ii,” Are these equal-sized intervals? 31 Response: Yes, see p8, l 18 in marked MS. 32 33 P6 19 “which are all assigned different wave velocities, or celerities”. Are these the linear reservoirs 34 arranged in parallel referred to above? If yes, it would help to use the same terms to describe the same 35 thing. 36 Response: see p8, l 19 in marked MS. 37 38 P7 15 “The distance distributions (Figure 2)” Which catchments are these? 39 Response: we have included the information in the figure caption (see p34, l 7-8 in marked MS). 40 41 42 P11 8 “If we assume that the mean value of the sampled ΛΛ′s, Λ, represents the slope of recession in a 43 state of mean storage in the catchment” Why is it reasonable to assume the mean recession rate occurs 44 at the mean storage? 45 Response: We assume a very strong association between the recession characteristic Λ, and storage (see 46 discussion at p20, l 3-19 in marked MS). 47 48 Eq 17: Si appears on both sides of this equation – is that correct? Is it really an integral equation which 49 must be solved? 50 Formatert: Skriftfarge: Tekst 2

Abstract.In this study, we propose a new formulation of subsurface water storage dynamics for use in rainfallrunoff models.Under the assumption of a strong relationship between storage and runoff, the temporal distribution of catchment-scale storage is considered to have the same shape as the distribution of observed recessions (measured as the difference between the log of runoff values).The mean subsurface storage is estimated as the storage at steady state, where moisture input equals the mean annual runoff.An important contribution of the new formulation is that its parameters are derived directly from observed recession data and the mean annual runoff.The parameters are hence estimated prior to model calibration against runoff.The new storage routine is implemented in the parameter parsimonious distance distribution dynamics (DDD) model and has been tested for 73 catchments in Norway of varying size, mean elevation and landscape type.Runoff simulations for the 73 catchments from two model structures (DDD with calibrated subsurface storage and DDD with the new estimated subsurface storage) were compared.Little loss in precision of runoff simulations was found using the new estimated storage routine.For the 73 catchments, an average of the Nash-Sutcliffe efficiency criterion of 0.73 was obtained using the new estimated storage routine compared with 0.75 using calibrated storage routine.The average Kling-Gupta efficiency criterion was 0.80 and 0.81 for the new and old storage routine, respectively.Runoff recessions are more realistically modelled using the new approach since the root mean square error between the mean of observed and simulated recession characteristics was reduced by almost 50 % using the new storage routine.The parameters of the proposed storage routine are found to be significantly correlated to catchment characteristics, which is potentially useful for predictions in ungauged basins.

Introduction
The movement of groundwater to streams is an important component of catchment hydrology and simulating its movement is key to accurately reproducing the hydrograph.Unfortunately, at the spatial scale of interest for studying the dynamics of hydrological systems (the catchment scale), we are not able to actually see and learn how water is transported in the subsurface.Hence, for many decades the (subsurface) storage-runoff relationship has been the basis for countless hydrological model concepts.The subsurface water storage, hereafter denoted subsurface storage or storage, is to be understood as the dynamics storage, i.e. the variation in storage between the wet and dry periods (Kirchner, 2009).In this paper, we will develop and test a new formulation for storage dynamics.The proposed subsurface storage model is based on linear reservoir theory and its parameters are derived directly from recession analysis, digitised maps and the mean annual runoff.
The linear reservoir, often visualised as a straight-sided bucket with a hole in the bottom (Beven, 2001;Dingman, 2002), has an exponentially declining outflow and is the basis for the exponential unit hydrograph (UH).It has served as the most commonly used storage-runoff relationship and plays a fundamental role in conceptual rainfall runoff models.A single linear reservoir is, however, too simple for describing the variability and non-linearity of hydrological response (Brutsaert and Nieber, 1977;Lindström et al., 1997).Some groundwater models conceptualise the stream-aquifer inter-Published by Copernicus Publications on behalf of the European Geosciences Union.
T. Skaugen and Z. Mengistu: Estimating catchment-scale groundwater dynamics from recession analysis actions as the drainage of an infinite number of independent linear reservoirs (Sloan, 2000;Pulido-Velasquez et al., 2005;Bidwell et al., 2008;Rupp et al., 2009).This comes as a result of solving the linearised Dupuit-Boussinesq equation for saturated flow as an eigenvalue and eigenfunction problem.In order to capture the variability in hydrological response, most conceptual rainfall-runoff models also use a system of several, often modified, linear reservoirs to describe the soil moisture accounting and runoff dynamics.The system may vary in complexity (and hence in the inclusion of free calibration parameters), but the linear reservoir remains the basic building block.Examples of such models are the UH models of Nash (1957) and Dooge (1959) and the explicit soil moisture accounting (ESMA) models, of which the workhorse of operational Nordic hydrology, the Hydrologiska Byråns Vattenbalans (HBV) model (Bergström, 1992) serves as an example (see Beven, 2001 for a discussion on the evolution of rainfall-runoff models).In Lindström et al. (1997), the upper zone (the reservoir responsible for quick response) of the HBV model was formulated as a non-linear reservoir, Q = ϑS 1+δ , where Q is runoff, S is storage and ϑ and δ are calibrated constants.For δ = 0, this is, of course, an ordinary linear reservoir.
Recession behaviour should be characteristic for a specific catchment (Tallaksen, 1995;Kirchner, 2009;Stoelzle et al., 2013;Berghuijs et al., 2016) since it provides hydrological information integrated over the catchment.Such a scaledup hydrological signal contrasts that of information derived from the extrapolation of point measurements.Recession data have often been used to derive the storage-runoff relationship, and Brutsaert and Nieber (1977) discuss several theoretical models from the soil sciences as a basis for describing the non-linearity of storage-runoff relationships and investigate this relationship using recession events.Lamb and Beven (1997) developed a tool that used recession data to parameterise non-linear storage-runoff relationships but were not always able to fit single analytical functions.In Kirchner (2009), runoff is assumed to depend solely on the amount of water stored in the catchment and very carefully selected recession events are used to parameterise the storage-runoff relationship.The recession events were selected such that the possible contaminating effect of precipitation and evapotranspiration on the recession data was minimised.For two rivers in the UK, highly non-linear relationships between storage and runoff were found using this approach.
Recession characteristics are, in this paper, used to estimate parameters characterising the storage dynamics.The parameters associated with storage are hence estimated directly from observed data and a priori model calibration to runoff.Such an approach has many attractive features.First, when we use the precipitation-runoff relationship in model calibration, the estimated parameters will be conditioned on both inputs (precipitation and temperature) and the output (runoff).The calibrated parameters will therefore be sensitive to biases and errors in the inputs.Consequently, the more un-certain and biased the precipitation input, the more uncertain and biased parameter estimates (e.g.Dawdy and Bergman, 1969;Kuczera and Williams, 1992;Andréassian et al., 2001;Engeland et al., 2016).Second, when a single parameter is estimated directly from data, one removes the possibility that its value is conditioned on the value of the other parameters, i.e. that the calibrated parameter values compensate for structural or data errors (Beven, 1989;Kirchner, 2006Kirchner, , 2009)).Third, when a single parameter is estimated directly from observed data and not through the optimisation of a model, one does not have to take into account the possible (and probable) errors associated with the model structure (Beven, 2001, p. 21;Kirchner, 2009).In such a way, the errors associated with the modelling of, for example, snow and groundwater, do not influence the parameter estimate.In this paper, we distinguish between calibrated and estimated parameters.The term "calibrated parameters" refers to parameters being part of a set that is simultaneously optimised when minimising the difference between observed and simulated runoff.The term "estimated parameters" refers to parameters estimated independently and directly from observed data.These values are not tuned to minimising the difference between simulated and observed runoff as would be the case if they were calibrated.
The new formulation of storage dynamics proposed in this paper is implemented in the distance distribution dynamics (DDD) model (Skaugen and Onof, 2014;Skaugen et al., 2015), which is briefly reviewed in the next section.In this model, the dynamics of runoff are modelled using linear reservoirs (unit hydrographs -UHs) arranged in parallel, a principle which resembles the stream-aquifer interaction model described by, for example, Bidwell et al. (2008).The UHs are turned on and off according to the level of saturation in the catchment.The UHs are parameterised from recession data and digitised maps, so the DDD model incorporates many of the modelling approaches presented above.
The main objective of this study is to assess how the new formulation of storage, with its parameters estimated directly from recession characteristics and the mean annual runoff, compares with the current formulation of the storage, where its parameter is calibrated against runoff.The comparison will be carried out for a large number of catchments and for runoff and recession behaviour.In the discussion, some implications with respect to predictions in ungauged basins and spatially variable groundwater modelling are discussed.

Hydrological model
The DDD model (Skaugen and Onof, 2014;Skaugen et al., 2015) is a rainfall-runoff model written in the programming language R (http://www.r-project.org) and currently runs operationally at daily and 3-hourly time steps at the opera-Hydrol.Earth Syst.Sci., 20, 4963-4981, 2016 www.hydrol-earth-syst-sci.net/20/4963/2016/ tional flood forecasting service of the Norwegian Water Resources and Energy Directorate (NVE).The DDD model introduces new concepts in its description of the subsurface and of runoff dynamics.Input to the model is precipitation and temperature.In the subsurface module (see Fig. 1), the capacity of the subsurface water reservoir M (mm) is shared between a saturated zone, S (mm), called the groundwater zone and an unsaturated zone with capacity D (mm), called the soil water zone.The actual water present in the unsaturated zone, D, is called Z (mm).
The subsurface state variables are updated after evaluating whether the current soil moisture, Z(t), together with the input of rain and snowmelt, G(t), represent an excess of water over the field capacity, R, which is fixed at 30 % (R = 0.3) of D(t) (Grip and Rohde, 1985, p. 26;Colleuille et al., 2007).If so, excess water X(t) is added to S(t).To summarise, Excess water : Soil water content : Soil water zone : where Q(t) is runoff.Actual evapotranspiration, Ea(t), is estimated as a function of potential evapotranspiration and the level of storage.Potential evapotranspiration is estimated as Ep = θ cea • T (mm day −1 ), where θ cea (mm • C −1 day −1 ) is the degree-day factor which is positive for positive temperatures and zero for negative temperatures.Actual evapotranspiration thus becomes Ea = Ep × (S + Z)/M and is drawn from Z.
In the current version of DDD, M is a calibrated parameter and is divided into equal-sized storage levels, i, for which their associated UHs are all assigned different wave velocities, or celerities, v i (m s −1 ).The celerities increase for increasing i (see next section).Experience using the DDD model shows that the subsurface water capacity parameter M largely controls the variability of the hydrograph.Low values of M increase the amplitude of the hydrograph since the entire range of celerities is engaged, and vice versa.
Calibrated model parameters are hereafter denoted by θ with subscripts (e.g.θ M ), in order to clearly distinguish between estimated and calibrated parameters.

Runoff dynamics
The runoff dynamics are completely parameterised from observed catchment features derived using the Geographical Information System (GIS) and runoff recession analysis.Central for the formulation of runoff dynamics for a catchment is the distance distribution derived using GIS.The distances, d (m), from points in the catchments to the nearest river reach are calculated for each catchment.For more than 120 studied catchments in Norway, the exponential distribution describes the distribution of distances well.Figure 2 shows the empirical and exponential distributions for two Norwegian catchments and although the mean distance d is different, the exponential distribution is a good fit for both catchments.The parameter, γ , of the exponential distribution equals 1/d.The distance distributions (Fig. 2) express the areal fraction of the catchment as a function of distance from the river network.In Appendix A, analytical relations between exponential distance distributions and linear reservoirs are described.
In the DDD model, water is conveyed through the soils to the river network by waves with celerities determined by the actual storage, S(t), in the catchment.The celerities associated with the different storages are estimated by assuming exponential recessions with parameter in , where Q 0 is the peak discharge immediately before the recession starts (Nash, 1957).We can determine the parameter (t) from the difference at any time t during the recession due to the lack-of-memory property of the exponential distribution (Feller, 1971, p. 8).The parameter is thus the slope per t of the recession (of log Q(t)).From Eqs. (A2) and (A7) in Appendix A, we find the celerity v (m s −1 ) as a function of : If we sample values from all recession events (the only condition is that Q(t) > Q(t + t)) according to Eq. ( 3), we find that they can be fitted to a gamma distribution.This is a development from the exponential model used in Skaugen and Onof (2014) and is based on more detailed analysis of a much larger number of runoff records.For the 73 catchments used in this study, the gamma distribution was a good fit for all catchments.In Fig. 3, we have plotted the empirical and the gamma distribution of for six catchments with estimated shape, α, and scale, β, parameters of the gamma distribution, and it is clearly seen that the flexibility of the gamma distribution is needed in order to model the observed quantiles (see, for example, Fig. 3d and f).
The capacity of the subsurface reservoir, θ M , is divided into storage levels of equal capacity.The storage levels i correspond to the quantile of the distribution of under the assumption that the higher the storage, the higher the values of .Each level is further assigned a celerity v i = λ i d t (see Eq. 4), where λ i is the parameter of the individual unit hydrograph for storage level i, and estimated such that the runoff from several storage levels will give a UH equal to the exponential UH with parameter i , i.e.
where are the weights associated with the discharge from each level estimated by i = i / i k=1 k .From Eq. ( 5), λ i are solved successively for increasing i under the assumption that λ 1 = 1 (see Skaugen and Onof, 2014).
The quantiles of are mapped to a uniform distribution of S, F ( ) = S θ M , which implies that all storage levels are equally probable and that the equally spaced storage levels have equal capacity of water, i.e. if θ M = 50 mm and we use five storage levels (i = 1 . . .5), each level has a capacity of 10 mm.In Skaugen and Onof (2014), no increase in the precision of daily runoff simulations was found using more than five storage levels.

Reformulation of the subsurface of DDD
An obvious problem of the approach described above is that we attempt to estimate an extreme value, the maximum catchment-scale storage θ M , a task which is obviously associated with more uncertainty than estimating the mean catchment-scale storage, m s .Another problem is the assumption of a uniform distribution of storage levels.A quick investigation of observed groundwater level fluctuations suggests that this is not the case.Figure 4 shows histograms of observed groundwater levels from three observation boreholes located in a small catchment (the Groset catchment, 6.33 km 2 ) in southern Norway.The figure clearly illustrates that fluctuations in storage and groundwater levels are spatially variable and should ideally be treated as such Hydrol.Earth Syst.Sci., 20, 4963-4981, 2016 www.hydrol-earth-syst-sci.net/20/4963/2016/ in rainfall-runoff models (Rupp et al., 2009;Sloan, 2000).This is a consequence of the differences in water level fluctuations depending on the location of the borehole relative to the river (i.e. at the top of a hillslope versus adjacent to a river) and also on the catchment variability of topography and soil porosity (Refsgaard et al., 2012).It is therefore very difficult to parameterise the distribution of the catchmentscale groundwater fluctuations from such single observation points (Kirchner, 2009).In addition, the distribution is unlikely to be uniform as none of the individual histograms exhibits such a behaviour.
To overcome the problems identified above, we attempt to develop a storage model that differs from the previous model in that the groundwater reservoir is parameterised by its mean storage, m s , as opposed to the maximum storage, θ M .In addition, regarding the practical problems associated with the observation of catchment-scale fluctuations of storage, we make the assumption that recession and its distribution carries information on the distribution of catchment-scale storage.More precisely, we assume that the temporal distribution of catchment-scale storage can be considered as a scaled version to that of the recession characteristic, .Consequently, the subsurface reservoir no longer increases linearly with the quantiles (which is the case with storage levels of equal capacity), but rather increases non-linearly according to the shape of the distribution of .
Since the distribution of is modelled as a two-parameter gamma distribution, we can write where α and β are the shape and scale parameters, respectively, and estimated from observed values (using Eq. 3).
The distribution of S is hence also modelled as a twoparameter gamma distribution: where the scale parameter, η, is and c is a constant and equal to where is the mean value of , estimated from the parameters of the fitted gamma distribution and represents the mean recession characteristic.Note that since the distribution of S is a scaled version of , the shape parameter α is equal for the two distributions.
In order to model the storage as a two-parameter gamma distribution, we need to estimate the mean storage, m S .We can then determine the constant c from Eq. ( 9), and finally, the scale parameter η using Eq. ( 8).
If we assume that the mean value of the sampled values, , represents the slope of recession in a state of mean storage in the catchment, then the associated UH is The temporal scale of the UH in Eq. ( 10) is t h,max = d max /v h , where d max is the observed maximum distance of the hillslope distance distribution and v h is the celerity associated with through v h = d t (see Eq. 4).Let t h,max be divided into suitable time intervals, t, then the number of time intervals it takes to drain the hillslope is J = t h,max / t.When Eq. ( 10) is integrated over successive time intervals, we obtain weights, µ j , which, if multiplied by the excess moisture input, X( t), give the response (the water entering the river network) for the different time intervals.The weights are calculated as and scaled so that the sum of weights equals 1.The runoff at time interval j is calculated as For estimating the mean storage, m S , we first calculate the mean annual runoff, MAR, which corresponds to a daily excess moisture input X of where A is the catchment area.
After J successive days of input X, routed with the UH of Eq. ( 10), we reach a steady state where the volume of the input equals the output (MAR).The total sum of moisture input after J days is where total runoff, Q SS , after J days is Hydrol.Earth Syst.Sci., 20, 4963-4981, 2016 www.hydrol-earth-syst-sci.net/20/4963/2016/ and k is the number of days and the subscript denotes "steady state".The water left in the soils, S SS , at steady state (after J time intervals) and hence assumed to represent the mean storage m S , is S SS = J • X − Q SS , which can also be calculated as With an estimate of the mean storage, m S , we can use Eqs.( 8) and ( 9) to estimate the scale parameter, η, of the distribution of S. The shape parameter, α, is already determined and equal to that of the distribution of .The gamma-distributed storage levels S i are calculated as quantiles of the gammadistributed storage: where M is now estimated as the 99 % quantile of the distribution of S.

Test of new storage routine
We will test the performance of the new formulation of storage by replacing the formulation of the storage where θ M is a calibrated parameter and storage is uniformly distributed with a formulation where storage is gamma distributed with parameters, η and α, derived from recession data and MAR.The model with the current storage routine is denoted DDD_θ M and the model with the new storage routine is denoted DDD_m S .DDD_m S and DDD_θ M are tested for 73 catchments distributed across Norway (see Fig. 5).The catchments vary in latitude, size, elevation and landscape type (see histograms of selected catchment characteristics in Fig. 6) and thus constitute a varied, representative sample of Norwegian catchments.
The time series for precipitation and temperature are mean areal catchment values extracted from an operational meteorological grid (1 km 2 × 1 km 2 ) produced by MET Norway, which provides daily values of precipitation and temperature for Norway from 1957 to the present day (see http://www.senorge.no).The runoff data are provided by the NVE.The time series of precipitation, temperature and runoff were split into a calibration data set (1 September 1995-31 December 2011) and a validation data set (1 January 1980-31 August 1995).
DDD_θ M is calibrated using an R-based Monte Carlo Markov chain method (Soetart and Petzhold, 2010).Altogether, 11 parameters (including θ M ) are calibrated (see parameters denoted by θ with subscripts in Table 1).The calibrated parameters, except for θ M , are also used when running DDD_m S .The variability error is given by the ratio of the coefficients of variation of simulated and observed runoff as suggested in Kling et al. (2012).The mean values of the skill scores for DDD_θ M and DDD_m S are shown as straight lines in the plots.We have also added a moving average of the results for enhanced readability.We see from Fig. 7 that little precision is lost in the results for DDD_m S .The mean values of NSE and KGE are slightly better for DDD_θ M .The result for bias is better for DDD_m S (Fig. 7d), whereas the results for the correlation and variability errors favour DDD_θ M .Overall, the differences in skill between DDD_m S and DDD_θ M are very small.Mean values of the skill scores for DDD_m S and DDD_θ M are shown in Table 2.The observed distribution of the recession characteristic , is crucial for both the estimation of the subsurface celerities and the estimation of m S .If the distribution of simulated , denoted ˙ , is similar to that of the observed, this suggests that recessions are well simulated and hence that the dynamics of the model are realistic.Figure 8 shows scatter plots of the mean and standard deviation of observed and simulated ˙ for DDD_m S (blue circles) and DDD_θ M (red crosses).The root mean square error (RMSE) of the mean ˙ is clearly less for DDD_m S , whereas the RMSEs of standard deviation of ˙ for DDD_m S and DDD_θ M are similar (see Table 3).
Figure 9 shows histograms of simulated storage from DDD_θ M (Fig. 9a) and DDD_m S (Fig. 9b) with empirical cumulative distribution functions (CDFs) (Fig. 9c) of the observed (black line) and simulated ˙ (DDD_θ M with a red line and DDD_m S with a blue line) for a specific catchment.The CDF of ˙ simulated with DDD_m S is clearly in better agreement with that of the observed.The shape of the histograms of storage fluctuations are very different, and as we have no data to estimate the true empirical distribution of storage at the catchment scale, we cannot claim that the fluctuations simulated with DDD_m S are closer to the truth than those simulated by DDD_θ M .However, since the parameters of the subsurface and the dynamic module of DDD_m S are estimated prior to model calibration and the recessions are demonstrably better simulated, it is reasonable to suggest that the catchment-scale storage fluctuations simulated with DDD_m S are closer to the truth.

Discussion
The new formulation for the subsurface storage gives good results, and it is promising that the replacement of a routine with calibrated parameters with a routine with estimated parameters produces runoff simulations which are equally precise and robust.In addition, the simulated recessions, ˙ , are much closer to those observed, suggesting a more realistically modelled storage-runoff relationship (i.e. the nonlinearly increasing storage capacity).Comparing simulated runoff in such a manner constitutes a rather strict test for Hydrol.Earth Syst. Sci., 20, 4963-4981, 2016 www.hydrol-earth-syst-sci.net/20/4963/2016/The reduction of calibrated parameters in the storage and dynamic module of the DDD model has attractive implications for the problem of predictions in ungauged basins (PUB) (see, e.g.Sivapalan, 2003;Parajka et al., 2013;Hrachowitz et al., 2013;Blöschl et al., 2013;Skaugen et al., 2015).In Skaugen et al. (2015), seven model parameters of the DDD model (including θ M and the parameters for the distribution of λ) were estimated from catchment characteristics (CCs) using multiple regression analysis.All model parameters were found to correlate significantly with the CCs.The median NSE for 17 catchments was found to be 0.66 and 0.72 for two time series when DDD was run with model parameters estimated from CCs.The change in the model structure of DDD presented in this paper with respect to predictions in ungauged basins implies that we need to estimate the parameters for the distribution of from CCs.The estimation of θ M through multiple regression with CCs, however, is no longer needed.Although it is not within the scope of this study to conduct a full PUB analysis, we investigated how the parameters of the distribution of can be regionalised.Since λ is a function of (see Eq. 5), the parameters of the distribution of λ and are obviously highly correlated (from a sample of 84 Norwegian catchments, we found correlations between the shape, α, and the scale, β, parameters of and λ of (α) ,λ = 0.97 and (β) ,λ = 0.98).In Skaugen et al. (2015), the parameters for the distribution of λ could be expressed as functions of the mean of the distance distribution, d, percentage of lake, percentage of bare rock Degree-day factor for melting snow Calibrated Degree-day factor for melting glacier ice 1.5 × θ CX CFR (mm and catchment length with significant coefficients of determination of R 2 λ (α) = 0.45 and R 2 λ (β) = 0.35, respectively.A similar analysis using the new model structure (DDD_m S ) with an added new subroutine for the spatial distribution of snow water equivalent (SWE) (Skaugen and Weltzien, 2016) showed that the parameters of the distribution of were significantly correlated (p value < 0.01) to the mean of the distance distribution, d, areal percentage of lake and the catch-   4).From Table 4, we note that the shape parameter is positively correlated to the areal percentage of lake (L %).In Fig. 3f, this catchment has L % of 9.5 %, whereas in Fig. 3c, L % is only 4.4 %.The significant correlations yield significant multiple regression equations with coefficients of determination of R 2 (α) = 0.59 and R 2 (β) = 0.54.Hence, the potential for improved predictions in ungauged basins is promising.
The assumption of equal shape for the distributions of and S is, of course, difficult to verify as no direct observa- However, if we use the equation for the linear reservoir in Appendix A (Eq. A4) and express the rate constant as a function of (Eqs. 4 and A6), we can, for observed recession values of Q and , calculate the corresponding values of S and compare the distributions of and (the scaled) S.
Figure 10 shows such a comparison for two catchments, and, except for the highest quantiles, the distributions of and (scaled) S are almost identical and hence supporting our as-Hydrol.Earth Syst. Sci., 20, 4963-4981, 2016 www.hydrol-earth-syst-sci.net/20/4963/2016/ sumption.The high frequency of high S values present in Fig. 10, also seen for several other catchments (not shown), is the result of the combination of high Q values and low values of , i.e. very modest recession for situations with high runoff values.Such events are probably not representative for describing recession characteristics of the catchment.By sampling (t) and estimating S(t) under the condition that precipitation at the time (t + t) could not exceed a low threshold of 0, 2 and 5 mm, we found that the frequency of very high values of S estimated by Eq. ( 18) were reduced.Hence, the very high values of S did not represent storage for true recession events.Moreover, the distribution of was insensitive to such conditioning, implying that Eq. ( 3) is a robust estimate of recession characteristics, whereas the distribution of S is highly sensitive.This way of conducting recession analysis differs, mainly in the manner of sampling the recession events from those described in recession analysis reviews such as Tallaksen (1995) and Stoelzle et al. (2013).Common for many of the recession selecting algorithms reported in the literature is the censoring of the recession events with exclusion of events with rainfall or periods of high evapotranspiration (e.g.Kirchner, 2009) and exclusion of the early stages of the recession to avoid the influence of preceding storm and surface flow (Stoelzle et al., 2013).In this study, all recession events that satisfy Q(t) > Q(t + t) are used to estimate the parameters of the distribution of .We have found that the distribution of remains quite insensitive to precipitation (see above), and it is equally important that the parameters of the distribution of are correlated to (and can be estimated from) catchment characteristics.
There are other assumptions presented in this paper that remain difficult to test because S is not observed at the catchment scale.We assume that (i) represents the slope of recession in a state of m S and (ii) that water left in the soils at steady state for MAR represents m S .These assumptions are needed so that we can estimate m S , and hence have an analytical expression for the distribution of S through Eqs. ( 8) and ( 9).Since we are interested in a mean behaviour, linking the two known means, MAR and , with the unknown mean, m S , seems reasonable and the most straightforward.The actual spatially distributed subsurface state for mean storage (see below for more detailed discussion on subsurface states) can, of course, vary and be associated with different recessions (different values), but this does not rule out that and m S are good representatives for describing a mean behaviour.A catchment under steady state for MAR, where water is routed through the hillslopes with a unit hydrograph parameterised using , is a very simplistic way of describing runoff dynamics.However, it serves the purpose of establishing a scenario of mean behaviour where input (precipitation) balances the storage and output (runoff), and hence gives a plausible estimate of m S .
In Kirchner (2009), the storage-runoff relationship is assumed to be a single-valued function, i.e. S is a single-valued function of Q.This leads to a very simple model with regards to the number of states in the subsurface, namely one.The number of states in DDD, however, can be very high.If we consider Eq. ( 16), the number of summations (time steps) constituting S SS can be viewed as a number of subsurface states since each summation represents a volume water that will sooner or later propagate into the river network.Equation ( 16) describes the subsurface using only one (mean) UH.In the DDD model, the number of storage levels is fixed to five, and the UHs constituting the storage levels all have the same shape (exponential) but have different temporal scales.The temporal scale (level of discretisations) of the UHs varies according to their associated celerity, and the slowest (lowest) storage level may be discretised such that hundreds of time steps are necessary for the complete attenuation of the UH.Such a system actually provides a 2-D representation of the subsurface (Rupp et al., 2009;Sloan, 2000) and gives numerous subsurface states (Harman, 2015).It is hence entirely possible to have different configurations of states associated with the same runoff.Figure 11 shows a snapshot of how DDD models the storage S. The catchment is represented as one hillslope where the x axis shows the distance (in metres) from the river reach (at the right-hand side) to the top of the hillslope (at the left-hand side).The y axis shows the different storage levels.We see the outline of boxes (especially for the higher storage levels), which represents the temporal discretisation of the UHs.Each box represents an area according to the distance distribution and the associated celerity that will drain per time interval.The higher the celerity, the more parts of the catchment area are represented by each box.The darker the blue colour, the more water is present in the box.Figure 11 can be seen together with Fig. A1 in Appendix A, which illustrates how the distance distribution (and the travel-time distribution) determines the fractional areas that drain per time interval for a given celerity (see also Harman, 2015 for distribution of storage and water age).In Fig. 11, we can also note that it is more or less dry at the top of the hillslope and saturated near the river.This is consistent with the wetting up of a catchment from the riparian zone outwards and up the hillslope (Dunne and Black, 1970;Kirkby, 1978, p. 275;Myrabø, 1997).
Figure 12 shows simulated storage, S, plotted against simulated runoff, Q, for two catchments of different size (49 and 1833 km 3 ).It is quite clear that the relationship between Q and S is not single valued.The variability of Q for the same S (and vice versa) is to be expected given the multitude of possible configurations of the subsurface states (i.e. the discretisations of the UHs).The shape of the clouds of points resembles those found for observations of groundwater versus runoff (Rupp et al., 2009;Laudon et al., 2004;Myrabø, 1997).The points in Fig. 12, however, do not level off to the same degree as seen for groundwater observations.This can probably be explained by the fact that storage in DDD is simulated for an entire catchment, and it is more unlikely that an entire catchment will reach full saturation than individual groundwater boreholes located relatively close to the river (Myrabø, 1997;Laudon et al., 2004).
The parameters of the subsurface and the dynamical modules of the DDD model are all estimated prior to calibration against streamflow and we see this as a necessary development if we are to effectively test new algorithms for snow distribution, snowmelt, evapotranspiration etc. at the scale that matters for most practical applications, the catchment scale (Clarke et al., 2011).Multivariable parameter estimation (Bergström et al., 2002) has been put forward as a means to increase confidence in hydrological modelling and models.Although we agree that such procedures indeed narrow the parameter space (although not its number of dimensions), the interaction and compensating nature of the calibration parameters makes it almost impossible to reject flawed model structures so that we can concentrate on building models that work well for the right reasons.In this paper, and in previous ones (Skaugen and Onof, 2014;Skaugen et al., 2015), information ready at hand, such as GIS-derived distance distribution functions and runoff records, have proved to be useful for parameterising algorithms describing basic hydrological processes.

Conclusions
In this paper, a new formulation of the subsurface in the DDD model is presented.In the new formulation, the storage capacity increases non-linearly with saturation, following a two-parameter gamma distribution.The parameters of the gamma distribution are estimated directly from observed runoff recession data and the mean annual runoff and not through model calibration against runoff.The new storage formulation has been tested for 73 catchments in Norway Hydrol.Earth Syst. Sci., 20, 4963-4981, 2016 www.hydrol-earth-syst-sci.net/20/4963/2016/ of varying size, mean elevation and landscape type, with little loss in precision.In addition, more realistic runoff recessions are found using the new subsurface routine, suggesting a more realistic storage-runoff relationship.
A preliminary analysis shows that the parameters of the new storage routine can be estimated from catchment characteristics, which is promising for continued advances in prediction in ungauged basins.
The DDD model exhibits a spatially variable representation of the subsurface and allows for different subsurface states associated with the same value of runoff.This constitutes a more realistic representation of the subsurface and is more in line with more dedicated groundwater models.
Future work includes implementing a more physically based energy balance approach for snowmelt in DDD and testing the new model structure for predictions in ungauged basins in a similar analysis to that of Skaugen et al. (2015).

Data availability
The precipitation, temperature and runoff data used in this study can be obtained by contacting the corresponding author.

Figure 1 .
Figure 1.Schematic of the subsurface water reservoir M of DDD.G(t) represents moisture input, rain and snowmelt.The dotted horizontal line is the actual level, Z, of soil moisture in D. The ratio (G(t) + Z(t)) / D(t) controls the release of excess water to S and hence to runoff.Note that D, S and Z are functions of time, whereas M is fixed.

Figure 2 .
Figure 2. Empirical and fitted (exponential, red line) cumulative distribution functions (CDFs) of distances from a point in the catchment to the nearest river reach for two Norwegian catchments, Møska (top panel) and Narsjø (bottom panel).The catchments are located in the south and north-east, respectively, in southern Norway.The mean distance (denoted d_mean in the figure) and catchment size differ, but the shape of the distribution is similar.

Figure 3 .
Figure 3. Empirical and fitted (gamma, blue line) CDFs of for six Norwegian catchments. is sampled using Eq.(3) for all observed recession events.

Figure 4 .
Figure 4. Histograms (in black, green and red) of groundwater levels at three different locations in the Groset catchment (6.33 km 2 ) located in southern Norway.

Figure 5 .
Figure 5. Location of the 73 catchments used to evaluate the new storage routine.

Figure 6 .
Figure 6.Histograms of catchment characteristics for the 73 catchments.(a) Mean of the hillslope distance distribution, d, (b) areal percentage of lakes, (c) areal percentage of bogs, (d) catchment area, (e) mean elevation, (f) areal percentage of glaciers, (g) areal percentage of forests and (h) areal percentage of bare rock.

Figure 8 .
Figure 8. Scatterplot of mean (a) and standard deviation (b) of observed and simulated with DDD_m S (blue circles) and DDD_θ M (red crosses) ˙ for 73 catchments.

Figure 9 .
Figure 9. Histograms of storage simulations with DDD_θ M (a) and DDD_m S (b).Empirical CDFs of observed (black line) and simulated ˙ with DDD_θ M (red line) and DDD_m S (blue line) are shown in (c).
Figure 10.Empirical CDFs of (circles) and scaled S(t) (blue line) for two Norwegian catchments.

Figure 11 .
Figure 11.Snapshot of the saturated zone S of the DDD model.The catchment is represented as one hillslope.The x axis shows the distance from the river (right-hand side) to the top of the hillslope (left-hand side).The y axis show the storage levels.The darker the blue colour, the more water is present in the storage level.

Figure 12 .
Figure 12.Simulated storage S plotted against simulated runoff Q for a catchment of 49 km 2 (a) and a catchment of 1833 km 2 (b).

Table 1 .
(Saelthun, 1996)e DDD model with description and method of estimation.Some parameters have fixed values obtained through experience in calibrating DDD for gauged catchments in Norway.These values are within the recommended range for the HBV model(Saelthun, 1996).The GIS analyses are carried out using the national 25 × 25 m DEM (http://www.statkart.no).

Table 2 .
Mean values of skill scores obtained with simulating with DDD_m S and DDD_θ M for 73 catchments.KGE_r measures correlation, KGE_b the bias error and KGE_g the variability error.All skill scores have an ideal value of 1.

Table 3 .
Root mean square error (RMSE) values for the mean and standard deviation of simulated ˙ for the 73 catchments.

Table 4 .
Significant Spearman correlation (p value < 0.01) between catchment characteristics and the shape, α, and scale, β, parameters of the distribution of .The correlations are based on estimated model parameters for 83 Norwegian catchments.