Conditioning rainfall-runoff model parameters for ungauged catchments and land management impacts analysis

Data scarcity and model over-parameterisation, leading to model equifinality and large prediction uncertainty, are common barriers to effective hydrological modelling. The problem can be alleviated by constraining the prior parameter space using parameter regionalisation. A common basis for regionalisation in the UK is the HOST database which provides estimates of hydrological indices for different soil classifications. In our study, Base Flow Index is estimated from the HOST database and the power of this index for constraining the parameter space is explored. The method is applied to a highly discretised distributed model of a 12.5 km2 upland catchment in Wales. To assess probabilistic predictions against flow observations, a probabilistic version of the Nash-Sutcliffe efficiency is derived. For six flow gauges with reliable data, this efficiency ranged between 0.70 and 0.81, and inspection of the results shows that the model explains the data well. Knowledge of how Base Flow Index and interception losses may change under future land use management interventions was then used to further condition the model. Two interventions are considered: afforestation of grazed areas, and soil degradation associated with increased grazing intensity. Afforestation leads to median reduction in modelled runoff volume of 24% over the simulated 3 month period; and a median peak flow reduction ranging from 12 to 15% over the six gauges for the largest simulated event. Uncertainty in all results is low compared to prior uncertainty and it is concluded that using Base Flow Index estimated from HOST is a simple and potentially powerful method of conditioning the parameter space under current and future land management. Correspondence to: N. Bulygina (n.bulygina@imperial.ac.uk)

sence of information with which to constrain the prior parameter space (Beven, 2001(Beven, , 2003, which may result in unacceptable uncertainty in modelled response. Where the modelling objective requires a detailed catchment discretisation with many element types, each with its own model parameter set, the issue of uncertainty due to lack of flow observations quickly becomes over-riding. 10 To alleviate this problem, parameter regionalization may be used to constrain the parameter space prior to, or instead of, calibration (Bardossy, 2007;McIntyre et al., 2005;Yadav et al., 2007). The core of most regionalization approaches is an estimation of either the dependence of the model parameter on physical catchment characteristics (e.g. catchment area, steepness, soil permeability) (Lamb and Kay, 2004;Lee et al., 15 2006;McIntyre et al., 2005;Young, 2006); or the dependence of response indices (e.g. mean annual discharge, daily discharge standard deviation) using physical characteristics (Bardossy, 2007;Yadav et al., 2007), so that the indices can then be used to restrict the parameter space. The first approach is weakened by uncertainty in the model parameters and hence the regional relationship, and the regional relationship neglects or 20 simplifies model parameter interdependencies (McIntyre et al., 2005). The potential benefit of the second approach is that the response indices may be selected so that they are relatively independent and well-defined for the relevant set of catchments, so that relationships with the chosen catchment characteristics may be found with minimal ambiguity. The underlying challenge is to find indices which are adequately informative, 25 independent and lead to well-defined regional relationships. Yadav (2007) explored different physical characteristics and response indices in a pilot study for 30 UK catchments. The most influencial properties were found to be Base Flow Index (proportion of flow as base flow) and a Wetness Index equal to precip-Introduction

Conclusions
References Tables  Figures   Back  Close Full Screen / Esc

Printer-friendly Version
Interactive Discussion itation divided by potential evaporation. In their work the Base Flow Index (BFI) is estimated using the HOST system of Boorman et al. (1995). The HOST system provides estimates of Base Flow Index (BFI HOST ) for each of 29 soil classes as functions of various physical soil properties: depth to gleyed layer, depth to slowly permeable layer, integrated air capacity, presence of peaty surface layer, and soil parent mate-5 rial. Therefore, BFI HOST is non-linearly related to soil properties, with considerable uncertainty. Various other researchers have found that BFI HOST is the catchment characteristic of principal importance in the UK and alone contains significant information about rainfall-runoff model parameters (Lamb and Kay, 2004;Lee et al., 2006;Young, 2006). In some cases, other catchment properties or HOST outputs have been found 10 to be more important. These are generally, although not always, highly correlated with BFI HOST (Wagener et al., 2004). In all applications of the HOST data to rainfall-runoff modelling, the significant uncertainty in HOST outputs, among other sources of model uncertainties, means than formal uncertainty analysis is recommended. As well as application to ungauged catchments, regionalization of response indices 15 may be applied to land management impacts analysis (Wagener, 2007). For example, the impact of land management on soil properties can be speculated and translated, via the regional model, into changes in response indices. This has been done most widely using indices derived from the USDA's Curve Number system (e.g. Gassman et al., 2007) and recently in the UK using the HOST system (DEFRA, 2007). Although the 20 effect of some land management interventions has been well studied in some catchments (e.g. the study of the impacts of trees by Robinson et al., 2003), in general the link between land management and runoff indices is complex and not well identified from the literature (O'Connell et al., 2007). Representation of land management using conceptual models therefore involves significant uncertainty.

25
The objective of this study is to develop a regionalization scheme which may be applied throughout the UK, and which may provide adequate information about rainfallrunoff responses for a range of applications. In particular the method is developed to allow us to predict flows in ungauged catchments and to explore impacts of local land management changes to catchment properties using highly discretised catchment models. The response index approach to regionalization is adopted, using the indices to condition prior model parameter uncertainty into posterior distributions so that probabilistic predictions of land management impacts can be made. While the methodology allows all available runoff response indices to be introduced, we use only 5 BFI HOST with the hypothesis that this on its own is usefully informative. In the rest of this paper, the methodology is described and assessed for a highly discretised catchment, and its applicability to evaluating land use change impacts is demonstrated.

Method
2.1 Parameter space restriction using BFI 10 Suppose a catchment is discretised into a large number of runoff generating elements, and catchment response is viewed as the integration of all the individual elemental responses (we consider network routing of generated runoff as a separate issue, below). Potentially, the catchment model needs a separate set of parameters for each element.
Here, it is assumed that all elements with the same BFI HOST have the same set of 15 parameter values. Therefore, the number of different parameter sets needed for runoff generation modelling is far less than the number of elements and cannot exceed 29the number of soil types in the HOST classification. For each model element, a hydrological model can be run, and then the BFI can then be estimated from the simulated runoff. In this study, BFI is estimated using base 20 flow hydrograph separation procedure from Gustard et al. (1992). The procedure calculates flow minima of five-day non-overlapping consecutive periods and subsequently searches for turning points in this sequence of minima. The turning points are then connected to obtain the base flow hydrograph which is constrained to equal the observed hydrograph ordinate on any day when the separated hydrograph exceeds the 25 observed. The simulated BFI is compared to the expected BFI HOST value for the as- Interactive Discussion sociated soil class. The closer the comparison, the more consistent the model with the prior information about the soil. This comparison should account for the standard deviation of BFI HOST due to natural variablity within a soil class σ (Table 1). Therefore, for each HOST soil type, the posterior distribution of model parameter θ is expressed as where p 0 is a prior parameter distribution, L(BFI HOST |BFI θ ) is the likelihood of BFI=BFI HOST given the model estimate BFI=BFI θ . The likelihood function is assumed proportional to a normal probability density function with expected value BFI θ and variance σ 2 . In the case study, a single model structure is assumed, although the method 10 may be generalized to multiple model structures.
As introduced previously, the underlying hypothesis behind the method is that BFI HOST alone contains an adequate amount of information with which to condition the parameter space. The method is therefore not applicable for conditioning the parameters of model components unrelated to soil type. For example, this may include 15 the routing effects of lakes, reservoirs and the channel network; and effects of vegetation types on evapotranspiration and routing. The parameters of these components must be constrained using some other source of knowledge or observed data.

Modelling aspects of land use change
As well as testing the applicability of the method to ungauged catchments, our motiva-20 tion is to allow element-scale (100 m×100 m) land management changes to be easily represented within catchment scale models. This involves identifying suitable changes to p(θ) for the affected elements. Two examples are discussed here -afforestation and grazing intensity.
In most situations trees are known to increase interception losses and hence reduce Introduction

Tables Figures
Back Close

Full Screen / Esc
Printer-friendly Version Interactive Discussion (Jewitt, 2005). There is also evidence that base flow proportion increases under forest (Wheater et al., 2008), but currently only limited quantitative information about this change is available. Therefore, afforestation is assumed to lead to higher BFI, while keeping the same HOST soil type. Hence, given a parameter set θ 0 (and corresponding BFI value BFI 0 ) before afforestation, the posterior parameter probability distribution 5 for new conditions is defined as, Here, θ f is a parameter set for afforested conditions, the prior p(θ f )=p(θ 0 ) is the parameter distribution for the current HOST class, and the likelihood L(θ f |θ 0 ) takes one of two values: 1 if the simulated BFI for θ f is higher than BFI 0 and 0 otherwise. In 10 other words, the posterior includes only those parameter sets that lead to a base flow increase. We note that, in principle, physically-based models could be used as an additional source of information for parameter conditioning, e.g. following Jackson et al. (2008). Changes in interception losses associated with afforestation may be estimated using 15 a standard model using parameter distributions from the literature. For the simple hard threshold bucket model shown in Fig. 1, the value of canopy storage capacity, C int can range from 0.2 to 3.8 mm depending on species, leaf area index, canopy cover, vegetation structure, and density. Canopy storage is particularly low in deciduous forests during leafless (winter) periods (David et al., 2005). 20 The second land management change considered is increasing stocking density, leading to soil structural degradation. Following the approach of Hollis (Packman et al., 2004), degraded soil is assigned an appropriate analogue HOST class to represent the change ( Table 1). The rationale for the proposed changes is that soil structural degradation, in the form of topsoil and upper subsoil compaction and seasonal "capping" and 25 sealing of soil surfaces, causes a reduction in the effective soil storage, which in turn results in increased surface runoff. Therefore, the general principle is that soil structural degradation affects the soil storage/wetness component of the HOST classification, but does not alter the hydro-geological component.
1913 Given a parameter set θ 0 (and corresponding BFI value BFI 0 ) before soil degradation, the posterior parameter probability distribution for the degraded conditions is defined as where θ d is a parameter set for the degraded conditions, the prior p(θ d ) equals the 5 parameter distribution for the analogue HOST class, and the likelihood L(θ d |θ 0 ) takes one of two values: 1 if the simulated BFI for θ d does not exceed BF I 0 and 0 otherwise. Hence, the posterior accepts only those parameter sets that lead to a base flow reduction and therefore flashier response.

Performance statistics 10
The proposed parameter restriction method will result in probabilistic discharge prediction. Often in hydrology, prediction quality is characterized using the Nash-Sutcliffe efficiency coefficient (NS). To define its analogue for probabilistic predictions, a "distance" between a data point at time t(q 0 t ) and a random variable that represents the modelled value (ξ t ) can be introduced as follows, HESSD 6,2009 Conditioning rainfall-runoff models for ungauged catchments This can be subdivided into two parts where the first part corresponds to the traditional NS coefficient in which expected val-5 ues are considered as predictors; and latter represents the variance whereby the higher predictor variance around the mean is, the less "effective" the prediction. Although the proposed method may be used for ungauged and poorly gauged catchments, for performance assessment purposes the method is applied here to a densely monitored experimental catchment, as if it were poorly gauged; and the predictions are evaluated 10 using the NS def and the NS statistics.

Case study -catchment description
The method is demonstrated using data from the Pontbren catchment in Powys, Wales, UK (gauged area 12.5 km 2 ) (Marshall et al., 2009 (Hough and Jones, 1997) for grassland is 774 mm, so that P/PE ratio is about 2.16. Soil types are dominated by low permeability silty clay loams (Table 2 and Fig. 2). Typically, 20 cm depth of topsoil overlies a deep layer of relatively impermeable subsoil. On moorland, the topsoil has a significant peat content. The time-series data used for this study are 10 min resolution rainfall, daily MORECS 5 potential evapotranspiration data, and 15 min resolution streamflow data from five bedmounted acoustic Doppler velocity meters (gauges 2, 5, 6, 7, and 9 in Fig. 3) and from a pressure transducer at a rated section (gauge 10). Gauge 10 data are used for assessing low flow (<1.5 m 3 /s) performance only because the rating curve for higher flows is poorly defined. The contributing areas at each of the six gauges are given in Table 3. Although another four flow gauges exist (Fig. 3)

Case study -model description
The catchment is discretised into 100 m×100 m runoff generating elements. A semidistributed modelling toolkit (Orellana et al., 2008;Wagener et al., 2004) is used that requires specification of a conceptual runoff generating model and a routing model for each element. A probability distributed soil moisture model together with two parallel 20 linear routing stores (Fig. 4) is selected, as this is perceived to be widely applicable in the UK (Calver et al., 2005;Lamb and Kay, 2004;Lee et al., 2006). This model has five parameters (units are listed in Table 4): c max is the maximum soil water storage capacity within the element, b is a shape parameter defining the stoage capacity distribution, α is a parameter defining the proportion of quick runoff, and k f and k s are routing store 25 residence times. There is a lake upstream of gauge 8 (Figure 4). Its response is simulated using the weir equation Q=kH m , where H is the water level in the lake above the outlet's HESSD Introduction

Tables Figures
Back Close

Full Screen / Esc
Printer-friendly Version Interactive Discussion lowest point, and coefficients k and m are related to the outlet geometry (Montes, 1998) that, following channel measurements , is assumed to be parabolic. Any subsurface flow from the lake is neglected. Other lakes and reservoirs in the catchment are not accounted for, due to their small sizes. Element runoff is routed down the stream network using a constant celerity ap-5 proach, i.e. the water moves with constant velocity (Beven , 1979). Rainfall is assumed uniform across the catchment, while PE is associated with vegetation cover via the MORECS data.
3.3 Case study -conditioning the parameter space for land management impacts analysis 10 Prior ranges for the runoff generating model parameters are given in Table 4. These are liberal ranges for UK catchments (Wagener et al., 2004). To select acceptable parameter sets for each HOST soil class, 10 000 parameter sets are randomly sampled from the five-dimensional prior parameter space using the Latin hypercube method, and the model is run to estimate corresponding BFI for each parameter set. Following

15
Eq. (1), each parameter set is prescribed a weight based on the closeness of the simulated BFI to the corresponding BFI HOST value, producing a posterior parameter distribution for each soil type. Then 1000 parameter sets for each HOST class are independently sampled from the posterior distributions. Thus 1000 possible parameter sets for each of the 29 soil classes define the conditioned parameter space, and no 20 observed flow data has yet been used. The celerity parameter is estimated by fitting the modelled peak flow arrival times to observations. To do so, a celerity value is sampled from the range 0.1 to 1 m/s, and the semi-distributed model is run to generate modelled time of arrival for each flood event. Several samples of peak flow arrival time observed at gauge 6 are used to estimate the 25 best of the sampled celerity values (visual fit). Note that the flow data requirement for parameter estimation is thereby limited to a few samples of peak flow arrival time (and no actual discharge values are used). In catchments which are completely ungauged, 1917 Introduction

Tables Figures
Back Close

Full Screen / Esc
Printer-friendly Version Interactive Discussion the celerity parameter could instead be estimated from channel properties. At the beginning of the simulations the soil moisture stores are assumed to be full, and the fast and slow routing stores are assumed to be empty. To diminish the influence of the initial conditions on predictions, the model is run for a two month "warming up" period before the 1st January 2007. 5 To investigate the effect of afforestation and grazing effects, the conditioning method described by Eqs. (2) and (3) is implemented. Additionally, for afforestation, an interception capacity is added to the model using C int =1 mm, and wet canopy evaporation is assumed to be three times higher than the dry canopy evapotranspiration rate, representative of pine/coniferous forest (David et al., 2005;Stewart and Thom, 1973). The following section shows the restrictions in the parameter space achieved, probabilistic flow predictions, and system response changes due to the land management change (Koren et al., 2004).

Results
The posterior parameter distributions restrict the slow flow residence time k s and runoff 15 partitioning coefficient α (Fig. 5). The low k s values have low posterior probability, and the runoff partitioning coefficient distribution is concentrated around a value of (1 − BFI HOST ). The celerity of 0.7 m/s provided satisfactory peak flow arrival timing, and is used for stream network routing.
These results illustrate that BFI HOST contains information about baseflow response 20 time as well as baseflow proportion, however lacks information about soil moisture capacity and fast flow response. Nevertheless, the results below show that despite this missing information, the method performs well in terms of explaining the observed flow. Other investigations have indicated that soil parameters and fast flow residence times may be usefully estimated from other prior information (e.g. Koren et al., 2004; 25 Lee et al., 2006;Jackson et al., 2009) and/or may be well identified from suitable calibration data sets. Such information would be necessary if uncertainty needed to be HESSD 6,2009 Conditioning rainfall-runoff models for ungauged catchments constrained further. Figure 6 shows discharge predictions for January 2007, which is representative of the 3 month evaluation period. Here, the medium grey and dark grey areas represent the 90 and 50% confidence intervals on the discharge prediction; and the black dots are observed data points. The light grey area represents 90% intervals of prior uncertainty, 5 when there is no distinction due to soil type and land use, and parameters are assigned uniformly across the catchment. The dashed lines show the range of flows within which the streamflow gauge was calibrated and considered accurate , so that the data points lying in the range could be considered as being more reliable than the points lying outside. Note that gauge 10 rating curve is 10 estimated using flows up to 1.5 m 3 /s only. The Nash-Sutcliffe statistics for probabilistic predictions and expected values, as defined in Sect. 2, are summarized in Table 5. The performances achieved together with Fig. 6 support the view that BFI HOST is an effective response index, and therefore changing the distribution of BFI HOST values (Eqs. 2 and 3) is a viable method of introducing information about land management 15 impacts.
The afforestation and soil degradation scenarios introduce the shifts in the distributions of parameters shown in Fig. 7. Figure 8 shows the predicted impacts of these land management changes on runoff at gauge 10. Here, the solid lines represent the 90% intervals for current conditions and the dashed lines are the corresponding results 20 for full afforestation and soil degradation. The peak flow distributions provide quantitative information which reflects our basic prior knowledge. The uncertainty in the peak flow is high compared to the expected changes, indicating that there would be benefits in seeking some more information about the model parameter values particularly, perhaps, k f . The relative differences between the median and mean results associated 25 with the scenarios are given in Table 6. This includes changes in total runoff within the period 1st January 2007-31st March 2007 and changes in the highest observed peak flow during the period (on the 18th January). The percentage reductions in total runoff, when rounded to the closest integer, were the same for all gauges (only this single HESSD Introduction

Conclusions
References Tables  Figures   Back  Close Full Screen / Esc

Printer-friendly Version
Interactive Discussion number for each land use change is shown in the corresponding rows in Table 6). The afforestation delayed the highest peak arrival by 15 min (one simulation time step), and the soil degradation scenario did not show any difference in peak flow arrival time.

Conclusions
This study illustrated a simple method of conditioning hydrological model parameters 5 on prior information in order to simulate runoff under current conditions and future land management scenarios. The prior information about current conditions, in this case, came almost entirely from the BFI HOST index from a national database of soil types. Under the case study catchment (Pontbren, a 12.5 km 2 upland catchment in Wales), the conditioned model was shown to simulate observed flows to an impres-10 sive level of accuracy. Under land management scenarios, new posteriors for BFI HOST and interception losses were introduced based on best available knowledge, providing probabilistic predictions of land management effects on flood runoff. Although the case study looked at catchment-wide land management changes, the model was discretised into a 100 m square grid so that spatially discriminate interven-15 tions may also be explored. In landscapes such as Pontbren, however, effective land management interventions often occur at even smaller scales, for example tree shelter belts of a few meters width (Jackson et al., 2008;Marshall et al., 2009). Other interventions may be closer to the 100 m scale, but do not directly affect BFI HOST , for example tile drainage installation or drain blocking. The methodology presented here does not 20 by itself address these types of issues. To do so, small-scale physics-based models or observations are needed to generate additional information to condition responses (Jackson et al., 2008).
There are also more general limitations of the approach associated with the limited information in the BHI HOST index, which may warrant the inclusion of additional  Lee et al., 2006), although it is notable that they do not achieve the performance against observed data which we achieved in Pontbren using only BFI HOST and time to peak. For gauged catchments, observed data would also normally be used. Using BFI HOST , the lack of information about the soil parameters c max and b is a potential concern. This uncertainty was considered unimportant in the winter period used in the case study, but may become important when considering non-winter flood events where the soil storage capacity will probably have a more significant effect. Clearly, some other prior information about these parameters may be required (e.g. Koren et al., 2004;Jackson et al., 2009). The fast flow residence time k f was also not significantly constrained within our procedure, although in general this parameter is found to be identifiable within regionalisation studies (Lamb and Kay, 2004;Lee et al., 2006). This indicates again that there is some scope for improvement, although care would be needed to properly handle the dependencies between the new information and BFI HOST within Eq. (1). In the case study, there is also a potential concern that ob-15 served data were needed for estimating the channel celerity parameter, although only peak flow arrival times rather than flow time-series. If necessary, the celerity parameter (or more sophisticated hydraulic model) and its uncertainty may be estimated from channel properties (e.g. Koren et al., 2004). The conditioning method has been presented in the context of a UK soil database 20 and UK case study. Extension of the principle of the method to other countries would be straightforward, so long as there is a national regional model which provides a national map of expected values of base flow index (or some other powerful index) plus associated uncertainty. For example, the USA's Soil Conservation Service numbers may be equally as useful as the HOST classification.

25
Despite its limitations, the approach described in this paper has allowed impacts of land use management interventions to be predicted, within a framework which incorporates the key prior knowledge and which has been (partially) validated on observed flow data. Two speculative scenarios were investigated: 1) returning the catchment to HESSD Introduction

Conclusions
References Tables  Figures   Back  Close Full Screen / Esc

Printer-friendly Version
Interactive Discussion a pristine woodland landscape through afforestation and; 2) further degradation of the soil associated with over-grazing. Median values show significant impacts: for example, the changes in the largest observed flood peak at the catchment outlet were 12% reduction with afforestation and 8% increase due to over-grazing.