Subgrid spatial variability of soil hydraulic functions for hydrological modelling

State-of-the-art hydrological applications require a process-based, spatially distributed hydrological model. Runoff characteristics are demanded to be well reproduced by the model. Despite that, the model should be able to describe the processes at a subcatchment scale in a physically credible way. The objective of this study is to present a robust procedure to generate various sets of parameterisations of soil hydraulic functions for the description of soil heterogeneity on a subgrid scale. Relations between Rosettagenerated values of saturated hydraulic conductivity (Ks) and van Genuchten’s parameters of soil hydraulic functions were statistically analysed. An universal function that is valid for the complete bandwidth of Ks values could not be found. After concentrating on natural texture classes, strong correlations were identified for all parameters. The obtained regression results were used to parameterise sets of hydraulic functions for each soil class. The methodology presented in this study is applicable on a wide range of spatial scales and does not need input data from field studies. The developments were implemented into a hydrological modelling system.


Introduction
One of the major challenges in hydrological process modelling is to minimise the discrepancy between model and data scale as described e.g. by Blöschl and Sivapalan (1995) or Hopmans et al. (2002).
State-of-the-art hydrological applications require a process-based, spatially distributed hydrological model.As the first objective, runoff characteristics are demanded to be well reproduced by the model.Despite that, and even for large-scale applications, the model should be able to describe the processes at a subcatchment scale in a physically credible way.Following Blöschl and Sivapalan (1995), hydrological processes that are dominant at spatial scales larger than the smallest calculation unit (hydrological response unit respective elementary grid size) of the model are assumed to be described directly by the model.Small-scale processes below the smallest spatial calculation unit are assumed to be described indirectly by the model, e.g. by calibration.
The simulation of soil water movements and storages can be particularly sensitive with respect to many model outputs (total runoff, infiltration, groundwater recharge, actual evapotranspiration, etc.).Especially the water content of the soil near the surface is a decisive factor for the runoff generation (e.g.Binley et al., 1989a;Beven, 1995;de Roo et al., 1996;Coles et al., 1997;Bronstert and Bárdossy, 1999;Entin et al., 2000;Hasenauer et al., 2009).Further, the parameterisation of field-saturated hydraulic conductivities (K s values, e.g.cm day −1 ) with proxy data is an essential factor for many physically based hydrological models (Gupta et al., 2006).
Hydrological models that rely on one effective (specific) parameterised set of soil hydraulic functions for each soil type may not be able to describe subgrid variation in an adequate way.Therefore, it can lead to a high calibration effort and possibly to an inadequate process description.Bronstert and Bárdossy (1999), for instance, do not recommend averaged (effective) input data.Instead they suggest to use additional stochastic components to consider small-scale heterogeneities.Further, Kirchner (2006) points out that "the key question is not whether models of hydrologic systems should P. Kreye and G. Meon: Subgrid spatial variability of soil hydraulic functions for hydrological modelling be physically based; instead, the question is how they should be based on physics".
Area-wide measured data of basic soil properties or even of soil hydraulic properties are not available for most hydrological model applications at the meso-and macroscale.However, in many cases rough information about the soil (e.g.soil maps) is available on a very coarse spatial resolution (1 : 50 000 at best).Using such rough input data does not allow direct parameterisation of any subgrid variability.In addition, soil maps are already products of regionalised input data.Consequentially, all soil hydraulic parameters based on soil maps can be interpreted (only) as effective parameters.
In this study, the subgrid spatial variability for the parameterisation of soil hydraulic functions will be derived indirectly from soil map information.To achieve this, three statements are formulated and will be discussed below: 1.The spatial variability of saturated hydraulic conductivity of soils on a subgrid scale can be expressed by a lognormal distribution.
2. There are relationships between the saturated hydraulic conductivity and the parameters of soil hydraulic functions.
3. These relationships are mirrored in the parameters generated by the software Rosetta (Schaap et al., 2001).
They can be used to simulate a subgrid spatial variability in a straightforward procedure, which does not require measured samples of soil properties.
The second statement was investigated in several studies as well.However, compared to the first statement, the available studies are less clear.Carsel and Parrish (1988) used approx.3000 measurements of soil textures and bulk densities, which were summarised into 12 major texture classes.They approximated van Genuchten (1980) parameters (VGP) S , R , α and n as well as K s values utilising the empirical regression functions of Rawls et al. (1993) to describe soil hydraulic functions.In a following step, Gaussian distributions for the VGP were approximated by using the Johnson system of transformations.This was done for every VGP independently.After the transformation, high correlations were found between VGP and K s values.In a following study, de Rooij et al. (2004) used approx.140 samples from two layers of an agricultural soil to fit VGP and K s values each.Relationships between the VGP and the K s values were found by means of regression analyses.However, these relationships were considered to be too weak for using the K s values as a direct predictor for the VGP.In the next step, they used these relationships as additional information for estimating probability distribution functions for each VGP.The assumption of K s being lognormal distributed was considered as well.In a study of Li et al. (2007), data were measured experimentally to describe 63 pF curves as well as corresponding K s values, texture information, bulk densities and fractions of organic matter.pF is defined as log 10 values of the absolute soil pressure heads.The model of van Genuchten (1980) was adapted to fit the measured data in order to obtain pF curves.This research found high correlations between VGP, measured texture classes and bulk density as well as weak correlations between measured K s values and bulk densities.No significant correlations were found between K s and the texture of the soil.Regression analyses were not conducted for K s and VGP.However, the other regressions of Li et al. (2007) indicate that there seems to be no significant relationship.Botros et al. (2009) carried out measurements to obtain pF curves for nearly 100 sediment cores.They analysed the dependence among measured K s values and VGP, which were fitted to the measured pairs of the soil water content ( ) and the soil pressure head (h).Significant correlations were found between K s and α, n and S ; K s and R were also correlated, but did not yield significance.A factor all these studies have in common is that any analysis is always based on measured input data of soil properties.Aside from that, rather elaborate numerical simulations were necessary in many cases.As a general note, relationships between the VGP and K s values were found in many studies.
Besides the lack of measured soil samples, the effort of parameterisation by means of sophisticated procedures that often require Monte Carlo applications is very high even for models operating on the hill slope scale.This effort is much higher for large areas and huge timescales as it is usual in e.g.climate change hydrological modelling.Consequently, the use of effective parameter sets and powerful calibration procedures is widespread.On the other hand, some kind of calibration parameters are always needed in hydrological modelling.Based on this, the third (innovative) statement was formulated.Premised on profound analyses of the relationship between Rosetta-generated K s values and VGP for several texture classes, the objective of this study is to consider the subgrid spatial variability of soil hydraulic functions for hydrological modelling by using these relationships.It is worth mentioning that the methodology presented in this study is applicable for a wide range of spatial scales and does not need measured input data from field studies.

Methodology
In this section, we shortly give the required theoretical background in soil physics and statistics.Further, the creation of a database is presented by means of the software Rosetta.The database contains the parameters and K s values for the description of pF curves based on the equations of van Genuchten (1980).In the next step, correlations between the K s values and the parameters of the soil hydraulic functions of the generated databases are analysed.

Soil hydraulic functions
Since the objective of this paper is the consideration of subgrid variability of the parameterisation of soil hydraulic functions at the meso-and macroscale, the model for the description of the soil hydraulic functions has to be determined in the first place.The use of proxy information is one of very few possibilities to parameterise soil hydraulic functions extensively for large hydrological model areas.As the software Rosetta will be used for this application (see Sect. 2.2), the obtained parameters are limited to the model of van Genuchten (1980).However, this model is widely used in hydrological and soil physical disciplines for describing the relation between water content and pressure head in soils: (1) There are synonymic designations for the relationship between water content and pressure head; see Durner and Flühler (2006) for details.In this study, the designation pF curve is used.In Eq. ( 1), (h) denotes the volumetric water content (cm 3 cm −3 ), h (cm) marks the pressure head of the soil, R and S (cm 3 cm −3 ) are defined as the residual and saturated water contents of the soil, whereas α (cm −1 ) n (-) and m (-) are shape parameters of the model.Both shape parameters have a weak physical interpretation.The inverse of α (and also n) is slightly related to the air entry pressure head (however, Eq. (1) has no defined air entry value).The parameter n is connected to the width of the pore size distribution of the soil between S and air entry pressure head.The product mn is related to the width of the pore size distribution of the soil between air entry pressure head and R (Durner and Flühler, 2006;Peters and Durner, 2006).Studies of Wösten and van Genuchten (1988) and van Genuchten and Nielsen (1985) analysed the influence of these parameters on the shape of the modelled pF curve in detail.The parameter m is in most cases approximated as 1 − 1 n , which reduces the flexibility of the model, but enables a closed form expression for the unsaturated hydraulic conductivity by combining Eq. ( 1) with the pore size model of Mualem (1976): with the effective saturation S e (cm 3 cm −3 ) as (3) In general, the absolute values of Eq. ( 2) are scaled by the saturated hydraulic conductivity K s (cm day −1 ).The parameter l (-) can be approximated as 0.5 (Kutílek and Nielsen, 1994;Hillel, 1998).The unsaturated hydraulic conductivity (K( ) respectively K(h)) can either be formulated in dependency of the soil water content as shown in Eq. ( 2), or of the pressure head h.
Table 1.Definitions of the used texture classes.The fractions of sand, silt and clay is processed out of the soil map for Lower Saxony (Boess et al., 2004) and the German soil classification system (Sponagel, 2005).

Parameters for soil hydraulic functions
One objective is to investigate for correlations between Rosetta-generated VGP and K s values.To formulate statistically significant statements, a representative population for the statistical analyses has to be considered.Therefore, a short algorithm was developed to create trios of numbers within a range of 0-100.These trios were randomly generated with the precondition that the sum of each trio has to be 100.The numbers of each trio are assigned to be a percentage fraction of sand, silt and clay.A total of 10 6 fictitious samples of possible compositions of texture fractions were obtained in this manner.All three texture fractions are characterised by the same distribution with an expected value of 33.3 % sand/silt/clay.The large number of generated samples was empirically determined in order to get a representative population for the statistical analyses.The regression results were stable for populations ≥ 10 5 .The number was increased to 10 6 to safeguard validity.The free-of-charge software Rosetta (Schaap et al., 2001) was utilised to estimate the VGP R , S , α and n as well as K s values per sample.It is based on neural network analyses and was calibrated by means of a large database comprised of 2134 soil samples that consists of more than 20 000 pairs of and h in total.For the saturated hydraulic conductivity, 1306 soil samples were available.A total of 235 samples also contained data for the unsaturated hydraulic conductivity function K( ) respectively K(h) including more than 4000 data points (Schaap et al., 1998(Schaap et al., , 2001)).The database UNSODA (Leij et al., 1996;Nemes et al., 2001) contributes significantly to these data points.Additional information about early neural network applications for parameterisation of soil hydraulic functions can also be found in Schaap and Bouten (1996).
The VGP sets (including K s values), obtained with Rosetta using the randomly generated texture compositions as input, are hereafter called database 0. In addition to this database, gradual reductions of database 0 were carried out.These reductions were a result of the evaluation of the regression analyses.Further reasons of the reduction are given in Sect.3. At total four different databases were generated (database 0 and three derivatives of database 0): 1.The complete database 0, which consists of the total of 10 6 VGP sets including K s values.
2. A reduced database 1 based on the condition that K s < 150 cm day −1 .Approx.95 % of the parameter sets of database 0 are still included.
3. A reduced database 2 based on the condition that K s < 150 cm day −1 and R < 9 %.Approx.70.5 % of the parameter sets of database 0 are still included.
4. Several selected databases 3 x : Variant A: subdivision based on natural texture classes according to the soil map of Lower Saxony, Germany.Variant B: subdivision based on soil hydraulic properties.

Generation of databases 3 x , variant A: classification by soil map
The final reductions to databases 3 x were conducted for two reasons: firstly, it is suspected that many grain size compositions in database 0 are unrealistic (e.g. 100 % clay or 50 % clay and 50 % sand) causing the neural network of Rosetta to extrapolate the parameters for these compositions.This may have noisy effects on possible correlations between K s and the VGP.Secondly, the presented approach is tailored to hydrological modelling at the meso-and macroscale without employing measured data.In most cases only rough information about the soil (e.g.soil maps) is available for the model area.For that reason, the database was further reduced to obtain natural texture classes, which can be found in many soil maps.Suitable soil maps (or similar products) are widely available around the world.We used the German soil map of Lower Saxony (Boess et al., 2004); see Fig. 1.Out of this, common natural compositions of grain sizes were isolated from the data sets of database 0 in order to generate databases 3 x (variant A).Abbreviations of the texture classes are defined in Table 1 and were assigned according to the German soil classification system (Sponagel, 2005).A predefined texture class for boggy soils (Hn) is not available.Silty clay (Tu) has similar properties as clayey loam (Lt), therefore these two texture classes (Hn, Tu) are not included in the following analyses.Instead, the texture classes for silty loam (Lu) and pure sand (Ss) were added.These texture classes are not shown in the soil map (Fig. 1).However, both are contained in other soil maps of Germany.Around each texture fraction, a ±5 % boundary in each direction was considered in order to get a representative number of van Genuchten data sets for the regression analyses.Note that in total more than 10 5 parameter sets of database 0 are still  (Boess et al., 2004).
Ls is sandy loam, Lt is clayey loam, S is sand, Sl is loamy sand, Su is silty sand, Tu is silty clay, Ul is loamy silt, Us is sandy silt, Ut is clayey silt; see Table 1.In addition, Hn stands for boggy soils and Wa stands for water bodies (lakes, rivers).
included in the databases 3 x (variant A).The procedure to obtain the VGP and K s values is graphically shown in Fig. 3.

Generation of databases 3 x , variant B: classification by cluster analyses
Twarakavi et al. ( 2010) introduced a procedure to classify soils based on their hydraulic properties.To achieve this, they used the k-means clustering algorithm.The same algorithm was used in this study to subdivide database 0 by means of hydraulic properties.This algorithm is available in MATLAB.We standardised the VGP to avoid scale effects that influence the weightings in a negative way.Minimisation of euclidean distance was applied as objective function.
The number of resulting subdivisions (classes) is freely adjustable.We used 255 different target classes, starting with 2 and going up to 5680 classes.

Regression analyses for soil hydraulic parameters
A flexible exponential regression model is used, since the modalities of the relations between the K s values and the VGP are unknown: where a, b, c and d (-) are fitting parameters and e (-) is Euler's number.The model is adjusted by means of the Levenberg-Marquardt algorithm (Marquardt, 1963).
In addition to the univariate regression model shown above, a multivariate regression will be performed by using a general multivariate model, which can be denoted as where the matrix Y denotes the dependent variables, which are assumed to be correlated among themselves.The matrix X includes the independent variables, the matrix B comprises the fitting coefficients and E gives the matrix of residuals.
The index n denotes the number of samples, d the number of subjects and p the number of predictor variables.
To evaluate the quality of the regressions, the coefficient of determination R 2 is calculated as follows (Sachs, 2004): SSY is the total, RSS is the residual and MSS is the regression sum of squares.By standardisation of MSS with SSY, the coefficient of determination R 2 is obtained.y i denotes a data value and y describes the average of all data values, whereas ŷi symbolises a computed value of the regression model.R 2 ranges from 0 (no relation) to 1 (perfect fit).
For consideration of nonlinearities, Spearman's rank correlation coefficient r spear can be calculated in addition to the coefficient of determination (Sachs, 2004): The values rg(x i ) and rg(y i ), which are sorted into ranks (rg), are the values of the data set and the fitted model with the total number of k. r spear has a range from −1 to 1, whereby 0 denotes no correlation and 1/−1 describe a perfect positive/negative correlation, respectively.

Results and discussion
3.1 Regression analyses 3.1.1Complete database 0 and reduced databases 1 and 2 Regression analyses based on Eq. ( 4) were performed for the database 0 and for each of the reduced databases 1 and 2.
The K s values in relation to the R values resulted in low correlations with R 2 of 0.43.A more structured K s − R relation seems to arise for K s values smaller than 150 cm day −1 and R smaller than 9 %.Consequently, database 0 was reduced to database 2 and R 2 of the regression function, which was computed out of the complete database 0, increased to 0.72.However, to obtain a function on the basis of database 2, new regression analyses were conducted leading to R 2 of 0.74.This function is shown in the first plot of Fig. 5.A similar approach was applied to evaluate K s and S ; no significant correlations were obtained.Because of the high correlations found for K s − R in database 2, the reduction of the database 0 was also applied for S .However, only the range of the K s values was reduced, leading to database 1.In contrast to K s − R , no significant correlations were found between K s and S based on the reduced database; see the second plot of Fig. 5. Low correlations (R 2 = 0.41) were found for the parameter n when using database 0. An even lower fit (R 2 = 0.25) was obtained when reducing database 0 to database 1 as seen in the third plot of Fig. 5.The analysis of K s vs. α shows neither correlations for database 0 nor for database 1 (fourth plot of Fig. 5).
Generally, in some sections of the scatter diagrams there seem to be more connections between the K s values and parameters of the soil hydraulic functions than in other sections.However, these connections are very low and too uncertain for hydrological modelling purposes.A reduction of database 0 to database 1 respectively database 2 had a positive effect on the regression of R only.Apparently, it is not possible to obtain four single regression functions, one for each parameter.

Univariate regression analyses
Regression analyses based on Eq. (4) were performed for each of the natural texture classes.Concerning R , very high R 2 between 0.88 and 0.99 were found for 7 out of the 10 texture classes with an average R 2 of 0.96.The other three classes reached correlations with R 2 lower than 0.5; therefore, these classes were not included in following analyses and applications.Generally, curves with a R 2 lower than 0.5 are not illustrated in the figures and tables.The regression curves of R are exponentially decreasing proportional to decreasing K s values, which physically makes sense.However, we have to keep in mind that van Genuchten's R has no clear physical interpretation and other fitting models for the pF curve actually have no residual water content (see e.g.Rossi and Nimmo, 1994).The high correlations between R and K s may have to be considered as a kind of black box correlation that is valid for the Rosetta-fed van Genuchten model only.
Concerning S , high R 2 between 0.68 and 0.93 were found for five texture classes with an average R 2 of 0.82.The behaviour of these classes can be divided into two groups.Group one includes Lu and Ls, group two includes Us, Sl and Su.The main textural difference of these two groups is the fractional higher clay and lower sand content in group one compared to group two, as seen in Table 1.This has an ef-P.Kreye and G. Meon: Subgrid spatial variability of soil hydraulic functions for hydrological modelling fect on the slopes of the fitted regression models.Group one shows decreasing values of S with increasing K s values, group two behaves the other way round.Assuming higher sand fractions causing higher K s values, the grain size compositions of group one are shifted in the direction to the centre of the texture triangle.This may cause smaller values of S .On the other hand, moving away from the centre of the texture triangle with higher fractions of sand (as for group two) may have the opposite effect of increasing porosity.Both effects are imaginable, however, we do not want to overinterpret the physical impact of van Genuchten's S .
Concerning α, high R 2 values between 0.67 and 0.96 were found for four texture classes with an average R 2 of 0.75.As given in Sect.2.1, the parameter α is weakly related to the inverse of the air entry suction (not to forget that van Genuchten curves have no defined air entry value).In general, without focusing on van Genuchten's model, the entry suction should be higher for fine-grained than for coarsegrained soils.This means that the entry suction should rather decrease with increasing K s than increase.This connection cannot be found for the texture class Lu.That is why this regression (Lu) is not considered in the subsequent analysis.
Concerning n, very high R 2 between 0.63 and 1.00 were found for seven texture classes with an average R 2 of 0.85.Especially for the two sandy texture classes, highly accurate fits were obtained.Under the assumption of n being related to the pore size distribution, many different pore sizes lead to low values of n, whereas many pores with a similar size lead to high values of n.In general, soils that are located near the borders of the texture triangle tend to have a more narrow pore size distribution than soils located in the middle of the triangle.Taking into account that these soils (pure sand, pure silt) may have higher K s values compared to loamy soils, increasing K s may be related to increasing values of van Genuchten's n.Again, we have to be careful not to overstretch connections of Rosetta-generated VGP to measurable physical properties of soils.
All statistical quality values from the univariate regression analyses are listed in Table 2. Additionally, p values are included.Low p values indicate a correlation between K s and the parameters of the soil hydraulic functions.All p values of Table 2 are nearly 0, yielding that all shown correlations are significant.Further, the square of r spear yields approximately R 2 for most cases.This seems to validate R 2 as a quality criterion for the regression analyses.

Multivariate regression analyses
Regression analyses based on Eq. ( 5) were performed for each of the natural texture classes.We used log 10 (K s ) to fill the matrix X.The matrix Y comprises R , S , n and α.These more elaborate procedures, which consider the correlations among the dependent variables, serve as references for the previous results.Both the shape of the obtained fits of the multivariate method and the R 2 turned out to be very similar to those of the univariate method.The average R 2 both for the univariate and multivariate method was ∼ 0.835.The shapes of the functions differ just slightly or are even identical.curves behave very similarly with small differences at high K s values.However, R 2 are equal to each other and a better fit cannot be pointed out.All other comparisons between the regression results of the two methods act similar to Fig. 7.The high accordance of both methods' results speaks for the robustness of the less elaborate univariate method.Based on this, the results of the univariate regression analyses will be used for further applications.

Results of the subdivision
Figure 2 shows subdivisions of the soil texture based on soil hydraulic properties by means of cluster analyses for a number of 31 classes.Results of Twarakavi et al. (2010) showed that the subdivisions based on soil hydraulic properties are similar to the US texture-based classification, especially for coarse-textured soils (sands).These similarities were not found for fine-textured soils.The results of our subdivision based on soil hydraulic properties are unlike the texturebased classification.However, this is not directly a contradiction to Twarakavi et al. (2010).They used the US texture triangle for comparison and we used the German classification.In addition to that, the rules and conditions for the al- gorithm of the cluster analyses have a high influence on the result.

Univariate regression analyses
In variant B, we concentrate on univariate regression analyses only.In Fig. 4 the average R 2 are shown in dependency of the number of classes used for the subdivisions.As previously, regression results with R 2 lower than 0.5 are not considered.The abscissa is limited to a maximum of 200 classes.If more classes are used, the average R 2 does not increase significantly.The average R 2 ranges therefore mainly between 0.7 and 0.8.If we use 31 classes, which is the same number of subdivisions as the texture-based classification of the German soil classification system, the average R 2 is 0.74 and 40 % of the regression results have coefficients of determination higher than 0.5.The maximum can be found for the number of 2128 classes (R 2 = 0.82 with 49 % of the regression results with > 0.5).The results of the regression analyses based on databases 3 x (variant A) yielded a total R 2 of 0.88 by using nine natural texture classes and 67 % of the regression results had an R 2 > 0.5.In addition, the application of the univariate method is faster and less elaborative.For those reasons, we will use the results of the regression analyses based on databases 3 x (variant A) for further applications.

Applications on soil hydraulic functions
Figure 8 illustrates the impact of the regression results that were obtained by the univariate method of databases 3 x (variant A) on van Genuchten's soil hydraulic functions for the texture classes S, Su and Lu.These three texture classes are assigned to be representative for all classes that were investigated.In addition, a wide range of K s values is covered.K s values were selected ranging from the minimum to the maximum values that were obtained out of database 3 x (variant A).The pF curves of the texture class S are shown in Fig. 8a.Van Genuchten's n was computed out of the regression function.The pF curve of the regression with the smallest K s value has a clearly smoother slope compared to the pF curve that was obtained for the largest K s value.The lower the K s the more moves the shape of the pF curves in the direction of typical pF curves for sandy soils with a fraction of silt.The curves for low K s values tend to have a higher usable field capacity possibly leading to higher rates of transpiration in hydrological modelling applications.The curves for the unsaturated hydraulic conductivity K(h) of the texture class S are given in Fig. 8d.The same parameters as for the pF curves were used.Near saturation the curves of large K s values are above the curves of low K s values.This relation changes after an intersection point at pF of ∼ 2, caused by the variation of van Genuchten's n that is directly connected to the parameter m.From the physical point of view, the shapes of the curves can be described as reasonable.The curves with lower K s values have a higher fraction of small pores.These fractions of small pores are able to transport water for a wider range of pF in contrast to the curve parameterisations with high K s values.This leads to the intersection point that changes the dominating impact factor on the conductivity curves: for pF < 2, the K s value, which simply scales the curve, is the dominating factor.For pF > 2, van Genuchten's m is the dominating impact factor.However, after the intersection point K(h) is already at very low values.Therefore, the variation of m for sandy soils may have a small impact compared to the impact of variations of the K s values.
Figure 8b shows the impact of the regression results on the pF curves of the texture class Su.Similar to Fig. 8a, the curves for low K s values have a smoother slope.In addition to that, the modifications of van Genuchten's α causes the water content to drop at higher pF values for the curves of low K s values compared to the curves of high K s values.This behaviour is typical for texture classes that have a slightly larger fraction of fine pores than the standard Su.The usable field capacity is more or less the same for all pF curves.The impact on hydrological model applications might nevertheless be immense depending on the method that reduces the potential evapotranspiration to the actual one: methods based on the actual water content of the soil within the root zone probably calculate higher rates of actual evapotranspiration using the parameterisation based on low K s values than using the ones of higher K s values.On the other hand, methods based on pF values of the soil are expected to be less affected.The impacts on the conductivity curves for the texture class Su are plotted in Fig. 8e.Here again, an intersection point can be located (at a pF of ∼ 1.8).Above this pressure head, the curves of high K s values drop below the curves of small K s values.In contrast to the conductivity curves of the texture class S, the values of K(h) at the intersection point (and close below) are still high enough to enable a water movement that is not negligible.For that reason, soil water simulations are influenced, especially during dry seasons.
The pF curves for the texture class Lu are visualised in Fig. 8c.Here, a shift on the ordinate can be observed, whereas the curves for low K s values induce higher water contents than the curves for high K s values for the same pressure head.This is due to the relation that was found for Lu of R and S being inversely proportional to K s .However, the variations of n cause different slopes of the curves.The impact on the reduction of the potential evapotranspiration is comparable to the impact described for the texture class Su.The impact on K(h) is primarily driven by the variations of the K s values, as seen in Fig. 8f.The intersection point is approximately at pF 4. At this high pF , K(h) has dropped magnitudes below the saturated value.
It can be summarised that the modifications of the VGP caused by the regression results of the databases 3 x (variant A) lead to plausible pF curves.Further, the impact on the conductivity functions near saturation is primarily driven by the value of K s .As the K s value works as a scaling factor for the conductivity curves, this result is no surprise and not induced by the regression functions.For medium and low saturations, however, the impact is dominated by the variations of the parameterisations of the soil hydraulic functions that were produced by the regression functions.Especially for the texture Su (and similar ones), the impact of the regression functions will have an impact on long-term hydrological model applications.Taking the soil map of Lower Saxony for instance, texture classes with compositions, like Su, Sl or similar classes, occupy more than one-third of the total area.For many of the texture classes, all four VGP could be fitted in dependency of K s .However, this did not always work as seen in Table 2. Following this, the correlation matrices of the VGP, generated within the regression analyses of databases 3 x (variant A), were taken into account more deeply.It turned out that correlations were very low between VGP that are related to K s and VGP that are not related to K s .These findings indicate the admissibility of fitting less than four VGP in dependency of K s .

Generating subgrid spatial variability
Spatial resolutions of hydrological models mainly depend on the resolutions of the input data of soil properties and land use, respectively.These input data are often not equally resolved in space and time (e.g. the German ATKIS database).If the model area is subdivided into polygons by the hydrological model, the spatial resolution is unequally distributed and given automatically by the input layers.If the model area is subdivided into raster cells, the spatial resolution is equally distributed and depends both on input layers and on the user's interests.For latter types of models, the spatial resolution may often induce a pseudo-accuracy, because the chosen grid size can be much smaller than most of the subdivisions of the input layers.In any case, the real spatial resolution of a hydrological model that has to be considered for the process description is given by the spatial resolutions of the input data.In most cases these spatial resolutions are rather coarse, causing many processes that are not directly resolved by the model.
To consider the spatial variability of soil water processes that are not directly resolved by the hydrological model, the following procedure is elaborated in order to generate parameterisations of soil hydraulic functions: 1. Acquisition of a soil map for the model area (or similar information).In this study, a German soil map of Lower Saxony is used; see Fig. 1.If not already included in the soil map, soil information has to be transformed into texture information.This study included usage of the German soil classification system; see Sponagel (2005).
2. Obtaining texture classes out of the soil map.For example, Sl with 65 % sand, 25 % silt and 10 % clay (see Table 1).
3. Random generation of trios of numbers within a range of 0-100 with the precondition that the sum of each trio has to be 100.The numbers of each trio are assigned to be a percentage fraction of sand, silt and clay.
5. Generation of VGP sets with the software Rosetta for the obtained texture classes (categories).
6. Regression analyses between K s values and all other VGP for each texture class.
The total number of needed randomly generated numbers (point 3) may differ in dependency of the texture classes that are going to be analysed.The Rosetta underlying databases have more samples of sandy soils than of clayey soils (Leij et al., 1996;Nemes et al., 2001).Furthermore, some combinations in the texture triangle are very seldom in nature.To ensure that these disagreements do not bias the regression results, only close-range (± boundary) near-natural occurring texture classes that are obtained from soil maps should be considered for the regression analyses (here: generation of database 3 x (variant A); see Sect.2.2).The boundary was assigned to be ±5 % in order to get a representative number of VGP sets for each texture class.Other values for the boundary were tested, whereby much lower values (e.g.±1 %) lead to a very close range of the K s values.Much higher values for the boundary (e.g.±10 %) blurred the VGP sets of the texture classes (there was no difference left between certain texture classes).Therefore, we recommend a value of ±5 % for the boundary.At the next step, the obtained regression functions have to be applied in a hydrological model.The following procedure is recommended: 1. Assumption of a lognormal distributions for the K s values of each texture class.The mean values are given by the K s values that were obtained with Rosetta at the centre of each texture class.The standard deviations are given by the user.
2. Calculation of variations of the other VGP by using the regression functions and the K s distribution functions.
The number of VGP sets is up to the user.At least three sets should be used.We recommend five sets by using the 10th, 30th, 50th, 70th and 90th percentiles of the K s distribution function.More sets are possible.
3. Run the model by parallelly using the VGP sets that were obtained at the previous point 2.
Due to the fact that standard deviations of the K s values are in most cases unknown for meso-and macroscale hydrological model applications, this parameter should be assumed by the user.Note that this is the only tuning parameter needed for the procedure presented in this study.The standard deviations of K s values at field scale may vary between less than 50 % and several hundred percent and there seem to be no clear correlations to the texture classes of the analysed soils; see e.g.Ciollaro and Romano (1995), Reynolds and Zebchuk (1996), Bosch and West (1998), Mohanty and Mousli (2000), Gupta et al. (2006) or Sobieraj et al. (2002).
The range of the standard deviation that should be used is indirectly given by the minimum and the maximum K s values that were obtained out of database 3 x (variant A).Assuming a specific standard deviation, the 10th and 90th percentiles of the resulting K s distribution still have to be within the range of K s values given in database 3 x (variant A).If so, the hydrological model is ready to start the simulation.If not, the regression function should either be restricted to the range of K s (this is recommended) or the standard deviation should be forced to a maximum value by the model.After fulfilling this condition, the hydrological model is ready to start.A possibility to effectively process the VGP sets within the hydrological model is given in point 2 of the above list.We recommend to use at least three different VGP sets per soil to describe the spatially variability.However, more sets can be used likewise.It is possible to simulate the soil water movement for all VGP sets parallel in one simulation run of the hydrological model.Note that vertical information about soil profiles, if available by the soil map, can be handled with the same procedure as described so far.Hence, the spatial variability of soil hydraulic functions can either be described as horizontal (if just texture classes without any vertical profile information is available) or horizontal and vertical (if soil profile information is also available).
These presented developments were implemented into the hydrological modelling system PANTA RHEI (Förster et al., 2012;Förster, 2013;Kreye et al., 2010Kreye et al., , 2012;;Kreye, 2015) and were used successfully in many practical applications and projects (e.g.Hölscher et al., 2014;Wurpts et al., 2014;Kreye, 2015).PANTA RHEI has been developed by the Department of Hydrology, Water Management and Water protection, Leichtweiss Institute for Hydraulic Engineering and Water Resources, University of Braunschweig in cooperation with the Institut für Wassermanagement IfW GmbH, Braunschweig (LWI-HYWAG and IFW, 2012).It is a deterministic, semi-distributed, physically based hydrological model for single events or long-term simulations.The temporal discretisation is adaptive; for many applications an hourly time step is used.The spatial discretisation is divided into three levels: HRUs (hydrologic response units), subcatchments and gauged catchments.Watersheds are the basis for the subcatchments, which contain the HRUs.This spatial discretisation makes the model very flexible to account for differences in scale of the input data, similar to the mHm model of Samaniego et al. (2010).A difference between our hydrological model PANTA RHEI compared to many other models is the low number of model parameters that are used for calibration.We work with catchment-based model parameters, which have different effects on the subcatchment scale controlled by physiographic characteristics.This leads to (only) 6-8 model parameters in total to calibrate the model for an area of a many hundred square kilometres.
The structure of the soil model of PANTA RHEI is shown in Fig. 9. Different parameterisations of VGP (e.g. 5) are established by means of lognormal distributions of K s .After the sets of VGP are derived, we use all of them to parameterise the soil model.As mentioned, we assume that one effective set of VGP cannot express subgrid variability.Secondly, we assume that many different sets of VPG are able to do so.That is why the soil model is parameterised many times, whereby the structure and equations were not changed.These different models (domains) operate individually.However, they are connected to each other.Summarised, it can be argued that we do not have multiple model scenarios -it is one model with multiple parameterisations solved simultaneously.The impact of the subgrid parameterisation of the soil hydraulic functions are dominated by the variation of K s in wet periods and by the variation of VGP in dry periods.Furthermore, the parameterisations have a feedback on the reduction of evapotranspiration that can be related to the pressure head of the soil (Feddes et al., 1976).The developed soil model is innovative regarding concept, interfaces and parameterisation.The model structure provides the required interfaces for calibrations made at runoff, soil moisture and/or groundwater level.Therefore, the demand for an automated optimisation procedure arises through the multivariable examination of the system and its new complexity.A pioneering lexicographical strategy of optimisation was developed, using the model interfaces connected to modern data types (Gelleszun et al., 2015;Kreye, 2015).To account for the impact of the subgrid parameterisation, we compared breakthrough curves (1-D) with different numbers of VGP sets and with different standard deviations of the K s distribution functions.We also compared spatially distributed simulation results of the hydrological model for soil moisture with remotely sensed satellite data (ERS1/2-ESCAT, MetOp-ASCAT, ENVISAT-ASAR).The simulated soil water contents turned out to have high accordance with the satellitebased soil moisture.In addition to that, the model was able to approximate the dynamics of groundwater level with very high quality compared to measured data (Kreye, 2015).Another possibility to account for subgrid variability is to analyse the standard deviation of soil moisture as a function of the number of applied VGP sets.Further, the spatial soil moisture patterns could be compared in dependence of the number of applied VGP sets, similar to Samaniego et al. (2010).We are working on a following paper focusing on the hydrological model and its calibration.

Conclusions
The objective of this study was to present a robust procedure to generate various sets of parameterisations of soil hydraulic functions for the description of soil heterogeneity on a subgrid scale.To achieve this, relations between K s values and van Genuchten's parameters of soil hydraulic functions were investigated.The VGP were obtained with the software Rosetta.An universal function that is valid for the complete bandwidth of K s values could not be found.After concentrating on natural texture classes, strong correlations were identified for all parameters.The results of the numerical study presented here confirm the findings of field studies (Li et al., 2007;Botros et al., 2009).The methodology presented in this study is applicable on a wide range of spatial scales and does not need input data from field studies.Zhu and Mohanty (2002) tried to find effective parameters for van Genuchten's soil hydraulic functions within a numerical study.They conclude that it is very difficult to define a single set of effective parameters that lead to suitable simulation results.In order to avoid effective parameters, the assumption of a parameterisation of soil hydraulic functions in dependence of K s , as presented in this study, is a promising alternative.Therefore, regression functions have to be set up a priori to the hydrological modelling.This is done in a much shorter time than the time needed for acquisition and preparation of other input data for a large-scale hydrological model.Further, the procedure is robust in application and additional data (and costs) are not required.When using Rosetta, a soil map of the modelling area is sufficient.
Our methodology can be connected to the work of Wösten et al. (1985Wösten et al. ( , 1986)), Carsel and Parrish (1988) and de Rooij et al. (2004).Wösten et al. (1985Wösten et al. ( , 1986) ) successfully elaborated a procedure to regionalise soil hydraulic properties on the total model area by using measurement point data (for different soil profiles) and soil maps.However, in contrast to our work, they needed measurement data and their modelling area was very small (a few hundred hectares) compared to meso-and macroscale hydrological model areas with several thousand square kilometres.Besides texture data, they used additional soil properties like bulk density or organic matter.The sophisticated methods for the consideration of subgrid variability presented by Carsel and Parrish (1988) and de Rooij et al. ( 2004) may be difficult to implement for hydrological modelling, because of needed measurement data (again).However, for future work, it might be interesting to feed their methods with Rosetta-generated input data.
It is worth discussing the applicability of transferring Rosetta results to a distributed hydrological model.An interchange of parameters between different models can be cumbersome.This was e.g.found by Koch et al. (2016) by using the model HYDRUS 1-D to fit VGPs, which were passed to several hydrological models (MIKE SHE, HydroGeoSphere and ParFlow-CLM).The fitting in HYDRUS 1-D was done by means of continuously measured time series of soil moisture at different locations and depths.HYDRUS 1-D also incorporates a Rosetta interface, but here inverse modelling was used to fit VGP.To parameterise the hydrological models, Koch et al. (2016) homogeneously used the same VGP at every spatial location.In a second (heterogeneous) scenario they used spatially differentiated porosity (saturated water content), but all other VGP were still homogenously distributed.Hence, they nicely concluded that "future work must focus on other possibilities to further distribute the remaining VGP parameters".One possibility to achieve this on the mesoscale is what we introduced in our study.However, we do not use another model (like HYDRUS 1-D) to estimate VGP by means of inverse modelling.Besides the need of measured input data, it is a challenge to regionalise the obtained (1-D) results of a model like HYDRUS 1-D to a spatial fully distributed hydrological model.As Rosetta is based on neural network analyses, it serves as a pedotransfer function for the estimation of VGP and K s .Data with different levels of detail can be used as input, starting with texture classes and going up to more detailed (experimentally determined) information.However, Rosetta does not fit VGPs and K s by means of measured time series of e.g.soil moisture or pressure head.Hence, we again want to point out that Rosetta has to be defined as "pedotransfer function" rather than using the term "model".Compared to point measurements of VGP, Rosetta is not always capable to perform a perfect fit; see e.g.Pandey et al. (2005), Li et al. (2007) or Ghorbani Dashtaki et al. (2010).However, considering the huge sizes of model areas that are common for hydrological model applications, Rosetta is a good choice to generate parameters covering the complete area.

Figure 1 .
Figure1.Soil map of Lower Saxony, Germany(Boess et al., 2004).Ls is sandy loam, Lt is clayey loam, S is sand, Sl is loamy sand, Su is silty sand, Tu is silty clay, Ul is loamy silt, Us is sandy silt, Ut is clayey silt; see Table1.In addition, Hn stands for boggy soils and Wa stands for water bodies (lakes, rivers).

Figure 2 .
Figure 2. Subdivision of the soil texture by means of cluster analyses based on 31 classes (blue coloured polygons).The classes were divided by similarity of their soil hydraulic parameters (cf.Twarakavi et al., 2010).The subdivisions of the German soil classification system (cf.Sponagel, 2005) are overlayed with white lines.

Figure 3 .
Figure 3. Procedure to obtain van Genuchten (VG) parameters and the saturated hydraulic conductivity (K s ) values based on soil map information.The software Rosetta is based on neural network analyses and generates van Genuchten parameters and K s values out of soil texture information.

Figure 4 .
Figure 4. Average coefficient of determination (R 2 ) in dependency of the number of classes used for the subdivisions based on soil hydraulic properties by means of cluster analyses.The average R 2is calculated out of the R 2 of all classes for each case.For this calculation, only classes with R 2 > 0.5 were considered.In addition to that, the range of R 2 is shown.The range was calculated out of the maximum and minimum R 2 of the individual classes.

Figure 5 .Figure 6 .
Figure5.Scatterplots of the van Genuchten parameters ( R , S , n, α) in dependency of the saturated hydraulic conductivity (K s ).Visualised is database 1 ( R − K s ) and database 2 ( S -K s , n − K s and α − K s ).A regression function with a coefficient of determination (R 2 ) of 0.74 was fitted between R and K s .Furthermore, a regression function with an R 2 of 0.25 was fitted between n and K s .S − K s as well as α − K s showed no correlation.

Figure 7 .
Figure7.Scatterplot of the van Genuchten parameter n in dependency of the saturated hydraulic conductivity (K s ) for the texture class Su (silty sand) out of database 3 x (variant A).To compare the univariate and multivariate regression, both functions are shown in the graph.R 2 is the coefficient of determination.

Figure 8 .
Figure8.Impact on the pF and K(h) curves due to the univariate regression functions out of database 3 x (variant A). pF is log 10 of the absolute pressure head h.K(h) is the hydraulic conductivity in dependency of pressure head.= volumetric water content.The minimum and maximum saturated hydraulic conductivities (K s ) were given by Rosetta.The van Genuchten parameters were changed in dependency of K s by means of the regression functions.(a) pF curves for the texture class S (sand).(b, c) The same as shown in (a), but for the texture classes Su (silty sand) and Lu (silty loam).(d) Hydraulic conductivity curves for the texture class S. (e, f) The same as shown in (d), but for the texture classes Su and Lu.

Figure 9 .
Figure 9. Application of different van Genuchten parameter sets on the soil model of the hydrological modelling system PANTA RHEI.The different parameterisations (domains) are parallel used at all spatial locations.The domains are solved simultaneously and with interaction to each other.The main input is given by the spatial precipitation (P ), which was reduced in advance by vegetational interception.Results of the soil model are the direct runoff (P eff,D ), the groundwater recharge (P eff,GW ), which leads to base flow in a long-term view and actual evapotranspiration (ET).

Table 2 .
Obtained coefficients of determination (R 2 ), Spearman correlation (r spear ) and belonging p value (p) as well as the sample size (Samples) for the regressions between the K s values and the soil hydraulic parameters for each texture class.Lu is silty loam, Ls is sandy loam, Ut is clayey silt, Ul is loamy silt, Us is sandy silt, Sl is loamy sand, Su is silty sand, S is sand, Ss is pure sand.