Picturing and modeling catchments by representative hillslopes

This study explores the suitability of a single hillslope as a parsimonious representation of a catchment in a physically based model. We test this hypothesis by picturing two distinctly different catchments in perceptual models and translating these pictures into parametric setups of 2-D physically based hillslope models. The model parametrizations are based on a comprehensive field data set, expert knowledge and process-based reasoning. Evaluation against streamflow data highlights that both models predicted the annual pattern of streamflow generation as well as the hydrographs acceptably. However, a look beyond performance measures revealed deficiencies in streamflow simulations during the summer season and during individual rainfall–runoff events as well as a mismatch between observed and simulated soil water dynamics. Some of these shortcomings can be related to our perception of the systems and to the chosen hydrological model, while others point to limitations of the representative hillslope concept itself. Nevertheless, our results confirm that representative hillslope models are a suitable tool to assess the importance of different data sources as well as to challenge our perception of the dominant hydrological processes we want to represent therein. Consequently, these models are a promising step forward in the search for the optimal representation of catchments in physically based models.


Introduction
The value of physically based hydrological models has been doubted (e.g., Beven, 1989;Savenije and Hrachowitz, 2016) since their idea was introduced by Freeze and Har-lan (1969).Physically based models like MikeShe (Refsgaard and Storm, 1995) or CATHY (Camporese et al., 2010) typically rely on the Darcy-Richards concept for soil water dynamics, the Penman-Monteith equation for soilvegetation-atmosphere exchange processes and hydraulic approaches for overland flow and streamflow.Each of these concepts is subject to limitations arising from our imperfect understanding of the related processes and is afflicted by the restricted transferability of process descriptions from idealized laboratory conditions to heterogeneous natural systems (Grayson et al., 1992;Gupta et al., 2012).
Nevertheless the usefulness of physically based models as a learning tool to explore how internal patterns and processes control the integral behavior of hydrological systems has been corroborated in several studies.For example Pérez et al. (2011) used Hydrogeosphere (Brunner and Simmons, 2012), together with a regularization scheme for its calibration, to infer how changes in agricultural practices affect the streamflow generation in a catchment.Hopp and McDonnell (2009) explored the role of bedrock topography in the runoff generation using HYDRUS 3-D (Simunek et al., 2006) at the Panola hillslope.Coenders-Gerrits et al. (2013) used the same model structure to examine the role of interception and slope in the subsurface runoff generation.Bishop et al. (2015), Wienhöfer and Zehe (2014) and Klaus and Zehe (2011) used physically based models to investigate the influence of vertical and lateral preferential flow networks on subsurface water flow and solute transport, including the issue of equifinality and its reduction.These and other studies (e.g., Ebel et al., 2008;Scudeler et al., 2016) show that physically based models can be set up using a mix of expert knowledge and observed parameters and may be tested Published by Copernicus Publications on behalf of the European Geosciences Union.R. Loritz et al.: Picturing and modeling catchments by representative hillslopes against a variety of observations beyond streamflow -such as soil moisture observations, groundwater tables or tracer breakthrough curves.Such studies are, on the one hand, an option to increase our limited understanding of the processes underlying physically based models (Loague and VanderKwaak, 2004), and on the other hand reveal whether a model allows consistent predictions of dynamics within the catchment and of its integral response behavior (Ebel and Loague, 2006).
Setting up a classical physically based model in a heterogeneous environmental system is, however, a challenge as it requires an enormous amount of highly resolved spatial data, particularly on subsurface characteristics.Such data sets are rare and only available in rather homogeneous systems or in environmental system simulators such as Biosphere 2 LEO (Hopp et al., 2009).Therefore, it has been a long standing vision to replace fully distributed physically based models with aggregated yet physically based model concepts, for instance the Hillslope Storage Boussinesq approach (HSB, Troch et al., 2003;Berne et al., 2005), the REW approach (Representative Elementary Watershed, e.g., Reggiani and Rientjes, 2005;Zhang and Savenije, 2005) or different dual-continuum approaches (Dusek et al., 2012).The key challenge in applying these concepts to real catchments is the assessment of a closure relationship, which parametrizes (a) hydrological fluxes (Beven, 2006a) and (b) soil water characteristics in an aggregated effective manner (Lee et al., 2007;Zehe et al., 2006).Furthermore, it is not completely clear whether the entire range of variability in subsurface characteristics is relevant for hydrological simulations (Dooge, 1986;Zehe et al., 2014).There are, however, promising concepts emerging, for example the work of Hazenberg et al. (2016), who recently developed a hybrid model consisting of the HSB model in combination with a 1-D representation of the Richards equation for the unsaturated zone.
Regardless of whether one favors physically based, hybrid or more statistical model approaches, a perfect representation of a hydrological system should balance the necessary complexity with the greatest possible simplicity (Zehe et al., 2014).The former is necessary to avoid oversimplification.The latter attempts to avoid the drawbacks of overparametrization (Schoups et al., 2008).In principle there are two ways one can try to reach this optimum model structure: either by starting with a complex system representation, for instance a full 3-D catchment model, and simplifying the model structure as much as possible, or by starting at the other end of the spectrum, with the most parsimonious model structure, and proceeding towards higher complexity.In conceptual rainfall-runoff models which follow the HBV concept (Bergström and Forsman, 1973) the most parsimonious model structure for simulating the behavior of a catchment is a single reservoir.In the case of physically based models there is more than one starting point.In flatland catchments without dominant lateral flow processes in the soil one might choose a single soil column.This "null model" could be re-fined into multiple parallel acting columns, to capture variability in vegetation and soil properties.This represents the first generation of land surface components in meteorological models (e.g., Niu et al., 2011) and the first generation of models for the catchment-scale dynamics of nitrate (Refsgaard et al., 1999).
However, in hilly or mountainous terrain the smallest meaningful unit is a hillslope including the riparian zone, because rainfall and radiation input depend on slope and aspect, as well as on downslope gradients which cause lateral fluxes in the unsaturated zone (e.g., Bachmair and Weiler, 2011;Zehe and Sivapalan, 2007).This is the reason why hillslopes are often regarded as the key landscape elements controlling transformation of precipitation and radiation inputs into fluxes and stocks of water (e.g., Bronstert and Plate, 1997), energy (Zehe et al., 2010(Zehe et al., , 2013) ) and sediments (Mueller et al., 2010).
The most parsimonious representation of a small catchment in a physically based model could thus be a single representative hillslope.However, the challenge of how to identify such a hillslope has rarely been addressed.This reflects the fact that the identifiability of a representative hillslope has been strongly questioned since the idea was born.For example, Beven (2006b) argues that the hillslope form is not uniquely defined nor is it clear whether it is the form that matters, the pattern of saturated areas (Dunne and Black, 1970) or the subsurface architecture.The enormous spatial variability of soil hydraulic properties and preferential flow paths in conjunction with process non-linearity are additional arguments against the identifiability of representative hillslope models (Beven and Germann, 2013).Nevertheless, hillslopes act as miniature catchments (Bachmair and Weiler, 2011), which made Zehe et al. (2014) postulate that structurally similar hillslopes act as functional units for the runoff generation and might thereby be a key unit for understanding catchments of organized complexity (Dooge, 1986).Complementarily, Robinson et al. (1995) showed that the behavior of catchments up to the lower mesoscale (5-50 km 2 ) are strongly dominated by the hillslope behavior, and Kirkby (1976) highlighted that in catchments extending up to 50 km 2 random river networks had the same explanative power for runoff generation as the real river network.He concluded that as long as river networks are not dominant, the characteristic areas of the catchment hold the key to understanding its functioning.
In this context it is of interest to which extent the parameters of a representative hillslope model can be derived by averaging various structural properties of several hillslopes or plots in a catchment.A promising avenue is to set up the representative hillslope based on a perceptual model which is in turn a generalized and simplified picture of the catchment structure and functioning.This is because perceptual models provide a useful means of facilitating communication between field researchers and modelers (Seibert and Mc-Donnell, 2002) and additionally often represent catchments Hydrol.Earth Syst.Sci., 21, 1225Sci., 21, -1249Sci., 21, , 2017 www.hydrol-earth-syst-sci.net/21/1225/2017/ as hillslope-like cross sections.The general idea to translate a perceptual model into a model structure is not new and has already been applied within a conceptual rainfall-runoff model framework even within the same area (Wrede et al., 2015).The scientific asset of using a physically based model is that the perceptual model provides important information on typical ordinal differences in the hydraulic conductivity of different subsurface strata and the nature and qualitative locations of the dominating preferential flow paths.This information can be implemented in hillslope models in a straightforward manner.The transformation of a qualitative model structure into a quantitative parametrization of the model depends, however, strongly on the chosen hydrological model and the quality and amount of available data.

Objectives and approach
We hypothesize that a single hillslope in a physically based model is a parsimonious representation of a small hilly catchment.The objective of this study is to test this hypothesis in a two-step approach.
-First we derive a qualitative model structure of a representative hillslope from our perception of the dominant processes and the related dominant surface and subsurface characteristics in the catchment.
-In the second step we transform this qualitative model structure into a quantitative model structure without the use of an automatic parameter allocation.
The challenge in deriving a qualitative model structure lies in the separation of the important details from the idiosyncratic ones.This process is to a large extent independent of the chosen hydrological model and is strongly related to the available expert knowledge and quality of the data.The transformation of a qualitative to a quantitative model structure on the other hand depends on the chosen model and whether it is for example based on a 2-D or 3-D hillslope module or how rapid flow paths are represented.For this reason the objective of our study is not to "sell" our particular model, but to share the way we distilled the quantitative model setups in our target catchments from available data and to evaluate the ability of this parsimonious physically based model to accurately simulate multiple state and flux variables.During the model setup we intentionally avoided using an optimization algorithm to fit the model to the data.In contrast, we relied on various available observations, process-based reasoning, and appropriate literature data to conceive our perceptual models and parametrize the representative hillslope models as their quantitative analogs.More specifically, we use geophysical images to constrain subsurface strata and bedrock topography and derived representative soil water retention curves from a large data set of undisturbed soil samples.Furthermore, we use observations from soil pits, dye staining experiments and observed leaf area indices (LAI) for our model parametrization.Finally we benchmark the hillslope models against normalized double mass curves, the hydrograph as well as against distributed soil moisture and sap flow observations.
2 Study area, database and selected model We focus our model efforts on two different catchments, the Colpach and the Wollefsbach, located in the Attert experimental basins in Luxembourg (Fig. 1, Pfister et al., 2000).These sites offer comprehensive laboratory and field data collected by the CAOS (Catchments As Organized Systems) research unit (Zehe et al., 2014).Besides standard hydrometeorological data the model setup is based on (a) observed soil hydraulic properties of a large number of undisturbed soil cores, (b) 2-D electric resistivity profiles in combination with soil pits and augering to infer on bedrock topography, and (c) flow patterns from dye staining experiments and soil ecological mapping of earthworm burrows, to infer the nature and density of vertical preferential flow paths.The representative hillslopes for the two catchments were each set up as a single 2-D hillslope in the CATFLOW model (Zehe et al., 2001).The following subsections will provide detailed information on the perceptual models and on the water balance of both catchments.We will shortly refer to the key data and those parts of the model which are relevant for the quantitative model setup, while the Appendix provides additional details on both.

The Attert experimental basin
The Attert basin is located in the mid-western part of the Grand Duchy of Luxembourg and has a total area of 288 km 2 .Mean monthly temperatures range from 18 • C in July to a minimum of 0 • C in January; mean annual precipitation in the catchment varies around 850 mm (1971-2000) (Pfister et al., 2000).The catchment covers three geological formations, the Devonian schists of the Ardennes massif in the northwest, Triassic sandy marls in the center and a small area of sandstone (Jurassic) in the southern part of the catchment (Martínez-Carreras et al., 2012).Our study areas are headwaters named Colpach in the schist area and Wollefsbach in the marl area.As both catchments are located in distinctly different geologies and land use settings, they differ considerably with respect to runoff generation and the dominant controls (e.g., van den Bos et al., 1996;Martínez-Carreras et al., 2012;Fenicia et al., 2014;Wrede et al., 2015;Jackisch, 2015).

Colpach catchment: perceptual model of structure and functioning
The Colpach catchment has a total area of 19.4 km 2 and elevation ranges from 265 to 512 m a.s.l.It is situated in the northern part of the Attert basin in the Devonian schists of the Ardennes massif (Fig. 2a).Around 65 % of the catchment is forested, mainly the steep hillslopes (Fig. 2).In contrast, the plateaus at the hilltops are predominantly used for agriculture and pasture.Several geophysical experiments and drillings showed that bedrock and surface topography are distinctly different.The bedrock is undulating and rough with ridges, depressions and cracks (compare the perceptual model in Fig. 3a and the ERT image in Fig. 6b).Depressions in the bedrock interface are filled with weathered, silty materials which may form local reservoirs with a high water holding capacity.These reservoirs are connected by a saprolite layer of weathered schist which forms a rapid lateral flow path on top of the consolidated bedrock.Rapid flow in this "bedrock interface" is the dominant runoff process (Wrede et al., 2015), and the specific bedrock topography is deemed to cause typical threshold-like runoff behavior similar to the fill-and-spill mechanism proposed by Tromp-Van Meerveld and McDonnell (2006).Further indication that fill-and-spill is a dominant process is given by the fact that the parent rock is reported as impermeable, which makes deep percolation through unweathered schist layers into a large groundwater body unlikely (Juilleret et al., 2011).Furthermore, surface runoff has rarely been observed in the catchment, except along forest roads, which suggests a high infiltrability of the prevailing soils (van den Bos et al., 1996).This is in line with distributed permeameter measurements and soil sampling performed by Jackisch (2015).Moreover, numerous irrigation and dye staining experiments highlight the important role of vertical structures in rapid infiltration and subsequent subsurface runoff formation (Jackisch, 2015, Fig. 2b).These vertical preferential flow paths, the saprolite layer on top of the impermeable bedrock, the bedrock topography as well as the absence of a major groundwater body are regarded as the dominant structures for the representative hillslope model (Fig. 3a and c).

Wollefsbach catchment: perceptual model of structure and functioning
The Wollefsbach catchment is located in the Triassic sandy marls formation of the Attert basin.It has a size of 4.5 km 2 and low topographic gradients, with elevation ranging from 245 to 306 m a.s.l.The catchment is intensively used for agriculture and pasture (Fig. 2c); only around 7 % are forested.
Hillslopes are often tile-drained (compare the perceptual model sketch in Fig. 3b).The heterogeneous marly soils range from sandy loams to thick clay lenses and are generally very silty with high water holding capacities.Similar to the Colpach catchment, vertical preferential flow paths play a major role in the runoff generation; their origin, however, is distinctly different between the seasons.Biogenic macropores are dominant in spring and autumn due to the high abundance of earthworms.Because earthworms are dormant during midsummer and winter, their burrows are partly disconnected by ploughing, shrinking and swelling of the soils (Fig. 2d; see also Fig. 4).Soil cracks emerge during long dry spells in midsummer due to the considerable amount of smectite clay minerals in these soils, which drastically increase soil infiltrability in summer (Fig. 4).The seasonally varying interaction of both types of preferential flow paths with a dense man-made subsurface drainage network is considered the reason for the flashy runoff regime of this catchment, where discharge rapidly drops to baseflow level when precipitation events end.This is the key feature that needs to be captured by the representative hillslope model.However, as the exact position of the subsurface drainage network and the worm burrows as well as the threshold for soil crack emergence are unknown, the specific influence of each structure on runoff generation in a hydrological model is difficult to estimate.

Water balance and seasonality
The water balance of the Colpach and Wollefsbach catchments for several hydrological years is presented in Fig. 5 as normalized double mass curves.Normalized double mass curves relate cumulated runoff to cumulated precipitation, both divided by the sum of the annual precipitation (Pfister et al., 2002;Seibert et al., 2016).Annual runoff coefficients in the Colpach catchment vary around 0.51 ± 0.06 among the 4 hydrological years (Fig. 5a).Annual runoff coefficients are smaller in the Wollefsbach catchment than in the Colpach catchment, and vary across a wider range, from 0.26 to 0.46 (Fig. 5b).In both catchments the winter period is characterized by step-like changes which reflect fast water release during rainfall events partly due to rapid subsurface flow.In contrast, the summer regime is characterized by a smooth and almost flat line when vegetation is active.Accumulated   rainfall input is not transformed into additional runoff, but is either stored in the system or released as evapotranspiration (Jackisch, 2015).As suggested by Seibert et al. (2016), we used a temperature index model from Menzel et al. (2003) to detect the bud break of the vegetation and to separate the vegetation-controlled summer regime from the winter period in these curves.

Subsurface structure and bedrock topography
We used hillslope-scale 2-D electrical resistivity tomography (ERT) in combination with augerings and soil pits to estimate bedrock topography in the schist area.Our auger profiles revealed, in line with Juilleret et al. (2011) and Wrede et al. (2015), that the vertical soil setup comprises a weathered silty soil layer with a downwards increasing fraction of rock fragments, which is underlain by a transition zone of weathered bedrock fragments and by non-weathered and impermeable bedrock.Based on a robust inversion scheme as implemented in Res2Dinv (Loke, 2003) and additional expert knowledge, the subsurface was subdivided into two main layers of unconsolidated material and solid bedrock.
The bedrock interface was picked by the 1500 m isoline, as explained in detail in the Appendix.For our study we used seven ERT profiles from the Colpach area (for an example, see Fig. 6b).Due to the very different geological setting in the marl region (high clay content and alternating sedimentary layering), we could not establish a relation between bedrock depth and the electrical conductivity data for this region.Therefore, the available ERT data do not provide information on depth to bedrock for this geological setting and we had to rely on auger profiles to estimate the average soil depth.

Soil hydraulic properties
We determined soil texture, saturated hydraulic conductivity and the soil water retention curve for 62 soil samples in the schist area and 25 in the marl area.Particularly for the soil hydraulic functions, Jackisch (2015) and Jackisch et al. (2016) found large spatial variability, which was neither explained by slope position nor by the soil depth at which the sample was taken (Fig. 7).As our objective was to assess the most parsimonious representative hillslope model, we neglected this variability but used effective soil water characteristics for both catchments instead.These were not obtained by averaging the parameters of the individual curves, but by grouping the observation points of all soil samples for each geological unit and averaging them in steps of 0.05 pF.We then fitted a van Genuchten-Mualem model using a maximum likelihood method to these averaged values (Table 1 and Fig. 7).The Appendix provides additional details on measurement devices and on the dye staining experiments.

Meteorological forcing and discharge
Meteorological data are based on observations from two official meteorological stations (Useldange and Roodt) provided by the Administration des services techniques de l'agriculture Luxembourg.Air temperature, relative humidity, wind speed and global radiation are provided with a temporal resolution of 1 h, while precipitation data are recorded at an interval of 5 min.Precipitation was extensively quality checked against six disdrometers which are stationed within the Attert basin and by comparing several randomly selected rainfall events against rain radar observations, both using visual inspection.Discharge observations are provided by the Luxembourg Institute of Science and Technology (LIST).

Sap flow and soil moisture data
The Attert basin is instrumented with 45 automated sensor clusters.A single sensor cluster measures inter alia rainfall and soil moisture in three profiles with sensors at various depths.In this study we use 38 soil moisture sensors located in the schist area and 28 sensors located in the marl area, at depths of 10 and 50 cm.Furthermore we use sap flow measurements from 28 trees at 11 of the sensor cluster sites.The measurement technique is based on the heat ratio method (Burgess et al., 2001); sensors are East 30 Sensors threeneedle sap flow sensors.As a proxy for sap flow we use the maximum sap velocity of the measurements from three xylem depths (5, 18 and 30 mm) as recorded by each sensor.

Physically based model CATFLOW
Model simulations were performed using physically based hydrological model CATFLOW (Maurer, 1997;Zehe et al., 2001).CATFLOW consists of a 2-D hillslope module which can optionally be combined with a river network to represent a catchment (with several hillslopes).The model employs the standard physically based approaches to simulate soil water dynamics, optional solute transport, overland and river flow and evapotranspiration, which were already men-tioned in the introduction and are described in more detail in the Appendix.In the following we will only explain the implementation of rapid flow paths in the model, as this aspect differs greatly from model to model.

Generation of rapid vertical and lateral flow paths
Vertical and lateral preferential flow paths are represented as a porous medium with high hydraulic conductivity and very low retention.This approach has already been followed by others (Nieber and Warner, 1991;Castiglione et al., 2003;Lamy et al., 2009;Nieber and Sidle, 2010), and is one of many ways to account for rapid flow paths in physically based models.However, it is important to note that such a macropore representation is obviously not an image of the real macropore configuration given the typical grid size of a few centimeters, but a conceptualization to explicitly represent parts of the subsurface with prominent flow paths and the adjacent soil matrix in an effective way.The approach includes the assumption that preserving the connectedness of the rapid flow network (Fig. 3) is more important than separating rapid flow and matrix flow into different domains.
Implementations of this approach with CATFLOW were successfully used to predict hillslope-scale preferential flow and tracer transport in the Weiherbach catchment, a tiledrained agricultural site in Germany (Klaus and Zehe, 2011), and at the Heumöser hillslope, a forested site with fine textured marly soils in Austria (Wienhöfer and Zehe, 2014).The locations of vertical macropores may either be selected based on a fixed distance or via a Poisson process based on the surface density of macropores.From these starting points the generator stepwise extends the vertical preferential pathways downwards to a selected depth, while allowing for a lateral step with a predefined probability of typically 0.05 to 0.1 to establish tortuosity.Lateral preferential flow paths to represent either pipes at the bedrock interface or the tile drains are generated in the same manner: starting at the interface to the stream and stepwise extending them upslope, again with a small probability of a vertical upwards or downwards step to allow for tortuosity (Fig. 3c and d).
3 Parametrization of the representative hillslope models 3.1 Colpach catchment

Surface topography and spatial discretization
We extracted 241 hillslope profiles based on the available DEM in the Colpach catchment using Whitebox GIS (Lindsay, 2014) following the LUMP approach (Landscape Unit Mapping Program, Francke et al., 2008).Based on these profiles (Fig. 6a) we derived a representative hillslope with a length of 350 m, a maximum elevation of 54 m above the stream, and a total area of 42 600 m 2 .The hillslope has a mean slope angle of 11.6 • and faces south (186 • ), similar to the average aspect of the Colpach catchment.The first step in generating the representative hillslope profile was to calculate the average distance to the river of all 241 extracted hillslope profiles as equal to 380 m.In the next step all elevation and width values of the profiles were binned into 1 m "distance classes" from the river ranging up to the average distance of 380 m.For each class the median values of the (a) elevation above the stream and (b) the hillslope width were derived and used for the representative hillslope profile (Fig. 6a).For numerical simulation the hillslope was discretized into 766 horizontal and 24 vertical elements with an overall hillslope thickness of 3 m.The vertical grid size was set to 0.128 m, with a reduced vertical grid size of the top node of 0.05 m.Grid size in the downslope direction varied between 0.1 m within and close to the rapid flow path and 1 m within reaches without macropores (Fig. 3c).The hillslope thickness of 3 m was chosen to reflect the average of the deepest points of the available bedrock topographies extracted from ERT profiles, which was 2.7 m. Boundary conditions were set to the atmospheric boundary at the top and the no flow boundary at the right margin.At the left boundary of the hillslope we selected the seepage boundary condition, where outflow only occurs under saturated and no flow under unsaturated conditions.A gravitational flow boundary condition was established for the lower boundary.We used spin-up runs with initial states of 70 % saturation for the entire hydrological year of interest and used the resulting soil moisture pattern for model initialization.This initialization approach was also used for the Wollefsbach catchment.

Land use and vegetation parametrization
According to the land use maps, the hillslopes are mostly forested.As the hilltop plateaus account for only a very small part of the representative hillslope, the land use type for the entire hillslope is set to forest (Fig. 2a).The start and end of the vegetation period were defined using the temperaturedegree model of Menzel et al. (2003), which allowed successful identification of the tipping point between the winter and vegetation season in the double mass curves of the Colpach and of the Wollefsbach (compare Fig. 5a and b).We further used observed LAI to parametrize the evapotranspiration routine.However, since only 14 single measurements at different positions are available for the entire schist area and vegetation period, we use the median of all LAI observations from August as a constant value of 6.3 for the vegetation period.To account for the annual pattern of the vegetation phenology we interpolate the LAI for the first and last 30 days of the vegetation period linearly between zero and 6.3, respectively.The other evapotranspiration parameters are displayed in Table 2 and were taken from Breuer et al. (2003) or Schierholz et al. (2000).

Bedrock topography, permeability and soil hydraulic functions
We used the shape of the bedrock contour line of the ERT image (Fig. 6) to constrain the relative topography of the bedrock interface in the hillslope model as follows.We scaled the 100 m of bedrock topography to the hillslope length of 380 m.We then used the average depth to bedrock from all seven available ERT measurements (2.7 m) to scale the maximum depth to bedrock in our model.divided the average depth of 2.7 m by the deepest point of the bedrock in Fig. 6b (3.3 m) and used the resulting factor of 0.88 to reduce the bedrock depth of Fig. 6b relatively at all positions.As a result, the soil depths to the bedrock interface vary between 1 and 2.7 m, with local depressions that form water holding pools.Since no major groundwater body is suspected and no quantitative data on the rather impermeable schist bedrock in the Colpach are available, we use a relatively impermeable bedrock parametrization suggested by Wienhöfer and Zehe (2014, Table 1).It is important to note that due to this bedrock parametrization water flow through the hillslope lower boundary tends to zero.The silty soil above the bedrock was modeled with the representative hydraulic parameters obtained from field samples listed in Table 1.Since there was no systematic variation of hydraulic parameters of the individual soil samples with depth, soil hydraulic parameters were set constant over depth, except for porosity, which was reduced to a value of 0.35 (m 3 m −3 ) at 50 cm depth to account for the increasing skeleton fraction of around 40 % in deeper soil layers.

Rapid subsurface flow paths
Macropore depths were drawn from a normal distribution with a mean of 1 m and a standard deviation of 0.3 m.These values are in agreement with the mean soil depth and correspond well to the results of dye staining experiments performed by Jackisch (2015) and Jackisch et al. (2016).Additionally, macropores were slightly tortuous, with a probability of a lateral step of 5 %.Since no observations for the macropore density were available, we use a fixed macropore distance of 2 m.The macropore distance was chosen rather arbitrarily to reflect their relative density in the perceptual model and to establish a partly connected network of vertical and lateral rapid flow paths.The vertical flow paths were parametrized using an artificial porous medium with high hydraulic conductivity and low retention properties proposed by Wienhöfer and Zehe (2014, Table 1).Also, the weath-ered periglacial saprolite layer which is represented by a 0.2 m thick layer above the bedrock was parametrized as a porous medium following Wienhöfer and Zehe (2014).The estimated saturated hydraulic conductivity of 1 × 10 −3 m s −1 corresponds well to the velocities described by Angermann et al. (2016).This ensures that the Reynolds number is smaller than 10, implying that flow can be considered laminar and that the application of Darcy's law is still appropriate (Bear, 1972).

Surface topography and spatial discretization
Since only eight relatively similar hillslope profiles were derived from the DEM in the Wollefsbach, we randomly chose one of those with a length of 653 m, a maximal elevation above the river of 53 m and an area of 373 600 m 2 .The hillslope has a mean slope angle of 8.1 • and faces south (172 • ).The hillslope was discretized into 553 horizontal and 21 vertical elements with an overall hillslope thickness of 2 m (Fig. 3d).The vertical grid size was set to 0.1 m, with a reduced top and bottom node spacing of 0.05 m.Grid size in the lateral direction varied between 0.2 m within and close to the rapid flow paths and 2 m within reaches without macropores (Fig. 3b and d).

Land use and vegetation parametrization
Land use was set to grassland within the steeper and lower part of the hillslope, and set to corn for larger distances to the creek (> 325 m).Due to the absence of local vegetation data we used tabulated data characterizing grassland and corn from Breuer et al. (2003).The start and end points of the vegetation period for the grassland and the start point for the corn cultivation were again identified by the temperature index model of Menzel et al. (2003).The vegetation period for the corn cultivation ends at the beginning of October since this is the typical period for harvesting.The intra-annual vegetation dynamics were taken from Schierholz et al. (2000).

Bedrock topography, permeability and soil hydraulic functions
In contrast to the Colpach, geophysical measurements and augerings revealed bedrock and surface to be more or less parallel.Soil depth was set to a constant 1 m and the soil was parametrized using the representative soil retention curves shown in Fig. 7.The bedrock was again parametrized according to values Wienhöfer and Zehe (2014) proposed for the impermeable bedrock at the Heumöser hillslope in Austria (Table 1), which is also in a marl geology.

Rapid subsurface flow paths
Based on the perceptual model (Fig. 3b and d) and the reported vertical and lateral drainage structures in the catchment, we generated a network of fast flow paths.The depths of the vertical flow paths were drawn from a normal distribution with a mean of 0.8 m and a standard deviation of 0.1 m.The tile drain was generated at the standard depth of 0.8 m extending 400 m upslope from the hillslope-creek interface.Due to the apparent changes in soil structure either by earthworm burrows or emergent soil cracks (Fig. 4), we used different macropore setups for the winter and vegetation seasons.For the winter setup we implemented vertical drainage structures every 4 m.In the summer setup we added fast flow paths every 2 m to account for additional cracks and earthworm burrows.The positions of the conceptual macropores were selected again arbitrarily to create an image of the perceptual model and to ensure that the soil surface and the tile drain were well connected.Vertical flow paths and the tile drain were parametrized similarly to the Colpach with the same artificial porous medium (Table 1).Boundary conditions of the hillslope, initialization and the spin-up phase were the same as described for the Colpach model.

Model scenarios
Both hillslope models were set up within a few test simulations to reproduce the normalized double mass curves in both catchments of the hydrological year 2014.Within those trials we compared for instance setups without and with an arbitrary selected density of macropores, but we did not perform an automated parameter allocation as stated above.We choose the normalized double mass curves as a fingerprint of the annual pattern of runoff generation since it is particularly suitable for detecting differences in the inter-annual and seasonal runoff dynamics of a catchment (Jackisch, 2015).Model performance was judged by visual inspection as well as by using the Kling-Gupta efficiency (KGE, Gupta et al., 2009).
In a second step we compared the simulated overland flow and subsurface storm flow across the left hillslope boundary to observed discharge.Water leaving the hillslope through the lower boundary was neglected from the analysis because in both setups the total amount was smaller than 1 % of the overall hillslope outflow.We compared the specific discharge of the hillslopes to the observed specific discharge of the two catchments in mm h −1 by dividing measured and simulated discharge by the area of the catchments and the hillslopes.Our goal was to test whether our hillslope models represented the typical subsurface filter properties which are relevant for the runoff generation in both selected hydrological landscapes (schist and marl areas in the Attert basin).We measured the model performance with respect to discharge, again based on the KGE.Since it is advisable to calculate and display various measures of model performance (Schaefli and Gupta, 2007), we calculated the Nash-Sutcliffe efficiency (NSE; a measure of model performance with emphasis on high flows) and the logarithmic NSE (log NSE; a performance measure suited for low flows).As both catchments are characterized by a strong seasonality, we further separated the simulation period into winter and vegetation periods and calculated the KGE, NSE as well as the logNSE separately for each of the seasons.In addition, we followed Klemeš (1986) and performed a proxy-basin test to check whether the runoff simulation is transposable within the same hydrological landscape and conducted a split sampling to examine whether the models also work in the hydrological year of 2013.Finally, we judged the model goodness visually for selected rainfall-runoff events.
In a third step we evaluated the model setups against available soil moisture observations.A natural starting point for a modeling study would be to classify the available soil moisture observations for instance by their landscape position.However, similar to the case of the soil water retention properties, the small-scale variability of the soil properties seems to be too dominant, as grouping according to hillslope position was not conclusive (Jackisch, 2015;Appendix A4).We therefore extracted simulated soil moisture at 20 virtual observation points at different downslope positions at the respective depths of the soil moisture observations (10 and 50 cm), and compared the median of the simulated virtual observations against the 12 h rolling median of the observed soil moisture using the KGE and the Spearman rank correlation.Finally, we analyzed simulated transpiration of the Colpach model by plotting it against the 3-day rolling median of the daily sap flow velocities observed in the schist area of the Attert basin.As sap flow is a velocity and transpiration is a normalized flow, they are not directly comparable.This is why we normalized both observed sap flow and simulated transpiration by dividing their values by their range and only discuss the correlation among the normalized values.The visual inspection shows additionally to which extent maximum and minimum values of both normalized time series coincide.This cannot be inferred from the correlation coefficient.4 Results

Normalized double mass curves and discharge
The hillslope models reproduce the typical shape of the normalized double mass curves -the steep, almost linear increase in the winter period and the transition to the much flatter summer regime -in both catchments very well (Fig. 8a  and b).In both catchments subsurface flow is, at 99 % in the Colpach and at 94 % in the Wollefsbach, the dominant form of simulated runoff.The KGEs of 0.92 and 0.9 obtained for the Colpach and the Wollefsbach, respectively, confirm that within the error ranges both double mass curves are explained well by the models.As a major groundwater body is unlikely in both landscapes, a large inter-annual change in storage is not suspected and we hence state that the hillslope models closely portray the seasonal patterns of the water balance of the catchments.This is further confirmed by the close accordance of simulated and observed annual runoff coefficients.We obtain 0.52 compared to the observed value of 0.55 in the Colpach and 0.39 compared to an observed value of 0.42 in the Wollefsbach.
In addition to the seasonal water balances, both models also match observed discharge time series in an acceptable manner (KGE 0.88 and 0.71; Table 3).A closer look at the simulated and observed runoff time series (Figs. 9 and 10) reveals that the model performance differs in both catchments between the winter and summer seasons.Generally we observe a better model accordance during the wet winter season, when around 80 % of the overall annual runoff is generated in both catchments.In contrast, there are clear deficiencies during dry summer conditions.This is also highlighted by the different performance measures which are in both catchments higher during the winter period than during the vegetation period (Table 3).
The Colpach model misses especially the steep and flashy runoff events in June, July and August, and underestimates discharge in summer.It also misses the characteristic double peaks of the catchment as highlighted by runoff events 2 and 3 (Fig. 9).Although the model simulates a second peak, it is either too fast (event 2) or the simulated runoff of the second peak is too small (event 3).This finding suggests that our perceptual model of the Colpach catchment needs to be revised, as further elaborated in the discussion.Another shortcoming is the missing snow routine of CATFLOW which can be inferred from event 1 (Fig. 9, top left panel).While snow  is normally not a major control of runoff generation in the rather maritime climate of the Colpach catchment, the runoff event 1 happened during temperatures below zero and was most likely influenced by snowfall and subsequent snowmelt, which might explain the delay in the observed rainfall-runoff response.
In the Wollefsbach model the ability to match the hydrograph also differed strongly between the different seasons (Table 3; Fig. 10).The flashy runoff response in summer is not always well captured by the model, as for example for a convective rainfall event with rainfall intensities of up to 18 mm 10 min s −1 in August (Fig. 10, event 2).
On the contrary, runoff generation during winter is generally simulated acceptably (KGE = 0.74).Yet, the model strongly underestimates several runoff events in winter too (Fig. 10, event 1).As temperatures during these events were close to zero, this might again be a result of snow accumulation, which cannot be simulated with CATFLOW due to the missing snow or frozen soil routine.It is of key importance to stress that we only achieve acceptable simulations of runoff production in the Wollefsbach when using two different macropore setups for the winter and the summer periods to account for the emergence of cracks (Fig. 4) by using a denser 2 m spacing of macropores.When using a single macropore distance of either 2 m (summer setup) or 4 m (winter setup) in the entire simulation period, the model shows clear deficits with a KGE of 0.61 and 0.53, respectively.Furthermore, we are able to improve the performance of the Wollefsbach model if we use values of saturated hydraulic conductivity faster than 1 × 10 −3 m s −1 for the drainage structures.However, this violates the laminar flow assumption and the application of Darcy's law becomes inappropriate.

Model sensitivities, split sampling and spatial proxy test
Sensitivity tests for the Colpach reveal that the model performance of matching the double mass curves is strongly influenced by the presence of connected rapid flow paths.A complete removal of either the vertical macropores or the bedrock interface from the model domain decreases the model performance considerably (KGE 0.71 or 0.72, respectively).In contrast, reducing the density of vertical macropores from 2 to 3 or 4 m only leads to a slight decrease in model performance (KGE 0.85 and 0.82, respectively).In an additional sensitivity test we changed the bedrock topography from the one inferred from the ERT data to a surface parallel one, which reduces model performance with respect to discharge (KGE < 0.6).
The temporal split sampling reveals that the representative hillslope model of the Colpach also performs well in matching the hydrograph of the previous hydrological year 2012-2013 (KGE = 0.82).Furthermore, the parameter setup was tested within uncalibrated simulations for the Weierbach catchment (0.45 km 2 ), a headwater of the Colpach in the same geological setting.This again leads to acceptable results (KGE = 0.81, NSE = 0.68).The same applies to the representative hillslope model of the Wollefsbach, which also performs well in matching the hydrograph of the previous year (KGE = 0.7).Furthermore, the parameter setup was tested within an uncalibrated simulation for the Schwebich catchment (30 km 2 ), a headwater of the Attert basin in the same geological setting as the Wollefsbach, and again with acceptable results (KGE = 0.81, NSE = 0.7).

Simulated and observed soil moisture dynamics
We compare the ensemble of soil moisture time series from the virtual observation points to the ensemble of available observations (Fig. 11).In the Colpach, soil moisture dynamics are matched well (Spearman rank correlation r s = 0.83).This is further confirmed when comparing this value to the median Spearman rank correlation coefficient of all sensor pairs (r s = 0.66).However, simulated soil moisture at 10 cm depth was systematically higher than the average of the observations.The predictive power in matching the observed average soil moisture dynamics was small (KGE = 0.43; Fig. 11a).

1239
In contrast to the positive bias, the total range of the simulated ensemble appears, at 0.1 m 3 m −3 , much smaller than the huge spread in the observed time series (0.25 m 3 m −3 ).In line with the model performance in simulating discharge, the model has deficiencies in capturing the strong declines in soil moisture in June and July.Simulated soil moisture at 50 cm depth exhibits a strong positive bias and again underestimates the spread in the observed time series.The predictive power is slightly better (KGE = 0.51), while simulated and observed average dynamics are in good accordance (r s = 0.89).
In contrast to what we found for the Colpach, the ensemble of simulated soil moisture at 10 cm for the Wollefsbach falls into the state space spanned by the observations; it only slightly underestimates the rolling median of the observed soil moisture (Fig. 11c).The predictive power is higher (KGE = 0.67) than in the Colpach, while the match of the temporal dynamics is slightly lower (r s = 0.81).Again the model fails to reproduce the strong decline in soil moisture between May and July.It is, however, interesting to note that the model is nearly unbiased during August and September.This is especially interesting since the Wollefsbach model does not perform too well in simulating discharge during this time period.Simulated soil moisture at 50 cm depth shows similar deficiencies as found for the Colpach, while the predictive power was slightly smaller (KGE = 0.44), and the dynamics is also matched slightly worse (r s = 0.79).
When recalling the soil water retention curves (Fig. 7), one can infer that a soil water content of 0.2 m 3 m −3 corresponds to pF around 3.8 in the Colpach and to pF around 4.1 in the Wollefsbach.That in mind it is interesting to note that some observed soil moisture values are below this threshold throughout the entire year.This is particularly the case for soil moisture observation at 50 cm depth in the Colpach, where almost 50 % of the sensors measure water contents close to the permanent wilting point throughout the wet winter period.This also holds true for eight sensors at 10 cm depth.

Normalized simulated transpiration versus normalized sap flow velocities
As sap flow provides a proxy for transpiration, we compared normalized, averaged sap flow velocities of beech and oak trees to the normalized simulated transpiration of the reference hillslope model of the Colpach.The 3-day rolling mean of sap flow data stays close to zero until the end of April and starts to rise after the bud break of the observed trees.The Colpach model is able to match the bud break of the vegetation well.Furthermore, the simulated and observed transpiration fluxes and observations are in good accordance during midsummer.In the period between August and October the simulations underestimate the observations, while in April and May the simulations are too high (Fig. 12).Nevertheless, the model has some predictive power (KGE = 0.65), and is able to mimic the dynamics well (r s = 0.75).

Discussion
The results partly corroborate our hypothesis that single representative hillslopes might serve as parsimonious and yet structurally adequate representations of two distinctly different lower mesoscale catchments in a physically based model.The setups of the representative hillslopes were derived as close images of the available perceptual models and by drawing from a variety of field observations, literature data and expert knowledge.The hillslope models were afterwards tested against streamflow data, including a split sampling and a proxy basin test, and against soil moisture and against sap flow observations.From the fact that streamflow simulations were acceptable in both catchments when being judged solely on model efficiency criteria, one could conclude that the hillslopes portray the dominant structures and processes which control the runoff generation in both catchments well.A look beyond streamflow-based performance measures revealed, however, clear deficiencies in streamflow simulations during the summer season and during individual rainfall-runoff events as well as a mismatch in simulated soil water dynamics.In the next sections we will hence discuss the strengths and weaknesses of the representative hillslope model approach.More specifically, in Sect.5.1 we will focus on the role of soil heterogeneity, preferential flow paths and the added value of geophysical images.In Sect.5.2 we will discuss the consistency of both models with respect to their ability to reproduce soil moisture and transpiration dynamics.Finally, in Sect.5.3 we discuss whether the general idea to picture and model a catchment by a single 2-D representative hillslope is indeed appropriate to simulate the functioning of a lower-mesoscale catchment.

The role of soil heterogeneity in discharge simulations
By using an effective soil water retention curve, instead of accounting for the strong variability of soil hydraulic properties among different soil cores (Sect.2.2.3), we neglect the stochastic heterogeneity of the soil properties controlling storage and matrix flow.This simplification is a likely reason why the model underestimates the spatial variability in soil moisture time series (compare Sect. 5.2.1).However, our approach does not perform too badly in simulating the normalized double mass curves as well as the runoff generation, at least to some extent, in both catchments.Especially during the winter, when around 80 % of the runoff is generated, runoff is reproduced acceptably well.As our models do not represent the full heterogeneity of the soil water characteristics but are still able to reproduce the runoff dynamics in winter, we reason in line with Ebel and Loague (2006) that heterogeneity of soil water retention properties is not too important for reproducing the streamflow generation in catchments.In this context it is helpful to recall the fact that hydrological models with three to four parameters are often sufficient to reproduce the streamflow of a catchment.This confirms that the dimensionality of streamflow is much smaller than one could expect given the huge heterogeneity of the retention properties.This finding has further implications for hydrological modeling approaches as it once more opens the question on the amount of information that is stored in discharge data and how much can be learned when we do hydrology backwards (Jakeman and Hornberger, 1993).Our conclusion should, however, not be misinterpreted that we claim the spatial variability of retention properties to be generally unimportant.The variability of the soil properties of course plays a key role as soon as the focus shifts from catchment-scale runoff generation to, e.g., solute transport processes, infiltration patterns or water availability for evapotranspiration.

The role of drainage structures and macropores in discharge simulations
By representing preferential flow paths as connected networks containing an artificial porous medium in the Richards domain, we assume that preserving the connectedness of the network is more important than the separation of rapid flow and matrix flow into different domains.The selected approach was successful in reproducing runoff generation and the water balance for the winter period in the Wollefsbach and Colpach catchments.Simulations with a disconnected network, where either the saprolite layer at the bedrock interface or the vertical macropores were removed, reduced the model performance in the Colpach model from KGE = 0.88 to KGE = 0.6 and KGE = 0.71, respectively.We hence argue that capturing the topology and connectedness of rapid flow paths is crucial for the simulation of streamflow release with representative hillslopes.We furthermore showed that a reduction in the spatial density of macropores from a 2 to 4 m spacing did not strongly alter the quality of the discharge simulations.This insensitivity can partly be explained by the fact that several configurations of the rapid flow network may lead to a similar model performance.From this insensitivity and the equifinality of the network architecture (Klaus and Zehe, 2010;Wienhöfer and Zehe, 2014) we conclude that it is not the exact position or the exact extent of the macropores which is important for the runoff response, but the bare existence of a connected rapid flow path (Jakeman and Hornberger, 1993).However, our results also reveal limitations of the representation of rapid flow paths in CATFLOW.For instance, model setups with higher saturated hydraulic conductivities (> 10 −3 m s −1 ) of the macropore medium clearly improved the model performance in the Wollefsbach but violated the fundamental assumption of Darcy's law of pure laminar flow.This was likely one reason why capturing rapid flow was much more difficult with the selected approach for the Wollefsbach.Another reason was the emergence of cracks, implying that the relative importance of rapid flow paths for runoff generation is not constant over the year, as highlighted by the findings of dye staining experiments (Fig. 4).Given this non-stationary configuration of the macropore network it was indispensable to use a summer and winter configuration to achieve acceptable simulations.This indicates that besides the widely discussed limitations of the different approaches to simulating macropore flow, another challenge is how to deal with emergent behavior and related non-stationary hydrological model parameters.This is in line with the work of Mendoza et al. (2015), who showed that the agility of hydrological models is often unnecessarily constrained by using static parametrizations.We are aware that the use of a separate model structure in the summer period is clearly only a quick fix, but it highlights the need for more dynamic approaches to account for varying morphological states of the soil structure during long-term simulations.

The role of bedrock topography and water flow through the bedrock
The Colpach model was able to simulate the double peak runoff events which are deemed typical for this hydrological landscape.However, the model did not perform satisfactorily with regard to peak volume and timing.A major issue that hampers the simulation of these runoff events is that the underlying hydrological processes are still under debate.While Martínez-Carreras et al. (2015) attribute the first peak to water from the riparian zone and the second to subsurface storm flow, other researchers (Angermann et al., 2016;Graeff et al., 2009)  The representative hillslope model in its present form only allows simulation of overland flow and subsurface storm flow and not the release of groundwater because of the low permeability of the bedrock medium of 10 −9 m s −1 .The deficiency of this model in reproducing double peak runoff events shows that neglecting water flow through the bedrock is possibly not appropriate (Angermann et al., 2016) and that both the perceptual model and the setup of the representative hillslope for the Colpach need to be refined.We hence suggest that the representative hillslope approach provides an option for a hypothesis-driven refinement of perceptual models, within an iterative learning cycle, until the representative hillslope reproduces the key characteristics one regards as important.
The importance of bedrock topography for the interplay of water flow and storage close to the bedrock was further highlighted by the available 2-D electric resistivity profiles.A model with surface-parallel bedrock topographies performed considerably worse in matching streamflow in terms of the selected performance measures and particularly did not produce the double peak events.This underlines the value of subsurface imaging for process understanding, and is a hint that the Colpach is indeed a fill-and-spill system (Tromp-Van Meerveld and McDonnell, 2006).It also shows that 2-D electric resistivity profiles can be used to constrain bedrock topography in physically based models (Graeff et al., 2009), which can be of key importance for simulating subsurface storm flow (Hopp and McDonnell, 2009;Lehmann et al., 2007).Although we used constrained bedrock topography only in a straightforward, relative manner in this study, our results corroborated the added value of ERT profiles for hydrological modeling in this kind of hydrological landscape.Nevertheless, we are aware of the fact that a much more comprehensive study is needed to further detail this finding.

Storage behavior and soil moisture observations
Both hillslope models reveal much clearer deficiencies with respect to soil moisture observations.While average simulated and observed soil moisture dynamics are partly in good accordance, both models are biased, except for the Wollefsbach model at 10 cm depth.In the Wollefsbach catchment this might be explained by the fact that we use a uniform soil porosity for the entire soil profile, although porosity is most likely lower at larger depths, for instance due to a higher skeleton fraction.This is no explanation for the Colpach catchment as porosity was reduced in deeper layers with respect to the skeleton fraction.In this context it is interesting to note that quite a few of the soil moisture observations are suspiciously low, with average values of around 0.2.The resulting pF values of around 3.8 and 4.1 in the Colpach and Wollefsbach, respectively, indicate dry soils even in the wet winter period.This fact has two implications: the first is that the chosen model is almost not capable of simulating such small values, because root water uptake stops at the permanent wilting point and is small at these pF values.The second is that these sensors may have systematic measurement errors, possibly due to entrapped air between the probe and the soil.This entrapped air decreases the dielectric permittivity close to the sensor (Graeff et al., 2010), which implies that measured values will be systematically too low.From this we may conclude that the average soil moisture dynamics in both catchments might be higher and the spatial variability of soil moisture time series in turn lower, as it appears from the measurements.The obvious mismatch between the observed moisture maxima and the laboratory measurements could justify a reduction of the porosity parameter in the models, which would lead to even better fits.
In addition to the mismatch of the soil moisture simulations, the model fails in reproducing the strong decline in observed soil moisture between May and July 2014.A likely reason for this is that plant roots in the model extract water uniformly within the root zone, while this process is in fact much more variable (Hildebrandt et al., 2016).

Simulated transpiration and sap velocities
It is no surprise that evapotranspiration in our two research catchments is -with a share of around 50 % of the annual water balance -equally important as streamflow.It is also no surprise that evapotranspiration is dominated by transpiration, as both catchments are almost entirely covered by vegetation.However, measuring transpiration remains a difficult task, and a lack of reliable transpiration data often hinders the evaluation of hydrological models with respect to this important flux.While it is possible to calculate annual or monthly evapotranspiration sums based on the water balance, more precise information about the temporal dynamics of transpiration is difficult to obtain.Therefore we decided to evaluate our transpiration routine with available sap flow velocity data, because although the absolute values are somewhat error-prone, the dynamics are quite reliable.We tried to account for the uncertainties of the measurements by deriving a 3-day rolling median of 28 observations instead of using single sap flow velocity measurements.As we are comparing sap flow velocity to the simulated transpiration as a normalized flow, we only compare the dynamics of both variables.It is remarkable that despite the uncertainties in the sap flow velocity measurements and our ad hoc parametrization of the vegetation properties, the comparison of sap flow velocity and simulated transpiration provides additional information, which cannot be extracted from the double mass curve or discharge data.For example, based on the comparison with sap flow velocities we were able to evaluate whether the bud break of the dormant trees was specified correctly by the temperature index model of Menzel et al. (2003); this was not the case when using the default and pre-defined veg-etation table of CATFLOW (not shown).Additionally, we could identify that the spring and autumn dynamics of transpiration, in April as well as in August and September, are matched poorly by the model, while the pattern corresponds well in May, June and July.We attribute this discrepancy to the lack of measured LAI values in spring and autumn and to our simple vegetation parametrization which includes several parameters like root depth or plant albedo that are held constant throughout the entire vegetation period.We are aware that this comparison of modeled transpiration with sap flow velocity is only a first, rather simple test; however, it encourages the use of sap flow measurements for hydrological modeling.It shows furthermore that the concept of a representative hillslope offers various opportunities for integrating diverse field observations and testing the model's hydrological consistency, for example evaluating it against soil water retention data and sap flow velocities.

The concept of representative hillslope models
The attempt to model catchment behavior using a 2-D representative hillslope implies a symmetry assumption in the sense that the water balance is dominated by the interplay of hillslope parallel and vertical fluxes and the related driving gradients (Zehe et al., 2014).This assumption is corroborated by the acceptable yet seasonally dependent performance of both hillslope models with respect to matching the water balance and the hydrographs.We particularly learn that the timing of runoff events in these two catchments is predominantly controlled by the structural properties of the hillslopes.This is remarkable for the Colpach catchment, which has a size of 19.4 km 2 , but in line with Robinson et al. (1995), who showed that catchments of up to 20 km 2 can still be hillslopedominated.
An example of the limitations of our single hillslope approach is the deficiency of both models in capturing flashy rainfall-runoff events in the vegetation period.Besides the existence of emergent structures, these events might likely be caused by localized convective storms, probably with a strong contribution of the riparian zones (Martínez-Carreras et al., 2015) and forest roads in the Colpach catchment, and by localized overland flow in the Wollefsbach catchment (Martínez-Carreras et al., 2012).Such fingerprints of a non-uniform rainfall forcing are difficult to capture by a simulation with a spatially aggregated model, and might require an increase in model complexity.Nevertheless, we suggest that a representative hillslope model provides the right start-up for parametrization of a functional unit when setting up a fully distributed catchment model consisting of several hillslopes and an interconnecting river network.Simulations with distributed rainfall and using the same functional unit parametrization for all hillslopes would tell how the variability in response and storage behavior can be explained compared to the single hillslope.If different functional units are necessary to reproduce the variability of distributed fluxes and storage dynamics, these can for example be generated by stochastic perturbation.We further conclude that the idea of hillslope-scale functional units, which act similarly with respect to runoff generation and might hence serve as building blocks for catchment models, has been corroborated.This is particularly underpinned by the fact that the parametrization of both models was -without tuning -successfully transferred to headwaters in the same geological setting and also worked well for other hydrological years.

Conclusions
The exercise to picture and model the functioning of an entire catchment by using a single representative hillslope proved to be successful and instructive.The picturing approach allowed us to consider both quantitative and qualitative information in the physically based modeling process.This concept made an automated parameter calibration unnecessary and led to overall acceptable streamflow simulations in two lower-mesoscale catchments.A closer look, however, revealed limitations arising from the drawn perceptual models, the chosen hydrological model or the applicability of the concept itself.
Distilling a catchment into a representative hillslope model obviously cannot reflect the entire range of the spatially distributed catchment characteristics.But as the streamflow dynamics of the catchments were simulated reasonably well and the models were even transferable to different catchments, it seems that the use of physically based models and the large heterogeneities in subsurface characteristics must not prevent meaningful simulations.Additionally, our results highlight the importance of considering non-stationarity of catchment properties in hydrological models on seasonal timescales and emphasize once more the value of multiresponse model evaluation.A representative hillslope model for a catchment is, hence, perhaps less accurate than a fully distributed model, but in turn also requires considerably less data and reduced efforts for setup and computation.Therefore, this approach provides a convenient means to test different perceptual models, and it can serve as a starting point for increasing model complexity through a combination of different hillslopes and a river network to model a catchment in a more distributed manner.

Data availability
Data and codes used in this study are available on request from the corresponding author, Ralf Loritz (ralf.loritz@kit.edu).

Figure 1 .
Figure 1.Map of the Attert basin with the two selected headwater catchments of this study (Colpach and Wollefsbach).In addition, the cluster sites of the CAOS research unit are displayed.

Figure 2 .
Figure 2. (a) Typical steep forested hillslope in the Colpach catchment; (b) soil profile in the Colpach catchment after a Brilliant Blue sprinkling experiment was conducted.The punctual appearance of blue color illustrates the influence of vertical structures on soil water movement in this schist area.(c) Plain pasture site of the Wollefsbach catchment; (d) soil profile in the Wollefsbach catchment after a Brilliant Blue experiment showing the influence of soil cracks and vertical structures on the soil water movement.

Figure 3 .Figure 4 .
Figure 3. Perceptual models of the (a) Colpach and (b) Wollefsbach and their translation into a representative hillslope model for CATFLOW.It is important to note that only small sections of the model hillslope are displayed (C Colpach; D Wollefsbach) and not the entire hillslope.

Figure 5 .
Figure 5. Normalized double mass curves for each hydrological year from 2010 to 2014 in the Colpach catchment (a) and from 2011 to 2014 in the Wollefsbach catchment (b).The transition period marks the time of the years when the catchment shifts from the winter period to the vegetation period.The separation of the seasons is based on a temperature index model from Menzel et al. (2003).Since the season shift varies between the hydrological years the transition period is displayed as an area.

2. 2
Database 2.2.1 Surface topography and land use Topographic analyses are based on a 5 m LIDAR digital elevation model which was aggregated and smoothed to 10 m resolution.Land use data from the Occupation Biophysique du Sol are based on CORINE land use classes analyzed by color infrared areal images published in 1999 by the Luxembourgian surveying administration, Administration du cadaster et de la Topographie, at a scale of 1 : 15 000.

Figure 6 .
Figure 6.(a) Profile of all hillslopes extracted from a DEM in the Colpach catchment.Hillslope profile we used in this study highlighted in blue.(b) Bedrock topography of a hillslope in the schist area measured using ERT.The contour line displays the 1500 m isoline which is interpreted as the soil-bedrock interface.

Figure 7 .
Figure 7. Fitted soil water retention curves and measured soil water retention relationships for the Colpach (a) and Wollefsbach (b) catchments.

Figure 8 .
Figure 8. Simulated and observed normalized double mass curves of (a) the Colpach catchment and (b) the Wollefsbach catchment.The double mass curves are separated into a winter period and a vegetation period following Menzel et al. (2003).

1237Figure 9 .
Figure 9. Observed and simulated runoff of the Colpach catchment.Moreover, three rainfall-runoff events are highlighted and displayed separately.

Figure 10 .
Figure 10.Observed and simulated runoff of the Wollefsbach catchment.Two rainfall-runoff events are highlighted and displayed separately.
Figure 11.Observed soil moisture at 10 and 50 cm depths in the schist (a, b) and marl (c, d) areas of the Attert catchment.Additionally the 12 h rolling median (black) derived from the soil moisture observations and the simulated soil moisture dynamics at the respective depths (red Colpach; orange Wollefsbach) are displayed.

Figure 12 .
Figure 12.Normalized observed average sap velocities of 28 trees in the Colpach catchment (green) and normalized simulated transpiration from the Colpach model smoothed with a 3-day rolling mean (dashed blue).Additionally the ensemble of all 28 sap flow measurements is displayed in grey.

Table 1 .
Hydraulic and transport parameter values used for different materials in the model setups.
Hassler et al. (2017)ime flux, we use 12 h daily means between 08:00 and 20:00 LT.For further technical details on the sap flow measurements, seeHassler et al. (2017).1232 R. Loritz et al.: Picturing and modeling catchments by representative hillslopes
To this end we 1234 R. Loritz et al.: Picturing and modeling catchments by representative hillslopes

Table 3 .
Benchmarks for simulated double mass curves and simulated discharge for all model setups used in this study.