Interactive comment on “ Mapping dominant runoff processes : an evaluation of different approaches using similarity measures and synthetic runoff simulations ”

Abstract. The identification of landscapes with similar hydrological behaviour is useful for runoff and flood predictions in small ungauged catchments. An established method for landscape classification is based on the concept of dominant runoff process (DRP). The various DRP-mapping approaches differ with respect to the time and data required for mapping. Manual approaches based on expert knowledge are reliable but time-consuming, whereas automatic GIS-based approaches are easier to implement but rely on simplifications which restrict their application range. To what extent these simplifications are applicable in other catchments is unclear. More information is also needed on how the different complexities of automatic DRP-mapping approaches affect hydrological simulations. In this paper, three automatic approaches were used to map two catchments on the Swiss Plateau. The resulting maps were compared to reference maps obtained with manual mapping. Measures of agreement and association, a class comparison, and a deviation map were derived. The automatically derived DRP maps were used in synthetic runoff simulations with an adapted version of the PREVAH hydrological model, and simulation results compared with those from simulations using the reference maps. The DRP maps derived with the automatic approach with highest complexity and data requirement were the most similar to the reference maps, while those derived with simplified approaches without original soil information differed significantly in terms of both extent and distribution of the DRPs. The runoff simulations derived from the simpler DRP maps were more uncertain due to inaccuracies in the input data and their coarse resolution, but problems were also linked with the use of topography as a proxy for the storage capacity of soils. The perception of the intensity of the DRP classes also seems to vary among the different authors, and a standardised definition of DRPs is still lacking. Furthermore, we argue not to use expert knowledge for only model building and constraining, but also in the phase of landscape classification.

Abstract.The identification of landscapes with similar hydrological behaviour is useful for runoff and flood predictions in small ungauged catchments.An established method for landscape classification is based on the concept of dominant runoff process (DRP).The various DRP-mapping approaches differ with respect to the time and data required for mapping.Manual approaches based on expert knowledge are reliable but time-consuming, whereas automatic GIS-based approaches are easier to implement but rely on simplifications which restrict their application range.To what extent these simplifications are applicable in other catchments is unclear.More information is also needed on how the different complexities of automatic DRP-mapping approaches affect hydrological simulations.
In this paper, three automatic approaches were used to map two catchments on the Swiss Plateau.The resulting maps were compared to reference maps obtained with manual mapping.Measures of agreement and association, a class comparison, and a deviation map were derived.The automatically derived DRP maps were used in synthetic runoff simulations with an adapted version of the PREVAH hydrological model, and simulation results compared with those from simulations using the reference maps.
The DRP maps derived with the automatic approach with highest complexity and data requirement were the most similar to the reference maps, while those derived with simplified approaches without original soil information differed significantly in terms of both extent and distribution of the DRPs.The runoff simulations derived from the simpler DRP maps were more uncertain due to inaccuracies in the input data and their coarse resolution, but problems were also linked with the use of topography as a proxy for the storage capacity of soils.
The perception of the intensity of the DRP classes also seems to vary among the different authors, and a standardised definition of DRPs is still lacking.Furthermore, we argue not to use expert knowledge for only model building and constraining, but also in the phase of landscape classification.

Introduction
Conceptual rainfall-runoff models perform well on gauged basins but appear to be limited in reproducing the hydrological behaviour of ungauged catchments (Hrachowitz et al., 2013).Expert knowledge about the different runoff processes that can occur on a catchment can improve the hydrological simulations for such ungauged basins.For example, it can be used to design process-tailored model structures aiming to be right for the right reason (Klemeš, 1986).Furthermore, it can help to reduce the need for calibration by constraining the parameter values or modelled output to guarantee consistency with the reality (Franks et al., 1998;Seibert and McDonnell, 2002;Gharari et al., 2014;Hrachowitz et al., 2014).Hydrological classifications based on landscapes with similar hydrological behaviour can be useful regionalisation tools for predictions in ungauged basins.In this case, once a Published by Copernicus Publications on behalf of the European Geosciences Union.model structure and its parameters have been identified for each landscape in a gauged catchment, they are transferred to an ungauged catchment where the landscapes have similar hydrological behaviour (e.g.Beran, 1990;Mosley, 1981;Viviroli et al., 2009a).
In recent decades, several methods have been developed to quantify the spatial extent and to identify the distribution of areas where a specific runoff process occurs.The topographic wetness index (Beven and Kirkby, 1979), as an example of index-based methods, allows areas prone to saturation overland flow (SOF) to be identified using only topographical information.Similarly, Woods et al. (1997) developed a topographic index for areas where subsurface flow (SSF) occurs.Another well-established methodology involves the explicit definition of hydrological response units (HRUs), which can be identified according to geological, ecological, pedological, and/or topographical criteria (e.g.Ross et al., 1979;Flügel, 1995).For example, Markart et al. (2011) developed a method for assessing surface runoff coefficients and surface roughness in the case of extreme precipitation events.Similarly, Dobmann (2010) introduced a way to map runoff disposition, defined as "the tendency of water to become displaced downstream due to gravity in such a way as to cause damage" (Kienholz, 1998).
Although these methods represent an important basis for the determination of runoff peaks and return periods of flood events, they cannot reproduce the full range of runoff responses that can be observed on a site.To improve the HRU approach, several hydrological classifications have been developed based on the concept of dominant runoff process (DRP), i.e. the runoff generation mechanism that contributes most to runoff (Blöschl, 2001).
DRP classifications may be manual or automatic (Table 1).Manual approaches are based on extensive field investigations and the interpretation and upscaling of the results on expert knowledge (e.g.Scherrer and Naef, 2003).In contrast, automatic methods generally rely on GIS and on algorithms based on simplifications of expert knowledge (e.g.Peschke et al., 1999).
Automatic approaches differ in which data they require.Some rely on topographical information only (e.g.Gharari et al., 2011), while others use all the available information for an area (e.g.Schmocker-Fackel et al., 2007).The data requirement is closely linked to the time it takes to map the DRPs, ranging from a few hours with simple data input to months if the data are derived from extensive field investigations (e.g.Tezlaff et al., 2007).
The output classes of the classifications also differ.All methods distinguish at least between infiltration excess (Hortonian) overland flow (HOF) and SOF, and between SSF and deep percolation (DP) (e.g.Gharari et al., 2011;Gao et al., 2014).Several approaches also provide information on the intensity of the SOF and SSF processes, where the numbers from 1 to 3 represent the delay in their reaction to rainfall, with 1 representing an almost immediate reaction, 2 a slightly delayed one, and 3 a strong delayed one (e.g.Scherrer and Naef, 2003;Schmocker-Fackel et al., 2007;Müller et al., 2009;Hümann and Müller, 2013).Boorman et al. (1995), however, classified expected hydrological behaviour according to 29 classes in the Hydrology Of Soil Types classification of Great Britain.
Several algorithms have been developed exclusively for specific catchments, and are therefore not suitable for regionalisation purposes.For instance, Tilch et al.'s (2002) classification is based on the genesis of the hillslope and its covering material.Similarly, Waldenmeyer (2003) determined DRPs from a forestry site map, and Gao et al. (2014) linked the presence of forest to the hillslope exposition in the barely inhabited upper Heihe catchment in China.These simplifications limit the applicability of the methods to other catchments.
All these methods aim to map the spatial distribution of DRPs in a realistic way, but only a few have investigated the transferability of the algorithms to other catchments.Furthermore, it remains unclear how the different time and data re-quirements of the mapping approaches affect hydrological simulations.The objective of this paper is therefore to (i) test the suitability of different automatic DRP-mapping approaches for mapping ungauged catchments, and to (ii) quantify the uncertainty of hydrological simulations due to different spatial representations of DRPs.
DRP maps were produced for two catchments on the Swiss Plateau using the automatic approaches of Schmocker-Fackel et al. (2007), Müller et al. (2009), and Gharari et al. (2011).These were then compared with reference maps produced using manual mapping according to Scherrer and Naef (2003).To assess how similar the automatically derived DRP maps are to the reference maps, a measurement of agreement, Fuzzy Kappa (Hagen-Zanker, 2009), a measurement of association, Mapcurves (Hargrove et al., 2006), and a class comparison were carried out.Furthermore, the effects of the differences between the DRP maps on synthetic runoff simulations were investigated with an adapted version of the well-established PREVAH model (Viviroli et al., 2009b).

Study sites
Our analyses are performed on two small catchments on the Swiss Plateau.Dorfbach Meilen is a creek which drains a 4.6 km 2 catchment and flows into Lake Zurich (Fig. 1).The elevation of the catchment ranges from 409 to 850 m a.s.l.It is mainly covered by grassland (49.4 %) and forest (39 %) and, to a lesser extent, arable land (3.6 %) and settlements (8 %).The basin is characterised by the Upper Freshwater Molasse, with conglomerate in the shallow subsurface (Hantke et al., 1967).A large part of the catchment is covered by brown-earth soils with normal permeability and storage capability.Soils with less permeable soils and wetlands are less widespread but play an important role in runoff generation.
The Reppisch catchment up to Birmensdorf is situated in the south-west of the Canton of Zurich, Switzerland (Fig. 2).It has an area of 22 km 2 , of which 48 % is covered by forest, 42 % by grassland, and 7 % by settlements.The elevation of the catchment ranges from 467 to 894 m a.s.l.The geological substructure of the catchment forms the Upper Freshwater Molasse, composed of sandstone and marl, and is covered in most cases by glacial sediments (Hantke et al., 1967;Pavoni et al., 1992;Bolliger et al., 1999).Gravel deposits can be found along the Reppisch River, while a number of smaller alluvial fans were accumulated by its many tributaries.Brown-earth soils with normal permeability and storage capability cover most of the catchment, while soils with low permeability are less widespread.3 Data and methods

DRP-mapping approaches
Manually derived DRP maps based on the decision scheme of Scherrer and Naef (2003), referred to here as SN03 maps, are available as shape files for both study sites and were used as reference maps (Figs.3a and 4a).These DRP maps are developed in different steps as follows.(1) Information about the land use, vegetation, soil, geology, hydrogeology, and topography of the catchment is collected.(2) Based on these data, the DRPs are initially estimated using expert knowledge, and locations where estimations are not straightforward are identified.(3) On these sites, soil profiles are investigated and the DRP at the plot sites identified according to the decision schemes for long-lasting events, i.e. with precipitation intensity less than ca.20 mm h −1 , of Scherrer (2006).(4) After the analysis of the field investigations, the DRPs can be determined for the hillslopes and finally for the whole catchment.(5) The DRPs are reclassified into five different runoff types (RTs) with respect to the runoff intensity (Table 2).Schmocker-Fackel et al. (2006) developed a strategy to simplify the decision schemes of Scherrer and Naef (2003) and determine the DRPs automatically within a GIS environment.Basically, the method relies on a soil map with high resolution (1 : 5000) of the Canton of Zurich and information about the soil water regime, soil depth, and the soil's physical and chemical properties.Where information on soil is lacking, an expert-based soil prediction model was used to derive DRPs from information about forest communities, the slope and shape of hillslopes, the surface water network, and the geology (Margreth et al., 2010).This step is relatively time-consuming, since the soil prediction model has to be adapted to each catchment according to the information available.Therefore, several days of fieldwork are necessary.The DRP maps derived with this approach for this study are Table 2. Reclassification of DRPs according to runoff types (HOF: Hortonian overland flow; SOF: saturation overland flow; SSF: subsurface flow; DP: deep percolation; 1 represents an almost immediate reaction, 2 a slightly delayed one, and 3 a strongly delayed one).Adapted from Naef et al. (2000).

Runoff type (RT) DRP
Runoff intensity Strongly delayed 5 DP Not contributing available as shape files, referred to hereafter as SF07 maps (Figs.3b and 4b).Müller et al. (2009) proposed a further simplification of Schmocker-Fackel et al.'s (2007) approach based on GIS and valid for prolonged rainfall events.The method combines information on the permeability of the geological substratum, land use, and slope, but excludes soil information.It results in the same DRP classes as those proposed by Scherrer and Naef (2003), and involves, first, using a DTM analysis to identify classes of slopes; then, classifying the geological substrata of the catchments as either permeable or impermeable; and finally, combining the pre-processed digital data to obtain the DRP (Table 3).Hümann and Müller (2013) extended the approach proposed by Müller et al. (2009) to forested areas and to different event types.Since the reference maps refer to long-lasting events, Müller et al.'s (2009) approach was used in this study.
DRP maps based on Müller et al. (2009), referred to here as MU09 (Figs. 3c and 4c), were derived for the two study sites with a spatial resolution of 25 m based on the following assumptions.(i) Riparian zones, i.e. the spots around the river network, were classified as SOF1.The extension of these areas was defined by taking into consideration the cells with a height above the nearest drainage (HAND), i.e. the height of a DTM cell less the elevation of the river network where the cell drains (Rennó et al., 2008), that is, lower than 1.2 m. (ii) Settlement areas were not considered in the current study as the resolution of the land-use map used (100 m) was not high enough to obtain a realistic representation of their spatial distribution.
Table 3. Dependency of the DRP on the slope and permeability of the substratum for grassland, arable land and forest, according to Müller et al. (2009).

Slope
Impermeable substratum Permeable substratum (%) Grassland and arable land Forest Grassland, arable land, and forest As a further simplification, topography-based classifications were developed with the assumption that the topography can be seen as a proxy for the geology, soil, land use, climate and, consequently, DRPs (Savenije, 2010).In addition to traditional topographical descriptors (e.g.elevation, slope, and exposition), these methods are based on the HAND value, which represents, in turn, a rearrangement of the "elevation-above-stream" proposed by Seibert and McGlynn (2006).HAND-based classifications have been used to define classes of soil water environments, where a single runoff generation mechanism dominates (Nobre et al., 2011;Gao et al., 2014).Gharari et al. (2011) found that the combination between HAND and slope provided the most suitable descriptors for a topography-based classification of DRPs.The mapping approach distinguishes between three landscape classes.Areas below a certain HAND threshold value are called "wetland" (subject to SOF).The remaining regions are further divided into two classes: "hillslope", subject to SSF, and "plateau", subject to DP, depending on whether the slope is above or below a certain threshold value.Since these threshold values are not unconditionally transferable to other catchments, a sensitivity analysis was carried out on both study sites.Different combinations of threshold values were tested, and the resulting maps were compared with SN03 at a spatial resolution of 25 m.We selected the maps with the best Mapcurve score (cf.Sect.3.2) for this study, and refer to them as GH11 (Figs. 3d and 4d).The threshold values obtained are in agreement with those of Gharari et al. (2011) in a central European catchment (Fig. A1).

Map comparison
To test the suitability of different approaches for automatically mapping the DRPs on ungauged catchments, a class comparison between automatically derived DRP maps and the reference maps was carried out for the two study sites.The percentage of total catchment area assigned to each RT, and the percentages of discrepancy between the RTs in the automatic DRP maps and those in the reference maps were calculated.To deal with the difference in the number of classes between the GH11 maps and reference maps, an expedient step was introduced.Since none of the three classes  of GH11 maps (wetland, hillslope, and plateau) is necessarily comparable to a specific class of the reference maps, the five RTs of the SN03 maps were reclassified into three classes covering every possible combination (Table A1), resulting in six new reference maps.These were compared one by one with the GH11 maps.In addition, the discrepancies between the MU09 maps and the reference maps were highlighted in a deviation map to identify the spots where the difference in the RTs is greater than 2 and to help identify the possible causes of incorrect mapping.
To account for fuzziness in the definition of the RTs, a measure of agreement, fuzzy kappa (K Fuzzy ), was used.The method was proposed by Hagen-Zanker (2009) to extend the well-established Cohen kappa (Cohen, 1960) and to take into account the fuzziness of categories, allowing some pairs of classes to be more similar than others, as well as the fuzziness of location, given that cells tend to be at least slightly spatially correlated.To take the fuzziness of categories into account, a similarity matrix was defined, where each pair of classes was assigned a number between 0 (totally distinct) and 1 (completely identical).The extent to which neighbouring cells influence the cell in question is defined by a distance decay function.An overall measure of similarity between two maps can be obtained by using the following equation: where P represents the mean agreement of the two compared maps weighted by the expected agreement E. K Fuzzy ranges from 0 (fully distinct maps) to 1 (fully identical maps).For this study, the fuzzy kappa algorithm implemented in the Map Comparison Kit 3 software (Visser and de Nijs, 2006) was used.We assumed that contiguous RTs are similar to some extent, and the corresponding degree of similarity was set to 0.25.An exponential decay function with a halving distance of one cell is adopted.
Given that the number of classes in the GH11 map is different from that in the reference maps, the goodness-of-fit (GOF) measure called Mapcurves (Hargrove et al., 2006) was used to quantify the degree of spatial concordance between the automatic DRP maps and the reference maps.For each of the existing classes in two maps, a GOF score (unitless) was calculated according to the following equation: where A is the total area (m 2 ) of a given class X on the map being compared, B is the total area (m 2 ) of a class Y on the reference map, C is the intersecting area (m 2 ) between X and Y when the maps are overlaid, and n is the total number of classes on the reference map.The sum of this product gives a GOF value for a particular class.The overall Mapcurves (MC) score is given by the area under the curve obtained by plotting the GOF scores on the abscissa and the percentage of map classes with a GOF score larger than a particular value on the ordinate.An MC score of 1 represents a perfect fit, while an MC score of 0 means that there is no spatial overlap between the classes of two maps.Both the shapes of the Mapcurves and MC scores differ when the compared map is used as a reference map.This is because the MC score depends on the average size and number of the patches in each class of the maps being compared.Hargrove et al. (2006) argue that the combination of compared map and reference map that has the highest MC score must be chosen.However, by doing so, the coarser maps would be advantaged.Therefore, for this study, SN03 maps were always set as reference maps.A detailed description of the two similarity measures is reported in Hagen-Zanker (2009) and Hangrove et al. (2006), while applications in hydrology are described in Speich et al. (2015) and Jörg-Hess et al. (2015).
To identify those landscapes where automatic approaches perform better, the comparison measures were applied to the single sub-catchments, at a high spatial resolution, to take into account the added value of the finest maps.For this reason, the shape files were rasterised and the coarser maps were resampled to a grid resolution of 2 m.

Synthetic runoff simulations
To assess how the differences between the automatic DRP maps affect a hydrograph, synthetic runoff simulations were carried out.This approach was inspired by Weiler and Mc-Donnell (2004), who suggested using numerical experiments to isolate hypotheses and investigate their influence on the model output.In a recent review paper, Fatichi et al. (2016) acknowledge these studies to be different from the ones aiming at comparing performances of different models or validating model results.The word "synthetic" implies therefore that the focus is exclusively on how the different DRP maps influence the simulated runoff, and not on how well the model reproduces a measured discharge.The model used for this study is an adapted version of the runoff generation module of the PREVAH model (Viviroli et al., 2009a).It is distributed (500 m grid resolution) to take into account the spatial variability of the input data, which consists of a combination of radar and traditionally measured rainfall data (Sideris et al., 2014).For each cell, the percentage of each RT is taken into account to avoid losing information because of the grid resolution.
The model does not take interception, evapotranspiration, and soil moisture into consideration (Fig. 5).The rainfall directly recharges the upper zone (unsaturated) runoff storage (SUZ), where the storage times for the surface runoff (K0H) and subsurface runoff (K1H) regulate the generation of the runoff.The threshold for quick runoff formation (SGR-LUZ) determines the separation between surface runoff (R0) and subsurface runoff (R1).A maximum percolation rate (CPERC) controls the percolation to the groundwater storage, which is divided into a quick-leaking storage (SLZ1) and two slow-leaking storages (SLZ2 and SLZ3; Schwarze et al., 1999).The storage capacity of SLZ1 is limited by a maximal storage charge (SLZ1MAX), while its contribution to the slow runoff (R2) is regulated by the storage time for quick baseflow (CG1H).SLZ2, which only receives the fraction of percolation not absorbed by SLZ1, is controlled by the storage time for slow baseflow (K2H).With this model configuration, it is possible to detect the effects of differences between the different maps in terms of both extent and distribution of RTs.The difference in extent of RTs gives more weight to one or another of the parameter sets.If the RT extent is the same, the location of the RTs on the catchment plays a role, since the rainfall input can vary from cell to cell.
We assume that the properties of the different RTs can be represented by varying the parameter values of the model employed.For example, the tendency for RT1 and RT2 to generate overland flow was represented by assigning low values of SGRLUZ and CPERC.Furthermore, the K0H values assigned to RT1 and RT2 were set as low since the fast contributing areas were assumed to be close to the river network.In areas where either HOF or DP dominates, the subsurface flow was neglected and K1H was set to higher values (e.g.1000 h).As the baseflow generation does not necessarily depend on the RTs, the parameters of the SLZ1, SLZ2, and SLZ3 were defined a priori as averaged values for both catchments and kept constant for the simulations.The values selected were based on the results of Viviroli et al. (2009a), who identified a range of suitable values for each parameter of PREVAH for flood estimation in ungauged mesoscale catchments in Switzerland.
To investigate the sensitivity of the model output with respect to the definition of parameter values based on the RTs, the parameters were defined in a stepwise process, resulting in 16 different parameter combinations (Table A2).First, the five RTs were assigned the same set of parameter values and no information about the RTs was thus included.In the second step, the value of each parameter controlling the SUZ was defined with respect to the RT one at the time, and the value of the other parameters was left unchanged.The same procedure was then repeated by defining the values based on the RTs of two, three, and finally all the parameters at the same time.As in the class comparison (see Sect. 3.2), an expedient step was introduced to take into account the fact that there were fewer classes of GH11 maps.Every possible combination of the five predefined values for each parameter was covered, provided that the parameters fulfilled the following condition: This resulted in 10 different runs for each parameter combination (Table A3), with one exception: the storage time for the subsurface flow K1H.This was set at 1000 h for wetland (SOF) and plateau (DP), since no subsurface flow was expected there.
Synthetic simulations were carried out on the two study sites over the time period which ranges from 16 June 2014 to 15 August 2014.A modified version of the Nash-Sutcliffe efficiency (NSE; Nash and Sutcliffe, 1970), in which the observed runoff is replaced by the runoff simulated with the reference maps, was therefore used as an objective function (Eq.4).

Results
According to the reference (SN03) maps, the two study sites differ slightly in their RT distributions (Fig. 6).In the Reppisch catchment, areas with a delayed runoff contribution (RT3) prevail (45 % of the catchment area), while, in the Meilen catchment, areas with a strongly delayed runoff contribution (RT4) cover 55.3 % of the catchment.SF07 maps reproduce the RT distribution fairly, although they slightly overestimate the fast contributing areas (RT1) and underestimate the areas with a strongly delayed contribution (RT4) in the Meilen catchment.The RT distributions of the MU09 maps deviate from the one of the reference maps.They considerably overestimate the delayed contributing areas (RT3) and, to a lesser extent, the fast ones (RT1), at the expense of the remaining RTs.The runoff contribution is consistently overestimated, especially in the Meilen catchment, whereas in 64 % of the whole catchment the RT is faster compared with the SN03 map (Fig. 7).
The distribution of landscape classes of GH11 maps in the Meilen catchment (Fig. 6b) agrees well with the reference map, if the landscape class "hillslope" is assumed to correspond to RT3, "wetland" to the union of RT1 and RT2, and "plateau" to both RT4 and RT5.However, this consideration no longer holds true in the Reppisch catchment, where the percentage of the total catchment mapped as "hillslope" (68 %) markedly exceeds the one mapped as RT3 in the reference map (45 %).Considering each possible reclassification into three classes of the five RTs of the SN03 maps (Table A1), the GH11 maps, on average, estimate the runoff contribution to be lower than the SN03-map estimate (Fig. 7).
Figure 8 shows a map of the Reppisch catchment highlighting areas where the discrepancy between the RTs in the MU09 map and the SN03 map is higher than 2 (Table 4).The RT assigned to area 1 is too fast as the glacial sediments were assumed to be always impermeable.Similarly, area 3 was mapped as a non-contributing area as the alluvium was assumed to be always permeable.However, previous investigations showed the local permeability of the glacial sediments was high, and the one of the alluvium was low due to clayish sediments (Scherrer AG, 2006).Area 2 is located on a steep hillslope and is therefore mapped as contributing with a slight delay.In contrast, area 4 is on a flat plateau, so that its contribution to the runoff was assumed to be strongly delayed.However, field investigations found the soil was very thick, indicating a high storage capacity in area 2. In contrast, q q q q q q q q q q q q q q q q q q q q q SF07 MU09 GH11 Reppisch Meilen q q Slower than SN03 Faster than SN03  the mixture of brown earth, stagnosol, and gleysol resulted in a low storage capacity in area 4 (Scherrer AG, 2006).In area 5, the river network derived with the DTM analysis differs considerably from the actual river path.The runoff contribution there was therefore overestimated by MU09.Similarly, the runoff contribution of area 6 was overestimated because the depiction of the lake was wrong due to the coarse resolution of the land-use map.
The measures of association and agreement obtained by comparing the automatically derived DRP maps with the reference maps for the sub-catchments of the two study areas Table 4. List of areas identified in Fig. 8 with the automatically and manually derived DRPs (RTs), and a possible explanation for their deviation.
Area DRP (RT) on MU09 map DRP (RT) on SN03 map Explanation differ (Fig. 9).The scores of the SF07 maps are higher than those obtained by the comparison of MU09 maps and GH11 maps with the reference maps.The highest scores in the Reppisch catchment were in sub-catchment 1 due to the presence of a lake, which is mapped as RT1 in every mapping approach.As the values of the MC score obtained with MU09 maps and GH11 maps are nearly equal, these two mapping approaches seem to be interchangeable for both of the study areas.
Comparing the MC scores for each RT reveals which RTs can be clearly identified by the automatic mapping approaches (Fig. 10).The higher MC scores for classifications with the same number of classes should ideally be located along the main diagonal of the output matrices, meaning that each RT of an automatically derived DRP map is spatially best associated with its equivalent in the reference map.This is mainly the case for the SF07 maps, with the exception of the fast RT1 and RT2.These are identified as more similar to the next slower RTs of the reference maps.The MU09 maps's overestimation of the general runoff intensity of the whole catchment can be attributed to RT2 and RT4 in the Reppisch catchment and RT1 and RT3 in the Meilen catchment.These were spatially associated with the next slower RTs of the reference map.On both study sites, landscape classes "wetland", "hillslope", and "plateau" of the GH11 maps fit best with RT2, RT3, and RT4 of the reference maps, respectively.
Since the extent and distribution of areas with the same RT differ, using automatically derived DRP maps in runoff simulations affects the results of the simulations themselves (Fig. 11).Simulations driven by the SF07 maps showed the smallest deviation in comparison with simulations driven by the SN03 maps.The tendency of the MU09 maps to overestimate the runoff contribution (Fig. 7) led to higher peaks in the Meilen catchment since overland flow was activated in areas with delayed runoff contribution during the two heavy rainfall events on 21 July 2014 and 10 August 2014 (Fig. 12).This did not happen in the Reppisch catchment as the precipitation intensity in the catchment was lower.The GH11 maps were very sensitive to the storage time for subsurface flow K1H due to the consistency assumption; that is, no interflow is expected in wetland and plateau areas, which are prone to SOF and DP, respectively.As a result, too much water remained in the storage and runoff peaks were mostly underestimated.

Discussion
One of the main purposes of this study was to test how well automatic approaches can map small catchments.The most complex automatic DRP map, i.e. the one derived according to Schmocker-Fackel et al. (2007), proved to be most similar to the reference maps derived manually with Scherrer and Naef (2003), according to both the class comparison and the similarity measures.This result is not surprising, considering that the method of Schmocker-Fackel et al. (2007) was developed for the Canton of Zurich, where the two catchments of the present study are located.However, the method was successfully tested also outside the Canton of Zurich (e.g. on the Swiss Prealps, Scherrer et al., 2013).
The DRP maps derived with simplified mapping approaches that included no soil information differed significantly in terms of both the extent and distribution of the DRPs from the reference maps.These differences are clearly linked to the quality of the input data.Geological maps are often not fine enough to depict geological formations and 1.0 q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q SF07 MU09 GH11 q q Reppisch Meilen q Figure 11.Modified NSE obtained by comparing the runoff simulated with the automatic DRP maps with that simulated with the reference maps, in the two study sites (simulation period 16 June 2014-15 August 2014).The boxplots show the medians and the interquartile ranges of the simulations driven by GH11 maps, while the labels on the abscissa show the model parameters whose values were defined based on the RTs.
possible variations in permeability within the same formation.Furthermore, if the resolutions of the DTM and the landuse map are too coarse, significant biases may result.However, using input data with high resolution would not necessarily improve the results if the classification concept itself is too coarse and generic.Since topography does not seem to be a good proxy for the storage and infiltration capacity of the soils on the study sites, the approaches developed by Müller et al. (2009) and Gharari et al. (2011) often overestimated the runoff intensity on steep sites and underestimated it on flat sites.These approaches were developed on basins, located in Rhineland-Palatinate (Germany) and in the Grand Duchy of Luxembourg, with different soil properties and event charac-teristics than those investigated for this study.However, the adaptation of these classifications to the characteristics of our study sites (e.g. by adding or removing input data and modifying the classification criteria accordingly) was beyond the scope of this study.
The high MC scores obtained by certain pairs of different RTs (Fig. 9), as well as the visual inspection of the DRP maps, suggest that the perception of the intensity of DRPs varies among different authors.For example, the riparian zones on the reference maps were mostly mapped as RT2, but, where they were completely saturated and at least slightly sloped, they were mapped as RT1.In contrast, on MU09 maps and on SF07 maps, the riparian zone was mostly  A2).The bands represent the minimum and maximum runoff values obtained with the different parameter combinations for the simulations driven by GH11 maps.mapped as RT1.Similarly, areas prone to DP on GH11 maps fitted best with RT4 areas of the reference maps, which represent areas where strongly delayed SOF or SSF, but not DP, occur.Since a straightforward, standardised definition of DRPs is missing, not only do the classification criteria vary, but also the classes.This can be misleading, especially if different classes have the same DRP names.The MC-score ranking of the automatic mapping approaches is similar to the fuzzy kappa ranking, but the differences between the MC scores were not as significant as those between the fuzzy kappa values (Fig. 9).This is because the degree of association of the maps we compared is moderate.In this case, significant increases in the degree of overlap entail only small increases in the MC score (see Fig. 1 in Hargrove et al., 2006).This problem was also encountered by Speich et al. (2015).There is therefore a need for a goodness-of-fit score capable of comparing maps with different numbers of classes, while detecting improvements, as well, even if the degree of spatial overlap between maps being compared is moderate.
To keep the rainfall-runoff model as simple as possible, strong assumptions had to be made.These included no interception, no evapotranspiration, and completely saturated catchments.A calibration against measured runoff would thus have been meaningless.However, recent studies suggest that using expert knowledge in selecting parameter val-ues and introducing constraints can increase the performance of conceptual models even without traditional calibration (Bahremand, 2016;Gharari et al., 2014;Hrachowitz et al., 2014).Therefore, the choice of realistic parameter values according to Viviroli et al. (2009a) and the introduction of parameter constraints allow the simulation results obtained to be plausible.The complexity of the model structure is usually linked to the complexity of the DRP-mapping approaches.Two research directions have recently received attention, one using expert knowledge mainly in the phase of DRP identification and the other using this knowledge in the modelling phase.Hellebrand et al. (2011) used expert knowledge to determine the spatial distribution of DRPs as realistically as possible, as they assumed that with a more realistic DRP classification the modules representing each DRP in the model could be simplified.Gharari et al. (2014), in contrast, adopted a relatively complex combination of modules and fluxes to compensate for the rather simple classification they used.They, then, used expert knowledge to constrain both the model fluxes and parameters, to force the model to work well for the right reason by neglecting the actual spatial localisation of the DRPs.
In this study, the same model structure and model constraints were applied to different DRP-mapping approaches.By doing so, it was possible to investigate the effects of a specific uncertainty source (i.e. the DRP maps) on the sys-tem output (i.e. the simulated runoff) while keeping the other uncertainty sources fixed.
As the results indicate, the simplified classification approaches mostly fail in representing the spatial localisation of the DRPs and have a large impact on the simulated runoff.This finding suggests that investing more efforts in the landscape classification could enhance runoff predictions on ungauged catchments by improving the model realism.This topic will be further investigated during future research, by addressing the uncertainties linked to different input data, model structures, model parameters, and model constraints, as well as their interaction.

Conclusions
Mapping DRPs manually produces robust results, but is timeconsuming.Several ways of mapping DRPs automatically have been developed.They differ in terms of how much input data they require for mapping, their classification criteria, and the number of output classes.
In this study, three approaches to mapping DRPs automatically were compared in two catchments on the Swiss Plateau to determine which one produces the most realistic results.The DRP maps derived automatically with the most complex and most data-demanding approach (Schmocker-Fackel et al., 2007) were most similar to the reference maps derived according to the manual approach based on Scherrer and Naef (2003), and resulted in the lowest deviations from them when used as input data for synthetic runoff simulations.The DRP maps produced using Müller et al.'s (2009) simplified mapping approach, which requires no soil information, and those produced using Gharari et al.'s (2011) topography-based approach, differed considerably and similarly from the reference maps in terms of DRP extent and distribution.The differences arose from the inaccuracy and the coarse resolution of the input data.The simplifying assumptions these two approaches require also limit their usefulness in automatically mapping small catchments.

Figure 5 .
Figure 5. Runoff generation module of PREVAH, adapted from Viviroli et al. (2009b).Parameters in blue are averaged for the whole catchment, while parameters in red are adapted stepwise to the RTs.

Figure 6 .
Figure 6.Percentage of total catchment area assigned to each runoff type in the Reppisch and Meilen catchments with the four different mapping approaches.

qFigure 7 .
Figure 7. Distribution of the class deviations of the different automatic mapping approaches from the reference maps (circles refer to the Reppisch catchment and crosses to the Meilen catchment).The boxplots show median and interquartile ranges from the comparison between GH11 maps and the reclassified reference maps.

Figure 8 .
Figure 8. Deviation map between the MU09 map and the reference map.In the numbered areas the runoff contribution was either overestimated (red) or underestimated (blue).

Figure 9 .
Figure9.Agreement scores K Fuzzy and MC scores obtained by comparing the maps derived with automatic mapping approaches SN07, MU09, and GH11 with the reference (SN03) maps for the sub-catchments of the two study areas.

Figure 10 .
Figure10.MC scores related to each RT obtained by comparing the maps derived with automatic mapping approaches SN07, MU09, and GH11 with the reference (SN03) maps for the two study sites.

Figure 12 .
Figure12.Simulated runoff during the two heaviest rainfall events of the simulation period, obtained from the different DRP maps for the two study sites by varying the parameter values for each RT (simulation 4.1 of TableA2).The bands represent the minimum and maximum runoff values obtained with the different parameter combinations for the simulations driven by GH11 maps.

SlopeFigure A1 .
Figure A1.Sensitivity analysis of the threshold values for the HAND-based landscape classification on the whole Reppisch catchment.The level plot shows the percentage of deviation from the maximal MC score (0.2023) obtained by comparing GH11 maps with the reference maps.

Table 1 .
List of hydrological classifications based on DRPs, the data they require, and the number of output classes (A: automatic; M: manual).

Table A1 .
Reclassification of the reference maps for the class comparison with the GH11 maps.

Table A2 .
Parameter values used for the 16 runs of the synthetic runoff simulations.The simulation names are of the form "i.j ", where i refers to the number of parameters defined based on the RTs and j refers to the different combinations.