Which spatial discretization for which distributed hydrological model

Distributed hydrological models are valuable tools to derive distributed estimation of water balance components or to study the impact of land-use or climate change on water resources and water quality. In these models, the choice of an appropriate spatial scale for the modelling units is a crucial issue. It is obviously linked to the available data and their scale, but not only. For a given catchment and a given data set, the "optimal" spatial discretization should be different according to the problem to be solved and the objectives of the modelling. Thus a flexible methodology is needed, especially for large catchments, to derive modelling units by performing suitable trade-off between available data, the dominant hydrological processes, their representation scale and the modelling objectives. In order to represent catchment heterogeneity efficiently according to the modelling goals, and the availability of the input data, we propose to use nested discretization, starting from a hierarchy of sub-catchments, linked by the river network topology. If consistent with the modelling objectives, the active hydrological processes and data availability, sub-catchment variability can be described using a finer nested discretization. The latter takes into account different geophysical factors such as topography, land-use, pedology, but also suitable hydrological discontinuities such as ditches, hedges, dams, etc. For small catchments, the landscape features such as agricultural fields, buildings, hedges, river reaches can be represented explicitly, as well as the water pathways between them. For larger catchments, such a representation is not feasible and simplification is necessary. For the sub-catchments discretization in these large catchments, we propose a flexible methodology based on the principles of landscape classification, using reference zones. These principles are independent from the catchment size. They allow to keep suitable features which are required in the catchment description in order to fulfil a specific modelling objective. The method leads to unstructured and homogeneous areas within the sub-catchments, which can be used as modelling units. It avoids map smoothing by suppressing the smallest units, the role of which can be very important in hydrology, and provides a confidence map (the distance map) for the classification. The confidence map can be used for further uncertainty analysis of modelling results. The final discretization remains consistent with the scale of input data and that of the source maps. We present an illustration of the method using available data from the upper Saone catchment (11 700 km2) in France. We compare the results with more traditional mapping approach, according to the landscape representation and input data scale.

As said above, the definition of an appropriate spatial scale for hydrological models will result from a trade-off between various considerations. Let's review the various items listed before.

What is the objective of the distributed hydrological modelling?
This question might seem trivial but is often not always well defined by the modellers. Refsgaard et al. (2005) retained it as one of the first item in the list of tasks they identified for performing a modelling study while respecting some quality insurance criteria. Examples of possible modelling objectives are: determination of the components of the water balance of a catchment, quantification of flood or draught risk, evaluation of mitigation solutions for the limitation of the river pollution at a given location, test 15 of functioning hypotheses, search for dominant hydrological processes. . . For each objective and a given catchment, the required spatial discretization should be different.

Which output variables are required and at which spatial and temporal scales?
The definition of the objective of the study leads to the definition of the output variables of interest, and of the spatial and temporal scales at which they are required. For water Introduction EGU outputs can be, integrated fluxes, maximum concentrations over a given time step, durations over which legal thresholds are exceeded. These outputs can be required at the annual, monthly, daily, hourly time scale; at distributed locations within sub-catchments or only at the outlet of the whole catchment. It is generally admitted that coarser time and space scales require coarser modelling units, but there is no clear rule to define 5 that appropriate scale.
2.3 What are the measured data and at which resolution are they available?
The measured data includes input forcing variables (rainfall and climate forcing), output variables for verification such as streamflow, soil moisture, groundwater levels, surface fluxes, and landscape descriptors such as land use, geology, elevation, which are used 10 for model parameter estimation and definition of the modelling units. When speaking about observations, Blöschl and Sivapalan (1995) distinguish between the extent of the data, i.e. the zone other which the data set is collected; the support of the observation, i.e. the spatial scale at which data are integrated and the spacing, i.e. the distance (in time and space) between different observations points. 15 In this discussion, we will focus on the spatial scale, but the temporal scale is obviously linked (e.g. Skoïen and Blöschl, 2006). For modelling purposes, all input data, parameters and the verification data are required over the modelling units. Unfortunately, there is often a mismatch between the observations scales and the modelling scales. The support of in-situ measurements is often punctual. Therefore spatial in- 20 terpolation using techniques such as kriging is required to derive the values over the modelling units. Streamflow data are directly integrated over catchment areas and are more consistent with the modelling scale, but the number of gauging stations is often much smaller than the number of modelled sub-catchments. New measurements techniques such as scintillometers provide a certain space averaging for sensible and/or 25 latent heat flux along transects (Green et al., 2001), but do not yet provide values over sub-catchments.
With the availability of remote sensing data (satellite, planes, drone etc), there was a 782 fundamental question "which resolution is needed for landscape description to represent hydrological processes?" remains an open question (Puech, 2002). Furthermore, remote sensing measurements are often not directly related to the hydrologic quantity of interest. The sensors often only sample the first cm of the continental surfaces, whereas information on deeper soil layers would be required. Multi-disciplinary re-10 search amongst hydrologists (and more generally with researchers in environment) and remote sensing specialists is still needed to progress on these questions. There is also a paradox in this progress of remote sensing: whereas the continental surface can be described with more and more accuracy, even at the scale of a building, the knowledge of the sub-soil is not progressing so rapidly. For example, there is a 15 lack of knowledge on soil properties. The data support is local, spacing and extend are limited due to the difficulty and cost of soil sampling. To derive reliable maps at larger scales, hypotheses about soil organisation, forming factors are required. In pedology sophisticated classification techniques using geostatistics or fuzzy rules are developed for mapping soil units, but the result remain uncertain (e.g. Burrough et al., 1997;La-20 gacherie et al., 1997La-20 gacherie et al., , 2001. Lots of information is available in soil databases. But they often provide rather descriptive information on the soils. Their content is often found disappointing, if not useless by hydrologists who are looking for transfer coefficients. The initiative of Lin et al. (2006) to promote hydropedology as a synergetic discipline between hydrologists and pedologists is promising but will require some years to be 25 fruitful. Progress are also expected on soil characterization with the use of geophysical techniques but their use is still limited and they cannot be deployed routinely yet. Therefore, for some times on, we will have to cope with the paucity of information related to the soil and sub-soil, when performing hydrological modelling. It should not EGU be forgotten when combining these data with the detailed data derived from remote sensing of continental surfaces. From this analysis of data in hydrology, it is obvious that the information is often not available with the desired accuracy. Therefore, hydrologists should perhaps change their way of thinking: instead of looking for data for a given model, they should perhaps 5 try to find the relevant formulation of hydrological processes for given available data in order to avoid overparameterization and uncertainty.
2.4 What are the active/dominant hydrological processes on this catchment and what are their functional scales?
A lot of former distributed hydrological models were based on Hortonian scheme for 10 the runoff. In such models, the catchment was subdivided into so-called isochrones surfaces. The hydrograph separation with isotopic techniques showed later that this mechanism was uncommon at a flood event scale (Crouzet, Hubert et al., 1970, cited by Gineste, 1997. Frequently, the hydrograph is mainly composed with the waters present in the soil before the rainy event (Grésillon, 1994). This convinces hydrologists 15 of the complexity across scale of flow generation processes inside a catchment. The question of process scale in hydrology has been review by Blöschl and Sivapalan (1995) who proposed a diagram of characteristic spatial and temporal scales for the various processes in a log-log plot (Blöschl and Sivapalan, 1995, Fig. 2). Lots of research has been dedicated to the determination of these characteristic scales. One of 20 the recent examples is provide by Skoïen et al. (2003) and Skoïen and Blöschl (2006) who analysed rainfall, streamflow, groundwater level and soil moisture records from Austria and Australia, using geostatistical tools. They were able to determine characteristic time and scales for rainfall and runoff of respectively one day and one month, while showing that groundwater levels were not stationary. In space, they found that 25 rainfall was almost fractal without characteristic scales whereas runoff appeared nonstationary but not fractal. This data analysis provided data evidence of the consistency of the Blöschl and Sivapalan (1995) scale diagram. The latter provides guidance for the 784 with increasing catchment scale for various hydrologic and climatic contexts (e.g., Bergkamp, 1998;Braud et al., 2001;Cerdan et al., 2004). A downward approach of model complexity (Klemes, 1983), based on data analysis can help in the formulation of a conceptual model of the rainfall-runoff relationship, leading to a parsimonious model using parameters derived from available data. This concept has been recently 10 applied by Jothityangkoon et al. (2001) and Eder et al. (2003) for a semi-arid and an alpine catchment respectively. They propose models of the rainfall-runoff relationships at the annual, monthly and daily time scale, by progressively increasing the model complexity until a good reproduction of the data behaviour was obtained. The two cases studies show that, according to climate and catchment characteristics, very different 15 models can emerge, with different dominant hydrological processes in both cases. Macropores, preferential flow, re-infiltration, variability of land cover, the influence of micro-topography leading to the concentration of runoff into small channels, are some of the factors being able to explain threshold effects and the observed differences in dominant hydrological processes across scales (e.g. Bergkamp, 1998;Lehmann et al., 20 2006;Sidle, 2006;Zehe et al., 2005). Several concepts have been proposed to describe and explain the variability of landscape characteristics such as organization, hierarchy or fractal behaviour, leading to the definition of various characteristic scales (e.g. Vogel and Roth, 2003;Lin et al., 2006). However, there is still a long way to go before a clear picture can be drawn.
Nevertheless, when defining the hydrological modelling units, this multiplicity of scales must be acknowledged and, as much as possible, taken into account. Introduction The representation of a process within a model implies the choice of two complementary elements. The first one is the resolution of the model (the spatial scale we are discussing in this paper) and the process conceptualisation. If we borrow the vocabulary from the atmospherical community, the choice of the resolution of the model will 5 allow the separation between the processes which are represented explicitly (i.e. for which a prognostic variable with an evolution equation is defined) and the sub-grid scale processes which will be parameterised 1 (i.e. for which no evolution equation is solved and for which simplified representations are adopted and added to the prognostic equations). According to the resolution of the model, some processes can be 10 represented explicitly or parameterised. By adopting the same vocabulary in hydrology we would avoid the quarrel on the nature of conceptual or physically based modelling. A process would be conceptual (or parameterised) at one resolution and physically based (explicitly resolved) at another resolution. Plot scale studies allow the derivation of physical equations, extensively used in 15 hydrology such as the Richards equation for saturated/non-saturated water flow, the Boussinesq equation for 2-D groundwater flow, the Saint-Venant equation for river or overland flow. At this scale, several parameters in these equations such as retention curve, hydraulic conductivity or surface roughness can be estimated from measurements. When moving to larger scales, it is often assumed that the form of the equa-20 tion remains valid. Then it is necessary to derive so-called effective parameters at those scales (e.g. Blöschl and Sivapalan, 1995). For this purpose the aggregationdisaggregation modelling (Blöschl and Sivapalan, 1995) approach to identify the functional relationship at a larger scale from results at smaller scales can be used (see an example in Viney and Sivapalan, 2004). 25 1 In this context, the word "parameterization" should not be confounded with the estimation of parameters for which it is often used in hydrology. It would be equivalent to conceptualisation in the hydrology jargon.

786
EGU Several distributed hydrological models are based on a regular mesh over which point scale laws are extended and where effective values of the parameters must be determined. Examples are MIKE-SHE (Abbott et al., 1986a, b), ECOMAG (Motovilov et al., 1999), TOPKAPI (Ciarapica and Todini, 2002). Some authors contest this approach, referred to as a "reductionist" approach (e.g. Gottschalk et al., 2001), arguing 5 that the equation becomes a parameterisation of the process, since parameters cannot be estimated from field measurements (e.g. Beven, 2002b). The choice of the grid size is not always rationalised taking into account the processes that are represented, but seems rather the result of commodity and data resolution. One exception can be found in Beldring et al. (1999) and Motovilov et al. (1999) using the ECOMAG model in the 10 framework of the NOPEX project. They determined the size of the mesh from analysis of averaging properties of point groundwater and soil moisture measurements obtained using a dedicated sampling strategy with nested spacing. Of course, data required for such a study are seldom available.
One of the main criticisms about square elements is their poor handling of hetero-15 geneity because continental surface is not organised in pixels. The task of parameter estimation is therefore more difficult. To overcome the problem, some authors proposed different approaches such as meshes based on iso-contours of elevation in THALES (Grayson et al., 1992a) or TOPOG (Vertessy et al., 1994); and more recently meshes based on Triangular Irregular Networks (TINs) (Ivanov et al., 2004a;Vivoni 20 et al., 2004). The latter offer a good compromise between efficiency and accuracy as shown by the performances of the tRIBS model, developed on this irregular geometry (Ivanov et al., 2004a, b). Other authors tried to define more "hydrological" modelling units. Contributive zones are based on the concept of hydrologic similarity, and can be defined using for instance the topographic index of Beven and Kirkby (1979). Within 25 these areas, it is assumed that the catchment response is similar. The concept is used in TOPMODEL (Beven and Kirby, 1992). In order to represent land-use heterogeneity, some authors have introduced the concept of Hydrological Response Units (HRUs) (e.g., Flügel, 1995) al., 2005). HRUs represent a sub-catchment scale discretization composed of a unique combination of land cover, soil and land management. One of the drawbacks is that the HRUs mapping induces merging of smaller units into larger ones by applying smoothing filters. From a hydrological point of view, it may result in a loss of information, as some major hydrological processes can be localised on very small units. Illustrations 5 are re-infiltration of runoff at the bottom of hillslopes in the Sahel (Seguis et al., 2002), runoff decrease due to hedge networks (Viaud et al., 2005). Another example of hydrological spatial discretization is the concept of Representative Elementary Area (REA) proposed by Wood et al. (1988), looking for characteristic spatial scales, beyond which the geographical locations of features could be neglected and the distribution taken into account using statistical distributions. Fan and Bras (1995) questioned the universality of the concept, especially because flow routines and the hierarchical structure of the river network were not taken into account in the analysis. This drawback is overcome with the concept of Representative Elementary Watershed (REWs) proposed by Reggiani et al. (1998Reggiani et al. ( , 1999Reggiani et al. ( , 2000 and extended by 15 Tian et al. (2006). REWs form the elementary modelling units divided into several zones corresponding to the various hydrological processes. Global mass, momentum and energy balance laws are formulated at the sub-catchment scale. The corresponding equations remain unchanged whatever the scale (e.g. for REWs defined at various Strahler order). On the other hand fluxes between REWs and their zones (saturated, 20 non-saturated, overland, concentrated and river flow) must be defined for each scale. Sub-catchment scale variability can be parameterised in the derivation of these fluxes. The strength of the approach is therefore to translate the general problem of model formulation into the problem of derivation of closure relationships at various scales (Reggiani and Schellekens, 2003). Lee et al. (2005), Reggiani and Rientjes (2005)  The picture drawn from the review of the various items above might result quite confusing. Contrarily to atmospheric sciences, the definition of a unique scale, separating processes being represented explicitly from those that must be parameterised, is dif-5 ficult due to the hierarchical nature of the river network and the landscape complexity across scale. Furthermore characteristic scales are different for the various processes. Therefore we propose to adopt an approach, as quoted by Leavesley et al. (2002), requiring that, for a given catchment, the question should not be "which model is most appropriate for a specific set of criteria?" but "what combination of process conceptualisations is most appropriate?". This approach is consistent with the downward approach mentioned previously and the recognition of the "uniqueness of place" as stated by Beven (2003). It also allows building a specific model for a specific objective, taking into account the availability of data. This pleads for the use of multi-scale hydrological framework, where the processes are develop as independent components, using the 15 facilities provided by Object Oriented Modelling and, if possible with their characteristic time and space scales. They are then coupled through adequate tools provided by the modelling environment (for recent reviews about environmental computing frameworks, see for instance Argent, 2004;Krause et al., 2005). In order to represent landscape heterogeneity efficiently according to the mod-20 elling goals, we propose a flexible methodology for catchment meshing based on the REW discretization. It uses nested discretizations, starting from a hierarchy of subcatchments, linked by the river network topology. If consistent with the modelling objectives, the active hydrological processes and data availability, sub-catchment variability can be refined using finer discretizations ( EGU hedges, etc., in order to represent sub-catchment variability. These RECs can also be discretized to take into account the vertical structure of soil profiles (Haverkamp et al., 2004). Conservation laws can be solved on the obtained elementary volumes, with various degrees of complexity. At a fine resolution, we can end up with standard partial differential equations. The conservation laws can be solved using finite volumes 5 methods, which are consistent with the exchange fluxes approach. At larger scales, simplified representations can be derived.

EGU
Practically, for small catchments, the landscape objects such as agricultural fields, buildings, hedges, river reaches can be represented explicitly, as well as the water pathways between them (e.g., Moussa et al., 2002;Carluer andDe Marsily, 2004, Branger, 2007). For larger catchments, such detailed representation is not feasible and simplifications are necessary. The discretization methodology should remain flexible enough to fit into the modular concepts exposed above and as objective as possible concerning the consistency of scales and the simplifications of the patterns. As a practical solution, we suggest to extend the principle of landscape classification, used is soil mapping for 15 the definition of "hydro-landscapes" proposed by Winter (2001). It allows defining different levels of complexity in landscape representation that can be associated with different levels of complexity in hydrological processes description. The factors used in the classification can be adapted to the dominant hydrological processes and the size of the final units remains consistent with the resolution of the available data. The 20 methodology for hydro-landscape classification is described in Sect. 3 and an illustration on the upper Saône catchment (11 700

Discretization of a catchment using landscape classification technique
We borrowed the principles of our landscape classification from those used in soil mapping (Burrough et al., 1997) and more specifically from the method of Robbez-Masson (1994, 1995; Robbez-Masson et al. (1999); Lagacherie et al. (2001). The latter is based on the definition of reference zones and an analysis of the neighbourhood com-5 position at each location.

Hydro-landscape definition and classification requirements
Hydro-landscapes can be defined as areas where hydrological processes can be considered as homogeneous. Their delineation can take into account what will be referred to as factors below, influencing hydrological processes, e.g. slope, land use, geology, 10 pedology, etc. Hydro-landscapes can be considered as an extension of the HRU concept, as we added the following requirements for the delineation methodology. First, the method should avoid the quite arbitrarily procedures consisting in removing small areas from the map resulting from the overlay of the factors maps. Second, the method should al- 15 low better control on the errors arising from the overlay of maps at different resolutions, in an objective and quantifiable way.

Principles of hydro-landscape delineation
The different steps of the method are the following: 4. Definition of the p factors expected to be influential on these hydrological processes, according to the chosen representation, 5. Simplification of each p factor map into a number n p of classes. Their combination 5 using GIS leads to a multivariate map of the Πn p combined factors. One class is therefore a unique combination of the p factors (Fig. 1); 6. Definition of the reference zones on the study catchment. They are areas representing a unique combination of the retained factors. It may be a specific area of interest. These reference zones are characterized using the combining factor 10 map and the neighbourhood descriptor defined above.
7. Choice of a neighbouring window (size and shape) and a descriptor of the composition of the combined factor within this neighbourhood; 8. All points in the landscape are characterized by their neighbourhood window composition. 15 9. Mapping of the whole catchment using a pixel-by-pixel analysis, i.e. an allocation of each pixel to one of the reference zones according to a distance criterion between the descriptor of its neighboring window composition and that of the reference zones; 10. Estimation the distance map; 20 11. Iteration from steps 7 to 10 until a resolution ensuring a given accuracy and consistency with the input data is obtained.
We will present an illustration of the whole methodology in section 4 and will now detail steps 4 to 10.

792
of several interacting factors (climate, geology, slope land use parameters) which are used to define soil-landscapes. In ecology, eco-regions are defined as land and water extends including distinct natural community. For hydrology, the factors that will be retained in the analysis will depend on the modelled hydrological processes. To model infiltration, factors such as soil surface characteristics, soil types, management practices will be influential. To simulate runoff, we can consider topography, topographical index, and soil surface characteristics. For evapotranspiration modelling, land-use, orientation, groundwater levels (geology), snow melt (topography, orientation) can be taken into account. Once all needed factors are identify, their corresponding layers/maps superposition using GIS gives a composed picture 15 of the landscape. This composed picture defines various combinations of the landscape factors characterizing the spatial organization of water dynamic in the catchment.

Definition of references zones (step 6)
The notion of reference zone has been borrowed from soil classification where Favrot 20 (1989) proposed the use of reference sectors for small natural region characterisation. Reference zones are extensively surveyed areas that are supposed to contain all the soil classes of the region to be mapped. For the final cartography, one unique soil type characterized each reference sector. The derivation of accurate classifications and the quantification of uncertainties has led to lots of research using techniques such 25 as geostatistics, fuzzy sets, conditional probabilities (e.g. Burrough et al., 1997), the discussion of which is beyond the scope of the paper. We propose to extend the notion EGU of reference area for hydrological purposes. We define a reference zone for hydrology as an area with a certain degree of homogeneity relevant in the characterisation of hydrological processes. It may correspond to a unique combination of factors or to a dominant combination of factors. For a modelling of evapotranspiration and runoff such combination can be "deciduous forests over steep slopes" or "cultures over moderate 5 slopes". For modelling saturated zones and their influence on runoff, they can be "high topographic index" etc. Man influence can be included through management practices: "drained area", "urban area", etc. References zone can also be a particular natural region where water dynamics are specific (e.g. karstic areas, vineyard). There is no limitation in the number of reference zones considered in the analysis. Therefore, the 10 modeller has a lot of flexibility in the mapping of the landscape heterogeneity (complex/simple description), according to the data available, their scale and its objectives.
If there is a good knowledge of the catchment, the reference zones can be quite easily defined and delineated using the a priori knowledge of field hydrologists. The latter can know the location of areas prone to saturation, or with high slopes, etc. For larger 15 catchments or catchments where it is not possible to perform intensive field surveys, the task is more difficult. In this case, only the factors maps can provide the available information. These maps can be used to define possible reference zones, according to traditional/standard knowledge about hydrological processes. In this case, a simplified classification of factors can be used and the reference zones can be defined using 20 a statistical and spatial analysis of the multivariate map (see details in Sect. 4). The relevance of such delineation can then be confirmed by a specific field survey.

Mapping of modelling units (steps 7 to 10)
Neighbourhood definition and characterization of all point in the catchment (step 6 and 7) 25 "Landscape is what is around you" (Robbez-Masson, Foltête et al., 1999). With this sentence these authors argue that the integration of spatial neighbourhood is neces-794 EGU sary to describe the landscape. With this mapping approach, each point in the landscape is characterized by the composition of a neighbourhood window around the point (contextual analysis). The modeller must define the size and the shape of a neighbourhood window (e.g. ellipsis neighbourhood window on Fig. 1). The latter determines the resolution of the final units. All points in the catchment are characterized using the 5 composition of their neighbourhood window.
The characterization of each point is performed (inside the neighbourhood window) using a descriptor on the multivariate image (in pixels). The descriptor may be a histogram, a mean or a standard deviation. This descriptor is calculated for each of the points of the multivariate image. In the example provided in the left of Fig. 2, each 10 colour of the pixels characterizes a specific combination of factors and we chose the histogram as a descriptor. To derive it we consider all the pixels inside the neighbourhood window (here a square) and count the number of each combination class (i.e. each colour) in this window. This histogram is constructed for each point of the map. 15 The choice of the shape of the window allows taking into account some anisotropy in the catchment. For instance ellipsis neighbourhood window with the major axe oriented north south can be chosen if specific factor (e.g. topography) has such an orientation.
References zones characterization and mapping procedure (steps 8 to 10) Similarly to each point of the map, the references zones are characterized by their 20 composition, using the same descriptor, according to the combination factor map (right of the Fig. 2).
The mapping consists in the assignment of all points in the landscape to the most similar reference zone in a statistical sense. Similarity is defined as the minimization of the distance between the descriptor of each characterized point and those of the 25 references zones. The distance may be the modal distance, Kolmogorov distance or Manhattan distance (Robbez-Masson, 1994). See the illustration of the principles in Fig. 2

EGU
The result of the mapping is a first map representing a segmentation of the initial multivariate image into elementary landscape units or polygons. The resulting map is thus composed of homogeneous and unstructured areas. A second image is obtained and can be considered as a confidence map for the classification. It quantifies the reliability of the modelling units by providing a criterion of the statistical quality of the classifica-5 tion. If the confidence map is not satisfactory, the classification can be improved by adding more reference zones to get a better representation of the landscape.
Scale and accuracy assessment (step 11) It is possible to get different units size for the final landscape map, by using different size for the neighbourhood windows. This size must be chosen in consistency with the 10 scale of the input data. The size of the smallest units on the classified map cannot be lower than the finest units on the input maps. An iterative procedure is therefore needed to define a neighbourhood window size, consistent with this first constraint. This iterative procedure will consist in testing several sizes for the neighbourhood window until the constraint is fulfilled (iteration of steps 6 to 10). This ensures consistency 15 of the modelling units with the input data scales.
Once the "optimal" size of the neighbourhood windows is chosen, the classification can be improved using the distance map. The latter provides an idea of the accuracy of the classification. If the distance is large, it means that the similarity of the neighbourhood with the available reference zones is poor. Therefore, new references zones can 20 be added in the areas with larger distances and used to improve the mapping (iteration of steps 6, 8 to 11). This will reduce uncertainties on the landscape heterogeneity representation and handling.

Illustration of the methodology using data from the upper-Saône catchment in France
In this section, we illustrate the methodology outlined above using data from the upper-Saône catchment upstream the gauging station Lechatelet (11 700 km 2 ). It is located in north-eastern France (Fig. 3). The river rises in the Vosges mountains in Lorraine, 5 and flows south through Burgundy. The elevation ranges from 177 a.m.s.l. at the outlet to more than 1215 a.m.s.l. in the Vosges. The mean annual precipitation varies from 790 mm close to the catchment outlet to 2440 mm in the Vosges. Snowfall constitutes a minor part of the annual precipitation. The coldest month is January (-0.4-2.6 • C) whereas the warmest is July (16.1-21.0 • C). The average annual potential Evaporation 10 calculated by the Pennman equation varies from 859 mm in the lowest part to 717 mm in the highest part. The highest values appear in July (125-157 mm) and the lowest in January (19-26 mm). This climate results in a runoff regime called Pluvial C regime in the Pardé (1955) classification (Sauquet et al., 2000). This is a regime where the runoff is rain-fed with winter high water and low water in summer due to high evapo-15 transpiration losses. The mean annual runoff varies from 323 mm in the lowest parts, to more 1451 mm in the Vosges. The main land-use classes are broad-leaved forests, arable land, and pastures. The arable land is mainly located in the south-western parts, whereas the pastures are located in the north-eastern parts. The forests are spread all over the catchment, and in the Vosges the coniferous forest is important. Urban sur-20 faces constitute about 2% of the catchment area with the largest concentration around the town of Dijon. The geology is characterised by limestones in the southern parts and sandstones and granites in the northern parts. We first present the available data and their resolution. Then, we illustrate the nested discretization procedure and all the steps outlined in Sect. 3. We assume that the mod- 25 elling objective is to derive over the whole catchment, the water balance components at different temporal scale (e.g. annual, monthly and daily scales). The results of the classification are compared with traditional procedures in a fourth part. areas cover a vide spectrum ranging from 52 to 11 700 km 2 .

HESSD
-A digital elevation model (DEM) with resolution 200 m, 100 m and 1000 m from the IGN (Institut Géographique National) in France and provided by Water Agency (Agence de l'eau).
-The Corine Land Cover database provided by Institut Français de 15 l'ENvironnement (IFEN) with a 500 m resolution. The database contains 44 land cover classes organized in three levels (Bossard et al., 2000).
-A soil map from the Soil information system of France from National Institute of Agronomic Research (INRA) with a 1/100 000 resolution (Jamagne et al., 1995). For about one-third of the catchment, another soil database at 1/250 000 resolu- EGU -A referential of groundwater systems from SANDRE specification (Secrétariat d'Administration Nationale des Données Relativesà l'Eau) and available at the web site http://sandre.eaufrance.fr/.
In reference with the discussion about data resolution in Sect. 1, we can underline the large heterogeneity in the input maps. Furthermore, meteorological data provide the 5 coarser information, both in space and in time. These data are not used in the definition of the hydro-landscapes units, but the information should be taken into account in the choice of the processes representation, then in the final discretization. The situation for soil data illustrates one of the difficulties when modelling a catchment in France: the sources of data, especially for soils, are not homogeneous.

Determination of the modelling units for a distributed water balance components derivation
In the steps presented in Sect. 3, once the modelling objectives have been defined (step 1) and the data source identified (step 2), the next step is the choice of the processes which will be considered and their representation within the model. Given the 15 time steps and resolutions of the input data, a first step in the modelling is obviously to consider a first discretization at the sub-catchment scale (Sect. 4.2.1) using simple representations such as those presented by Eder et al. (2003). Within a sub-catchment, a large degree of heterogeneity, especially in land-use might remain. Let's suppose that it must be taken into account for an accurate simulation at the chosen scale. It 20 has of course to be proven through simulation, but it is beyond the scope of the paper, where we only wish to illustrate our methodology for landscape discretization, using the available data of the upper-Saône catchment.

A first discretization into REWs
The first discretization level was that of the sub-basin (REWs EGU to extract hydrological information from DEM (Peuker and Douglas, 1975;Martz and Garbrecht, 1992). We used the Tardem algorithm (Tarboton, 1997) to derive subbasin from DEM and a predefined digital river network. This algorithm first performs a detection and treatment of depression zones in the DEM. Then the local direction of out-flows on every cell is calculated using the D8 algorithm. The contributive surfaces 5 are determined for each cell in terms of surface drained through each cell. The subbasins drained by every link of the network are delimited using a threshold for Strahler order of the river links. Figures 4a and b show a discretization of the Saone catchment in sub-catchment using the first and second Strahler orders. The number of REWs is 341 and 81 for the first and second order respectively, with average surface of 35 and 10 147 km 2 respectively.

Sub-catchment scale discretization using the landscape classification Factors definition
We assumed that the modelling objective was the distributed determination of the components of the water balance at various time scales. One of these components is 15 evapotranspiration and we considered land-use as one of its controlling factors. We assumed the partition of incoming rainfall between runoff and infiltration was controlled by lithology and slope. The use of the topographic index of Beven and Kirkby (1979) was also a possible choice, as well as the direct use of soil-landscape units. In the example described below, we therefore considered three factors: lithology, slope and 20 land-use. The next step was to simplify the information through the definition of classes for continuous data such as slope. In the case of the Saône, we reclassified the original 26 classes of the Corine Land cover map (resolution: 500 m), present in the catchment, into nine classes. This classification was lead by the differentiation in processes representation induced by those land use classes. In the same way we simplified the is composed of 221 classes of combined factors (amongst the 9×5×7=315 possible classes) and formed the basis for the hydro-landscape mapping procedure. Note in Fig. 5d that within a sub-catchment, the variability of these factors is large, which is one argument for considering this second level of discretization.
Definition of the references zones 10 The definition of reference zones may be the most difficult step of this approach. In the ideal case, reference zones should result from a good knowledge on the catchment. Usually, as for the upper-Saône catchment, this is not the case. Therefore, we chose to define them using a statistical and spatial analysis of a simplified combination of factors map. This map was derived using the 9 classes for land-use but only 2 classes 15 for slopes (low and high) and 4 classes for lithology (see Table 1). They were used to define 46 types of reference zones (Table 2). Their location is shown in Fig. 6 and was defined by insuring the representativeness of each reference zone in all areas other the catchment. As much as possible, we chose areas with a certain degree of homogeneity. However, they still keep a histogram of composition for the original 221 20 combinations of factors.

Mapping procedure
For the mapping procedure, we used an available software named CLAPAS developed by Robbez-Masson (1994)  EGU histogram, matrix of co-occurrence. The software proposes several choices of the neighbourhood shape: square, ellipse, circle, ring etc. The squared neighbourhood window is usually used. The choice of the other shapes depends on specific thematic purposes. We used a square, as particular anisotropy handling was not needed. We chose to use the histogram of composition as a descriptor of the neighbourhood and 5 the Manhattan distance to compute the similarity with the composition of the reference zones. The choice of the Manhattan distance is justified since it is more robust than the other distances (Kolmogorov, Cramer, etc.) used to perform similarity between vectors (Robbez-Masson, 1994). Each point (X) within the catchment was characterized by a specific histogram of the 10 possible combined factors (p=221) within the neighbourhood window, denoted by M X = (d1, d2,. . . , dp). The histogram calculation was also performed for each k=46 references zones. For a reference zone j , the histogram is noted M j = (m1j, m2j,. . . ,mpj). Affectation of a point X to a reference zone j consists in minimizing the Manhattan distance d(M x , M j ) calaculated by Eq. (1); see Fig. 2: For the modelling units scale assessment, we varied the neighbourhood windows size, iteratively from 3 km down to one kilometer.
Resulting hydro-landscapes and distance maps Figure 7 shows the maps of the hydro-landscapes and their corresponding distance 20 maps for three sizes of the neighbourhood window: 3, 2.2 and 1.4 km. Figure 8 shows the distribution of the surface of the mapped units and Table 3 provides the corresponding statistics.
In the first iteration, we used a neighbourhood windows resolution of 3 km. The resulting map includes homogenous areas with an average surface of 16 km 2 (standard 25

802
EGU deviation of 99 km 2 ). Fifty percents of the total area was covered with units having a surface lower than 218 km 2 (top of the Fig. 8). Less than one percent of the total area was covered with modelling unit having a surface lower than 1 km 2 . The corresponding confidence map for this iteration showed larger areas with an unsatisfactory classification (areas in blue on Fig. 7 top right, situated mainly in the Vosges mountainous area).

5
In a second iteration, we tried to refine the mapping by using a smaller neighbourhood window of 2.2 km. The average surface of map units dropped to about 9 km 2 (standard deviation of 70 km 2 ). Fifty percents of the total area was covered with units having a surface lower than 122 km 2 (middle of the Fig. 8). Less than one percent of the total area was covered with modelling unit having an area lower than 1 km 2 . The confidence 10 map showed some improvement in the classification (Less areas in blue on the Fig. 7 middle right). In a third step, we tested a window's size of 1.4 km. The map units had an average surface of about 5 km 2 (standard deviation of 34 km 2 ). Fifty percents of the total area was covered with units having a surface lower than 67 km 2 (bottom of the Fig. 8). In the latter case, the map units had smaller size than above, three percents of 15 the total area was covered with modelling unit having a surface lower than 1 km 2 . The confidence map showed improvements as compared to the previous iterations.
Choice of the mapping that fits well with the input data The use of different sizes for the neighbourhood windows provided modelling units at different scales. The statistical analysis of the surface distribution provided a way 20 to assess the consistency of the results with the scale of the input data. The finest scale of the input data was 1/250 000 and the coarser scale was 1/1 000 000. The final modelling unit could not be more accurate than the finest scale. For a scale of the classified map of 1/250 000, the rule of "quart" states that the modelling units must have a surface larger than 1/4th of 2.5×2.5 km 2 , i.e. 1.6 km 2 . For a scale of the classified 25 map of 1/1 000 000, the units should be larger than 1/4th of 10×10 km 2 , i.e. 6.4 km 2 .
The percentage of the cumulated surfaces occupied by the units with surface lower EGU than these two thresholds are provided in Table 3. A trade off is needed to choose the most appropriate map. The first one (top of Fig. 7) with an average surface of 16 km 2 seemed too coarse as compared to the scale of the input data. In the third map (bottom of Fig. 7) the average surface of 5 km 2 is consistent with a scale of 1/250 000 for the classified map. However this feeling of 5 accuracy may be misleading because 18% of the total surface was composed of units with a surface smaller than 6.4 km 2 and about 6% of the total surface was composed of units with surface smaller than 1.6 km 2 . The better compromise may be to choose the second map (middle, Fig. 7) with an average surface around 9 km 2 . More than 90% of the total surface was covered with landscape units having a surface larger 10 than 6.4 km 2 and more than 97% of the total surface was covered with landscape units having a surface larger than 1.6 km 2 . Thus, the iterative procedure of this classification approach provided an efficient tool to assess the mapping scale according to the input data scale. Furthermore, the distance map provided an idea of the uncertainties on the modelling units representation, and could be used for uncertainty analysis. However, 15 the input rainfall was available on square grids of 64 km 2 . Therefore, if we consider this data, not taken into account in the classification procedure, the first map resolution could be more consistent with that of the rainfall input data for hydrological modelling.
Other modelling objectives and/or other factors map resolution would produce different maps. The presented technique allows a great flexibility in landscape heterogeneity 20 representation. It may insure a relevant representation of heterogeneity according to the available data and modelling objectives.

Comparison with the usual mapping techniques for catchment heterogeneity representation
In this section, we present the results of two traditional mapping techniques to repre-25 sent the catchment heterogeneity and compare them with those of the classification methodology presented before.

804
Printer-friendly Version

EGU
Mapping results with the basic smoothing techniques Smoothing techniques are used to simplify the multivariate factors maps, based on the area of the map units. Usually, the basic technique (we used the ArcView smoothing function) consists in removing areas smaller than those representing the scale of the input data. We used this approach by using areas threshold criteria of 6.4 km 2 and 5 1.6 km 2 corresponding respectively to the two scales chosen for the final landscape map, as defined before. We also performed the mapping by removing the areas lower than 1 km 2 . The corresponding maps are shown in Fig. 8 and the statistics of the surfaces of the mapped units can be found in Table 4. When areas with surfaces smaller than 6.4 km 2 were removed, the average surface 10 of the map units was of about 255 km 2 . Less than 0.04% of the total surface was mapped with units having a surface lower than 6.4 km 2 . The mapped units were very coarse and the picture was not very satisfactory. When removing areas lower than 1.6 km 2 , the average surface of the units was of about 32 km 2 and about 1% of the total surface was covered with units having an area lower than 6.4 km 2 . When the 15 areas lower than 1 km 2 were removed, the result seemed visually better than the two preceding maps. The average surface was about 14.6 km 2 and less than 6% of the total surface was covered with map units having an area lower than 6.4 km 2 . The statistics of this map were very close to that of the first map derived from the landscape classification (first line in Table 3).

20
Mapping by re-sampling and smoothing technique The final map of hydro-landscapes can be obtained by re-sampling the original multivariate map of the combination of factor in order to get a smoother image. For this re-sampling (we used the resampling function of ArcView), only the desired final factors were considered (in our case the 46 factors retained for the reference zones). Then  Table 4. When removing areas smaller than 6.4 km 2 , the average surface of the map units was about 64 km 2 and less than 8% of the total surface was mapped with units having a surface lower than 6.4 km 2 . When removing areas smaller than 1.6 km 2 , the average surface of the map units was about 15 km 2 and less than 1% of the total surface was 5 mapped with units having a surface lower than 1.6 km 2 .

Comments
In Table 4, we represent the percentages of the 46 initial reference zone encountered in the final map. We observe a large difference between the three mapping technique.
The first basic smoothing technique does not allow keeping in the final map a reliable 10 representation of the landscape details. More precisely, the urban zone, wine, agricultural and alluvial areas do not correspond to those present in the input data. When we decreased the minimum area, it was difficult to control the features in the landscape relevant for hydrological modelling, because small units were removed automatically.
In any case, with usual mapping techniques, only five to twenty percents of the 46 15 reference landscapes of Table 2 were represented in the final map (Table 4). When using re-sampling techniques, the result seemed visually better. But only 28 to 50% of the 46 reference landscapes were represented in the final map.
In the case of landscape classification, between 63 and 90% of the 46 reference landscapes types were represented in the final map (and more when new references 20 were added). This technique provides then better result than the other ones in term of keeping suitable features in the representation of the input data.

Discussion
The usual mapping techniques, because they use a threshold on area, are not efficient in keeping in the landscape suitable features consistent with the available input data 25 and represented hydrological processes. They do not offer any mechanism for map 806 EGU unit scale handling. In any case, there is no quantification of uncertainties of the map units when these methods are used.
The methodology presented in this paper for the derivation of "homogeneous" hydrological modelling units is a first attempt to derive something applicable in practice and allows keeping suitable details in the landscape. It also allows assessing uncertainties 5 on each map units by providing a confidence map. Finally, it allows a large degree of flexibility in the landscape heterogeneity representation, while constraining the hydrologist to formalise the processes he wants to retain and represent within the modelling units.
Nevertheless, we recognise that the method can be improved in many ways for the 10 definition of the classes used in the definition of the combination of factors maps, the reference zones, the choice of the descriptor, the distance used for the classification.
A complete sensitivity analysis of the consequences of these choices on the final map should be performed. But this is beyond the scope of the paper where our purpose was to illustrate the methodology outlined in Sect. 3 and initiate discussion about it. 15 This paper does not demonstrate either that the proposed approach is "better" than the traditional use of grid squares or smoothing approach when computing the hydrology. An answer by yes or not is probably not possible because verification data will often be insufficient to discriminate and assess the relevance of the various maps. Nevertheless we think that it can be useful in many ways. By performing sensitivity analysis, it 20 should be possible to get an idea of the refinement below which results -for a given objective and data availability-are insensitive. It can also allow testing hypotheses about catchment functioning or preparing an experimental campaign by highlighting more sensitive zones or defining sampling strategies able to lower model uncertainties. An example of the use of the REC discretization is provided by De Condappa (2005).

25
A catchment of about 50 km 2 was divided into columns according to the superposition of a land-use map and a soil map. Then 1-D-simulation of water transfer within the various RECS was performed for the derivation of daily water flux below the root zone.
Using this discretization approach we are experimenting new modelling framework  (Viallet et al., 2006), based on modularity of hydrological processes and appropriate spatio-temporal integration of processes. Once the modelling framework will be built, different tests will provide more precise advice on the suitability of a given catchment discretization for a given objective. 5 We hope that the ideas expressed in this paper will stimulate research in various domains connected to hydrological modelling. Paraphrasing Beven (2006) about uncertainties, we could consider surface heterogeneity as an opportunity to develop new fields of research instead of considering it a "problem". As shown in this paper, if we take surface heterogeneity into account, we end up with unstructured meshes, which 10 do not even hold the convexity properties, necessary to use existing classical numerical schemes. There is therefore an opportunity for applied mathematicians to work with hydrologists in order to develop methods applicable on these meshes. One promising example is that of the tRiBS model (Ivanov et al., 2004a) developed using triangular irregular networks, customized to handle hydrological specificities such as catchment 15 boundaries, river network or the requirement to a better discretization along river flat areas where saturated zones can be expected. The inclusion of linear discontinuities (river reaches, hedges, ditches, dikes) is also a promising area of research to derive methods able to i) describe properly these networks, ii) take them into account into hydrological models explicitly at small scales iii) derive simplified parameterisations for 20 use in larger catchment scale models iv) describe the organisation of these networks within the landscape, to be able to take them into account as sub-grid scale parameterisations. For larger catchments, as the Saône catchment presented in this paper, the methodology we proposed is only a first step. We hope that specialists in landscape typology or laws of organization would be interested in taking possession of the sub- and microtopography in semiarids shrublands, Catena, 33, 201-220, 1998    Open spaces on sedimentary land with law slope 3

Concluding remarks
Open spaces on chalky land with law slope 4 Open spaces on alluvium with low slope 5 Open spaces on marl with low slope 6 Open spaces on chalky land with high slope 7 Agricultural areas on sedimentary land with low slope 8 Agricultural areas on chalky land with low slope 9 Agricultural areas on alluvium with low slope 10 Agricultural areas on marl with low slope 11 Agricultural areas on sedimentary land with high slope 12 Agricultural areas on chalky land with high slope 13 Agricultural areas on alluvium with high slope 14 Vineyard and fruit tree on sedimentary land with low slope 15 Vineyard and fruit tree on chalky land with low slope 16 Vineyard and fruit tree on marl with low slope 17 Vineyard and fruit tree on sedimentary land with high slope 18 Vineyard and fruit tree on chalky land with high slope 19 Prairies on sedimentary land with low slope 20 Prairies on chalky land with low slope 21 Prairies on alluvium with low slope 22 Prairies on marl with low slope 23 Prairies on sedimentary landà high slope 24 Prairies on chalky land with high slope 25 Prairies on alluvium with high slope 26 Sparsely vegetated areas on sedimentary land with low slope 27 Sparsely vegetated areas on chalky land with low slope 28 Sparsely vegetated areas on alluvium with low slope 29 Sparsely vegetated areas on marl with low slope 30 Sparsely vegetated areas on sedimentary land with high slope 31 Sparsely vegetated areas on chalky land with high slope 32 Broad leaved-forest on sedimentary land with high slope 33 Broad leaved-forest on chalky land with high slope 34 Broad leaved-forest on alluvium with high slope 35 Broad leaved-forest on marl with high slope 36 Broad leaved-forest on sedimentary land with high slope 37 Broad leaved-forest on chalky land with high slope 38 Broad leaved-forest on alluvium with high slope 39 Coniferous