Holistic versus monomeric strategies for hydrological modelling of human-modified hydrosystems

The modelling of human-modified basins that are inadequately measured constitutes a challenge for hydrological science. Often, models for such systems are detailed and hydraulics-based for only one part of the system while for other parts oversimplified models or rough assumptions are used. This is typically a bottom-up approach, which seeks to exploit knowledge of hydrological processes at the micro-scale at some components of the system. Also, it is a monomeric approach in two ways: first, essential interactions among system components may be poorly represented or even omitted; second, differences in the level of detail of process representation can lead to uncontrolled errors. Additionally, the calibration procedure merely accounts for the reproduction of the observed responses using typical fitting criteria. The paper aims to raise some critical issues, regarding the entire modelling approach for such hydrosystems. For this, two alternative modelling strategies are examined that reflect two modelling approaches or philosophies: a dominant bottom-up approach, which is also monomeric and, very often, based on output information, and a top-down and holistic approach based on generalized information. Critical options are examined, which codify the differences between the two strategies: the representation of surface, groundwater and water management processes, the schematization and parameterization concepts and the parameter estimation methodology. The first strategy is based on stand-alone models for surface and groundwater processes and for water management, which are employed sequentially. For each model, a different (detailed or coarse) parameterization is used, which is dictated by the hydrosystem schematization. The second strategy involves model integration for all processes, Correspondence to: I. Nalbantis (nalbant@central.ntua.gr) parsimonious parameterization and hybrid manual-automatic parameter optimization based on multiple objectives. A test case is examined in a hydrosystem in Greece with high complexities, such as extended surface-groundwater interactions, ill-defined boundaries, sinks to the sea and anthropogenic intervention with unmeasured abstractions both from surface water and aquifers. Criteria for comparison are the physical consistency of parameters, the reproduction of runoff hydrographs at multiple sites within the studied basin, the likelihood of uncontrolled model outputs, the required amount of computational effort and the performance within a stochastic simulation setting. Our work allows for investigating the deterioration of model performance in cases where no balanced attention is paid to all components of human-modified hydrosystems and the related information. Also, sources of errors are identified and their combined effect are evaluated.


Introduction and motivation
Two different general approaches or philosophies are applied in modelling of natural processes at large spatial scales, e.g., a whole catchment (in the order of at least a few km 2 ): The first approach, called bottom-up or upward (BU), seeks to exploit knowledge (typically physical laws) at the micro-scale (in the order of a few m 2 ) and then proceeds to larger scales through spatial aggregation.The second approach, called top-down or downward (TD), examines processes directly at the large scale and then eventually proceeds to making inferences about processes at smaller scales (Klemeš, 1983;Sivapalan et al., 2003b).Apart from this categorization of modelling approaches whose criterion is the initial spatial scale of process representation, another categorization arises when the criterion is the level of modelling detail: Very often, Published by Copernicus Publications on behalf of the European Geosciences Union.some parts of the studied system are modelled in detail (in space-time) while for other parts simplified models are employed; in that way essential interactions among system components may be poorly represented or even omitted; we will call this approach monomeric (M), which originates from the Greek words "µ óνoς" and "µέρoς" respectively denoting "solely" and "part".Specifically, we call a modelling approach "monomeric" when some, one or a few, components, processes or information regarding the studied system, are examined in detail, while at the same time other important components are roughly accounted for or even omitted.Conversely, when all parts of the studied system are modelled in similar detail and are linked via feedback mechanisms, the approach will be called holistic (H); this derives from the Greek word "óλoν", which means "whole".The distinction between the monomeric and the holistic character of a specific approach can be extended to the type of information about the system that is exploited in modelling; in a monomeric approach often limited information is used, which encompasses a small number of measured system outputs or responses, whereas in an holistic approach often one seeks to exploit information that is more general.Thus, apart from observations, qualitative information of any kind about the system responses is also taken into account together with the empirical interpretation of the unmeasured variables (whether output or internal ones).
Focusing on hydrological modelling, Savenije (2009) pointed out that ". . . the dominant paradigm of hydraulics is reductionism, or a bottom-up approach, whereas in hydrology it is (or should be) empiricism and a top-down approach looking for links with fundamental laws of physics."The implementation of the BU approach into hydrology has led to modelling of hydrological processes at the small scale (e.g., local, plot or hillslope), which has been an active research area in recent years (Zhang and Savenije, 2005;Zehe et al., 2006;Bárdossy, 2007).The practical usefulness of such models lies in that these allow hydrological predictions at the catchment scale, supposedly without using any information on hydrological responses (Kilsby et al., 1999).Essentially this was the initial central focus of the "Predictions in Ungauged Basins (PUB)" initiative (Sivapalan et al., 2003a).
Critiques on the fundamental limitations of this approach, promising substantial reduction of uncertainty through reduction (i.e., theoretical explanation of small-scale processes) rather than deduction (i.e., explanation based on "lumped" response data) have appeared recently (Koutsoyiannis et al., 2009), but the underlying idea has been criticised from its early steps (Beven, 1989).Savenije (2009) reports examples where the BU approach fails, while taking a broader perspective of the system under study through a top-down (TD) approach manages to better explain reality.Applications of the latter approach, which is rather macroscopic and, in this sense, holistic, are few (e.g., Tekleab et al., 2010).
The problems of the bottom-up approach become apparent when hydrological models are called to support engineering and management decisions, i.e., meet their major role (Efstratiadis and Mamassis, 2009).Supporting of decisions often requires modelling hydrosystems that involve extended surface-groundwater interactions and extended anthropogenic interventions in the hydrological cycle, such as abstractions from surface water bodies, pumping, and returns through artificial drainage systems.Theoretically, applying the BU approach for such human-modified hydrosystems would necessitate putting together all physical processes and process interactions.Obviously, this would require a tremendous amount of information, which is absent in every realworld application.Although there are some "integrated" physically-based tools, e.g., the SHE model, that have been effectively used for solving complex water management issues (see the recent discussion by Refsgaard et al., 2010), what is very frequently encountered within the BU approach is the monomeric character of modelling as this is defined earlier in this section.For example, a very detailed model is often formulated for one part of the system (or sub-system), while using oversimplified models for other parts or even ignoring dynamic links between sub-systems.More often than not, the focus is on the detailed hydraulic model of a specific sub-system, such as an aquifer.According to the two categorization criteria, this approach will certainly be characterized as bottom-up/monomeric (BU-M).Although one may say in advance that this is simply a bad modelling practice, which merits no further attention, the use of such approach is still so widespread that the analysis of its implications is, to our view, justified.
In this paper, we will concentrate our effort to humanmodified hydrosystems, since this is a practical problem of high interest.Nowadays, especially in western countries, natural (i.e., pristine) river basins no longer exist (e.g., Sanderson et al., 2002;Wagener et al., 2010).Thus, in our time, dealing with hydrological issues often implies dealing with human-modified systems.To represent the BU-M approach we will consider a particular modelling strategy, called here strategy A. This focuses on hydraulic modelling of one natural sub-system only, which is the basin aquifer.To cope with the system complexities, a multi-stage modelling process is used that involves four stages: (1) splitting the hydrosystem into a number of natural sub-systems (sub-basins and the aquifer) and one man-made sub-system; (2) modelling natural sub-systems individually; (3) transferring predictions from the natural sub-systems to the man-made subsystem; and (4) optimizing the operation of the latter subsystem so as to represent as close as possible the observed conditions of the past (calibration).This is typically the strategy followed in engineering studies with the aid of popular commercial computer packages for water resources management.It presupposes that: (a) pure natural sub-systems can be effectively found, and (b) sufficient information is available for each sub-system modelled.
Strategy A may lead to erroneous predictions in complex basins where no simple natural sub-systems can be identified due to complex water exchanges.Moreover, inadequate information on some sub-systems may prohibit successful modelling.Data inadequacy involves, among others, illdefined system boundaries due to unknown leakages, sinks to the sea and anthropogenic intervention with unmeasured abstractions both from surface and groundwater.In the last years some approaches have appeared that cope with some of the above problems, although they do not cover the case of human-modified basins (e.g., Singh and Bhallamudi, 1998;Bari and Smettem, 2004;Panday and Huyacorn, 2004;Gauthier et al., 2009) neither do they treat the case of unknown abstractions (e.g., Schoups et al., 2005).Few exceptions are found in literature, which follow a TD approach (e.g., Ivkovic et al., 2009;Payan et al., 2008); noteworthy is the effort by Hingray et al. (2010) to simultaneously cope with human-modified basins (with strong influence of hydraulic works) and data scale incompatibility.
An alternative modelling strategy, called here strategy B, will be used to represent a TD-H approach.Here the hydrosystem under consideration is viewed as a whole, having the input and the required information as guides to formulate spatial modelling units and process models.Ultimately, this strategy leads to model integration, parsimonious parameterization and simultaneous optimization of all model parameters.All these features provide flexibility to strategy B, which may be critical in cases with human-modified but poorly measured hydrosystems.
The motivation for this work is to test the applicability of modelling strategy A when the latter is employed for systems such as those described above.Our target is to examine the every-day modelling strategies in a critical spirit.It is the effects of these strategies that are investigated and not the value of the models used therein.In this respect, our work differs from a number of large-scale comparative studies reported in the literature (e.g., WMO, 1975WMO, , 1992;;Bell et al., 2001;Velázquez et al., 2010), including the Distributed Model Intercomparison Project (Smith et al., 2004).The potential benefit when the problems of strategy A are faced is evaluated through applying strategy B. In this context, we followed a two-step procedure; initially, we estimated the model parameters using historical data through a hybrid multi-objective calibration framework, and then we further evaluated the model performance via stochastic simulation.
To implement the two strategies, two corresponding modelling frameworks were chosen (Fig. 1) -the choice merely reflects the authors' experience on specific models.Modelling framework A implements the corresponding strategy A and is based on the well-known groundwater modelling package MODFLOW, coupled with a simple infiltration scheme.Modelling framework B, which is chosen to implement strategy B, uses the recently proposed framework HYDROGEIOS that integrates a semi-distributed rainfall-  runoff model, a coarse groundwater model and a networktype water allocation model (Efstratiadis et al., 2008).We note that in this work we limited our scope to illustrating the effects that the BU-M approach can have on modelling human-modified hydrosystems by choosing one specific modelling strategy (strategy A) within this category of approaches.We further narrowed down the scope through implementing strategy A with the aid of a specific modelling framework (framework A).For comparison purposes, strategy B, together with its implementation through framework B, were chosen to represent the TD-H approach.
A challenging operational case study was chosen involving the Boeoticos Kephisos river basin, Greece.This comprises all complexities described above and has been studied by the authors in the past (Nalbantis and Rozos, 2000;Nalbantis et al., 2002;Rozos et al., 2004;Efstratiadis et al., 2008).All the above works present sequentially improved modelling strategies, from relatively simple to more detailed ones, which are consistent with the TD-H type of approach.Taking advantage of this effort, we detected and investigated six key modelling options within the selected modelling strategies, which are discussed hereinafter.A major criterion for selecting this study area was also our will to address some of the intrinsic restrictions of the available data (cf.Kirchner, 2006).
Apart from the Introduction (Sect. 1) the paper comprises four sections.The key modelling options are described in Sect.2, with regard to the two hydrological modelling strategies.Sect. 3 is devoted to presenting details on the study area and the specific modelling frameworks that were used while Sect. 4 refers to our research results.Last, conclusions are drawn in Sect. 5.

STOCH-SIM
The calculations are exceptionally time-consuming against synthetic data of long time horizons.Synthetic time series of "future" decisions (abstractions) should be externally provided.
Due to its computational efficiency, the model can run for very long time horizons, to provide probabilistic estimates.
Synthetic time series of "future" decisions are model outputs.

Key modelling options in hydrological modelling strategies
When formulating a modelling strategy, critical decisions are made in regard to selecting, formulating and fitting hydrological models.These decisions lead to defining key modelling options that constitute the "ingredients" of the formulated strategy, as described next.Table 1 summarizes six such options, linking them with each one of the modelling frameworks (and, hence, the related strategy) that is employed in our tests.All these options are related to specific stages of the modelling process.According to the classification by Refsgaard and Henriksen (2004) and Scholten et al. (2007), these stages are: (1) model study plan; (2) data and conceptualisation; (3) model setup; (4) calibration and validation; (5) simulation and evaluation.So, the first two options will refer to model conceptualisation (stage 2) since they deal with selecting hydrological processes and the interactions thereof, while the next three options will have to do with selecting calibration parameters, thus referring to stage 4. Finally, the last option, i.e., the application of models in stochastic simulation mode, is linked to stage 5.

Key modelling option SW-GW: link between models for surface and groundwater processes
In strategy A, different stand-alone models are used for surface and groundwater hydrology, which precludes interaction between the corresponding processes.Very often, the two models differ in their spatial and temporal scale and in their modelling approach.The surface processes are usually represented via conceptual (hydrological) models or fullydistributed physically-based schemes, which are also conceptual at the grid scale (Beven, 1989).Yet, groundwater modelling typically follows a hydraulic rationale.Obviously, these options also affect the parameter estimation procedure.
In strategy B, the main hydrological interactions are explicitly represented, and thus model parameters can be simultaneously optimized, taking advantage of the available measurements across all components.

Key modelling option SW-GW-WM: link between models for hydrological processes and water management
In the staged modelling procedure of strategy A, hydrological models are constructed exclusively for the unmodified parts of the system (e.g., sub-basins) and the outputs thereof (e.g., river flows) are, then, transferred as inputs to the water management model of the man-made sub-system, usually implemented within a Decision Support System (DSS).This serial operation, apart from being computationally inefficient, suffers from a number of drawbacks: (a) it requires iterations when decision-related interactions between the hydrological and the man-made sub-systems are significant; (b) it is infeasible when real abstractions are not measured, since these serve as the boundary conditions for hydrological and hydrogeological models; (c) it may require making simplified yet unrealistic assumptions regarding the water allocations (e.g., a "first-come, first-served" management policy); (d) it requires data transfer from the hydrological model to the water management model and hence the space-time scale compatibility of the related variables; (e) the above problems introduce high uncertainty in the parameters obtained through automatic calibration or even make the calibration impossible, since models with unknown yet interrelated parameters have to run individually (Efstratiadis et al., 2010).Attempts to cope with the above problem are rare in literature (Fredericks et al., 1998;Dai and Labadie, 2001).Coping with these problems requires a fully integrated computational scheme, as employed in strategy B.

Key modelling option SCHEM-SCALE: link of spatial scale and model schematization
The large heterogeneity of mechanisms and properties makes it difficult to achieve compatibility between the measurements made at the local scale and the model predictions.Quite often, in strategy A, very detailed models are chosen in the hope to achieve scale compatibility between the data and the predicted variables.Yet, the resulting high dimensionality leads to extremely time-consuming schemes, which is a major restricting factor affecting not only the calibration but also the operational applicability of models; the latter arises when models have to co-operate with DSS that run in forecast mode, using synthetic forcing data for long time horizons (Nalbantis et al., 2002).

Key modelling option SCHEM-PARAM: link between hydrosystem schematization and parameterization
Inevitably, in strategy A, the hydrosystem schematization (i.e., the simplification of the process representation in space) dictates the parameterization.Parameters are assigned to individual spatial elements (e.g., sub-basins, grid cells), thus having limited physical meaning.Moreover, due to the detailed spatial scale adopted, the principle of parsimony is broken, which results to poorly identified parameters (Kuczera and Mroczkowski, 1998) and increased predictive uncertainty (Efstratiadis and Koutsoyiannis, 2010;Savenije, 2010).Attempts to reduce the number of control variables of the optimization problem require hybridized strategies, such as detecting only the most important parameters while estimating the rest of them on the basis of field data (Refsgaard, 1997;Eckhardt and Arnold, 2001) or using zonation approaches (i.e., spatial grouping of parameters).Contrary to the above, in strategy B the schematization and the parameterization are disconnected, thus allowing models to be by construction parsimonious.In this respect, the schematization is adapted to the engineering objectives (i.e., which processes should be simulated and where), while the parameterization is only linked to the available information (cf.Dehotin and Braud, 2008).

Key modelling option OPT: appropriate use of optimization in calibration
In theory, physically-based approaches enable their free variables to be derived from field measurements.Yet, in practice, their applicability is significantly restrained not only by the heterogeneity of processes and the unknown scale dependencies of parameters (Beven, 2001;Wagener et al., 2001;Rosberg and Madsen, 2005), but also by the high computational effort and the subsequent inability to co-operate with DSSs (Nalbantis et al., 2002).Hence, optimization is always required for at least some of the model parameters (Refsgaard, 1997;Eckhardt and Arnold, 2001).However, within strategy A, a "black-box" approach is frequently adopted, seeking for a "optimal" parameter set (typically against a single, quantitative performance criterion), through an automatic algorithmic procedure.Very often, this leads to: (a) parameter values that are inconsistent with their physical interpretation; (b) poor predictive capacity against independent control data (model validation); (c) unreasonable values of the uncontrolled responses (e.g., evapotranspiration, underground losses) and the internal model variables (e.g., soil and groundwater storage) (Rozos et al., 2004;Efstratiadis et al., 2008).On the other hand, strategy B emphasizes both on the optimization task itself and the comprehensive understanding of the problem components (real system, model and data), to ensure reliable results.Particularly in models of complex parameterization, it aims to increase the information that is exploited in calibration as much as possible, by means of both multi-response measurements and empirical metrics ("soft" data), accounting for the hydrological expertise (Efstratiadis and Koutsoyiannis, 2010).

Key modelling option STOCH-SIM: model evaluation through stochastic simulation
The typical "split-sample" procedure for model building (i.e., calibration/validation based on historical data; Klemeš, 1986) may hide possible structural deficiencies of the model, given that it is too much depended on the historical data (and the errors embedded in them).A more holistic approach aims to extend model validation, by also examining the model responses under hypothetical conditions on the basis of synthetically-generated forcing, i.e., through stochastic simulation.This option offers multiple advantages over the typical model validation procedure: (a) it helps testing a modelling strategy within a framework that is similar to an operational one, i.e., by representing situations that are consistent to those in which it is supposed to be employed in practice (Klemeš, 1986); (b) it allows for examining future water management scenarios that are different from the historical ones, thus also offering a "crash-test" for evaluating the model transposability in time, which is a necessary condition for its operational adequacy (Andréassian et al., 2009); (c) it provides an opportunity to check for model inconsistencies, e.g., unreasonable long-term statistical trends, jumps, or abnormal patterns (often impossible to spot in the short period encountered in typical calibration and validation), which constitutes a supplementary verification of the model credibility and helps highlight hidden artefacts; (d) it provides estimates of the uncertainty of predictions and the long-term risk in hydrological studies and water resources management, through a proper representation of the varying character of climate and the related processes in stochastic terms (Koutsoyiannis et al., 2007;Koutsoyiannis, 2011); (e) it allows for including the evaluation of extremes, which are not represented in the (usually limited) calibration data (Seibert, 2003).
3 Case study

The study area
The Boeoticos Kephisos river basin lies on the Eastern Sterea Hellas, to the north of Athens, and drains a closed area of 1930 km 2 (Fig. 2).The catchment is formed on heavily karstified limestone covered with alluvial deposits in plain areas.Considerable groundwater amount (more than half of the annual catchment runoff) is discharged through large springs in the upper and the middle part of the basin, whereas an unknown amount is leaking to the sea.The surface runoff of the basin is conducted to the neighbouring Lake Hylike, which is one of the major reservoirs of the Athens water supply system.Groundwater of the middle part is also considered as an emergency resource.A significant part of the surface and groundwater resources of the basin is used in agriculture.For more details the reader may refer to previous publications (Rozos et al., 2004;Efstratiadis et al., 2008).
For the study area, a major question is about the impacts of the abstractions through the Vasilika-Parori boreholes (middle course) to the overall hydrological regime of the basin.These boreholes were drilled during a severe drought in the period from 1989 to 1994, at the end of which almost all surface resources dried out.Yet, due to the considerable reduction of precipitation and the intense pumping for providing drinking water to Athens, the discharge of the neighbouring Mavroneri springs (which account for 15% of the total basin runoff) was interrupted twice during 1990 and 1993, thus creating various social and environmental problems.In the last 15 years, several engineering studies were carried out to investigate the complex dynamics of the hydrosystem and to provide reliable estimations regarding the consequences of water supply abstractions to the system responses.Modelling framework A originates from earlier engineering studies (Perleros, 1998;Nalbantis and Rozos, 2000) whereas framework B capitalizes the experience gained after continuous research attempts (Nalbantis et al., 2002;Rozos et al., 2004;Efstratiadis et al., 2008).

Modelling framework A
Within this framework an off-line coupling of MODFLOW is employed with a simple rainfall-runoff model, as illustrated in Fig. 1, left.MODFLOW is a classical tool for 3-D simulation of groundwater flow, where the flow field is discretized into a number of rectangular cells and all quantities are referred to cell centres.The 3-D continuity equation and the Darcy's law written in finite differences form provide one final flow equation for each cell as a function of the unknown hydraulic head h and other known variables.The latter are either parameters of the aquifer (conductances for the three axes directions and the specific yield) or external stresses of Hydrol.Earth Syst.Sci., 15, 743-758, 2011 www.hydrol-earth-syst-sci.net/15/743/2011/ the cell (percolation from the soil or the river bed, pumped water, water outflow to the sea).After defining the initial and the boundary conditions, the final system of equations on h is solved via a variety of methods such as the Preconditioned Conjugate Gradient method (PCG) implemented in the PCG2 solver (Hill, 1990).
For model implementation, we used the computer package by Waterloo Hydrogeologic Inc. (1999).This includes the optimization module PEST (Doherty et al., 1994) for the automatic calibration of model parameters.In this work manual calibration was performed.
The rainfall-runoff process is represented through two modelling components; the first calculates the effective rainfall on the basis of precipitation and evapotranspiration, whereas the second divides the effective rainfall into runoff and percolation, assuming a constant ratio between these two variables, which stands as the single model parameter.The above scheme runs independently, to provide external stresses to MODFLOW cells, due to percolation.

Modelling framework B
In modelling framework B the computer package HYDRO-GEIOS 2.0 is used.This is a GIS-based model suite, which allows for flexible representation of hydrological processes, (Efstratiadis et al., 2008(Efstratiadis et al., , 2009)).A synoptic despicttion of the modelling framework is illustrated in Fig. 1, right.More precisely, surface flow is considered within the hydrographic network, which is extracted from a digital terrain model through adding control points that correspond to flow measurement stations or diversion nodes.The network conveys flow of the sub-basins, which are subject to different hydrological stresses (precipitation and potential evapotranspiration -PET).Surface processes are considered homogeneous within partitioned areas or patches of the basin (not necessarily contiguous) termed as the Hydrological Response Units (HRUs).This idea has found a limited number of applications to distributed modelling also (Flügel, 1995;Srinivasan et al., 2000;Gurtz et al., 2003).The HRUs represent soil and land use types and are defined on the basis of classification of different properties, such as soil permeability, land cover and terrain slope.The surface hydrological model transforms precipitation into real evapotranspiration, percolation and surface runoff (direct, overland, lateral) and comprises six parameters per HRU; through selecting the number of classes one can adjust the number of HRUs and, consequently, the total number of the surface hydrology model parameters.
Groundwater flow is modelled through a multi-cell approach involving the discretization of the aquifer into nonrectangular cells (Rozos andKoutsoyiannis, 2006, 2010).This allows the description of complex geometries on the basis of the physical characteristics of the aquifer (e.g., geology), through parsimonious structures.Two parameters are assigned to each cell (conductivity and specific yield).
Springs and underground losses are implemented as virtual cells of very large base, which allows keeping almost constant hydraulic head.Model stresses are: (a) distributed inflows due to percolation; (b) inflows due to infiltration underneath each river segment; (c) outflows due to pumping from each borehole.
Regulated flow through man-made structures and portions of the hydrographic network is modelled with the aid of a water management network, which is an extension of the scheme described by Efstratiadis et al. (2004).This has as nodal inflows the surface and the groundwater runoff, as nodal outflows the withdrawals for water uses, and as distributed fluxes the water losses due to infiltration and river discharge.Major hydraulic works are also represented as well as their interactions with the natural system.Model properties are discharge and pumping capacities, target priorities, demand time series and unit transportation costs of water.The allocation of flows is based on a linear programming approach where virtual unit costs, positive or negative, are assigned either to prohibit undesirable fluxes or to force the model fulfil the hydrosystem targets (Efstratiadis et al., 2010).
HYDROGEIOS embeds an advanced calibration module that provides a number of statistical and empirical criteria for model fitting on multiple responses (river and spring discharge, hydraulic heads) and various options regarding the delineation of the feasible search space.Optimization is carried out through the evolutionary annealing-simplex method (Efstratiadis and Koutsoyiannis, 2002;Rozos et al., 2004)

Model schematization and parameterization
Within strategy A, the karst aquifer is considered as a single layer with free surface unsteady flow.The flow field was divided into zones (Fig. 3), which resulted to totally 18 parameters to calibrate (7 parameters for conductivity, 7 for specific yield and 4 for conductances).The alluvial aquifer beneath the downstream part of the basin was ignored since its water yield is low.In the whole perimeter of the karst aquifer no-flow boundaries were assigned, apart from two areas: a small area in the N boundary where small-scale outflows to the sea were simulated using dummy pumping wells and a small area in the SW where lateral recharges were simulated using dummy recharge wells.The discharge of the dummy wells was estimated using a multiple regression formula with rainfall.External stresses due to percolation were modelled in a simple way as explained in Sect.3.2.Precipitation was considered homogenous in the interior of three sub-basins, which represent different hydrogeologic units (low, middle and upper course).The infiltration depth was taken as a constant fraction of the effective precipitation depth, which varied from one type of surface geological formation to another; three types of such formations were assumed, corresponding to three permeability levels.The cell sizes were varying from 800 × 800 m 2 near the boundaries to 150 × 150 m 2 in central areas (near the springs); this resulted in a total number of cells equal to 3631.
Stream-aquifer interactions were represented using the River module of MODFLOW assuming zero infiltration during the dry period (i.e., June-September).The river stage for the wet period (October-May) was estimated using a multiple regression formula with rainfall, while for the remaining months it was considered equal to the river bed elevation.
No water management model was explicitly considered, for estimating the unknown historical withdrawals from groundwater.Alternatively, we used the irrigated area that is supplied by each well component and the related water demand per unit area, and assumed (wrongly, since abstractions are made also from the river) that the entire demand is fulfilled via pumping (Nalbantis and Rozos, 2000;Nalbantis et al., 2002).
On the other hand, strategy B followed a semi-distributed schematization to 15 sub-basins, as shown in Fig. 2. The spatial average of precipitation of each sub-basin was calculated through the Thiessen method, whereas PET time series were estimated via the Penman-Monteith method (in both frameworks we used the same point rainfall records, the same meteorological data and models for estimating PET, and the same spatial integration techniques).For the definition of the HRUs, three categories of conductivity (low, medium, high) and two categories of terrain slope were taken.The Cartesian product of these layers resulted into the definition of six HRUs.The groundwater flow field was discretized into 40 non-rectangular cells, four of which represent underground leakages to neighbouring basins and the sea.In addition, six dummy cells were used to model surface outflows through the major karstic springs (Fig. 4).This spatial discretization is two orders of magnitude coarser than the one implemented within strategy A, thus making the computational effort for groundwater simulation almost negligible, in comparison to that of strategy A. For the parameterization of the groundwa- ter flow field, we used three categories of conductivity and porosity, which reflect topography and geology.Moreover, we used particular conductivity values for the rest of cells representing springs and underground losses (ten parameters in total).In combination with the parameterization of the surface processes via the six HRUs, the total number of model parameters was 52 (36 for the surface model and 16 for the groundwater model).We note that both the schematization and the parameterization differ from those reported by Efstratiadis et al. (2008); for the surface sub-system, a more detailed schematization to 15 instead of 4 sub-basins was made, to better describe the variability of the rainfall field and obtain flow predictions at multiple sites of the hydrographic network, while the delineation of the groundwater sub-system was radically changed, in an attempt to significantly reduce the number of groundwater parameters, thus ensuring model parsimony (Kopsiafti, 2009).
In the water management model, a conceptual network is considered, which includes various types of nodes (e.g., consumption nodes for agricultural use) and aqueducts.Water supply targets and virtual costs are assigned to consumption nodes and aqueducts, respectively, with the purpose to represent a realistic abstraction policy (e.g., in case of combined abstractions, priority was given to river abstractions instead of pumping, which is the historical practice).This is a critical difference between the two strategies, since strategy A requires known abstractions whereas strategy B is much more flexible by allowing for choosing among different abstraction policies to fulfil demands.For instance, pumped water through the Vasilika-Parori boreholes is either conducted downstream, for the irrigation of Kopais plain, or diverted for the water supply of Athens.Withdrawals from other resources, both surface and groundwater, are also implemented to serve local agricultural demands (Fig. 5).
The number of parameters used in the two strategies is of the same order; yet, in strategy B, the parameters refer to the Hydrol.Earth Syst.Sci., 15, 743-758, 2011 www.hydrol-earth-syst-sci.net/15/743/2011/ whole system components (river network, HRUs, groundwater cells, springs).This naturally qualifies strategy B as being more parsimonious, since it achieves a much more extended (and thus holistic) representation of the system processes.
We would like to note that both strategies and the related frameworks use HRU-type (or zonation-type) parameterization; their difference lies in the fact that in strategy B this is true for both the surface and the groundwater processes while in strategy A only groundwater modelling is essentially benefited from such type of parameterization.

Calibration strategies and data for parameter estimation
Both strategies were tested against the observed hydrological responses for a 10-year control horizon (October 1984-September 1994), employing a monthly time step.For this period, discharge series at seven locations are available, precisely at the basin outlet (Karditsa tunnel) and downstream of the six karst springs, as illustrated in Fig. 2. With regard to groundwater, several level gauges were available, mostly located in the vicinity of the main river branch.
Regarding the surface hydrology component of framework A, its single parameter (i.e., infiltration fraction) was estimated empirically, on the basis of the main geological formations of the basin.The MODFLOW parameters were manually optimized on the basis of 18 observed level series and through visual inspection of the closeness of the observed and the simulated spring hydrographs.
In strategy B, the parameters of the surface and the groundwater models were simultaneously optimized.To cope with parameter uncertainty, which is directly linked to equifinality (Freer et al., 1996), multiple criteria were taken into account, including "soft" data (Seibert and McDonnell, 2002;Efstratiadis and Koutsoyiannis, 2010).Thus, a weighted objective function was formulated comprising the following statistical and empirical measures: (a) efficiency and bias of the monthly hydrographs at the seven locations mentioned above; (b) penalties for not reproducing flow intermittency; and (c) penalties for generation of unrealistic trends regarding the groundwater levels.The first group of criteria accounts for "hard data" (i.e., from measurements) which is essential for reproducing the global water balance and the spring mechanisms, but not sufficient for representing the groundwater regime; these criteria will be hereinafter referred to as the performance indices.The second group of criteria accounts for the information "zero or non-zero discharge", which is easily observable, reliable and is of major interest in water management.Finally, the third group of criteria is a kind of "soft" data, ensuring reasonable fluctuation of the non-observable internal variables of the model.The optimization of parameters was carried out through a hybrid strategy, which combines human experience and automatic tools (Boyle et al., 2000;Rozos et al., 2004).In that manner, search was guided towards a realistic, best-compromise parameter set, ensuring satisfactory predictive capacity for all model responses.Initial conditions were empirically assigned.In particular, for the surface model, initial soil moisture depth was set to zero for the all basin partitions (i.e., combinations of subbasins and HRUs), given that the simulation starts at the end of the dry period (October).For the groundwater model, the cell levels at the beginning of simulation were estimated on the basis of topography, spring elevations and average piezometric information for the plain (karst) aquifer and using arbitrary yet realistic values for the peripheral cells, which are fed by the percolation of mountain areas.Preliminary trials were necessary to establish steady-state conditions during the entire control period.

Operational use of models through stochastic simulation
As mentioned in Sect.2, calibration is an intermediate step in the modelling procedure, which allows for optimizing the predictive capacity of the model on the basis of observed data.Yet, further analysis with "projected" inputs is required, which will provide support to future decisions.In this respect, the two strategies are evaluated within a stochastic simulation framework, aiming to examine the system response under different stress conditions, comprising both natural (precipitation, PET) and anthropogenic (abstractions from surface and groundwater resources) forcing.
www.hydrol-earth-syst-sci.net/15/743/2011/ Hydrol.Earth Syst.Sci., 15, 743-758, 2011 For the representation of rainfall, a multivariate stochastic scheme (Castalia model) was used to generate point series of 1000-year length, which preserve the essential statistical characteristics of the observed samples of 12 rain gauges across the basin, at the annual and monthly time scales (Koutsoyiannis et al., 2003).Next, the point series were aggregated to the appropriate spatial scale, thus providing areal rainfall series for the 3 and 15 sub-basins, which correspond to modelling frameworks A and B, respectively.The synthetic rainfall records were divided into clusters of ten-year length, to formulate a hundred of statistically equivalent forcing scenarios.Each framework ran 100 times, each one under different stochastic forcing, whereas for all runs the same initial conditions were applied.For instance, for modelling framework B we used zero soil moisture depths (since simulation begins at the end of the dry period) and the average groundwater levels of the calibration.In this context, the model outputs represent statistically consistent trajectories of the system responses, for a ten-year horizon.Such type of Monte Carlo simulation is typically used in operational applications (e.g., forecasts), and is also known as terminating simulation (Koutsoyiannis et al., 2003;Koutsoyiannis, 2005).
Regarding the PET throughout each sub-basin, we assigned the monthly averages of the areal values of the control period 1984-1994, since this is a forcing variable of very low interannual variability.For the anthropogenic forcing, we examined two alternative water management scenarios.Specifically, we assigned the actual irrigation water demands across the basin (223 hm 3 yr −1 ) and assumed either zero or extensive (46 hm 3 y −1 , equal to that of the dry water year 1993-1994) water demand from the Vasilika-Parori boreholes, for providing drinking water to Athens (Fig. 5).We remind that in modelling strategy A, the water requirements are by definition fulfilled via pumping (which is an erroneous yet obligatory assumption) whereas in strategy B the demands can be satisfied through multiple sources.

Experiment design
Ideally, testing the effect of adopting modelling strategy A for human-modified hydrosystems would require a complex computer experiment based on a series of alternative frameworks implementing strategies A and B in the sense that was given in Sect. 2 and in Table 1.The practical difficulty of such task led us simplify our experiment by considering the two modelling frameworks presented in Sect.3.2 and 3.3 above.Table 1 summarises the a priori knowledge about the relation of each key modelling option to each modelling framework and hence to the related strategy.Application to a common data set of hydrological variables of the aforementioned test basin enabled an objective evaluation of the effect of adopting strategy A. For this, both numerical and empirical criteria are used for the comparison of the two strategies, in calibration and simulation mode.In this study, we emphasize on the monthly flows at common control points (i.e., the main karstic springs), whose hydrographs are also depicted for visual comparison.Comparisons on the river flow at the basin outlet, which is very well represented by modelling framework B, were not feasible, since framework A did not provide such output.Regarding the groundwater levels, we only examined their long-term behaviour rather than their actual values, since a direct comparison of the aquifer water levels was meaningless, due to enormous differences of scale between the two alternative groundwater models used.

Performance in calibration and validation
The model performance during the calibration and validation periods is evaluated on the basis of two criteria, efficiency values and bias in the mean.Due to the different assumptions regarding the system delineation (e.g., in strategy A, the river network and the alluvial areas of the aquifer were not simulated), comparisons were possible only at three observation sites, namely downstream of Mavroneri, Melas and Polygyra springs.In Tables 2 and 3, values of the corresponding performance indices are provided, which show a clear improvement of model performance in spring flow predictions when passing from strategy A to strategy B for both the calibration and the validation periods.This is confirmed by hydrograph comparisons on Fig. 6 for the two most important springs of the basin (Mavroneri and Melas).We note that the first strategy, although concentrated on the detailed representation of the aquifer dynamics, fails to reproduce key characteristics Hydrol.Earth Syst.Sci., 15, 743-758, 2011 www.  of the observed flows, namely the monthly variability, which is overestimated for Melas and underestimated for Mavroneri springs; it also fails to reproduce the interruption of the discharge of the latter, during 1990 and 1993.
On the other hand, the advantages of model integration as offered by strategy B extend beyond predicting flow.Other improvements are equally significant, which are commented in Table 4 and provide explanation of the superior performance indices regarding observed groundwater variables, despite the fact that framework B follows a much simpler modelling approach, which focuses on the surface processes and the water management practices.

Comparison of model performance in simulation
Obviously, when comparing two modelling strategies in a stochastic simulation setting, it is impossible to use quantitative criteria (e.g., goodness-of-fit measures), as in calibration.Therefore, the evaluations are based on the grounds of common sense, i.e., testing whether the model provides the right answers for the right reasons (cf.Kirchner, 2006), taking advantage of the hydrological experience.
The implementation of the two strategies under synthetically generated inputs further reveals the drawbacks of strategy A. In Fig. 7 we plot the projected discharge at the Mavroneri springs for a ten-year horizon (mean value of 100 flow scenarios and 50% prediction limits), under zero and intensive pumping through the neighbouring boreholes at Vasilika-Parori (respectively referred to as the "actual abstraction policy" and the "intensive abstraction policy").For both management policies, there is a sharp decrease of the discharge, which is inconsistent with the experience so far.Although one could expect that under extensive pumping a systematic negative trend could be possible, it is unlikely that such trend is encountered under the actual abstraction policy, since, in the absence of trend in the input data and the hydrosystem itself, a stationary output should be produced.In fact, the observed water levels, although very sparse in time and space, confirm this.Yet, manual calibration through visual inspection used in strategy A yielded model parameters that erroneously introduced a mild trend in levels; this is hardly noticeable in the six years of the calibration period.Using synthetic data helped identifying this trend.The differences with the respective results obtained through strategy B (Fig. 8) are substantial; here, the mean projection for the spring outflows (which corresponds to average rainfall conditions) follows a stationary pattern under the actual abstraction policy, whereas there is a progressive decrease of the spring resources under the intensive abstraction policy (Fig. 8).This indicates that, in a long-term perspective, the intensive use of the Vasilika-Parori boreholes for the water supply of Athens is not a sustainable option.
Regarding the response of the Melas springs, both strategies exhibit a systematic negative trend under intensive pumping (Figs. 9 and 10).This is reasonable, since the entire karst aquifer should be affected by the upstream water supply abstractions.However, the statistical analysis of the stochastic flows of strategy A revealed an unrealistic behaviour.Specifically, a prominent peak of the spring discharge appears at the end of each wet period, which is indistinguishable in the calibrated flows (see Fig. 6) as well as in most of the individual synthetic flows.Only by getting the statistical characteristics of the simulated flows, i.e., the average and the prediction limits, it was feasible to detect this pattern.The latter originates from the rough description of the boundary conditions with regard to the stream-aquifer interactions Integrating surface and groundwater models allowed for simultaneous calibration against basin and spring hydrographs within a single computer program.

SW-GW-WM
The absence of a water management model and the use of rough estimates of withdrawals produced errors due to drastic assumptions (satisfaction of water demand, time averaged values).
Model integration allowed for optimizing dynamic withdrawals and allocating targets fulfilled via different sources, which helped to improve the overall model performance.

SCHEM-SCALE
The coarse scale of the infiltration model decreased the value of the detailed information provided by MODFLOW.
Scale compatibility was guaranteed between surface and groundwater processes whereas respecting the principle of parsimony.The delineation of the aquifer to 40 cells (in contrast to the 3631 cells of strategy A) dramatically decreased the time of simulations.
SCHEM-PARAM Surface processes were parameterized per sub-basin as homogeneous areal units, i.e., system schematization dictated parameterization.Zonation was applied in groundwater flow modelling.
The use of HRUs helped decoupling the schematization and the parameterization of the surface hydrology model.For the groundwater model, decoupling proved possible through parameter grouping, on the basis of both topographical and geological criteria (zonation).

OPT
The manual calibration was a tedious procedure.The model performance was low in calibration.The deterministic optimization of the local-search type implemented in MODFLOW (but not used here) certainly lies behind modern optimization methods.
Calibration was effectively guided towards a best compromise solution through proper formulation of the optimization problem, as explained in Table 1.The observed hydrographs were satisfactorily reproduced, at multiple sites of the basin.

STOCH-SIM
The simulation time for 1000 years was almost two orders of magnitude larger than of framework B (i.e., ∼2 ḣours vs. ∼2 minutes).Unrealistic patterns were revealed, regarding the simulated flows and water levels.
Accounting for the fluctuation of the groundwater levels ensured realistic responses against the two water management scenarios.Fig. 9. Simulated discharge (m 3 s −1 ) at Melas springs (mean and 50% prediction limits) under the actual abstraction policy (left) and the intensive abstraction policy (right) for the water supply of Athens, according to modelling framework A.   (Sect.3.4).The stochastic simulation highlighted this inconsistency, which turns out to be another weakness of the monomeric character of the specific strategy.
Attempting to investigate the reasons for the unrealistic performance of modelling framework A, we concluded that we should revisit the calibration procedure.Actually, we avoided recalibration but we simply evaluated the implications of the initial calibration attempt.In contrast to strategy B, where we accounted for the internal variables of the model by assigning trend penalties on groundwater levels, in strategy A we just focused on fitting the model responses (spring flows and water levels for some points) to the observations.In that manner, we allowed systematic drainage and filling of the cells lying near the boundaries of the aquifer where spring flow or groundwater level observations are not available.Actually, inconsistent conductivity values were assigned in order to maximize the model performance against observations via the efficiency indices.On the other hand, within strategy B the soft criteria contributed to a more consistent calibration, although this required a time-consuming hybrid procedure.In Fig. 11 we compare the synthetically generated levels obtained through the two strategies, which correspond to the most upstream part of the karst system, assuming the actual abstraction policy.We observe a questionable behaviour (i.e., negative trend) of the projected level when applying strategy A, whereas for strategy B the groundwater system exhibits stationary behaviour, which is more reasonable.

Conclusions
Our investigations have shown that in modified watersheds due to human interventions, the classical modelling strategy based on the monomeric bottom-up approach may prove inefficient.This makes use of a detailed hydraulic model for only a part of the studied system and of separate models for surface hydrological processes, groundwater flow processes and water management actions.Such serial use of models prohibits the modelling of the process interactions and suf-fers from increased computational burden.The monomeric character of this approach also involves other modelling aspects, including the assignment of parameters (coincidence of spatial scales of schematization and parameterization).In addition, calibration is based only on "hard information" i.e., measured system variables, while "soft information" i.e., likelihood of relations of system variables is ignored.All the above misuse practices are reflected in the predictive capacity of the model, which proved disappointedly poor for such an exhaustive effort (in calibration).These weaknesses of this approach became more evident by employing the model in stochastic simulation mode for operational use; the obtained projections are far from being realistic.
Conversely, a holistic top-down approach allows for model schematization and parameterization that respects the principle of parsimony and ensures computational efficiency by means of both simulation and optimization/calibration.This precludes taking advantage of the power of fully distributed models while, at the same time, favouring the use of the semidistributed approach.An effective way to reduce the size of the parameter set is to decouple parameterization and model schematization through using the HRU concept.For, the model structure depends on a limited number of landscape classes, whose parameters retain some physical consistency thus allowing for a better identification of their prior uncertainty (cf.Savenije, 2010).Last, model integration allows simultaneous calibration of all models through exploiting many kinds of information and not only information about some basic output variables.A hybrid process of manual and automatic optimization proved very effective in finding a best compromise solution while, at the same time, respecting physical consistency of parameters.In this approach all available pieces of information, including hydrological experience, are exploited in model calibration within a multiobjective framework.
Our tests also proved that running models in stochastic simulation mode can be a useful technique for their testing and validation since this augments information supplied by typical calibration and validation procedures, while, at the same time, addressing some data restrictions and revealing possible model deficiencies.However, this option is a matter of expert judgment.In addition, it may be difficult to implement in complex hydrosystems and is hardly applicable for evaluating the credibility of the model in situations involving non-stationary inputs.
In this work we quantified the deterioration of model performance in cases that no attention is paid to all components of a human modified hydrosystem -an issue that is unfortunately encountered in many instances in hydrological practice.Moreover, various sources of error were identified, although their individual contribution to the overall prediction error was not quantified.We believe that the results obtained suggest that from a qualitative viewpoint, in poorly gauged human-modified hydrosystems the top-down/holistic approach (TD-H) is expected to perform better than the bottom-up/monomeric (BU-M) approach.Differences are however expected to depend on the kind and the degree of the human intervention, the degree of data scarcity, the kind and degree of the monomeric character of the approach that is followed, and the type of models used.
We believe that the research presented in this paper can contribute towards (1) formulating specifications for model packages applicable to human-modified basins, and (2) opening new research paths regarding different types of approach followed in hydrological modelling.For example we think that distributed information can well be incorporated to describe specific aspects of heterogeneity (e.g., elevation, land cover, rainfall), as far as this does not contrast the principle of parsimony by introducing additional model parameters.High level of integration of such information into the proposed modelling framework certainly allows for a TD-H approach to be implemented.

Fig. 2 .
Fig. 2. The Boeoticos Kephisos river basin and the main hydrosystem components, i.e., sub-basins, river network, springs (circles) and rain gauges (rectangles), according to the schematization of modelling framework B. The basin location is also shown (upper right).

Fig. 3 .
Fig. 3. Discretization of the karst aquifer and zonation employed in framework A (with the springs, part of the basin boundaries, and the dummy wells: x = recharge, o = pumping).

Fig. 4 .
Fig. 4. Discretization of the entire groundwater system (also indicating the springs and the four dummy cells, accounting for underground losses) and zonation approach (three zones in different colours), according to modelling framework B.

Fig. 5 .
Fig. 5. Detailed depiction of the water management network in the middle part of the basin, according to modelling framework B.

Fig. 7 .Fig. 8 .
Figure 7Fig.7.Simulated discharge (m 3 s −1 ) at Mavroneri springs (mean and 50% prediction limits) under the actual abstraction policy (left) and the intensive abstraction policy (right) for the water supply of Athens, according to modelling framework A.

Fig. 10 .
Fig. 10.Simulated discharge (m 3 s −1 ) at Melas springs (mean and 50% prediction limits) under the actual abstraction policy (left) and the intensive abstraction policy (right) for the water supply of Athens, according to modelling framework B.

Figure 11 Fig. 11 .
Figure 11Fig.11.Simulated level (m) at the upstream part of the aquifer (mean and 50% prediction limits) under the actual abstraction policy (zero pumping for the water supply of Athens), according to modelling frameworks A (left) and B (right).

Table 1 .
Features of test modelling frameworks in regard to key modelling options.

Table 2 .
Coefficients of efficiency between computed and observed monthly flows for the period of calibration(October 1984- September 1990)and validation(October 1990-September 1994).

Table 4 .
Key points related to the effectiveness of the alternative modelling frameworks as reflected in the research results of this work.