www.hydrol-earth-syst-sci.net/12/669/2008/ © Author(s) 2008. This work is distributed under the Creative Commons Attribution 3.0 License.

Numerical simulation models are frequently ap- plied to assess the impact of climate change on hydrology and agriculture. A common hypothesis is that unavoidable model errors are reflected in the reference situation as well as in the climate change situation so that by comparing ref- erence to scenario model errors will level out. For a polder in The Netherlands an innovative procedure has been intro- duced, referred to as the Model-Scenario-Ratio (MSR), to express model inaccuracy on climate change impact assess- ment studies based on simulation models comparing a ref- erence situation to a climate change situation. The SWAP (Soil Water Atmosphere Plant) model was used for the case study and the reference situation was compared to two cli- mate change scenarios. MSR values close to 1, indicating that impact assessment is mainly a function of the scenario itself rather than of the quality of the model, were found for most indicators evaluated. A climate change scenario with enhanced drought conditions and indicators based on thresh- old values showed lower MSR values, indicating that model accuracy is an important component of the climate change impact assessment. It was concluded that the MSR approach can be applied easily and will lead to more robust impact as- sessment analyses.


Introduction
Numerical simulation models have been used extensively in climate change research over the last decades. In general these models have been applied for two types of research: climate change projections and climate change impact assessment and adaptation. Typical examples of the first are the socalled General Circulation Models (GCM) which are defined Correspondence to: P. Droogers (p.droogers@futurewater.nl) as "a numerical representation of the climate system based on the physical, chemical and biological properties of its components, their interactions and feedback processes, and accounting for all or some of its known properties" (Baede, 2001). There is an evolution towards more complex models including oceanography, chemistry and biology (Coupled Atmosphere Ocean General Circulation Models, AOGCMs). Extensive literature regarding these AOGCMs can be found elsewhere (e.g. IPCC, 2007).
The other group of models used in climate change research are applied to assess the impact of climate change, as projected by AOGCMs, on mankind and nature. These models, sometimes referred to as impact assessment models, are the classical distributed hydrological models, crop growth models or a mixture of these. The assessment component of those models is obtained by running the model for a calibrated/validated reference case, and use this model for an altered climate obtained from a climate scenario. In this paper the term model refers always to these impact assessment models. The total number of existing assessment models is unknown, but a rough estimate indicated to be in the order of thousands (Droogers and Immerzeel, 2006).
One of the most important issues is the impact of climate change on the hydrological cycle including impact on crop production. The generic procedure to undertake such an impact assessment study involves the following steps: (i) selection of an appropriate numerical model, (ii) calibrate and validate the model for the current situation, (iii) obtain and downscale climate change projections, (iv) run the calibrated model with the downscaled climate change projections, and (v) evaluate impact of climate change (=difference between current situation and expected future). These five steps might be followed by an evaluation of potential adaptation strategies.
A typical example of such a study is given by Brouyère et al. (2004) where a detailed physical hydrological model was extensively calibrated to mimic reality.  with several climate change scenarios it was concluded that groundwater levels would decline under climate change. On a smaller scale Roberto et al. (2006) started with calibrating the detailed crop growth model DSSAT for two crops. Based on the calibrated model they evaluated the impact of precipitation, radiation and temperature on crop production. Along the same lines, a detailed agro-hydrological model was applied to study the impact of climate change on crop production in a basin in Sri Lanka (Droogers, 2004). This research expanded the impact assessment to an analysis of potential adaptation strategies to overcome negative impacts of climate change. This study was part of a seven countries study, where simulation models were used to assess the impact of climate change on water, food and nature (Aerts and Droogers, 2004). On a very large scale, Immerzeel (2007) evaluated the impact of climate change, based on a large-scale hydrological model, on downstream water flows in the Brahmaputra Basin. To assess the impact of climate change on low flows in the UK, Romanowicz (2007) developed a new modelling approach, using evaporation as additional information on soil moisture conditions in addition to commonly used observed streamflow.
These studies, amongst many others, assume that numerical models can be used to assess the impact of climate change without assessing the impact of unavoidable model inaccuracy. Despite a wide range of literature on model inaccuracy (some more recent examples: O'Connell et al., 2007;Choi and Beven, 2007;Mantovan and Todini, 2006;Feyen et al., 2007), there is a common hypothesis that model errors are reflected in the reference situation as well as in the climate change situation so that relative accuracy (difference between reference and scenario) is higher than absolute ac-curacy of the model. So far, no attempts have been made to develop a common framework to assess the error due to model inaccuracy during climate change impact assessment studies. Related to this is the question to which level of detail model calibration and validation should be undertaken to ensure a reliable impact assessment.
In summary, the objective of this research is to develop an approach to evaluate and quantify the consequence of model inaccuracy on climate change impact assessment studies.

Study area
A polder in The Netherlands managed by the Waterboard of Rivierenland is selected to evaluate the impact of model inaccuracy on climate change impact assessment (Fig. 1). The area is located between the rivers Meuse and Rhine, and is characterised by low-lying meadows and many drainage canals. Soils in the study area are loamy clays and are described by the Mualem -Van Genuchten (MVG) parameter set (Van Genuchten, 1980). Meteorological data are taken from the meteorological station Megen, about 7 kilometers south-west of the study area. Potential economic returns on pastures in the area are C1350 per hectare (LEI, 2006). Details about the study area can be found elsewhere (Immerzeel et al., 2007).

SWAP model
The Soil-Water-Atmosphere-Plant (SWAP) model was applied to simulate all the terms of the water balance and to estimate yields for a reference situation and two climate change scenarios. SWAP is an integrated physically based simulation model for water, solute and heat transport in the saturated-unsaturated zone in relation to crop growth. A first version of the SWAP model was already developed in 1978 (Feddes et al., 1978) and from then on, a continuous development of the model started. The version used for this study is SWAP 3.03 and is described by Kroes and Van Dam (2003).
The core part of the model is the vertical flow of water in the unsaturated-saturated zone, which can be described by the well-known Richards' equation: where θ denotes the soil water content (cm 3 cm −3 ), t is time (d), h (cm) the soil matric head, z (cm) the vertical coordinate, taken positive upwards, and K the hydraulic conductivity as a function of water content (cm d −1 ). S (d −1 ) represents the water uptake by plant roots (Feddes et al., 1978), defined in case of a uniform root distribution as: Hydrol. Earth Syst. Sci., 12, 669-678, 2008 www.hydrol-earth-syst-sci.net/12/669/2008/ where T pot is potential transpiration (cm d −1 ), z r is rooting depth (cm), and α (-) is a reduction factor as function of h and accounts for water and oxygen deficit. Total actual transpiration, T act , is calculated as the depth integral of the water uptake function S. Crop yields can be computed using a simple crop-growth algorithm based on Doorenbos and Kassam (1979) or by using a detailed crop-growth simulation module that partitions the carbohydrates produced between the different parts of the plant, as a function of the different phenological stages of the plant (Van Diepen et al., 1989). For this specific case, the first method was used as detailed crop parameters were lacking.
The SWAP model has been applied and tested already for many different conditions and locations and has been proven to produce reliable and accurate results (e.g. Bastiaanssen et al., 2007;Heinen, 2006;Varado et al., 2006;Droogers et al., 2000). A more detailed description of the model and all its components are beyond the scope of this paper, but can be found in Kroes and Van Dam (2003).
The SWAP model was applied to the study area in the Dutch polder using best data available. An automatic calibration procedure followed using the PEST software (Doherty, 2000), with observed groundwater levels for a six years period (1997 to 2003) as references. Details of the entire procedure for this study area are presented elsewhere (Van Loon et al., 2007 1 ).

Climate change scenarios
The Fourth Assessment Report (AR4) of the Intergovernmental Panel on Climate Change (IPCC) was published recently (IPCC, 2007) and is a condense result from thousands of scientific publications into a general assessment of the current knowledge about the climate system and the maninduced changes to it. Despite this wealth of information, regional and local climate change predictions are still hard to make due to the complexity of the climate system. A regional manifestation of climate change is subject to many interacting processes affecting atmospheric circulation and region-specific responses of physical processes. Based on the fourth IPCC assessment report and additional studies covering Western-Europe, the Dutch Meteorological Institute (KNMI) derived climate change scenarios to be used by impact and adaptation studies in The Netherlands.
A total of four scenarios have been developed based on two sets of two assumptions: global increase of temperatures by 1 or 2 • C in 2050, and whether or not dominant wind directions will change to more eastern directions during summer. For this study two scenarios with a global temperature increase of 2 • C in 2050 have been selected: W (=warm) and W+ (=warm and changes in wind directions). A summary of these scenarios is provided in Table 1. Details of the entire procedure in which way these scenarios are developed can be found in Van den Hurk et al. (2006). The W and W+ scenarios were statistically downscaled using the observed climate data and the KNMI transformation tool (KNMI, 2006). The basic principle of the downscaling procedure is the construction of a time series of precipitation that is consistent with the KNMI'06 scenarios, based on a historical time series of a specific meteorological station, in this case Megen. The following parameters are used in the transformation of precipitation: (i) relative change of the wet day frequency (day with more than 0.05 mm rainfall) (ii) relative change of the mean precipitation on wet days, and (iii) relative change of the Q99 of the precipitation on wet days (Q99=precipitation value that is exceeded on only 1% of the wet days). The method used for the transformation of potential evaporation is more straightforward. Historical time series of daily potential evaporation of meteorological station De Bilt were multiplied with a transformation factor, which is dependent on scenario and season (Table 1).

Model inaccuracy
To evaluate the impact of model inaccuracy on impact assessment of climate change the calibrated model has been altered to reflect the most common parameter uncertainties. This altered model will be referred to as the perturbed model. It should be emphasized that this approach is different from the more general parameter uncertainty analysis (e.g. Beven and Binley, 1992;Beven et al., 2005) as the objective of our perturbation is to adjust parameters within their expected range of potential accuracy. Meyer and Gee (1999) argued that there are in principle two sources of uncertainty: (i) natural variability (e.g. in space and time) (ii) a general lack of knowledge (e.g. presence of inaccurate, unrepresentative, or limited data). A detailed analysis of the expected parameter uncertainty is beyond the scope of this paper and therefore one value of 10% was used, which can be considered as a realistic value in detailed studies after model calibration (Mertens et al., 2005;Islam et al., 2006). For the SWAP model the most common parameter uncertainties are: (i) soil characteristics, (ii) bottom boundary condition and (iii) crop characteristics (Van Dam, 2000). For each of these three cases one sub-case with 10% lower than calibrated and one with 10% higher than calibrated values have been used, resulting in a total of six cases of perturbed models (see Table 2 for explanation of the parameters): -ErrorSoils10% less clayey: all soil parameters of the MVG set have been altered by 10% from the optimal value. This was implemented by changing values for top and sub soils by the following percentages: n +10%; alfa −10%; log(Ksat) +10% -ErrorSoils10% more clayey: all soil parameters of the MVG set have been altered by 10% from the optimal value. This was implemented by changing values for top and sub soils by the following percentages: n −10%; alfa +10%; log(Ksat) −10% -ErrorBottom10% more dynamics in seepage: bottom boundary condition altered by 10% from the calibrated value. In SWAP this was implemented by increasing two values determining the bottom boundary condition by 10%: SINAVE = average value of bottom flux, and SINAMP = amplitude of bottom flux sine function.
-ErrorBottom10% less dynamics in seepage: bottom boundary condition altered by 10% from the calibrated value. In SWAP this was implemented by decreasing two values determining the bottom boundary condition by −10%: SINAVE = average value of bottom flux, and SINAMP = amplitude of bottom flux sine function.
-ErrorCrop10% more drought resistance. This was implemented by increasing the threshold value were reduction in root water uptake occurs by +10%. In SWAP this is defined by the two parameters: HLIM3 and HLIM4.
-ErrorCrop10% less drought resistance. This was implemented by decreasing the threshold value were reduc-tion in root water uptake occurs by −10%. In SWAP this is defined by the two parameters: HLIM3 and HLIM4.
Changes were all set at a fixed 10% to mimic uncertainty in parameters. This 10% reflects a kind of average error, although substantial differences might occur depending on the parameter considered, the location and the applied model (Bastiaanssen et al., 2007). A more rigorous evaluation of expected parameter uncertainty, or an approach based on Monte-Carlo runs, is beyond the scope of this paper.
To compare the reference situation to these six cases a set of indicators was defined that describe key characteristics of the entire system in one number. Values for the following seven indicators have been extracted from the daily SWAP runs over 30 years : -Yield: average yield over 30 years (C ha −1 ) calculated as the difference between actual and potential crop transpiration multiplied by the potential economic returns of C1350 per hectare.
-Crop Fail: number of years, out of 30 years, with complete crop failure defined as yields lower than 80% of potential. This 80% is an average value when farmers' profits fall below zero.
Hydrol. Earth Syst. Sci., 12, 669-678, 2008 www.hydrol-earth-syst-sci.net/12/669/2008/ In summary, the entire approach is based on applying a well-calibrated model that was altered for six cases to reflect model inaccuracy. These six models have been run for a period of 30 years (1976 to 2005) and daily model output was summarized by seven indicators.

Model-Scenario-Ratio
Concepts of parameter uncertainty in hydrological modeling have been first published by Cuen (1973). However, it took another 20 years before the topic got substantial attention and peer-reviewed publications and international research projects (e.g. PUB, Prediction in Ungauged Basins, and MOPEX, Model Parameter Estimation Experiment) started to emerge (Droogers and Immerzeel, 2006).
A benchmark study was published by Beven and Binley (1992) on parameter uncertainty prediction in hydrological modelling. The paper described a methodology for calibration and uncertainty estimation of distributed models based on generalized likelihood measures (GLUE). This procedure works with multiple sets of parameter values and allows that, within the limitations of a given model structure and errors in boundary conditions and field observations, different sets of values may be equally likely as simulators of a catchment. Fifteen years later Beven et al. (2005) concluded that ". . . for many environmental prediction problems there are many competing models, which we know are not correct (there is model structural error), but many of which are declared successful by their parent modellers after some form of calibration process". He advocated therefore strongly that there is a clear need to evaluate the impact of model uncertainty in assessment studies.
A study that went beyond the classical parameter uncertainty analysis was published recently (Winsemius et al., 2006). Two quite distinct models were built for the Zambezi, both able to simulate observed streamflow quite reliable. However, further statistical analysis indicated that for one model parameter identifiability was much greater than for the other one. Bormann (2005) recognized that many scenario impact studies assumed that model errors were ignored. He presented therefore the signal-to-noise-ratio (SNR ref ) defined as: where SNR is signal-to-noise ratio, X reference is value of the reference scenario, X scenario is value of the scenario, X observed is observed value, X i,uncertain is value of the n realisations of the uncertainty analysis, i is control variable. This approach was applied to a basin in Benin from which it was concluded that 14 out of 15 scenario cases has a high SNR ref value, indicating that most case studies are reliable. One of the assumptions of the SNR ref approach is that the reference situation is still considered as the "true" value. It is however questionable whether this assumption is valid referring to the arguments made by Beven et al. (2005).
To overcome some of the problems mentioned above the Model-Scenario-Ratio (MSR) is introduced here to express the impact of model inaccuracy versus the impact of the scenario itself: The value of MSR indicates to what extent the impact of a scenario contributes to the final findings compared to model inaccuracy. An MSR value of 1 indicates that the model inaccuracy doesn't play a role and results are a function of the scenario only. An MSR value of 0 indicates that the response of the impact assessment originates for 50% from the changing climate scenario, and 50% from an inherent model uncertainty. MSR values lower than zero indicate that responses are dominated by model inaccuracy rather than by the scenario evaluated.
To demonstrate the features of MSR some hypothetical calculations are provided in Table 3. The six cases demonstrate the following:  The MSR is comparable to the well-known Nash-Sutcliffe (Nash and Sutcliffe, 1970) criterion to evaluate model performance. Nash-Sutcliffe can range from −∞ to 1. A N-S of 1 corresponds to a perfect match of modeled output to the observed data. A N-S of 0 indicates that the model predictions are as accurate as the mean of the observed data, whereas a N-S less than zero occurs when the observed mean is a better predictor than the model.

Results
Simulated and observed groundwater depths have been compared, indicating that the model as developed is able to mimic reality (Fig. 2). Statistical analysis reveals also that simulated groundwater depths are close to observed ones with: mean error is 0.2 cm, average mean error is 8.2 cm, root mean square error is 11.5 cm, and R 2 of 0.86. More details of the complete calibration and validation procedure can be found in Van Loon et al. (2007) 1 . A straightforward climate change impact assessment, based on indicator comparison as defined before, is shown in Table 4. The overall trend is that climate change will increase Hydrol. Earth Syst. Sci., 12, 669-678, 2008 www.hydrol-earth-syst-sci.net/12/669/2008/  water shortage, both more wet days and more dry days, a small reduction in economic returns and a higher chance of crop failure. Especially the W+ scenario will have a substantial impact on water and agriculture with water shortages increasing by 74 mm per year (+143%), an increase of number of dry days by 263% and an increase of years with complete crop failure from 2 to 8 out of 30 years. These results are based on the assumption that the model mimics reality. It is however impossible by definition that models are entirely correct and the question to be asked is what the impact would be if errors exist in models. Table 5 shows an impact assessment similar as presented in Table 4 using a model where soils data are less accurate (a less clayey soil, referred to as ErrorSoils in Sect. 2.4). The overall impact of this model inaccuracy is that water shortages are somewhat reduced (reference in Table 4 and Table 5), while at the same time the number of dry days increased substantially. This apparent contradiction can be explained by soil water dynamics where the less clayey soil increases capillary rise which reduces water shortages but lowers groundwater tables at the same time.
Comparing impact assessment based on the calibrated and the perturbed model, comparable trends can be observed for the W scenario (column 4 in Table 4 and Table 5). In other words, impact assessment based on this perturbed model will result in similar conclusions as based on the calibrated model. For the W+ scenario, however, impact assessment based on the perturbed model is somewhat deviating from the one based on the calibrated model (column 5 in Table 4 and Table 5). So evaluating this W+ climate change scenario using a perturbed model will yield to less reliable conclusions. Table 6 shows MSR values for the case assuming model inaccuracy in soil properties (less clayey). For the W scenario MSR values are all above 0.90 indicating that even with these model inaccuracies most of the scenario impact assessment is caused by the scenario itself rather than by model inaccuracies. For example the impact of the W climate change scenario on ETshort is 27% for the calibrated model (Table 4). For the perturbed model this value is 33% (Table 5). In terms of the defined MSR this means a value of 1-ABS(0.27-0.33)=0.94 (Table 6).
The W+ scenario shows a different picture where the impact of the scenario is for most indicators still dominated by the scenario itself, except for the number of dry days (GWLdry) and the years with complete crop failure. For www.hydrol-earth-syst-sci.net/12/669/2008/ Hydrol. Earth Syst. Sci., 12, 669-678, 2008 These indicators have been combined as well to evaluate the overall impact of model inaccuracy. Most indicators are relatively insensitive to model inaccuracy. In general more than 90% of the impact assessment analysis is a result of the scenario considered (Table 7). However, two indicators are very sensitive to model inaccuracy and if these incorrect models are used for impact assessment erroneous conclusions might be drawn. These two indicators are number of dry days (GWLdry) and number of years with crop failure (Crop Fail). The latter is based on a threshold value wherefor a small number of years -a large change in results (like from 2 to 3 times in 30 years) can be triggered by a relative small model errors. The GWLdry indicator is also based on a threshold value and is also very sensitive to small model errors.
Finally, the average MSR for the indicators for each model perturbation experiment can be seen in Table 8. This has been done for the full set of seven indicators as well as for a reduced set of five by leaving out the most erroneous ones (GWLdry and Crop Fail). The overall result, for the five indicators, is that of the evaluated climate change impact assessment more than 90% can be attributed to the scenario itself rather than to model inaccuracy. In cases where an indicator is required that is sensitive to model inaccuracy more emphasis should be put on model calibration and validation.

Conclusions and recommendations
A common assumption in climate change impact assessment studies is that unavoidable model errors are reflected in the reference situation as well as in the climate change situation. The SWAP model was applied for an area in The Netherlands to assess the impact of model inaccuracy on this climate change impact assessment. Based on this research the main conclusions are: -The calibrated model can be considered as state-of-theart and has been applied successfully over a wide range of applications. In this case the model is performing very well in simulating groundwater levels.
-In terms of climate change impact assessment for the two scenarios (W and W+) an increase in water shortage, more extremes in wet and dry periods, and a small reduction in agriculture production can be expected.
-The derived Model-Scenario-Ratio shows that for the W scenario model inaccuracy is for most indicators not relevant.
-The derived Model-Scenario-Ratio shows that for the W+ scenario (enhanced drought conditions) model inaccuracy plays a role for some indicators. However, still more than 90% of the assessed impact can still be attributed to the scenario itself and not to model inaccuracy.
Hydrol. Earth Syst. Sci., 12, 669-678, 2008 www.hydrol-earth-syst-sci.net/12/669/2008/ The overall conclusion based on the results is that model uncertainty becomes more important in areas where the model is generating a greater impact by being more sensitive to changes in environmental conditions. In this particular case where the system is sensitive to drought, the model representation of drought conditions stands out as an important contributor to impact uncertainty.
The overall recommendation from this study is that the climate change impact assessment study as presented is quite robust and model inaccuracy is for most cases a less relevant issue. However, for some indicators and for the scenario with enhanced drought conditions model inaccuracy can play an important role. It would therefore be recommended for future climate change impact assessment studies to explore the relevance of model inaccuracy using an impact model accuracy assessment, such as the Model-Scenario-Ratio as defined in this research. MSR values close to 1 indicate that the model used for the impact assessment is robust and results are mainly a function of the scenario itself rather than of the quality of the model. Low MSR values indicate that the quality of the model is an important factor in determining the indicator value and it is therefore essential that the model should be calibrated and validated at a very detailed level.
Finally, this research should be considered as a starting point to pay more attention to the importance of model accuracy in climate change impact assessment studies. More research is required to answer a range of questions such as: (i) how much physics should be included in the model, (ii) what other criteria than the MSR introduced here could be developed, (iii) what is effect of a scenario with even more enhanced drought conditions, and (iv) is this approach supporting policy makers.