Climate change impact on groundwater levels: ensemble modelling of extreme values

This paper presents a first attempt to estimate future groundwater levels by applying extreme value statistics on predictions from a hydrological model. Climate scenarios for the future period, 2081–2100, are represented by projections from nine combinations of three global climate models and six regional climate models, and downscaled (including bias correction) with two different methods. An integrated surface water/groundwater model is forced with precipitation, temperature, and potential evapotranspiration from the 18 models and downscaling combinations. Extreme value analyses are performed on the hydraulic head changes from a control period (1991–2010) to the future period for the 18 combinations. Hydraulic heads for return periods of 21, 50 and 100 yr ( T21−100) are estimated. Three uncertainty sources are evaluated: climate models, downscaling and extreme value statistics. Of these sources, extreme value statistics dominates for return periods higher than 50 yr, whereas uncertainty from climate models and extreme value statistics are similar for lower return periods. Uncertainty from downscaling only contributes to around 10 % of the uncertainty from the three sources.


Introduction
Climate change adaptation is an increasingly recognized component in planning of infrastructure development. Infrastructures, such as roads, are designed to be able to withstand extreme hydrological events. Opposite to water resources assessment, analyses of groundwater head extremes are highly relevant for roads in contact or close to groundwater tables since groundwater flooding and drainage issues can compro-mise the use of the road. Hydrological extreme events have commonly been estimated from historical data, but the evidence of a changing climate implies that estimates of future climatic conditions should be used instead. Estimates of future temperature and precipitation can be generated by global climate models (GCMs) with grid resolutions of typically 200 km. This resolution is too coarse for further application in hydrological models (Fowler et al., 2007), thus downscaling to a more local scale is necessary either by dynamical downscaling to regional climate models (RCMs) or by statistical downscaling. The inherent uncertainty in the climate models (CMs) should carefully be considered because this is possibly the largest source of uncertainty in hydrological climate change studies (Allen et al., 2010). Hawkins and Sutton (2011) analysed the uncertainty cascade for projections of precipitation from a GCM ensemble. For precipitation, they concluded that relative to emission scenario uncertainty, natural climate variability and climate model uncertainty dominated, even at the end of the 21st century. For hydrological models, precipitation and temperature are driving parameters and therefore the response of the uncertainty for these parameters should be shown in the hydrological model predictions. One way to do this is via a probabilistic modelling approach with multiple climate models (e.g. Tebaldi et al., 2005;Smith et al., 2009;Deque and Somot, 2010;Sunyer et al., 2011). The impact of climate change related to subsurface water has been considered in about 200 studies according to a recent review by Green et al. (2011). Only a few of these simulate groundwater conditions with a physically based groundwater flow model (e.g. Yosoff et al., 2002;Scibek and Allen, 2006; van Roosmalen et al., 2007;Candela et al., 2009;Toews and Allen, 2009). The general interest The motorway stretch where construction will be below present ground level with a road surface around 6 m below surface. Zones are used for groundwater head analyses (Motorway stations). The grey shaded polygons indicate paved ares. A geological cross section along the motorway from "a-a " is shown in Fig. 2. (b) Denmark, and location of Silkeborg and the Gudenå River. "a" indicated focus area shown to the left. of these studies is water resources, where quantifications of groundwater recharge and responding groundwater levels at seasonal timescales are adequate. To the knowledge of the authors, no reported studies have focused on extreme values of groundwater heads under future climatic conditions. The estimates of future groundwater head extremes would inherit the key sources of uncertainty from the climate model projections, which are (i) climate models and (ii) downscaling methods.
The use and concept of extreme value analysis (EVA) is well known within the hydrological sciences. The design of urban drainage systems are often planned to withstand or handle an extreme rain event, which means that the capacity for routing drainage water is sufficient for a given rain event, e.g. a 5 or 10 yr event. For example, at an urban runoff system in Toronto, Canada, Guo and Adams (1998a) compared volumes of runoff return periods for an analytical expression, based on exponential probability density functions of rainfall event characteristics, with return periods from the Storm Water Management Model (SWMM). In Guo and Adams (1998b), return periods for peak discharge rates from the analytical model and SWMM were compared. Bordi et al. (2007) used the generalised Pareto distribution to analyse return periods of extreme values for wet and dry periods in Sicily, Italy, using precipitation observations and a standardized precipitation index for wetness and dryness. A peak over-threshold methodology was used and spatial contour maps for return periods for the wet and dry thresholds produced, based on data from 36 rain gauges. Palynchuk and Guo (2008) used EVA statistics to develop design storms, standardized distribution of rainfall intensity with time, which conventionally are developed from depth duration frequencies of rainfall, or storm event analy-sis, where actual rainstorms are fitted to appropriate probability density functions. EVA has also been used in climate change impact studies. Burke et al. (2010) applied EVA to calculate drought indices for the UK, based on projections of future precipitation and an observed baseline period. Return periods for different drought indices were estimated with an above-threshold concept using a generalised Pareto distribution. Sunyer et al. (2011) compared the distribution of extreme precipitation events (> 25 mm day −1 ) from four projections of future climate at a location just north of Copenhagen, Denmark, with distributions derived from observed precipitation, 1979-2007. The lack of EVA for climate change studies of groundwater systems is concordant with the relatively few groundwater studies describing unusually high or groundwater flooding events. The area of groundwater flooding received increasing attention after flooding events in the winter-spring 2000-2001 from chalk aquifers in the UK and northern Europe (e.g. Tinch et al., 2004;Pinault et al., 2005;Morris et al., 2007;Jackson et al., 2011;Upton and Jackson, 2011;Hughes, 2011) but very few studies have dealt with groundwater flooding in a frequency analysis context. One study (Najib et al., 2008) developed a groundwater flood frequency analysis method to estimate T-year hydraulic heads for a given return period (T ). The tool was developed for a building construction project over a karstic aquifer in Southern France where heavy rainfall induced a groundwater table rise and thereby flooding. EVA is not only relevant when considering high groundwater levels causing flooding, but is very relevant to estimate drought conditions in terms of return periods for low groundwater conditions. The objectives of this study are to: (1) investigate the impacts of climate change on extreme groundwater levels in relation to infrastructure design; and (2) assess the uncertainty of extreme groundwater level estimates considering the key sources of uncertainties on the future climate.

Study area
The study area is located at the city of Silkeborg in the central part of Jutland, Denmark (Fig. 1). The area is dominated by deeply incised valleys formed during melt off from the glacial retreat of the North-east and the Baltic ice sheets 16 000-18 000 yr ago. The subsequent Gudenå River system flows through the city with a topography ranging from 20 to 95 m a.m.s.l. (meters above mean sea level). The area in focus is just north of the Gudenå River, in a part of Silkeborg, where a new motorway is planned.
Toward the north-west, north and east, smaller Gudenå River tributaries form natural hydrogeological boundaries. Toward the west the land surface topography forms a groundwater divide for the upper hydrogeological units and toward the south the Gudenå River valley delineates the hydrological model referred to as the Silkeborg model. The motorway crosses the river valley at the location of the city, and therefore the road level is constructed 6 m below the land surface topography, with a concrete bottom and vertical sheet piling walls. The groundwater level of the shallow terrace aquifer in the river valley is critically near to the road surface of the motorway.

Hydrogeology
The near surface geology at Silkeborg is dominated by glacial clayey tills in the upland areas. Thicknesses of these are up to 35 m and mostly formed as lodgement tills below Weichselian glaciers, when the main advance was located west of Silkeborg before 18 000 yr ago. Below this, coarser glacial sediments of sand and gravel form an upper unconfined aquifer with thicknesses up to 50 m. This sand unit was deposited during the retreat of former ice sheets, although it is perturbed by clayey sediment, mostly in the lower part, evidencing a more complex depositional history. The glacial till and sand are not observed in the Gudenå River valley at Silkeborg. In the valley, at least 3 erosional levels and fluviatile sandy sediments are observed (terrace sediments). The terrace sand was deposited when the glacial front had withdrawn to east of Silkeborg and the Gudenå River system was used as drainage for the melting ice to the Limfjord and later on to Kattegat with connection to the North Atlantic (Fig. 1). A geological cross section is shown in Fig. 2.
Below the Quaternary sediments, Oligocene and Miocene mica clay, mica sand and quartz sand are found. These sediments are observed down to 80-100 m below mean sea level where Eocene marls are found. In the eastern part of the modelled area, buried valleys are included in the geological model. The buried valleys are 6-8 km long, up to 1 km wide, and eroded about 75 m into pre-Quaternary sediments (Jørgensen and Sandersen, 2009). The valleys are possibly backfilled with re-deposited Miocene sediments.

Hydrology
The humid climate in Denmark is dominated by the weather systems of the North Atlantic and the European continent. At the Jutland Peninsula, precipitation varies from coastal zones to inland areas with around 200 mm yr −1 . The highest precipitation is found at the north-south trending topographical ridge just west of Silkeborg. The average precipitation at Silkeborg during the period of 1961-1990 was 903 mm yr −1 with max. monthly values in November of 101 mm month −1 and min. amount in April of 50 mm month −1 (Scharling, 2000, with correction factors from B type shelter from Allerup et al. (1998)). Average potential evapotranspiration for the same period was 546 mm yr −1 , with max. and min. in July and December of 100 and 4 mm month −1 , respectively (Scharling, 2000). Average monthly temperature peaked in July and August with 15.2 • C and had a low in January and February of −0.3 • C (Scharling, 2000). These conditions result in recharge of the groundwater aquifer during late autumn, winter and early spring.

Methodology
The study applies EVA on model predictions of future groundwater levels representing the period of 2081 to 2100. The levels are extracted at a groundwater-sensitive part of the planned motorway from an integrated groundwater-surface water model. Results representing a historic baseline period (1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010) and the future period (2081-2100) are compared. Estimates of groundwater levels are produced with a nested modelling approach, where a large regional model is used to calculate boundary conditions (BC) for a local model at Silkeborg. Although this approach doubles the number of model runs and data processing, it supplies the primary local model with more realistic BCs for the simulations representing the future (Toews and Allan, 2009). In recent studies the MIKE SHE code (Abbott et al., 1986;Refsgaard and Storm, 1995) has been used to evaluate the effect on surface and sub-surface hydrology by climate change (van Roosmalen et al., 2007(van Roosmalen et al., , 2009Stoll et al., 2011). The Danish National Water Resources Model, also called the DK-model (Henriksen et al., 2003) was used to produce daily updated BCs for the local Silkeborg model.

Regional model
The DK-model consists of 7 subareas with Area 5 covering the middle part of Jutland. Figure 3 shows the DK-model Area 5, further referred to as the DK-model. The model covers 12 501 km 2 with a 500 m × 500 m numerical grid discretisation. The model is set up with the MIKE SHE code coupled with the MIKE11 code and describes overland flow, evapotranspiration, flow in the unsaturated zone, the saturated zone with drainage routing, and river flow. Numerical layering follows a geological model with 11 layers. Geology was initially interpreted in a voxel (volume pixel) framework with cell size of 1000 m × 1000 m × 10 m (xyz). During the latest model update (2005)(2006)(2007)(2008)(2009)), the voxel model was superimposed by local geological models based on the hydrostratigraphic model (Højberg et al., 2010(Højberg et al., , 2013. The model is bounded by the North Sea and Kattegat towards west and east, respectively. Toward the north and south the model is bounded by topographical catchment boundaries. The model was calibrated in non-steady state toward data for the period 2000-2003 with 2592 groundwater head (h) observations and 66 time series of river discharge (Q) with the automated parameter optimiser PEST ver. 11.8 (Doherty, 2010). Besides these observations, observations of mean h, 1990-1999 were also used to design an objective function with 8 weighted criteria representing, water balance, transient error on h and Q, and mean error on h and Q. Further detail on the DK-model and the calibration of the lat-

Local model
The geology illustrated in Fig. 2 was used for the local groundwater and surface water model at Silkeborg. As with the regional model, the model was developed with the coupled MIKE SHE-MIKE11 framework. The model was set up with a 100 m × 100 m numerical grid with 3 vertical layers and a model domain of 103 km 2 , Figs. 3 and 4. The topmost layer, layer 1, follows the terrace sand in the river valleys and glacial clay in the higher elevated areas. This is possible because the MIKE SHE code allows for separate geological and numerical models, with the parameterization following the geological model. Layer 2, follows the glacial sand and layer 3 the pre-Quaternary sediments. Boundary conditions for the three numerical layers are different. The southern boundary at layer 1 is a lake, Silkeborg Langsø (A-B, Fig. 4) and was simulated as a time-variant specified head with daily time steps. In order to estimate lake water stage (h) beyond periods with observations (1990-1995) a Q/h relation was established. Lake stage observations were received from the Silkeborg municipality and flow (Q) from the river discharge station 21.109 (Resenbro, Danish monitoring programme, location A, Fig. 4) just downstream of the lake. Besides section C-D (Fig. 4), boundaries for layer 1 are smaller streams toward the west and north (B-C, D-G) and at the Gudenå River toward the east (G-A). The specified head elevations used to simulate these boundaries are adopted from a detailed digital elevation model. Section C-D follows a topographical low with small ponds but without any connecting stream. The section probably drains toward the southern or northern stream sections and is therefore simulated as a no-flow BC. The glacial sand in layer 2 terminates toward the river valleys surrounding the model (e.g. at the valley slope illustrated in Fig. 2) and therefore a no-flow BC is used for this layer. The pre-Quaternary sediments defining layer 3 crosses the model boundary and interact with regional groundwater systems in areas with coarse sediments, mica and quartz sand. At the southern and eastern model boundaries, only the fine-grained pre-Quaternary sediments are observed and section F-B is therefore defined as a no-flow BC. The remaining boundary for layer 3 (B-F) is open for exchange via a transient specified head BC. Daily head levels are simulated by the regional model for which one of the layers is vertically aligned with layer 3 in the local model. The different horizontal cell discretisations between the models, 500 and 100 m, causes that several boundary cells in the local model receive head levels from the same 500 m cell in the regional model.
The area in focus is located in the city of Silkeborg and therefore a paved area coefficient is used to describe direct runoff in urbanized areas to streams. Paved areas are illustrated in Fig. 1. The chosen coefficient of 0.33 for the town area is derived from an estimate that one third of the town area is covered by pavement or buildings whereas the rest is covered by recreational areas (grass/forest). In the model the paved area coefficient implies that one third of the precipitation for each time step is routed directly to the closest stream, whereas the rest will be available for infiltration.

Silkeborg model calibration
Calibration of the model focused on the critical zone for the motorway regarding groundwater flooding (Fig. 2). Optimization of model parameters was done inversely with PEST (Parameter Estimation) ver. 11.8 (Doherty, 2010). PEST op-timization is not available within the MIKE SHE graphical user interface; therefore, the setup of PEST and optimization were performed outside of this interface. The model was run for the period 1990-2012 with 1990-2009 as the warm up and 2010-2012 as the calibration period. The warm up period is relatively long because of large groundwater extractions in the early 1990s in the terrace aquifer and because the initial conditions affect model predictions for several years. Groundwater extraction in the terrace aquifer has been steady the last 15 yr. Observation data consist of a number of measurements from three categories: (i) historical head measurements from the Danish national borehole archive (Jupiter), often with a single or a few measurements dating from 1990-2011. (ii) Time series of daily head measurements for the period 2010-2012. (iii) Stream discharge observations from 4 stations ( Fig. 4) during 2011 and 2012. The objective function (Eq. 1) was defined with 7 weighted groups (Table 1). Hobs mean is the error on average h from group (i) for the period 1990-2011 compared with average h for the calibration period. HTS ME describe mean error for daily hydraulic head measurements from group (ii). HTS ErrAmpl is the maximum annual amplitude (fluctuation) error of h from group (ii). The last four terms in the objective function are the winter, spring, summer and autumn water balance errors of stream discharge from group (iii).
The weights are uniform within observation groups in the objective function, but the weighing is initially adjusted in such way that the starting footprint (sum of weight times observation error) of different h groups in the objective function reflected the modelling focus on predicting hydraulic heads along the highway transect in the terrace aquifer. Thus, observations in the HTS ME and HTS ErrAmpl groups receive the highest weights. The Qbal groups were given a lower weight than the h groups. The Qbal groups are weighted according to discharge volumes, highest during winter, less during spring and autumn, and lowest during summer. Initial weights (group footprints) are shown in Table 1. The selection of calibration parameters was based on a sensitivity analysis on parameters for geological units (horizontal and vertical hydraulic conductivity, specific storage and specific yield). Furthermore, sensitivity of drain constant, detention storage, Manning number (overland flow) and conductance for general head boundaries were tested. Parameters included in the inverse calibration were horizontal hydraulic conductivity (fixed K h : K v ratio of 1 : 10) of 5 geological units, horizontal and vertical hydraulic conductivity and specific yield of the upper aquifer (terrace sand with most h observations), one specific storage parameter for clay units and one specific storage parameter for sand units, Manning number for overland flow, detention storage, and conductance for the lake general head boundary condition.

Climatic baseline data
Daily climatic data for the hydrological models, i.e. precipitation (P ), temperature (T ), and reference evapotranspiration (E r ), were obtained for the period 1991-2010 (baseline period) in a grid format from the Danish Meteorological Institute. Calculation of areal grid-values, 20 km × 20 km size for T and E r , and 10 km × 10 km size for P , relies on a nationwide network of climate stations. The methodology used for making the grid interpolations can be found in Scharling et al. (2000). Grid values of P were catch corrected with a dy-namic correction model originally developed by Allerup et al. (1998) but applied on grid values by Stisen et al. (2012). The catch-correction model is a spatially distributed model which mainly uses wind speed to bias-correct measured daily rainfall.

Ensemble of climate models representing future weather
In the ENSEMBLES project , future climate projections have been made for Europe with many combinations of GCMs and RCMs for the A1B emission scenario. In the present study we have used data from nine of these GCM-RCM combinations (Table 2) for the period 1991-2010 (control period) and 2081-2100. Output from the RCMs have been transferred to a 10 km × 10 km grid discretisation for precipitation, temperature and reference evaporation and two different methods for bias correction have been applied: -Delta Change (DC). DC is the simplest and the most common downscaling method. The key principle is that the future climate is described by the historical climate data corrected by monthly change factors derived from the climate model projections, e.g. daily precipitation values for January 2081 consist of observed precipitation for January 1991 multiplied by the ratio between average January precipitation projected for the future period 2081-2100 and average January values projected for the control period 1991-2010. This implies that results from the climate models are not used directly, only the change in projected average monthly precipitation is used. DC is well proven and well suited for studies focussing on effects of average climate factors such as groundwater recharge and average groundwater heads (van Roosmalen et al. 2007(van Roosmalen et al. , 2011. -Distribution Based Scaling (DBS). DBS is a so-called direct method that corrects the outputs from the climate model and only uses observed data to estimate correction parameters (Piani et al., 2010). In the DBS method the climate model data and the observed data in the control period are fitted to two different double gamma distributions. The difference between these two gamma distributions represents the correction made by the DBS and the climate model simulations for the future period are then corrected by using this correction. While the DC method can preserve the projected changes in mean values, the DBS method can also preserve the projected changes in other statistical properties and is therefore theoretically better suited for extreme values.
Both methods include downscaling as well as bias correction. In this paper we refer to both processes with the single term downscaling.
More information on the DC and DBS methodologies and their implementation is provided by Seaby et al. (2013), who documented that the DBS is able to correct direct data from all RCMs so that they reproduce extreme precipitation in the control period. For the present study we have extracted climate model results from the 10 km grid covering the local model area in Silkeborg.

Climate change simulation with groundwater models
Applying a hydrological model, developed for present conditions, to simulate future conditions involves a number of assumptions. Calibration parameters and model structure are assumed constant throughout the 21st century. Land use and agricultural practice will most likely change but how and to which degree is uncertain. Future groundwater extraction is assumed to be the same as the average for the period 2003-2010. The baseline model run, applying climatic observations from 1991-2010, is also run with the constant pumping value from 2003-2010.

Extreme value analysis
Extreme value analysis (EVA) was applied to projections of future extreme high groundwater levels from the hydrological model. EVA focuses on the tail of the distribution, e.g. the lowest or highest percentile of values in a dataset. A suitable probability distribution is fitted to the selected extreme values and from this distribution, hydraulic heads corresponding to given return periods can be estimated. Within hydrology the double exponential, or Gumbel distribution, often approximates events (x) in the upper tail of distribution (Eq. 2, Gumbel, 1958).
Parameters α and β are found by a maximum likelihood method and the standard error of the estimate of the extreme value with a T-year return interval is calculated with 95 % confidence limit as: Where n is the number of annual maximum values (n = 20), s x the standard deviation of the 20 annual maximum values, and K T is the frequency factor only dependent on the return period. The calculated 95 % confidence limits will be referred to as the error bound for the Gumbel distribution. Model projections of (5 day-average) hydraulic head in the upper aquifer from 20 zones along the motorway were extracted from the baseline and ensemble runs. Annual maxima for the 20 yr were then found and sorted according to value. Two approaches were used to analyse climate change impacts on simulated hydraulic heads at each zone. (i) The DC dataset are 9 series of 20 annual maxima-future simulated hydraulic heads (at each of the 20 zones). The 20 yr average for the baseline simulation was subsequently subtracted from each of the DC members, e.g. relative climate change impact to average conditions of the baseline simulation. From these mean values and values of the upper 95 %, confidence limits of the dataset were calculated. Gumbel distributions were then fitted to both the mean dataset and the upper 95 % dataset. (ii) The procedure for the DBS and DC methods were the same with the following exception: instead of subtracting the 20 yr average from one baseline model using the observed climate data, the average was subtracted from each of the 9 DBS baseline models, e.g. the 20 yr average for one DBS simulation of the baseline period was subtracted from the equivalent annual maxima DBS simulation of the future period.

Model calibration
The calibrated model shows a distribution of mean error with best fit in the terrace sand (Fig. 5). This is not surprising because a majority of observations are located in this part of the model, and HTS ME with all its observations in the terrace sand has more than half of the total initial weight in the objective function (Table 1). Areas of the model with high topographical gradients (e.g. where the motorway leaves the river valley toward the north-west and south-east) produce high mean errors on head. Besides the obvious difficulties with having a high topography gradient and uniform model discretisation, a likely reason for the head error is the model simplification of the heterogeneous geology in the transition between the river valley deposits and the upland glacial sand and clay.
The model produces small errors of less than 0.1 m, on average, on the h-amplitude (HTS ErrAmpl). Optimized hydraulic conductivities for the glacial units (terrace sand, glacial sand, glacial clay) are within expected values and the 95 % confidence limits are relatively narrow except for the glacial clay, Fig. 6. This is likely because of a small number of observations in this clay unit and the K h is in the upper end of the expected range.
The difference between K h for the mica sand and clay appears to be a bit narrow. Boreholes penetrating the pre-Quaternary deposits seem to suggest that only a small lithological difference is present between the two units, e.g. sand layers dominated by fine sand, and clay layers by silt. This is exemplified by K h for the lowest of the two pre-Quaternary sand units, which is close to the value of the pre-Quaternary clay.

Climate change parameters
Results from the DC and DBS climate ensembles are compared with observations from the baseline period (1991-2010) for precipitation, temperature, and reference evapotranspiration (Fig. 7). The DC method (ensemble average) projects future precipitation similar to baseline observations in February, April and May, a decrease from June to October, and an increase in November, December, January and March. The DBS method (ensemble average) projects a decrease in precipitation in June and from August to October, and an increase in January, April, May, November and December. The climate models disagree by up to 2 mm day −1 of precipitation during summer and in September. Except for January, June and November, ranges of projections in the DBS ensemble are wider than in the DC ensemble. Temperatures are projected to increase throughout the year with the highest relative increase during winter. The ensemble representing the future seems well separated from the baseline period, indicating a clear, positive trend in temperature changes. Projected future-reference evapotranspiration shows the same trend as temperature, which is not surprising because of its direct correlation.

Analysis of extreme groundwater levels
EVA was performed for the 20 zones at the motorway with estimation of Gumbel parameters for each zone.   Figure 9 shows the Gumbel distribution for the modelled hy-draulic heads based on the observed period 1991-2010 with associated error bounds together with the 2081-2100 DBS and DC distributions (depicted with the same red and blue lines as in Fig. 8). At zone 34, the upper error bound for the Gumbel distribution (1991Gumbel distribution ( -2010 exceeds the mean future ensemble estimate for return periods longer than 10 and 30 yr, DC and DBS, respectively.

Uncertainties of future extreme groundwater levels
Several sources of uncertainty affect the estimation of future extreme groundwater levels. Firstly, the estimation of the future climate is challenging. In this study it is handled by applying an ensemble of climate projections from 9 combinations of global and regional climate models. Secondly, climate model results are downscaled using two different methods. The two methods provide alternative results illustrating the uncertainty in choice of downscaling. Thirdly, estimation of extreme values from the simulated groundwater levels  involves uncertainty related to fitting the Gumbel distribution, and this uncertainty is described by error bounds on the estimated extreme values. We will characterise the uncertainty from these three sources as the interval between the upper and lower 95 % confidence values, equivalent to four times the standard deviation of an estimated value. Based on the results shown in Table 3 we find: -Extreme value analysis. The ± in Table 3 represents half of the 95 % confidence interval. Hence, the uncertainty related to the Gumbel distribution is quantified as the average of the error from each of the two downscaling methods and the two climate values (mean and upper 95 % ensemble) multiplied by 2. For instance, at zone 30 the EVA uncertainty for T 21 would be ((0.29 + 0.29 + 0.31 + 0.29)/4)*2 = 0.59 m (Table 4).
-Climate models. The difference between mean ensemble and upper 95 % ensemble represents half of the 95 % confidence interval. Hence the climate model uncertainty is estimated as this difference multiplied by 2, averaged over the two downscaling methods. For example, at zone 30 the climate model uncertainty for T 21 would be (((1.21-0.94) + (1.40-1.10))/2)*2 = 0.57 m (Table 4).
-Downscaling. The two downscaling methods are assumed equally likely and therefore the uncertainty is considered as four standard deviations, where the standard deviation is calculated from the two known random variables (the DC and DBS estimate). For example, at zone 30 the downscaling uncertainty for T 21 would be 4*(standard deviation{0.94; 1.10}) = 0.34 m ( Table 4).
Assuming that the three sources of uncertainty are independent the total uncertainty, σ total , can be assessed by σ total = σ climatemodel 2 + σ downscaling 2 + σ EVA 2 (4) where σ climatemodel , σ downscaling and σ EVA are the uncertainties related to climate models, downscaling and extreme value analysis. Figure 10 shows the three uncertainty components. The results in the figure are calculated as the average of the three uncertainty components calculated for each of the 20 zones (for each T event). In Fig. 10 it is observed that the uncertainty from climate models and the extreme value analysis are the two dominating sources of uncertainty. Climate model uncertainty is almost constant for different return periods, while EVA uncertainty increases with higher return periods (also see Figs. 9, 10 and Table 4).

Climate change impacts on extreme groundwater levels in relation to infrastructure design
The projected change of extreme groundwater levels between today's climate and future climate is modest. The extreme value analysis shows changes of only tens of centimetres for T 100 events (zone 34 and 50, Fig. 9). The estimate based on the upper 95 % confidence limit of the projection with the 9 climate models gives, naturally, higher values, Table 3. The modest climate change impact at the investigated aquifer is a result of site specific conditions. Two interacting groundwater conditions, drainage, and the hydraulic conductivity of the aquifer affect extreme groundwater levels. The high conductivity of the aquifer will remove groundwater towards hydraulic boundaries as drains, streams, and lakes with a relatively low response time, implying that higher groundwater levels quickly will be reduced. With a good connectivity between the aquifer and the drainage system, the elevation of the drains will confine groundwater levels. In contrast to drainage of the aquifer, which reduces the extreme events, increased recharge from connecting aquifers and the unsaturated zone will tend to amplify extreme events. At the study site in Silkeborg the potential rate of drainage seems high compared to the potential rate of recharge. This relation between aquifer recharge/discharge is obviously very site specific, therefore, the potential impact of climate change for extreme groundwater levels is also very site specific. One aspect not considered in this study is anthropogenic influence on the hydrological system in the future. Changing land use and development of the drainage system could affect the aquifer recharge/discharge relation and thereby extreme groundwater levels. Drainage systems and land use will be part of future adaptation measures and include feedback to the groundwater system (Holman et al., 2012). This is not taken into account in the present study.
Extreme value analysis for groundwater systems in a future climate has, to the knowledge of the authors, not been presented in the literature before. As noticed, attempts have been made to use an EVA methodology within the area of groundwater flooding. The study by Najib et al. (2008) introduced a methodology to perform flood frequency analysis and estimate hydraulic heads for 100 yr events (T 100 ). The underlying objective, to implement flood hazard assessment at a groundwater-dominated hydrological regime by estimating the 100 yr event at a given site, is the same as in the present study. Three major differences between the studies are observed. Firstly, Najib et al. (2008) investigated a dual or triple porosity carbonate aquifer with hydraulic head variations of up to 90 m, whereas the terrace sand aquifer in Silkeborg only had a tens of centimetres of observed variation and is relatively homogeneous compared to the aquifer in Southern France. Secondly, Najib et al. (2008) reconstructed hydraulic heads used for the EVA by a global reservoir model with a non-physically based parameterization. Calibration of parameters was done for individual sites with observed hydraulic head and precipitation data. This is fundamentally different from the three dimensional, physically based groundwater-surface water model used in the present study. The non-physical description in Najib et al. (2008) fits observed data very well because parameterization is done locally toward local observations, whereas calibration of a 3-D groundwater model, through the objective function, attempts to make the best overall parameterization toward widely distributed observations. The general discussion for and against models as global reservoir models versus more physically based models as MIKE SHE or MODFLOW models is beyond the scope of this paper. Nevertheless, in respect to simulation of future conditions one could argue that a physically based parameterization is perhaps more robust for simulations with changing climatic input because at least the physical system is described with some confidence. Thirdly, the present study includes climate change impacts in the EVA for hydraulic heads. This leads to an estimation of T-years representing the last 20 yr of the 21st century and not a representation of the next 100 yr with today's climate as shown by Najib et al. (2008).

Uncertainty of extreme groundwater level estimates
The largest uncertainty for the extreme groundwater levels is the extreme value analysis. Results clearly show that especially at the upper return periods of the distribution for groundwater head predictions, the extreme value analysis dominates the uncertainty. Uncertainties for climate models are also substantial for the predictions and are in average 0.46 m for all T estimations (Fig. 10) but vary from 0.11 to 0.59 m between zones (Table 4). In other words, the uncertainty from climate models is the same for a T 21 and a T 100 estimate of hydraulic head for a given zone. This could be expected as the changes applied through the DC and DBS methods are uniform throughout the simulated future period of 2081-2100.
Uncertainty on the Gumbel prediction ranges between 0.29 and 0.92 m from T 10 to T 100 (Table 4). This uncertainty is a result of uncertainty in the estimation of parameter values in the Gumbel distribution because of limited data, 20 yr, and hence it could be reduced by selecting a longer period than 20 yr. Another uncertainty related to extreme value analysis that we have not addressed in the present study is related to parameter estimation methods, selection of extreme values, etc. Najib et al. (2008) compared six different T 100 estimates using an annual maximum series methodology, and a peak over-threshold methodology, both with parameter estimations using the method of moments, the maximum likelihood method and the probability weighted moment method. These six combinations of methods gave very similar T 100 estimates and standard deviations for the estimate of hydraulic head and thus justify the current use of only one method. In climate change studies it is critical not to select periods that are too long because the climate conditions do not honour the stationarity condition, which is an underlying assumption used in extreme value analysis. Our results suggest that when choosing a 20 yr period the uncertainty due to the extreme value analysis is significant compared to uncertainties due to climate models and downscaling methods.
The lack of studies investigating extreme groundwater conditions under future climate makes it difficult to compare the relative size of uncertainty sources found in this study. A general comparison can nevertheless be made to impact studies within other areas of hydrology. One study investigating the uncertainty distribution was Graham et al. (2007), where future river runoff is estimated with a combination of GCMs, RCMs, and two downscaling methods equivalent to the DC and DBS methods used here. Graham et al. (2007) concluded that large uncertainty is associated with the choice of climate model and, more important in relation to the present study, the choice of downscaling method affects prediction of extreme runoff events and seasonal dynamics, whereas the prediction of runoff volumes is not sensitive to the downscaling method. In this context, testing of different downscaling methods is very relevant when dealing with extreme hydrological events. A groundwater recharge study by Allen et al. (2010) also concluded that downscaling can cause high uncertainty of extreme values. The reason for the relatively modest downscaling uncertainty in the presented study is the local downscaling approach where bias correction of precipitation is done for the 10 km × 10 km grid values.
The findings from the Silkeborg case are, in principle, sitespecific. The estimated changes for future extreme groundwater levels are a result of the hydrogeological set-up for the aquifer at Silkeborg, the climatological changes projected for this region, and the hydrological model's ability to sim-ulate the natural and highly urbanized area in a trustworthy manner. Findings regarding the three analysed sources of uncertainty are limited by not including all possible sources of uncertainty, e.g. hydrological model structure uncertainty and model parameter uncertainty. Concerning projections of future climate, uncertainty from CO 2 emissions are not included in the study, which constitutes another limitation. However, findings by Hawkins and Sutton (2011) show that even at the end of the 21st century, climate variability and climate models constitute larger uncertainty sources than CO 2 emission scenarios. Another limitation, and probably the most critical, is the future change of land use, urbanization, drainage system development, and other anthropogenically introduced changes on the hydrological system.

Conclusions
Extreme groundwater levels found in this study in terms of T 21−100 events are modest. For a 100 yr event, groundwater levels are 0.97 to 1.57 m higher than today's average groundwater level (mean ensemble with distribution based scaling as downscaling method and the same estimate with the delta change method gives estimates of 1.11 to 1.52 m). Groundwater levels for a 21 yr event are 0.77 to 1.18 m higher than today's average groundwater level (mean ensemble, DBS, and 0.85 to 1.12 m, mean ensemble, DC). Results show higher extreme groundwater levels with the distribution based scaling than with the delta change method. Furthermore, groundwater levels for extreme events differ more from zone to zone using the distribution based scaling.
Three sources of uncertainty were investigated in this study: uncertainty due to climate models (an ensemble of nine combinations of global and regional climate models were used), uncertainty due to downscaling (two methods were used), and uncertainty due to the applied extreme values analysis. The uncertainty contribution from the three sources, especially for the higher return periods, is dominated by the extreme values analysis. While uncertainty from the choice of downscaling and climate model is around 20 cm, and 46 cm, respectively, uncertainty from the extreme value analysis increases from 45 (T 10 ) to about 82 cm (T 100 ). The total uncertainty contribution from the three sources is around 67 cm for the estimation of a 10 yr event and around 96 cm for a 100 yr event (with the definition of uncertainty as four times the standard deviation). Compared to the estimates of groundwater levels during a 100 yr event (0.23-1.22 m), the uncertainties from the three sources are very high. If uncertainty is considered in this simplistic way, downscaling accounts for 4 % of uncertainty from the three sources, climate models for 23 %, and extreme value statistics for 73 % (for estimation of a 100 yr event).