A simple groundwater scheme in the TRIP river routing model: global off-line evaluation against GRACE terrestrial water storage estimates and observed river discharges

Groundwater is a non-negligible component of the global hydrological cycle, and its interaction with overlying unsaturated zones can influence water and energy fluxes between the land surface and the atmosphere. Despite its importance, groundwater is not yet represented in most climate models. In this paper, the simple groundwater scheme implemented in the Total Runoff Integrating Pathways (TRIP) river routing model is applied in off-line mode at global scale using a 0.5° model resolution. The simulated river discharges are evaluated against a large dataset of about 3500 gauging stations compiled from the Global Data Runoff Center (GRDC) and other sources, while the terrestrial water storage (TWS) variations derived from the Gravity Recovery and Climate Experiment (GRACE) satellite mission help to evaluate the simulated TWS. The forcing fields (surface runoff and deep drainage) come from an independent simulation of the Interactions between Soil-Biosphere-Atmosphere (ISBA) land surface model covering the period from 1950 to 2008. Results show that groundwater improves the efficiency scores for about 70% of the gauging stations and deteriorates them for 15%. The simulated TWS are also in better agreement with the GRACE estimates. These results are mainly explained by the lag introduced by the low-frequency variations of groundwater, which tend to shift and smooth the simulated river discharges and TWS. A sensitivity study on the global precipitation forcing used in ISBA to produce the forcing fields is also proposed. It shows that the groundwater scheme is not influenced by the uncertainties in precipitation data.


Introduction
Land surface processes considerably influence the global climate system (Dirmeyer, 2001;Dirmeyer et al., 2000;Douville, , 2004Koster et al., 2000Koster et al., , 2002. They can affect the water and energy exchanges between land surface and atmosphere, the ocean temperature and salinity at the outlet of the largest rivers (Durand et al., 2011), and the climate, at least at regional scales (Alkama et al., 2007;Douville et al., 2000;Gedney et al., 2000;Lawrence and Slater, 2008;Molod et al., 2004). In climate models, these land surface processes are parameterized in the continental hydrological systems (CHSs), which are composed of land surface models (LSMs) generally coupled with river routing models (RRMs). LSMs provide the lower boundary conditions of temperature and moisture for atmospheric processes in atmospheric general circulation models (AGCMs), while RRMs convert the total runoff provided by LSMs into river discharges in order to evaluate the simulated water budget and transfer the continental fresh water to oceans, thereby closing the global hydrological cycle.
Despite its long response time, groundwater is an important component of the continental part of the global hydrological cycle. It represents about 30 % of the continental fresh water reservoir, and its interaction with the soil surface is likely to influence the soil moisture in unsaturated zones and thus the water and energy exchanges with the lower atmosphere (Anyah et al., 2008;Shiklomanov and Rodda, 2003). Moreover, it helps to sustain river base flows during the dry season in temperate zones, whereas it receives seepage from rivers in arid regions (Brunke and Gonser, 1997).
Published by Copernicus Publications on behalf of the European Geosciences Union.

3890
J.-P. Vergnes and B. Decharme: A simple groundwater scheme in the TRIP river routing model However, these groundwater processes are not yet included in most of the land surface parameterizations used in climate models. Considering their importance, the need to introduce them in CHSs has received increasing attention during recent years Decharme et al., 2010;van den Hurk et al., 2005;Maxwell and Miller, 2005;Yeh and Eltahir, 2005). The slow component of groundwater is thought to play an important role in climate models since they suffer from a lack of persistence in their land surface parameterizations Lam et al., 2011;Weisheimer et al., 2011). For example, Fan et al. (2007) analyzed a large dataset of water table observations over the United States and found that the groundwater reservoir had the potential to increase soil moisture memory. Alkama et al. (2010) compared global hydrological outputs from the Interactions between Soil-Biosphere-Atmosphere-Total Runoff Integrating Pathway (ISBA-TRIP) CHS to observed river discharges and terrestrial water storage (TWS) variations estimated from the Gravity Recovery and Climate Experiment (GRACE) satellite mission. They concluded that the underestimation of the simulated continental evaporation and the overestimation of the simulated annual discharges could be due to the lack of a groundwater reservoir. More recently, Lam et al. (2011) demonstrated that groundwater was a source of dry season evaporation and river base flow because it introduced a memory effect in land surface processing.
In this context, some attempts have been made to incorporate groundwater processes in CHSs. Two-dimensional diffusive groundwater models have been employed, but at smaller scales and mostly for regional applications Habets et al., 2008;Miguez-Macho et al., 2007;York et al., 2002). Such models are generally made for fine resolution grids using many parameters calibrated against insitu measurements, and so are not yet suitable for large scale application. Several studies have proposed adding a simple pseudo-groundwater reservoir into RRMs for global applications using a time delay factor only to delay the flow to the river, but without explicit groundwater dynamics (Alcamo et al., 2003;Arora and Boer, 1999;Decharme et al., 2010;Ngo-Duc et al., 2007). Other approaches have proposed introducing a very deep soil layer in one-dimensional LSMs to act as a groundwater component but neglect lateral diffusive flows (Chen and Hu, 2004;Gedney and Cox, 2003;Liang et al., 2003;Lo et al., 2010;Maxwell and Miller, 2005;Niu et al., 2007;Yeh and Eltahir, 2005).
However, validating a groundwater model is not always feasible because in-situ observations are lacking at global scale. Moreover, the observed water table depth presents great spatial variability due to heterogeneities in geological structures and the use of groundwater for human activities. Today, the GRACE satellite mission provides timevariable gravity field solutions which allow direct evaluation of the simulated TWS variations, i.e. the evolution of the sum of snow, ice, surface water, soil moisture, and groundwater reservoirs. Previous studies pointed out the possibility of us-ing the GRACE data to estimate TWS from basin (Crowley et al., 2006;Seo et al., 2006) to continent scale (Schmidt et al., 2006;Tapley et al., 2004), as well as groundwater storage variations (Rodell et al., 2007(Rodell et al., , 2009Yeh et al., 2006) or hydrological fluxes Ramillien et al., 2006;Swenson and Wahr, 2006;Syed et al., 2005). GRACE can also be used to evaluate simulated water storage Decharme et al., 2010;Güntner, 2008;Ngo-Duc et al., 2007;Ramillien et al., 2008;Swenson and Milly, 2006) or water table depth (Lo et al., 2010;Niu et al., 2007) in LSMs.
At the Centre National de Recherches Météorologiques (CNRM), the ISBA-TRIP CHS is used in the CNRM-CM Earth system model (Voldoire et al., 2012). Recently, a simple representation of groundwater has been developed in the TRIP RRM and tested with success over France in off-line mode (Vergnes et al., 2012). A simple methodology has been constructed to estimate the groundwater parameters and delineate the aquifer limits. The main advantage of this methodology is that it uses lithological and hydrogeological information available at global scale, which allows global applications of the groundwater scheme. This study underlines the impact of groundwater processes on the simulated river discharges, and demonstrates the feasibility of using this scheme at the resolution and time scales of climate models.
The main goals of the present paper are to present the global evaluation of TRIP including explicit groundwater processes. This evaluation is carried out at 0.5 • resolution using in-situ river discharges provided by the Global Runoff Data Centre (GRDC) completed with other national or regional datasets, and GRACE TWS variation estimates over the largest river basins of the world. The model is forced by surface runoff and deep drainage derived from a preexisting ISBA simulation performed on the 1950-2008 period. The precipitation dataset fed into ISBA to produce these forcing fields comes from the Global Precipitation Climatology Center (GPCC). Supplementary experiments using precipitation data from the Climate Research Unit (CRU) are also presented in order to explore the model's sensitivity to precipitation.

The TRIP model
The TRIP RRM was originally developed at Tokyo University by Oki and Sud (1998). It is a simple RRM used to convert the daily simulated runoff from ISBA into river discharges on a global river channel network at 0.5 • resolution. The original TRIP model used at Météo-France is described in detail in Decharme et al. (2010).
TRIP is based on a prognostic mass balance equation for the stream water mass, solved at a 60-min time step on each cell of the river network: S (kg) is the water mass, Q S in (kg s −1 ) is the sum of the surface runoff from ISBA within the grid cell with the water inflow from the upstream neighboring grid cells, Q S riv (kg s −1 ) is the groundwater-river exchanges and Q S out (kg s −1 ) the river discharge into the downstream cell, computed as follows: where v (m s −1 ) is the streamflow velocity computed via Manning's formula (Arora and Boer, 1999;Decharme et al., 2010), and L (m) the length of the river inside the cell. In this study, the new version of TRIP including a simple groundwater scheme is used (Vergnes et al., 2012). The simple groundwater scheme is based on the two-dimensional groundwater flow equation for the piezometric head H . This equation is solved using an implicit finite-difference numerical method based on the MODCOU hydrogeological model (Ledoux et al., 1989) with a time step of one day. It was rewritten in spherical coordinates in order to take the spherical form of the Earth into account and to be solvable on the regular longitude/latitude grids generally used in most CHSs: Only the uppermost unconfined aquifer representing one layer is solved. ω (m 3 m −3 ) is the specific yield corresponding here to the effective porosity; θ and φ are the longitude and latitude coordinates, respectively; r (m) is the mean radius of the Earth; T θ and Tφ (m 2 s −1 ) are the transmissivities along the longitude and latitude axes, respectively; and q sb (m s −1 ) is the deep drainage from ISBA per unit area of aquifer and q riv (m s −1 ) the groundwater-river flux per unit area. This equation is then discretized and solved in m 3 s −1 in TRIP. The conceptual approach to compute the groundwaterriver exchanges in m 3 s −1 is based on the following parameterization: Q riv (m 3 s −1 ) represents the groundwater-river exchange; Z bed (m) is the river bed elevation; H riv (m) is the river stage elevation; Z (m) is the elevation in the grid cell; W (m) is the river width within the grid cell; h c (m) and h s (m) are the river bankfull height and river water height, respectively, as defined in Decharme et al. (2012); and τ (s) is the coefficient of the transfer time of water through the river bed sediment (Vergnes et al., 2012). The geometry of the river is summarized in Fig. 1. As previously noted, Q riv is expressed in m 3 s −1 when Eq. (3) is solved and therefore needs to be converted into kg s −1 before being used in Eq.
(1). In other words, Q S riv = Q riv ρ with ρ (kg m −3 ) the water density. Equation (4a) corresponds to the case where the water table is connected to the river and Eq. (4b) to the case where they are disconnected. Each grid cell is considered as a river cell in TRIP and can therefore exchange water as a gaining or losing stream. If the flow is from river to groundwater and h s falls below 10 cm, Q S riv is set to zero to avoid a completely empty river and/or negative discharges. More details on the concept and numerical method of the groundwater scheme can be found in Vergnes et al. (2012).

Parameters
The elevation (Z in Eq. 5b) is derived from the Global Multiresolution Terrain Elevation Data 2010 (GMTED2010) provided at 30 arcseconds resolution (Danielson and Gesch, 2011). A first step consists in constructing this elevation at the intermediate resolution of 1/12 • : The elevation of each grid cell is computed as the mean value of the first decile of the actual 30 arcseconds resolution topographic values within the grid cell, ranked in ascending order. This global elevation is then calculated at 0.5 • by taking the average of all the 1/12 • topographic values within each 0.5 • grid cell. It helps us to compute the river bed elevation Z bed and reflect the altitude of the river in the grid cell. Previous results over France show that using such intermediate resolution allows us to construct low-resolution elevation that gives more realistic simulated river discharges (Vergnes et al., 2012).
The river width W is a parameter of primary importance because it is used in the river flow velocity computation and in the calculation of the river conductance (RC) in Eq. (5a). It is estimated over each basin via a geomorphological relationship using the mean annual discharges at each river cross section. More details about the W calculation can be found in Decharme et al. (2012).
In Vergnes et al. (2012), a method for constructing the geometry of the aquifers and estimating the groundwater parameters was tested with success at coarse resolution over France. Here, a similar method is used to define these parameters. The main advantage of this method is that it uses lithological and hydrogeological information available at global scale. Only major regional groundwater basins concerned by diffusive groundwater movements are taken into account because of the coarse resolution of the model at this scale. The global map of the groundwater resources of the world from the World-wide Hydrogeological Mapping and Assessment Programme (WHYMAP; http://www.whymap.org) is used as the primary information to delineate such domains. This map is divided into three main hydrogeological units. The "major groundwater basins" concern the sedimentary basins of permeable porous and fractured rocks, and also the alluvial plains with high permeability materials such as gravel or sand, and are therefore to be simulated. The "local and shallow aquifers" correspond to the old geological platforms or shields characterized by crystalline rocks with scattered, superficial aquifers, and are not considered. Finally, the "complex hydrogeological structures" group together complex aquifer systems. For example, karstic areas or orogens belong to this category, but are generally not concerned by regional groundwater flow and are assumed not to be simulated. Conversely, alluvial aquifers formed by the deposition of weathered materials can be found in this category. Such formations contain regional and continuous aquifer that must be taken into account, as it is the case, for example, over the Rhone River basin in France (Vergnes et al., 2012).
As a consequence, to deal with this "complex hydrogeological structures" category, two supplementary digital maps are used. First, a slope criterion is applied to remove the mountainous cells. More details on the computation of this criterion can be found in Vergnes et al. (2012). Secondly, the global map of lithology from Dürr et al. (2005) helps us to refine the limits of the aquifers by keeping or removing some of these complex areas. The final aquifer map at 0.5 • resolution is shown in Fig. 2a (gray-shaded areas). Note that the aquifer mask constructed over France from Vergnes et al. (2012) was incorporated into the final global map. In addition, a more precise hydrogeological map over the United States (USGS; http://www.nationalatlas.gov/index.html) was used to refine the geometry of the aquifers because of the lower accuracy of WHYMAP in this region. In particular, we chose to keep only the sandstone aquifers in the region located under the Great Lakes, embracing a part of the Mississippi watershed. Moreover, the carbonate-rock aquifers that are located in this region were removed since they tend to create karstic topography with rapid flow of water. Finally, the fraction of whole continents covered by the final aquifer map is about 43 % after removing Antarctica and Greenland.
The coefficient τ varies arbitrarily from 30 days in major river streams to 5 days in the upstream grid cells, through a linear relationship with the river stream order SO given by the TRIP river network in each grid cell of a given basin (Vergnes et al., 2012). This parameterization is introduced in order to take account of the supposed smaller thickness of riverbed sediments in upstream grid cells, which tends to make the groundwater-river exchanges quicker than for downstream, large rivers. Finally, transmissivity and effective porosity are estimated by taking mean values from the literature and chosen to be physically consistent for each unit of lithology encountered over the aquifers (Fig. 2b). These values are summarized in Table 1. Note that the values defined for the "Other rocks" type are given for information only since these rock types hardly appear in Fig. 2b.

Experiments
An off-line hydrological simulation with the groundwater scheme (GW) was compared to a control experiment without groundwater (NOGW). TRIP was integrated at 0.5 • resolution using a 60-min time step over the 1950-2008 period. The total runoff, i.e. surface runoff and deep drainage, came from a long-term ISBA simulation evaluated in Alkama et al. (2011). This simulation was forced by the global meteorological forcing from Princeton University (http://hydrology. princeton.edu) (Sheffield et al., 2006), where the precipitation is hybridized with the Global Precipitation Climatology Center (GPCC) datasets since the GPCC climatology certainly appears to be the best dataset for global hydrological applications (Decharme and Douville, 2006). A second ISBA simulation was used in this study. It was performed over the same period using the same meteorological forcing except for precipitation, which was hybridized with the Climate Research Unit's (CRU) precipitation dataset. The resulting additional total runoff forcing enabled a supplementary TRIP simulation with groundwater (GWCRU) to be produced in order to explore the sensitivity of the groundwater scheme to precipitation.
TRIP computes water table heads and river discharges for every day. In order to start the model at equilibrium, a simplified version of the groundwater scheme resolving Eq. (3) at steady state was used to compute an equilibrium state of the water table. This equilibrium state was reached using the annual average for 1950-1959 of the deep drainage from ISBA and the river water height h s (Eq. 5c) from NOGW. An additional spin-up was performed by TRIP over the same period  Dürr et al. (2005) over these aquifers. The green contours delineate the major river basins of the world. and then the model was evaluated over the period from 1960 to 2008. The monthly TWS variations simulated by ISBA-TRIP were calculated in terms of anomalies ( TWS in cm) as the sum of total soil moisture W, snow water equivalent W s , vegetation interception W r , stream water content S and groundwater reservoir H, if necessary: The groundwater head variations H needed to be multiplied by the specific yield ω to be converted into groundwater storage variations. The TWS evaluation was then carried out from August 2002 to August 2008.

Evaluation datasets
A list of about 3500 gauging stations distributed over the globe was drawn up to evaluate the monthly simulated river discharges, with 1900 of them potentially impacted by the groundwater scheme (Fig. 2a). The majority of these in-situ measurements were provided by the Global Runoff Data Centre (GRDC) ("The Global Runoff Data Centre", 56068 Koblenz, Germany) and completed with other sources of data: the US Geological Survey (USGS) stream flow data (http://waterdata.usgs.gov/nwis/sw) for the US river basins, the R-ArcticNet database (University Table 1. Transmissivity (m 2 s −1 ) and effective porosity values by type of lithology. The percentage of modeled aquifers at global scale covered by the different lithologies are also given. More information about the definition of each unit of lithology can be found in Dürr et al., (2005). of New Hampshire; http://www.r-arcticnet.sr.unh.edu/v4.0/ index.html) for the high latitude basins, the HYBAM observations for the Amazon basin (http://www.ore-hybam.org) and the French Hydro database (http://www.eaufrance.fr). Only the stations with observed periods of at least 10 yr were selected. Moreover, when several gauging stations were located in one grid cell, the one with the largest observed drainage area was kept.
The simulated TWS estimates were compared to the GRACE estimates using a similar method to that in Alkama et al. (2010). TWS estimates are provided by GRACE in terms of monthly anomalies ( TWS) based on highly accurate maps of the Earth's gravity fields over spatial scales of about 300 km Wahr et al., 2004). The present study used 74 months (from August 2002 to August 2008, excluding the June 2003 product, which was not available) of the Release 04 data produced by the Center for Space Research (CSR at the University of Texas at Austin) (Landerer and Swenson, 2012) The GRACE TWS estimates provided here were first filtered in order to remove noise and errors in the gravity field measurements. Several studies have shown that this filtering may modify the signal by reducing the seasonal amplitude of the final TWS signal. Such modification could lead to erroneous interpretation of the GRACE TWS estimates when compared to simulated TWS. In order to correct for this bias, the GW and NOGW TWS were smoothed using the same 300 km-width Gaussian filter as in , which is similar to the one used for the GRACE data products.

River discharges
The popular skill scores widely used in hydrology are used to evaluate the simulated river discharges against measurements of gauging stations. The annual discharge ratio (Ratio = Q sim /Q obs ) and the efficiency (Eff) criterion (Nash and Sutcliffe, 1970), which measures the ability of the model to capture the daily discharges dynamics, are used. Eff can be negative if the simulated discharge is very poor, and is above 0.5 for a reasonable simulation. The root mean square error (RMSE) score is also given at each gauging stations. This score helps to see how effectively the model is able to predict the river discharges. The nearer RMSE is to zero, the better the simulation is. Figure 3 shows the global distribution of the differences between the GW and NOGW river monthly discharges in terms of annual ratio, efficiency, and monthly anomaly RMSE. All theses scores are computed in term of monthly values. The simulated NOGW discharges are globally overestimated at 36 % of the gauging stations, with NOGW ratios higher than 1.3 mainly located in the western part of North America, in Africa, in Australia, and in South America (Tocantins and São Francisco basins; Fig. 3a). In Fig. 3b, a negative value of the annual ratio difference |GW − 1| − |NOGW − 1| means that the GW ratio is better than NOGW. The annual ratios were generally not significantly impacted by groundwater, except in some regions, such as the western part of North America, where the NOGW overestimated annual ratios tended to be improved by GW (22.6 % of the scores lower than −0.05). Figure 3c points out some weaknesses in TRIP, with about 60 % of negative efficiency scores for NOGW. Not surprisingly, these scores are located approximately in the places where the ratios are also overestimated. Conversely, 18 % of the scores are above 0.5, mostly in the eastern part of North America (Mississippi River basin), in the Paraná River basin, and in some other places in Europe such as the Danube River basin or the East European Plain.
These scores were improved for 73 % of the 1900 stations potentially impacted by groundwater in terms of efficiency (efficiency difference greater than 0.05; Fig. 3d) and monthly anomaly RMSE (RMSE difference less than −0.01; Fig. 3f). Conversely, about 10 % of these scores were deteriorated, mainly in the upstream parts of the largest rivers, such as the Mississippi River basin or the Paraná basin. Nevertheless, more than 50 % of the efficiency scores still remained negative. This was mainly the case over basins where the annual ratios were overestimated. For example, Fig. 3a shows ambivalent results for North America, with the eastern part associated with good ratios and efficiencies as opposed to the western part with overestimated ratios and negative efficiencies. These poor scores persisted despite the positive impact of groundwater. These problems are probably due to deficiencies in TRIP such as the absence of hydrological pro-cesses or uncertainties in parameter estimations that will be discussed later.
The unequal distribution of stations over the globe introduced some uncertainties in the conclusion of Fig. 3. Indeed, about 3 % of the 1900 gauging stations were located in Australia, 6 % in Africa, 8 % in South America, 18 % in Asia, 19 % in Europe, and this percentage increased to 45 % in North America (Fig. 2a). The corresponding number of gauging stations for each continent can be found on Fig. 4. In order to have a better representation of the impact of groundwater, Fig. 4 gives the distribution of the score difference between GW and NOGW in terms of efficiency (Fig. 4a) and monthly anomaly RMSE (Fig. 4b) over each continent. It confirms the relevance of using groundwater processes to simulate river discharges at continental scale. The continent where the simulated discharges were the least impacted by groundwater processes was Asia, where 30 % of the efficiency differences were between −0.05 and 0.05, and where 34 % of the monthly anomaly RMSE differences were between −0.01 and 0.01. Conversely, in Africa the scores were almost all improved. Some precautions must be exercised when interpreting these results. First, the stations are not always equally spatially distributed over each continent, and the percentage presented in Fig. 4 can be underestimated or overestimated relative to the actual situation. Secondly, the efficiency scores of a large number of stations remain negative since the improvements in terms of efficiency are small compared to the negative scores of Fig. 3. Nevertheless, for each continent, Fig. 4 confirms the global improvement previously shown in Fig. 3. Figure 5 compares the monthly anomalies and the annual cycles of the simulated and observed river discharges at the stations closest to the mouths of the largest river basins delineated in green in Fig. 2 and presented in Table 2. Monthly anomalies are computed by removing the monthly mean annual cycles from the time series of river discharges. Statistics are summarized in Table 3. The comparison between the GW and NOGW curves for each basin shows that groundwater globally tends to smooth the signal in terms of both annual cycle and monthly anomaly. Thus, the scores are improved over the tropical basins (Amazon, Paraná, Niger and Ganges) except the Mekong River basin. Groundwater shifts the signal by about one month over the Amazon basin, but this effect seems to be too strong over the Ganges, where the base flow is overestimated after (and before) the monsoon season. Over temperate basins, the scores are improved for the Danube River, while no significant changes affect the Mississippi River. Nevertheless, base flow is slightly overestimated with groundwater for the Mississippi case, what is related to the deterioration already observed over North America in Fig. 3. Over the Arctic Rivers, the scores are also improved for each station presented here. In cases with or without groundwater, the peak due to the spring snow melt is one month late. Moreover, this peak is also overestimated for the Ob and Mackenzie Rivers, and underestimated for the Yenisei and Lena Rivers. This is partly due to the absence of flooding in this TRIP version and deficiencies in the meteorological forcing . Figure 6 shows the spatial distribution of the climatological TWS simulated by ISBA-TRIP and estimated by GRACE from August 2002 to August 2008. TWS without groundwater is shown in column (a), TWS with groundwater in column (b), and the GRACE estimates in column (c). The zonal averages are also given in column (d). Generally speaking, the TWS zonal average amplitudes are overestimated by TRIP in December-February (DJF) and June-August (JJA) compared to the GRACE estimates, and underestimated in March-May (MAM) and September-November (SON). These biases are partially corrected by the use of the groundwater scheme, particularly in MAM and SON. The most important changes occur principally over tropical regions. In MAM and SON, the pattern of the spatial seasonal mean TWS is better reproduced by GW than NOGW compared to GRACE in the Amazon basin, in Africa and over the coast of the Bay of Bengal. This is confirmed by the zonal average (column d) with a better agreement in the amplitudes of the GW and GRACE peaks between latitudes −30 • and 30 • . In addition, the scores given in columns (a) and (b) show that the spatial correlation and RMSE are better for GW in DJF and JJA, while they are similar in MAM and SON. Figure 7 summarizes the comparison between simulated and GRACE TWS using the time correlation and RMSE. NOGW is well correlated with GRACE in tropical regions (Amazon, Congo or Ganges-Brahmaputra basins), in the Siberian Plains, in Europe and over some places in North America, while correlation is weak over desert regions such as the Sahel or the Gobi Desert (Fig. 7a). Conversely, the RMSE scores are poor over tropical regions, while arid regions present relatively good scores (Fig. 7b). Figure 7c shows the correlation difference between the GW and NOGW simulated TWS. GW is globally better, especially over the Paraná, the Amazon and the Congo basins, and also in Europe and over the downstream part of the Mississippi basin. Deteriorations are, however, found in Arctic regions along the Ob, Lena and Mackenzie basins, and also in a small region in the western part of the United States. These conclusions also apply to the correlation differences of the TWS monthly anomalies (Fig. 7e). These results suggest some defect in the groundwater parameterization, which will be discussed later. In Fig. 7d, groundwater improves the RMSE over the downstream Amazon, in Central Asia and also in Europe. Conversely, the RMSE scores are deteriorated over the Congo, Ganges and upstream Amazon basins. Elsewhere, no significant changes appear. The same conclusions can be drawn for the monthly anomaly RMSE differences shown in Fig. 7f, even though the improvements are globally less pronounced. Figure 8 compares the monthly anomalies and the annual cycles of the simulated TWS with the GRACE estimates over the same 12 river basins as in Fig. 5. In addition, the temporal correlation scores and RMSEs calculated over the whole GRACE period are given in Table 4. In general, groundwater increases the memory of the system by shifting the TWS signal. Thus, annual cycles are better reproduced over tropical basins, in particular for the Amazon and Ganges basins and, to a lesser extent, over temperate basins. However, groundwater deteriorates the annual cycles slightly over northern river basins (Ob, Yenisei, Lena and Mackenzie) since the TWS amplitudes are more underestimated for GW than for NOGW compared to GRACE. These results agree with Fig. 5. Statistics in Table 4 show that correlations are improved in terms of both TWS and TWS monthly anomalies for almost all tropical and temperate river basins, while no significant improvements appear for the Mekong and Arctic basins. Finally, the RMSE is improved for the Amazon, Ganges, Mississippi and Danube, while no obvious conclusions emerge for the remaining watersheds. Fig. 4. Histogram of the differences between GW and NOGW in terms of (a) efficiency and (b) monthly anomaly RMSE for six continental regions. In the left six panels, the corresponding number of stations is added in brackets beside the name of each continental region.

Sensitivity to precipitation
In order to explore the sensitivity of TRIP to the precipitation forcing used in ISBA to produce deep drainage and surface runoff, two supplementary experiments using TRIP with (GWCRU) and without (NOGWCRU) groundwater were performed with the CRU precipitation dataset and com-pared with the GW and NOGW simulations forced by the GPCC precipitation dataset. Such comparison is of interest in groundwater modeling because the precipitation determines, as well as topography and geology, the temporal and areal distribution of inputs to the groundwater system (Dingman, 1994). Figure 9 shows the global distribution of differences between the discharges simulated with the CRU and GPCC  Table 3. datasets in terms of annual ratio and efficiency. The main differences between the CRU and GPCC experiments appear in the western part of North America, in South America, in South Africa, and in the Ob River basin. The ratios are impacted by the choice of the precipitation dataset ( Fig. 9a and  b). To some extent, this reflects the differences in the spatial distribution of precipitation between the two meteorological products (Fig. 10). The histograms shown in Fig. 9a and b are symmetric and give no advantage to one or the other simulation. However, the simulated river discharges seem to be better reproduced with the GPCC dataset when the efficiency difference scores are considered (52 % of the efficiency differences lower than −0.05 with or without groundwater in the histograms shown in Fig. 9c and d). Moreover, the spatial distribution of the score differences is similar with (Fig. 9c and d) or without ( Fig. 9a and b) groundwater. On one hand, this shows that the groundwater scheme does not seem to be affected by the precipitation forcing, and on the other hand that precipitation seems to have a larger impact on the signal than the deep water transfer simulation. Figure 11 compares the simulated TWS of the CRU and GPCC experiments. As for the river discharges, the GPCC product gives some better scores in terms of correlation than the CRU product even though the score differences are relatively weak, as it is shown in the histograms of Fig. 11a and b. Moreover, the spatial distribution of the correlation differences is similar with or without groundwater ( Fig. 11a  and b). Conversely, the monthly anomaly RMSE is more impacted by meteorological forcing when groundwater is taken into account. Thus, the anomaly RMSE differences are more pronounced in Fig. 11d than in Fig. 11c. In particular, the changes introduced by groundwater using the GPCC dataset (see the GW results in Fig. 7) are amplified with CRU over the Amazon and Ganges River basins. Conversely, GWCRU gives better results than GW in terms of anomaly RMSE over the Congo River. Apart from these differences, the sensitivity to the meteorological fields is generally the same whatever the TRIP version used. Moreover, the comparison between GWCRU and GW shows that precipitation forcing can impact the simulated river discharges and TWS. It constitutes a non-negligible source of uncertainties in simulated hydrological outputs. Table 3. Statistics of NOGW and GW calculated over the observation period of each station shown in Fig. 5. Efficiency, ratio, correlation and RMSE (mm day −1 ) are given, as are correlation and RMSE (mm day −1 ) of the monthly anomalies.

Discussion
The results presented in this study confirm the relevance of taking groundwater into account in CHS for simulating river discharges and TWS at the global scale Liang et al., 2003;Lo and Famiglietti, 2011;Maxwell and Miller, 2005;Yeh and Eltahir, 2005). The groundwater scheme introduces a new reservoir which delays and smooths the hydrological and TWS response. It impacts surface storage variability and thus simulated river discharges, and improves the skill scores. About 73 % of the 1900 stream flow measurements potentially impacted by groundwater are improved by the groundwater scheme over the 1960-2008 evaluation period. In temperate and tropical river basins, water surplus is transferred from winter to summer, which results globally in better simulated base flows during dry periods, in particular over the Amazon and Danube basins. Another consequence of groundwater is to smooth the simulated river discharges, which results in better agreement between the am-plitudes of the simulated and observed discharges in terms of annual cycle and monthly anomalies. These results show that the low-frequency variability of groundwater increases the memory of the simulated surface storage to the benefit of the river discharges Lam et al., 2011;Maxwell and Miller, 2005). Groundwater has a positive impact on the simulated TWS when compared to the GRACE estimates over the 2002-2008 period. These positive impacts come with changes in surface storage and groundwater reservoirs. The groundwater component appears to be as important as the surface storage component in the total TWS signal and helps to improve the seasonal mean variability of TWS. Thus, the time response introduced by groundwater is particularly beneficial over tropical basins, such as those of the Amazon and Ganges Rivers. This good comparison between the GW and GRACE TWS shows the relevance of using the GRACE estimates to evaluate groundwater models (Niu et al., 2007). Even though GRACE constitutes only an indirect way to evaluate the water table variability, it suggests that the proposed groundwater scheme can provide a reasonable estimation of the spatio-temporal variability of water table head. The good results in terms of both simulated river discharges and TWS also confirm the suitability of the proposed methodology for simulating groundwater dynamics at a global scale with a coarse resolution suited to climate modeling as already suggested in Vergnes et al. (2012).
Nevertheless, some deficiencies appear throughout this evaluation. Groundwater can deteriorate the river discharge results and TWS over a few regions where aquifers are normally defined, even if the NOGW scores were initially acceptable. For example, the efficiency scores of Fig. 3d are deteriorated in the eastern part of the Mississippi River and in the upstream part of the Amazon and Paraná basins. These deficiencies point out some limitations in our simple method for defining the geometry of the aquifer and the geological parameters. Although WHYMAP is useful for determining the major aquifers, its low accuracy does not take account of the complex structures encountered locally (karstified areas, confined aquifers, etc.) and sometime leads to an overestimate of the size of the aquifer. Moreover, the coarse estimation of the geological parameters (transmissivity and porosity) and the basic classification of the lithological map (Dürr et al., 2005) used here are other potential sources of error. These uncertainties could explain the problems encountered in the upstream part of the Amazon or Mississippi Rivers, as was the case for the Seine River basin in Vergnes et al. (2012). In particular, the deteriorations over North America are mainly located in the sandstone Pennsylvanian aquifers. Our tests (not presented here) show that without simulating aquifers over these regions, the simulated river discharges in the upstream part of Mississippi River are better reproduced with groundwater. Nevertheless, the USGS aquifer map used to refine the mask of WHYMAP (see previous section) does not show any significant reasons to remove these sandstone aquifers, as also justified by the description of this rock type from Miller (1999). This is the reason why we choose to keep them.
Other causes can be related to some important processes not represented in this version of TRIP. First, the overestimated annual ratios over the Niger and Paraná basins are partly due to the absence of flooding, which introduces a supplementary reservoir to store water and increase evaporation . Over Arctic River basins, Decharme et al. (2012) demonstrates that the delay between the peaks of the simulated and observed river discharges during spring is reduced when flood storage is present in TRIP. Moreover, the underestimation of simulated discharges over the West Siberian basins could be attributable to the presence Fig. 8. Basin-scale comparison between the (right) mean annual cycle and (left) monthly anomalies of simulated NOGW (red) and GW (blue) TWS, and the mean GRACE product (black) with its associated error bars. Annual cycles of each TWS component, except W r , are also shown in dashed lines: W (brown), W s (magenta) and S NOGW (green). The GW specific components S GW (green) and H (cyan) are plotted in solid lines. Statistics for each basin are shown in Table 4. of permafrost not represented in ISBA, which prevents deep drainage and favors the formation of surface water bodies . The groundwater modeling in this region is also questionable since permafrost induces weak interactions between river and groundwater (Kane, 1997;Yang et al., 2002). It could explain the TWS deteriorations observed with GW in Fig. 7 over the Lena and Ob basins. These results point out that groundwater processes could be neglected over these basins, at least for low-resolution and large-scale studies. Secondly, only one layer is modeled in the groundwater representation of TRIP while, in reality, multi-layer aquifers can be present. Combined with the hypothesis of TRIP to consider each grid cell as a river cell, this could explain some deteriorations of the simulated discharge scores. For example, the well-known Guarani aquifer over the Paraná basin forms a complex multi-layer system comprising confined and unconfined aquifers (Wendland et al., 2004). Such a complex system is poorly represented by the one-layer simple groundwater scheme presented in this study. This may explain the errors observed for this watershed in terms of both TWS and river discharges. Finally, the capillary rise of the water table in the surface soil column of ISBA has not yet been implemented, although several studies have pointed out that it can affect soil moisture, evaporation or even precipitation (Anyah et al., 2008;Lo and Famiglietti, 2011;Maxwell and Miller, 2005;York et al., 2002). This could have a certain influence on the partition of precipitation between deep evaporation, surface runoff and deep drainage, and so could affect the simulated river discharges and water tables.
Anthropogenic influences are also not considered in TRIP. For example, the intensive use of water for human activities explains the overestimation of simulated river discharges over the Colorado River basins in the southwest of the United States (Milliman et al., 2008). Man-made irrigation can alter the river flow and increase the continental evapotranspiration, especially over South Asia or the Mississippi River Sacks et al., 2009;Thenkabail et al., Fig. 9. Statistical comparison of the simulated discharges inferred from the CRU and GPCC precipitation datasets. The annual ratio differences (a) without and (b) with groundwater are given, together with the efficiency differences (c) without and (d) with groundwater. 2009). Moreover, groundwater pumping can induce significant changes in the TWS monthly anomalies. For example, Rodell et al. (2009) show that groundwater depletion over the Ganges-Brahmaputra basin is probably due to human activities. Since such human groundwater pumping is not represented in TRIP, the decreasing trend of the TWS monthly anomalies for the Ganges River basin in Fig. 8 is not captured by the model.
Some shortcomings of the model can also be explained by the uncertainties of the meteorological forcing fields, especially precipitation, used to produce the deep drainage and surface runoff fed into TRIP. In order to explore the sensi-tivity of TRIP to precipitation inputs, supplementary simulations with the CRU precipitation dataset were performed. The results show that the GPCC products give better overall results than CRU either with or without groundwater. This is in agreement with Decharme and Douville (2006), who show that the GPCC climatology appears to be a better product than CRU for global hydrological applications even though some deficiencies in the GW experiment are corrected with GWCRU. For example, the deterioration of TWS obtained with GW over the Congo basin in Fig. 7f is partially corrected with the CRU dataset in Fig. 11d. The score comparison between GPCC and CRU experiments also shows significant differences. It confirms that the simulated TWS and river discharges, and thus the quality of global hydrological simulations, can be drastically affected by the uncertainties of the prescribed precipitation datasets (Decharme and Douville, 2006;Fekete et al., 2004;Szczypta et al., 2011). This could lead to a misinterpretation of results and the attribution of errors to the model rather than to the forcing. It is important to clarify that the results were obtained without calibration, as porosity and transmissivity are set according to the rock type. Even though potential tuning of these groundwater parameters and the other TRIP parameters is possible, it could to some extent compensate for the uncertainties introduced by the prescribed precipitation. Finally, the groundwater scheme seems not to be sensitive to the precipitation forcing since the score differences between the CRU and GPCC experiments are similar with or without groundwater (Figs. 9 and 11). As a consequence, it shows that precipitation seems to dominate the TWS and river discharge signals rather than lateral transfer of groundwater flow.

Conclusions
In this study, a methodology based on Vergnes et al. (2012) has been used to construct a global groundwater model to investigate the effects of groundwater processes on river discharges and TWS variations at global scale. This groundwater model is implemented in the TRIP river routing model used for global hydrological and climate applications. The simulations are performed in off-line mode at 0.5 • by 0.5 • resolution by using deep drainage and surface runoff coming from an independent ISBA simulation. The simulated river discharges are computed by TRIP and evaluated over the 1960-2008 period against a dense network of about 3500 in-situ river discharge gauges distributed all over the globe. The TWS simulated by ISBA-TRIP are computed from snow mass, soil moisture, vegetation interception, river water content, and groundwater, if necessary. The TWS variations are then compared to the GRACE satellite-derived TWS estimates for 2002-2008.
The results presented in this study confirm the relevance of introducing groundwater in CHS for simulating river discharges and TWS in a climate model at global scale. The groundwater scheme of TRIP improves river discharges by introducing more memory into the system through its buffering effect. Thus, it contributes in some extent to simulating more realistic base flows. Nevertheless, this buffering effect is too strong over some large river basins (Ganges, Mississippi). It reveals some deficiencies of our approach that will be discussed later. In the regions where the ratios are improved, groundwater contributes to storage for some of the surplus of water and improves the simulated mean annual river discharges, even though they are still overestimated.
The comparison between the simulated and GRACE TWS are also improved with the new groundwater scheme, especially over tropical basins (Amazon, Ganges, Niger). These results are mainly explained by the lag introduced by the lowfrequency variations of groundwater, which tend to shift and smooth the simulated river discharges and TWS. These results suggest that the groundwater scheme could be able to provide a reasonable estimation of the spatio-temporal variability of water table head, at least for a large-scale, simple model. Such affirmation needs nevertheless to be tempered, since the water table evaluation is only made indirectly by comparison between satellite-based and simulated TWS.
As previously suggested, this global evaluation points out some shortcomings in the proposed groundwater scheme. First, the lack of important hydrological processes could be partly responsible for the deteriorations of the simulated hydrological outputs. The most important of them is certainly the absence of a flooding scheme which results in overestimated river discharges and unrealistic peak flows, especially over Arctic River basins. Other processes related to groundwater and not taken into account by the model can be invoked: capillary rise of groundwater or presence of multi-layer aquifer. Moreover, the impact of human activities can be strong in certain regions with the presence of irrigation, water pumping, or reservoirs and dams for hydroelectric power. Secondly, the methodology for defining the groundwater model at global scale is also questionable with regard to the coarse definition of the groundwater parameters (transmissivity, porosity and river-groundwater coefficient) and the uncertainties in the delineations of aquifers. Another source of error could be the precipitation forcing used to produce the total runoff. The present study compares the results obtained with two precipitation datasets coming from GPCC and CRU. It shows that the uncertainties in the precipitation datasets can significantly change the resulting hydrological responses and lead to misinterpretation of the results. Thus, even though the standard TRIP parameters (river geometry and slope) and/or groundwater parameters could be tuned to improve the results, some precautions must be taken since it could compensate the errors introduced by the prescribed precipitation.
The next step of this work will be to couple the groundwater scheme with the ISBA LSM in order to simulate the interactions between the deep water table dynamics and the overlying unsaturated soil. The goal will be to represent the impact of water capillary rise on the land surface energy and water budgets. The ultimate objective will be to introduce this new land surface component into the CNRM global climate model (Voldoire et al., 2012) in order to assess the relevance of groundwater processes for the simulation of both recent and future climates.