Data assimilation of GRACE terrestrial water storage estimates into a regional hydrological model of the Rhine River basin

Introduction Conclusions References


Introduction
Terrestrial water storage (TWS) is the integrated sum of all surface water, soil moisture, snow water, and groundwater (GW) availability, and is a metric critical for monitoring the water supply for domestic, industrial, and agricultural sectors.The ability to estimate TWS is useful for understanding past events and predicting future changes in the hydrological cycle, streamflow and water availability, as well as their impact on the occurrence of droughts, heat waves, and floods (Hirschi et al., 2007).The individual components of TWS influence the climate system in different ways.Soil moisture is a major source of water for the atmosphere in the terrestrial Published by Copernicus Publications on behalf of the European Geosciences Union.

N. Tangdamrongsub et al.: Data assimilation of GRACE terrestrial water storage estimates
water cycle (Jung et al., 2010) and plays a particularly important role in the climate system (Seneviratne et al., 2010).Soil moisture estimates are also useful for seasonal predictions, and have been shown to improve predictions of air temperature in North America (Koster et al., 2010) and Europe (van den Hurk et al., 2012).Similarly, realistic estimation of the snowpack can improve the prediction of near-surface temperatures at high latitude regions at 15-30-day scales (Orsolini et al., 2013).Finally, GW variability influences soil moisture and evapotranspiration, and is related to long-term water availability and climate changes (Bierkens and van den Hurk, 2007;Green et al., 2011).
Despite the importance of having reliable estimates of TWS, knowledge about the spatial and temporal variations of TWS and its components is generally lacking.This is particularly true at large scales, due to the absence of global monitoring systems.Ground-based measurements, while very accurate, only provide point-wise estimates (Dorigo et al., 2011;Lettenmaier and Famiglietti, 2006).Large spatial coverage can be achieved using satellite remote-sensing observations, but these often measure only one component of the total storage and suffer from additional limitations.For example, in the case of soil moisture, satellite observations are limited to the top few centimetres of the soil column and to areas free from dense vegetation cover (e.g. de Jeu et al., 2008;Entekhabi et al., 2010;Kerr et al., 2012).Variations in surface water can be observed with satellite altimetry, but this technique is currently limited to large target areas ( > 10 km) (Phan et al., 2012;Schwatke et al., 2013;Kleinherenbrink et al., 2014).
Since measurements alone are not sufficient to comprehensively monitor all components of TWS, hydrological models are often employed.A strong point of hydrological models is their ability to obtain spatially distributed estimates, differentiate TWS components, and simulate changing boundary conditions.Many hydrological models are available, which vary in terms of process description, temporal resolution, spatial resolution, and the detail in process representation (Koster et al., 2000;Rodell et al., 2004).Models vary in terms of which TWS components are included in the model and how they are represented.The performance of hydrological models is also influenced by the accuracy of the input forcing data and the quality of the model calibration.The existence of model uncertainties motivates the need to combine the model with independent observations to obtain a better representation of the system's behaviour.
Changes in TWS can also be estimated by observing variations of the regional gravity field over time.The idea is that changes in water storage, including those deep underground, induce a gravitational signature proportional to the amount of (water) mass redistribution.Since 2002, these variations have been measured by the Gravity Recovery and Climate Experiment (GRACE) satellite mission (Tapley et al., 2004).GRACE allows temporal variations of Earth's gravity field to be observed at spatial scales ranging in the hundreds of kilometres, and at timescales as short as 1 month.As part of the GRACE data processing, atmospheric and ocean related time-variable gravity effects are removed from the data, leaving the remaining gravity signal over the continents mostly representing changes in TWS (in some areas, additional removal of other nuisance signals is needed, such as those due to glacier melting, glacial isostatic adjustment, and megathrust earthquakes).The GRACE mission has enabled the first direct observations of large-scale TWS, and studies to date have shown high correlation with modelled TWS in terms of seasonal dynamics and regional spatial patterns (Syed et al., 2008;Becker et al., 2011;Longuevergne et al., 2013).A unique feature of satellite gravimetry is that it observes the total column of mass variations (including GW), while other remote-sensing techniques can only penetrate to a very limited depth, often just a few centimetres.In contrast to hydrological modelling, it is not possible to identify which layer the inferred mass variations can be attributed to (Rodell et al., 2009).
Several earlier studies have employed data assimilation to combine the strengths of hydrological modelling and GRACE observations and to mitigate their respective weaknesses (Zaitchik et al., 2008;Su et al., 2010;Houborg et al., 2012;Li et al., 2012;Forman et al., 2012).In data assimilation, the model states are constrained by observations, taking into account the estimated uncertainties for both the model states and the observations (Evensen, 2003;Reichle, 2008).
Employing data assimilation provides a mechanism to downscale the coarse GRACE TWS variations to the temporal and spatial resolution of the model as well as providing insight from the hydrological model into the distribution of TWS between the individual storage terms.Zaitchik et al. (2008) assimilated GRACE into the Catchment Land Surface Model to estimate the TWS over the Mississippi River basin.Houborg et al. (2012) and Li et al. (2012) applied a similar strategy to improve the drought indicator over North America and Europe, respectively.Su et al. (2010) and Forman et al. (2012) extended the work of Zaitchik et al. (2008) to improve the estimated snow water equivalent over North America and northwestern Canada, respectively.All results from earlier studies reported that assimilating GRACE improved, or at least did not degrade, the hydrology model's performance.In particular, good agreements between estimated state variables, e.g.GW and streamflow, and the in situ measurements were observed.This study adds to these prior works by examining how GRACE assimilation performs when the hydrological model is not well calibrated or when unreliable meteorological data are used to force the model.This focus of the study is on the Rhine River basin (Fig. 1), which is significantly smaller than the large basin or continent-scale studies of these prior works, so the analysis presented here provides new insight into the performance of GRACE assimilation over smaller basins.And, while previous data assimilation studies have been performed in the Rhine and neighbouring basins (e.g.Weerts and Serafy, 2006;Rakovec et al., A1. 2012), this study is the first to incorporate GRACE observations in the assimilation scheme for this region.
The primary goal of this study was to understand the impact of GRACE assimilation on the estimated TWS, GW variations and streamflow in the Rhine basin.The second goal was to investigate the potential value of assimilating GRACE observations in data-sparse regions.Four scenarios were considered in which the model parameters used were either calibrated (high quality) or basin-averaged (poor quality) values, and the forcing data were obtained from either local (high quality) or global (poorer quality) data sets.In this context, comparison of the four scenarios provides insight into how GRACE can be used to constrain hydrological models when limited data are available.

Hydrological modelling
The hydrological model employed in this study is the Open-Streams wflow_hbv model (Schellekens, 2014).This is a distributed version of the HBV-96 model, named after the Hydrologiska Byråns Vattenbalansavdelning (Hydrological Bureau Water Balance section).The HBV model was originally developed at this former section of the Swedish Meteorological and Hydrological Institute (SMHI) in the early 1970s.Since then, the HBV model has been used in over 40 countries.In 1996, a comprehensive re-evaluation of the HBV model routines was carried out (Lindström et al., 1997), which resulted in the HBV-96 version.The OpenStreams wflow_hbv model is a variant of this model, programmed in the PCRaster-Python environment (Karssenberg et al., 2009), but using a kinematic wave for hydrological routing.It is publicly available through the OpenStreams project (https: //code.google.com/p/wflow/;last access: 18 January 2015).The defined grid resolution used in this study was 1 km.A schematic representation of OpenStreams wflow_hbv is given in Fig. 2a and the key parameters of the soil moisture and runoff response routines are listed and described in Table 1.
The water from either precipitation or snow first enters the interception storage and snow routine.The remaining liquid water (from rainfall and snowmelt) after the snow routine infiltrates into the soil.The soil moisture storage term (SM in millimetres), which includes both surface and root zone soil moisture is controlled by three main parameters: fc, lp, and β (see also Table 1).When the amount of water exceeds the maximum capacity (fc), the excess water becomes available for direct runoff.Within the soil layer, seepage is generated and controlled by an empirical parameter β.The volume of water available for runoff (direct runoff and seepage) is transferred to the runoff response routine.Additionally, some percentage of the soil moisture evaporates, which is controlled by a defined threshold (fc × lp).
Two linear reservoirs are defined in the runoff routine, namely the upper and lower zones (UZ and LZ).The excess water from SM recharges the UZ, and some of the water in UZ percolates to LZ, as determined by the perc parameter.At the same time, capillary flow from UZ to SM also occurs, controlled by cflux (maximum value of capillary rise from upper zone storage to soil moisture storage).The runoff generation in UZ is controlled mainly by two main parameters, the recession constant (khq) and the non-linearity parameter (α).LZ contributes the water to the base flow through the recession constant (k4).The amount of base flow is simply the multiplication between k4 and the amount of LZ.Runoff from UZ and LZ then enters the routing model to determine the streamflow.
For reference, TWS is defined here as the sum of SM, UZ and LZ.Groundwater storage is defined as the sum of UZ and LZ.These storage terms are calculated in the soil moisture and runoff response routines.Figure 2b shows the simulated SM, UZ, and LZ from a nominal model run (i.e. using the calibrated parameters and local forcing data).The main source of TWS variation in this model is SM, with the variations in LZ and UZ an order of magnitude smaller.Extraction of GW for irrigation is considered to be small over our study region.It accounts for less than 1 km 3 yr −1 .Industry is the largest user (Wada et al., 2014).However, the net removal is small as only 10 % of the total water withdrawal over the Rhine is from GW and the water is re-introduced to the system after being used for industry.This is markedly different to the extraction of GW for irrigated agriculture observed in India (Ferrant et al., 2014).Therefore, this impact on TWS is not considered in this study.
The OpenStreams wflow_hbv model was calibrated for the Rhine River basin using observations from in situ streamflow gauges (Mülders et al., 1999;Eberle et al., 2002Eberle et al., , 2005;;Photiadou et al., 2011).The spatial distribution of the calibrated model parameters is shown in Fig. 3.
In data-sparse regions, a lack of in situ (meteorological and streamflow) data makes it difficult to calibrate hydrological models (Sivapalan et al., 2003;Hrachowitz et al., 2013).Therefore, we decided to add non-calibrated cases to our simulations.In those cases, we defined the non-calibrated param- eters as the areally averaged values of the calibrated parameters in the entire basin, and used these for every grid cell in the basin.
3 Data sets

GRACE observation
The most recent release (RL05) of the GRACE gravity model product, generated by the University of Texas at Austin's Center of Space Research (CSR; Bettadpur, 2012), was used in the analysis.The CSR RL05 models represent a time series of Stokes coefficients up to a maximum spherical harmonic degree and order of 60, and are provided monthly.Following the GRACE conventional processing steps, degree-1 coefficients provided by Swenson et al. (2008) were added, and the degree-2 coefficients were replaced by the values estimated from satellite laser ranging (Cheng and Tapley, 2004).Variations in the gravity field were computed by removing the long-term mean (computed over the entire study period; see Sect. 5) from each monthly solution.The TWS variations over the Rhine basin were then produced using the approach described by Wahr et al. (1998).Because of strong noise artefacts present in the high degree coefficients, a de-striping filter similar to that described in Swenson and  Wahr (2006) was applied to each monthly solution.The filter used a 5th degree polynomial (Savitsky-Golay) over a 5point window to remove the correlations, and orders below 8 remained unchanged.Further, an additional 250 km radius Gaussian smoothing (Jekeli, 1981) was applied.While this process helps to mitigate noise in the solution, it also attenuates a genuine signal, so a scale factor is often applied in an effort to restore some of the signal that gets leaked out of the basin due to the spatial filtering.To that end, scale factors using the Global Land Data Assimilation System (GLDAS) hydrological model (Rodell et al., 2004) were computed following the method described by Landerer and Swenson (2012).The sum of four soil moisture layers (0 to 2 m) and a snow water equivalent layer from a monthly GLDAS Noah Land Surface Model version 1 was defined as the TWS.We (leastsquares) fitted the time series between the original and filter GLDAS at every grid node over the Rhine using only 1 scale factor.The estimated filtering-scale factors varied between 0.98 and 1.02 over the Rhine River basin.The correc-tion for glacial isotactic adjustment, which has been shown in other regions to affect the interpretation of long-term trends (Peltier, 2004), was determined to be small in our study, so the corresponding correction was not applied.

Forcing data
The forcing data required to drive the OpenStreams wflow_hbv model are precipitation, temperature, and potential evapotranspiration (PET).Two types of forcing data were used in this study.Local forcing data indicate the best available data, and global forcing data indicate a lower quality data set but one which is available globally or nearly globally.
In this study of the Rhine basin, local forcing data refer to meteorological data from the network of local weather stations, providing higher spatial and temporal resolution.Local precipitation and temperature data were retrieved from the European Climate Assessment & Data set (ECA & D) and ENSEMBLES project, known as E-OBS data (Haylock et al., 2008).Data collected from several hundred ground stations were combined to produce a daily grid of precipitation and mean surface temperature at a 0.25 • spatial resolution.Local PET data were derived from climatological data obtained from the Commission for the Hydrology of the Rhine basin (CHR) and the German Meteorological Service (DWD) (Weerts et al., 2008).The daily local PET was interpolated from a monthly mean value with a fixed annual cycle and was available at a 1 km spatial resolution (Weerts et al., 2008;Photiadou et al., 2011).
Global precipitation and temperature data were obtained from Sheffield et al. (2005).These data are constructed based on the long-term near-surface meteorological variables from the National Centers for Environmental Prediction -National Center for Atmospheric Research (NCEP/NCAR) reanalysis product.The daily global precipitation and temperature data were provided at a spatial resolution of 0.5 • .For global PET, the 1 • daily product generated by Senay et al. (2008) was used.
Figure 4 shows a comparison between mean daily precipitation, temperature, and PET in 2006 from the local and global forcing data sets.For the mean temperature, aside from the resolution difference, the spatial distribution and magnitude is very similar between the two data sets.On the other hand, significant differences can be seen between the local and global precipitation data, especially over the High Rhine.Differences are also observed in the PET products, with the global data set having generally higher values than the local one, in addition to the much coarser spatial resolution of the global product.

Validation data
Groundwater and streamflow measurements from various networks are used to validate our estimated results.

Groundwater data
In situ GW measurements were obtained from three different networks: 1. Ministerium für Klimaschutz, Umwelt, Landwirtschaft, Natur-und Verbraucherschutz des Landes Nordrhein-Westfalen (http://www.elwasweb.nrw.de; last access: 5 March 2014).(e.g.ADES) were interpolated to daily intervals.A total of 18 wells were used for validation.Their locations are shown in Fig. 1, and their names are provided in Table A1.

Bayerisches
The in situ GW measurements were provided in the form of piezometric head.The variations in piezometric head can be related to variations in GW storage if the specific yield is known (Rodell et al., 2007).As the latter data were unavailable, the piezometric head was scaled to the units of GW storage based on other GW data.Previous studies have demonstrated that subtracting SM derived from GLDAS from GRACE was able to extract the GW component from GRACE in several regions, e.g.North America (Rodell et al., 2007), Australia (Tregoning et al., 2012), or the Middle East (Longuevergne et al., 2013).We adopt a similar idea by using the relationship between TWS − SM (TWS  Dorigo et al., 2011) does not have data covering the GRACE observation period.The soil moisture estimated from remote sensing was also not appropriate because the penetration depth depends on frequency and would not be the same as that in OpenStreams wflow_hbv.Therefore, we decided to use GLDAS-derived SM in this study.The SM variation from GLDAS ( SM GLDAS ) was computed by removing its long-term mean value.The long-term mean value was produced from all GLDAS SM data over the same period as the GRACE observations (see Sect. 5).The GW variations from GRACE ( GW GRACE ) were obtained by removing SM GLDAS from the GRACE observations every month.GW GRACE was interpolated to daily values in order to compare it to the daily head variations h.The comparison was done using the following relationship: where e indicates the observation error.The two parameters a and b were estimated by least-squares regression.The scaled in situ GW variation ( GW in situ ) were then obtained from the observed variations in piezometric head using where â and b are the parameters estimated from Eq. (1).

Streamflow data
Streamflow was validated using observations from the 13 in situ gauges indicated in Fig. 1.Time series were provided by the Hydrological Modelling Basis in the Rhine basin (HY-MOG; Bader et al., 2013).The hourly data were aggregated to daily data for this study.
4 Data assimilation

Ensemble Kalman filter
The ensemble Kalman filter (EnKF) is used here to assimilate GRACE TWS into the OpenStreams wflow_hbv model.The EnKF uses a Monte Carlo approach: an ensemble of model states is integrated forward in time using the forward model.The update equation from the classical Kalman filter is used to update the model estimate, where the Kalman gain is determined using the error covariances calculated from the ensemble (Evensen, 1994).The EnKF and its variants are widely used because they are efficient, easy to implement and allow great flexibility in terms of model uncertainty (Evensen, 2003).In this study, we implement a so-called 1D-EnKF (De Lannoy et al., 2009) in which each grid cell is updated individually.The state equation in a discrete form is given as where f is the model operator, ψ is the state variables, u is the forcings, α is the model parameters, and w is the model error.In this paper, the state variables (ψ) are an n × 1 vector of TWS from OpenStreams wflow_hbv.The observations available at a measurement time t are gathered in a vector of observations d (TWS from GRACE): where d is an m × 1 vector containing the observations, H is measurement operator which relates the state ψ(t) to the measured variables d(t).In this study, the observation and the state vector are TWS, so n = m = 1 and H is the unit matrix.The uncertainties in the observations are given in the random error , which is assumed to have zero mean and covariance matrix R. In the initialization phase, the EnKF is initialized by generating an ensemble (i) of N realizations of the state ψ i (t), i = 1, . . ., N around a nominal ψ(t).This reflects the prior knowledge of the state at the initial time.The EnKF moves sequentially from one observation time to the next and works in two steps, a forecast step and an update step.At the updated time t (when the observation is available), an ensemble of perturbed observations, d i (t) is generated as where i denotes the perturbation of the error of each ensemble member i.If the ensembles of the variables are stored in a matrix A = (ψ 1 , ψ 2 , ψ 3 , . . ., ψ N ), the ensemble perturbation matrix can be defined as A = A − A, where A is the mean computed from all ensemble members.Similarly, the ensemble members of the observation and perturbations are gathered into the matrices D = (d 1 , d 2 , d 3 , . . ., d N ) and γ = ( 1 , 2 , 3 , . . ., N ).The analysis equation can be expressed as (Evensen, 2003) where A a is the analyzed model state.

Assimilating GRACE observations
Several steps must be taken before GRACE TWS can be assimilated into OpenStreams wflow_hbv.GRACE observations represent average TWS variations over 1 month, while the OpenStreams wflow_hbv model has a daily time step.In this study, it is assumed that the average TWS corresponds to the middle of the month.Then, spline interpolation between consecutive months is used to generate a time series of GRACE TWS variations at 5-day intervals.The 5-day interval was chosen through trial-and-error to be a good compromise between allowing the ensemble to grow between updates and avoiding implausible jumps.As in any land surface assimilation application, the update results in discontinuities as mass is added or removed from the state but these are not large enough to be obvious when a 5-day interval is used (see Sect. 5.1).If the update took place at larger time interval (e.g.once a month) and the entire increment was applied on 1 day, more significant artefacts or temporal discontinuities would occur (Widiastuti, 2009).In order to convert GRACE variations to absolute values the mean TWS in the study period was calculated from the nominal OpenStreams wflow_hbv run and added to the GRACE time series.GRACE observes total TWS, some components of which can be neglected (e.g.nominal OpenStreams wflow_hbv simulations indicate that surface water and interception storage contributed by less than 1 % to the estimated TWS).Snow is also small averaged over the study area (approximately 2 % to the estimated TWS in winter).Only over the Alps (see Fig. 1) is the snow contribution greater (approximately 7 %).Therefore, we decided to exclude the snow from the state vector.To reconcile GRACE to OpenStreams wflow_hbv TWS, we then removed the snow component estimated from the nominal run from the GRACE prior to assimilation.Note that in catchments where the snow component is more significant, it should not be excluded from the state vector.
In the EnKF, the GRACE TWS are calculated and assimilated at each 1 km model grid cell every 5 days.Because the analyzed model state A a (t) was an integrated value of TWS, the increment ( A(t) = A a (t) − A(t)) for every ensemble member needed to be disseminated among the three stores, SM, UZ, and LZ.The information about the distribution of the increment among the different model compartments could be obtained directly from the Kalman filter.However, we chose to carry out the vertical distribution in the way consistent with the OpenStreams wflow_hbv model (Fig. 2).While the SM and LZ stores have upper bounds determined by model parameters, UZ does not.As a result, allowing it to update freely in the EnKF runs the risk that it becomes excessively large, which would also have a detrimental effect on runoff.Therefore, the increment is used to adjust the SM first, subject to the upper and lower limits of zero and fc.Any remaining increment is applied in turn to LZ, up to its upper limit, and only then to UZ.
The GRACE observation error is assumed to be 20 mm and horizontal observation error correlations are not considered.The 20 mm value is considered realistic as it was suggested by several independent assessments, e.g.Klees et al. (2008), Wahr et al. (2006), and it also had been applied in previous GRACE assimilation studies (Zaitchik et al., 2008;Houborg et al., 2012).Our philosophy was to set the GRACE errors to realistic values determined from independent studies, so that the solutions were not guided towards any particular outcome.

Uncertainty in model forcing data and parameters
In the EnKF, stochastic noise can be included in model forcing data and parameters to account for model uncertainty.An earlier sensitivity study (Widiastuti, 2009) was conducted to identify the parameters of the OpenStreams wflow_hbv model that had a significant impact on TWS.Six such parameters, which include fc, lp, β, cflux, khq, and perc were found.Therefore, the soil moisture routine parameters, fc, lp, and β, as well as the runoff routine parameters, cflux, khq, and perc, were perturbed.For the calibrated case, the calibrated model parameters in each grid cell were perturbed using additive Gaussian noise, with a mean of zero and a standard deviation equal to 10 % of the range of values that occurred over the whole Rhine basin.In the non-calibrated case, the mean parameter value in each grid cell was set to the average calibrated value across the whole basin, and the standard deviation was set to that of the calibrated parameter across the whole basin.This was considered as a proxy for assigning approximate values based on the land cover type, topography, and climatology from the globally available databases.Averaging each parameter across the entire Rhine basin is intended merely to reflect this kind of first-order assumption.Though not all OpenStreams wflow_hbv parameters can be gleaned from such global databases, the averaged values could be compared to those in the Food and Agriculture Organization (FAO) of the United Nations database (http://www.fao.org/geonetwork/srv/en/main.home; last access: 5 December 2014).The areally averaged parameter values over the Rhine were found to be within the range provided by FAO.For example, the areally averaged soil moisture field capacity over the Rhine FAO provided is mostly between 150 and 200 mm, while the areally averaged value of approximately 180 mm is used as a mean in this study with a standard deviation of 33 cm.The meteorological forcing data were also varied, with the temperature data being perturbed with additive Gaussian noise, and the precipitation and PET being perturbed with additive log-normal noise.In the local forcing data case, noise with standard deviation based on 10 % of the nominal value was added to precipitation while 15 % noise was added to temperature and PET.For the global forcing data case, we assumed that the local forcing data were accurate and reliable, and the differences between the local and global forcing data represent the errors of global forcing data.The errors were assumed to be spatially correlated, so an exponential correlation function was applied to the covariance matrix for each variable.The correlation lengths for precipitation, temperature, and PET were determined using variogram analysis (Widiastuti, 2009) and found to be 21, 21, and 59 km, respectively.
Recall from Sects. 1 and 3.2 that four cases are considered in this study: (1) calibrated parameters with local forcing data (CL), (2) calibrated parameters with global forcing data (CG), (3) non-calibrated parameters with local forcing data (NCL), and (4) non-calibrated parameters with global forcing data (NCG).Comparison of the four scenarios provides insight into the benefit of GRACE assimilation under different degrees of uncertainty.The lowest and highest levels of uncertainty are associated with the CL and the NCG cases.

Results and discussion
Using the EnKF approach described above, GRACE observations were assimilated into the OpenStreams wflow_hbv model.An ensemble of 100 model states was propagated forward from 1 January 2001 to 30 November 2003 to spin up the model.The ensemble state at the end of the spin-up period provided the initial state for the assimilation.The study period is from 1 December 2003 to 31 October 2007 because the observed streamflow was only available until autumn 2007.

Impact of GRACE assimilation on TWS estimates
First, the impact of assimilating GRACE on the temporal and spatial patterns of the estimated TWS is considered.For the temporal pattern, the areal mean of the estimated TWS over the entire Rhine River basin was computed.The time series of TWS variations from the ensemble open loop (EnOL; ensemble run without GRACE assimilation), EnKF, and GRACE observations are shown in Fig. 5.
As expected, there is a seasonal cycle in the TWS estimates, which varies between ±75 mm.The high frequency variations in TWS in the CL and NCL that are not apparent in CG and NCG are due to the coarser spatial resolution of the global precipitation product.Lower spatial variability of the global data causes smoother averaged TWS presented in the CG and NCG time series.During the summer of 2006 (June-July-August: JJA), the areal mean global and local precipitation and temperature products agree.However, the global PET product estimates an areal mean PET of 4.10 mm day −1 while the local PET data suggest it was 2.89 mm day −1 .As the result, the minimum TWS in the CL and NCL cases in the EnOL is −69 mm while CG and NCG are close to −90 mm.In this period, GRACE assimilation has little impact on CL and NCL, but results in a significant (25 mm) update in TWS in the CG and NCG cases.The largest difference between the EnOL and EnKF occurs when TWS is increasing (e.g.October 2005).This is apparent in all cases, but is greatest in the two non-calibrated cases.In all cases, Fig. 5 shows that assimilation draws the TWS estimate toward the GRACE observation.
The impact of GRACE assimilation also varies within the basin.Figure 6 shows the spatial distribution of the average increment (posterior minus prior) in TWS during winter (December-January-February: DJF, 2005DJF, -2006) ) and summer (JJA) of 2006.During the winter (left panels), the EnKF estimated wetter conditions over entire Rhine River basin when the local forcing data were used.In the Alps, the global precipitation product is approximately 35 % higher than the local precipitation product.Therefore, GRACE assimilation reduced the TWS estimate over the Alps in the CG and NCG cases.During the summer (right panels), GRACE assimilation reduced the TWS estimate over the Alps and Neckar basin when local forcing data were applied, but adds moisture in the global data case.In this period, the local PET product is 66 % lower than the global product over the Alps and 44 % lower over the Lahn basin.This is consistent with the increase in areal-averaged TWS observed in the CG and NCG cases in Fig. 5. Since the local precipitation data are generally considered to be more accurate, the adjustment of the TWS estimates towards those produced by the local product is an excellent example of the benefit of GRACE assimilation, particularly in data-sparse areas.
In the Regnitz basin (east of domain), GRACE assimilation leads to a significant increase in TWS in both calibrated cases during the winter months.In this basin, the UZ recession coefficient (khq) is 0.52 in the calibrated case, compared to 0.3 in the non-calibrated case.This results in almost twice as much fast runoff in the calibrated case, which depletes the TWS in the winter months.GRACE assimilation adds moisture to the UZ and LZ stores, drawing the TWS closer to the GRACE observations.
In the summer, an average of 0.7 and 1.07 mm was removed in each update from the southern part of Moselle basin in the CL and CG cases, respectively (Fig. 6b and d), compared to 0.74 and 1.25 mm added per update in the NCL and NCG cases.In the two calibrated cases, the evaporation threshold value (the product of fc and lp) is approximately 11 % less than that in the non-calibrated cases.This leads to less soil evaporation and higher soil moisture in the calibrated cases.GRACE assimilation reduces the SM in the calibrated cases, and increases it in the non-calibrated cases to draw the TWS closer to the GRACE observations in all cases.

Impact of GRACE assimilation on GW estimates
The TWS and GW variations from OpenStreams wflow_hbv were computed at every grid cell.The estimates at the Sundern and A319C wells are shown in Figs.7 and 8.The two stations represent the behaviour of the other 16 stations (detailed below).For example, stations 2, 3, 4, 6, 9, 10, 11, 13, and 18 have similar behaviour to Sundern, while the rest have similar behaviour to the A319C station.Recall that GW is defined as the sum of UZ and LZ, so the difference between the left and right columns is the SM term.GRACE measures monthly variations, so the monthly mean of TWS, GW estimates, and the in situ data are shown.Similar to the areal mean values, the TWS from the EnKF in the individual grid cells (left column) is generally between the values from the EnOL and those observed by GRACE.
At Sundern (Fig. 7) in the CG and NCG cases, the impact of the forcing data was seen in the summer of every year.Table 2 shows that the precipitation, temperature, and PET at Sundern were higher in the global forcing data than in the local data.Figure 7c and g suggest that this leads to a more negative estimate of TWS in the EnOL for the CG and NCG cases.In the EnKF results, these TWS estimates are drawn towards the GRACE observations.The corresponding updates in terms of GW are larger in the global forcing data case than in the local forcing data cases -assimilation added approximately 5-10 mm of water to GW in the global data cases.Similar behaviour was also seen in CL and NCL cases in summer 2005.
At Sundern, the estimated GW in the CL case agrees quite well with the in situ values, suggesting that the distribution between the SM and GW components is reasonable in the calibrated cases.The fact that a good estimate of TWS does not result in an improved GW estimate indicates that the noncalibrated parameters are leading to an incorrect distribution of the TWS between the different stores.In the NCL and NCG cases, fc is just 179 mm compared to the calibrated value of 239 mm.So, for the same TWS value, the noncalibrated cases have more water in GW than the calibrated cases.As a result, despite the agreement in TWS in the winter months, the GW variation is considerably overestimated.
In every case at the A319C well location (Fig. 8), the EnOL estimated lower TWS in the first half of 2004 and 2006, and higher in the second half of the same years.Assimilation updated the TWS toward GRACE observation in these periods and resulted in better agreement between the assimilated and observed GW.In late 2005, the estimated TWS from the EnOL and EnKF are very close to the GRACE observations.However, the estimated GW in both cases is much lower than that observed in situ.As discussed, the difference between the two is soil moisture.The model is predicting a significant increase in soil moisture in all four cases.However, given there is little to improve in terms of TWS, the GW estimate from the EnKF is as bad as that from the EnOL.
The impact of the forcing data used is also presented.In the CG and NCG cases, on 3 October and 23 October 2006, underestimated global precipitation caused the underestimated GW.GRACE could not correct such a high frequency event due to the limitation of its temporal resolution.
The choice of the parameters plays a role in the estimated GW magnitude (as seen in Fig. 7), but now the non-calibrated parameters (compared to the calibrated ones) provided closer values to the in situ data (Fig. 8f and h).A higher noncalibrated fc parameter (see Table 3 for the values) was responsible for smaller GW estimates.
Tables 4 and 5 show the correlation coefficient and root mean square error (RMSE) between the estimated and in situ GW for all 18 well locations indicated on Fig. 1.These were calculated based on the monthly mean, but similar results were obtained using the daily values.In most cases, assimilation leads to an increase in correlation coefficient and a reduction in RMSE.
The results varied across the wells.The highest correlation coefficients in the EnOL simulations were typically found in the CL case, followed by the NCL.Clearly, using the local forcing data has a significant impact in resolving features at a single grid cell.An exception is the Main basin (wells 5, 7-10) where the global forcing data produce TWS more consistently with the GRACE observations and hence result in a better agreement with the GW.The highest correlation coefficients in the EnKF cases are also found in the two local data cases.The improvements in correlation coefficient are seen in all four cases.The CL and NCL cases also yield the lowest RMSE values in the EnOL case, and the results with the EnKF are very mixed.
It is important to note that at many wells, the NCL and NCG cases yield higher correlation coefficients than the CL and CG cases, respectively.Recall that the model is calibrated using streamflow, not GW data.So, while assimilation draws the modelled TWS towards the GRACE observations, the model parameters have a significant impact on whether or not this translates to an improvement in the GW estimate.
One of the objectives was to examine the potential value of GRACE assimilation in data-sparse regions.In the NCG case, it is encouraging that GRACE assimilation consistently leads to an increase in the correlation coefficient (from 0.46 to 0.61 in best case) and reduction in RMSE (up to 35 %).In other scenarios, assimilation of GRACE observations also leads into an increase in the correlation coefficient (from 0.31 to 0.53 in best case, at station 11 in the CG case) and a decrease in RMSE (up to 35 %, at station 1 in the NCG case).For the average improvement of GW estimates (for all four cases), the correlation coefficient increases from 0.6 to 0.7 and the RMSE was reduced by 15 %.

Impact of GRACE assimilation on streamflow estimates
The estimated and observed streamflows at Maxau (upstream) and Wesel (downstream) gauge stations are shown in Figs. 9 and 10.Accurate forcing data, particularly precipitation, are essential for reproducing the observed streamflow.
The high frequency variations in streamflow associated with fast response to local precipitation are often reproduced reasonably well in the CL case, but not in the CG case .
Use of the global data frequently underestimates the streamflow.This is clear on 5 June 2004, 24 August 2005, 6 October 2006, and 10 August 2007 in Fig. 9b and d. Comparing Fig. 9a to Fig. 9b, it is clear that the larger peaks in streamflow are poorly estimated when the global data are used.Because GRACE observations describe monthly variations over a larger area, they can do little to capture these individual streamflow events.By correcting TWS, GRACE assimilation mainly influences the longer term variations.The difference between EnOL and EnKF is very small in the CL case.The largest differences are observed in the CG and NCG cases, where TWS is updated to correct for errors in forcing data (e.g.summer 2004 and 2006 in Fig. 9).
Figure 11 shows the impact of GRACE assimilation on the correlation coefficient, Nash-Sutcliffe coefficient (NS) (Nash and Sutcliffe, 1970), and RMSE in streamflow.Results are shown for four gauge stations along the main channel, as well as the average value across all 13 stations.These results underscore the importance of forcing data and calibration for estimating streamflow.By far, the highest correlation coefficients and Nash-Sutcliffe coefficients and lowest RMSEs   are obtained when local forcing data are used.Use of global forcing data leads to a significant loss in performance.For example, using global rather than local forcing data with the calibrated model results in a decrease in the correlation coefficient from 0.89 to 0.65, a decrease in the Nash-Sutcliffe coefficient from 0.76 to 0.35, and an increase in RMSE of 71 % in the EnKF results.Using the non-calibrated model rather than the calibrated model also leads to poorer performance, though to a lesser degree.For example, using the noncalibrated rather than calibrated model with the local forcing data results in a decrease in the correlation coefficient from 0.89 to 0.88, a decrease in the Nash-Sutcliffe coefficient from 0.76 to 0.65, and an increase in RMSE of 23 % in the EnKF results.
Compared to the differences due to forcing data and calibration, GRACE assimilation leads to a relatively modest improvement in streamflow estimates.In terms of correlation coefficient, the largest improvements on average (Avg Similarly, GRACE assimilation leads to a modest improvement in terms of NS coefficient.The largest average improvement was from 0.62 to 0.65 in the NCL case.GRACE assimilation slightly reduced the RMSE in all four cases.The greatest reduction is 4 % in the NCL case.
Though it is encouraging that GRACE assimilation improved the estimated streamflow, these results demonstrate that it clearly cannot replace high-quality forcing data or good model calibration.

Conclusions
The first goal of this study was to investigate the impact of assimilating GRACE into the OpenStreams wflow_hbv model on the estimated TWS, GW storage and streamflow in the Rhine River basin.GRACE observations were assimilated into each grid cell of the model with an EnKF to update the soil moisture and UZ and LZ storage terms of the model.In general, assimilation drew the EnOL estimated TWS closer to the GRACE observations.In the absence of independent TWS observations, a qualitative analysis of the increments in TWS indicated that GRACE assimilation could partially correct the TWS estimate for the influence of errors in the meteorological forcing data and model parameters.As a result, an improvement in the GW estimate after assimilating GRACE data was noticeable.In the best case, correlation coefficient increased from 0.31 to 0.53 and RMSE decreased by 35 % with respect to the EnOL case.However, it is found that the improvement in TWS estimates did not always translate to an improved agreement between the estimated and observed GW storage variation at certain well locations.The differences may be due to the OpenStreams wflow_hbv parameters: if the upper limit on soil moisture storage is too high (low), then the GW variations could be under (over)estimated.This is particularly relevant in the type of model where the calibration is per sub-basin.This does not allow for local differences on the order of single or a few grid cells.The issue of scale is also significant because GRACE observes monthly variations on the order of hundreds of kilometres.Groundwater variations can be influenced by local features at finer scales.When the basin average is considered, validation against a denser network of well data or an independent GW model could be used to determine if an improvement occurs at the scale of the entire basin.
Furthermore, the considered model was used to simulate runoff.The GW terms, UZ and LZ, primarily serve as reservoirs for quick and base runoff generation.Due to the coarse resolution of the observations, GRACE assim-ilation resulted in only a modest improvement in streamflow estimates.In the best case, correlation coefficients increased from 0.65 to 0.66, Nash-Sutcliffe coefficients increased from 0.62 to 0.65, and RMSE was reduced by 4 %.
The second goal of this study was to investigate the potential value of assimilating GRACE observations in data-sparse regions.Results from four scenarios were compared in which the ensemble mean model parameters were either calibrated values or basin average values, and the meteorological forcing data were either local (high quality) data or global (poorer quality) data.By comparing the four cases, it was shown that GRACE assimilation could correct for errors in model forcing data and parameter calibration by drawing the estimated TWS toward that observed by GRACE.This also resulted in drawing the estimated GW storage closer to the in situ measurement.Given that the most significant improvements were observed in the NCG case, this suggests that GRACE observations are most valuable in data-sparse regions.In these regions any additional observations, even those at coarse spatial and/or temporal resolution, are welcome.GRACE can provide essential independent observations for validation, and serves as a constraint for TWS within the assimilation process.In terms of streamflow, a comparison of the four scenarios demonstrates that the ability to capture high flow events is determined largely by the quality of the forcing data and the model parameters.The improvements in streamflow estimates after assimilation are modest.Nevertheless, we consider the obtained results as promising, particularly in data-sparse scenarios, e.g. the NCG case.They indicate that GRACE contains information that can be useful for streamflow estimation.Whether updating TWS is the best way to use this information is an open question.An alternative strategy could be to use GRACE assimilation for parameter estimation at a sub-basin or basin scale and constrain the rainfall-runoff model through assimilation of soil moisture observations.
In conclusion, GRACE assimilation is beneficial, and the largest improvements are generally observed in the NCG (i.e.data-sparse) cases.In addition to providing a modest improvement to the estimated streamflow, it may result in a noticeable improvement in TWS estimates, yielding an extra insight into the behaviour of the hydrological model, its forcing data and parameters.Further research will combine assimilation of GRACE and a soil moisture remote-sensing product to constrain the SM estimate storage term, and ensure that improved TWS would lead to more consistently improved estimates of GW storage variations.Further research will also explore the value of assimilating GRACE into a GW model in which the primary processes of interest vary on temporal and spatial scales similar to those of GRACE.In addition, recent studies have explored the effect of spatial aggregation of GRACE TWS prior to assimilation (Forman and Reichle, 2013) as well as inclusion of the full GRACE error structure (Eicker et al., 2014).Combining the advances made in those studies with the assimilation framework presented here is expected to yield even more realistic estimates.As shown by De Lannoy et al. (2009), working with a spatially distributed state vector (3D-EnKF) can lead to an improved estimate.Given the coarse resolution of GRACE, we expect that implementing a 3D-EnKF within the assimilation framework would lead to an improved performance.This could be particularly important in small basins like the Rhine, and can be used to account for the fact that the GRACE overpasses are infrequent and may not be sensitive to TWS variations in response to specific events.

percFigure 1 .
Figure 1.River gauge (circle) and well (triangle) locations over the Rhine River basin used in this paper.Red triangles indicate Sundern (1) and A319C locations (17).Names of all well locations are given in TableA1.

Figure 3 .
Figure 3. Calibrated parameters of the soil moisture and runoff response routines of the OpenStreams wflow_hbv model.

Figure 4 .
Figure 4. Mean daily precipitation, temperature, and potential evapotranspiration in 2006 from the local (left panels) and global (right panels) forcing data sets.

Figure 5 .
Figure 5. Area-averaged mean terrestrial water storage (TWS) over the Rhine River basin from the EnOL, EnKF, and GRACE observations in four different scenarios (CL: calibrated parameters with local forcing data; CG: calibrated parameters with global forcing data; NCL: non-calibrated parameters with local forcing data; NCG: noncalibrated parameters with global forcing data).

Figure 6 .
Figure 6.Averaged increment (posterior minus prior) of TWS in mm during the winter 2005-2006 (left panels) and summer of 2006 (right panels) in four different scenarios (CL: calibrated parameters with local forcing data; CG: calibrated parameters with global forcing data; NCL: non-calibrated parameters with local forcing data; NCG: non-calibrated parameters with global forcing data).The polygons in the right column define the southern part of Moselle basin.

Figure 7 .
Figure 7. TWS variation (left panels) and GW variation (right panels) at the Sundern well location in four different scenarios (CL: calibrated parameters with local forcing data; CG: calibrated parameters with global forcing data; NCL: non-calibrated parameters with local forcing data; NCG: non-calibrated parameters with global forcing data).

Figure 8 .
Figure 8. TWS variation (left panels) and GW variation (right panels) at the A319C well location in four different scenarios (CL: calibrated parameters with local forcing data; CG: calibrated parameters with global forcing data; NCL: non-calibrated parameters with local forcing data; NCG: non-calibrated parameters with global forcing data).

Figure 9 .
Figure 9.Estimated and observed streamflow at the Maxau gauge station in four different scenarios (CL: calibrated parameters with local forcing data; CG: calibrated parameters with global forcing data; NCL: non-calibrated parameters with local forcing data; NCG: noncalibrated parameters with global forcing data).

Figure 10 .
Figure 10.Estimated and observed streamflow at the Wesel gauge station in four different scenarios (CL: calibrated parameters with local forcing data; CG: calibrated parameters with global forcing data; NCL: non-calibrated parameters with local forcing data; NCG: noncalibrated parameters with global forcing data).

Figure 11 .
Figure 11.The correlation coefficient (left panels), the Nash-Sutcliffe coefficient (middle panels) and RMSE (right panels) computed between estimated streamflows and gauge measurements in four different scenarios (CL: calibrated parameters with local forcing data; CG: calibrated parameters with global forcing data; NCL: non-calibrated parameters with local forcing data; NCG: non-calibrated parameters with global forcing data).Results are shown for the Maxau (Max), Mainz (Mai), Andernach (And), and Wesel (Wes) gauge stations.Average values (Avg) calculated across all 13 gauge locations are shown in the rightmost bar of each histogram, with the standard deviations indicated by error bars.

Table 1 .
Parameters of the soil moisture and runoff routines in the OpenStreams wflow_hbv model.

Tangdamrongsub et al.: Data assimilation of GRACE terrestrial water storage estimates 2085 variation
from GRACE minus SM variation) and the observed head to scale the observed head.Ideally, we would prefer to use in situ soil moisture data to represent the SM term, but they are not available at the well locations, and the nearest station from the International Soil Moisture Network (ISMN; Hydrol.Earth Syst.Sci., 19, 2079-2100, 2015 www.hydrol-earth-syst-sci.net/19/2079/2015/ N.

Table 2 .
Daily mean values of the forcing data at Sundern during summer (JJA) months.

Table 3 .
Ensemble mean parameter values at the Sundern and A319C well locations for the calibrated and non-calibrated simulations.

Table 4 .
Correlation coefficient computed between monthly mean estimated GW variation and monthly mean in situ variation.Names of the stations (first column) are provided in Appendix A.

Table 5 .
RMSE (mm)computed between monthly mean estimated GW variation and monthly mean in situ variation.Names of the stations (first column) are provided in Appendix A. ) are found when the global forcing data are used.The correlation coefficient increased from 0.64 to 0.65 in the CG case, and 0.65 to 0.66 in the NCG case.The largest improvement at an individual station was found at Maxau where assimilation resulted in an increase in correlation coefficient from 0.54 to 0.59 in the NCG case. column