On the use of the GRACE normal equation of inter-satellite tracking data for estimation of soil moisture and groundwater in Australia

An accurate estimation of soil moisture and groundwater is essential for monitoring the availability of water supply in domestic and agricultural sectors. In order to improve the water storage estimates, previous studies assimilated terrestrial water storage variation (1TWS) derived from the Gravity Recovery and Climate Experiment (GRACE) into land surface models (LSMs). However, the GRACE-derived 1TWS was generally computed from the high-level products (e.g. time-variable gravity fields, i.e. level 2, and land grid from the level 3 product). The gridded data products are subjected to several drawbacks such as signal attenuation and/or distortion caused by a posteriori filters and a lack of error covariance information. The postprocessing of GRACE data might lead to the undesired alteration of the signal and its statistical property. This study uses the GRACE least-squares normal equation data to exploit the GRACE information rigorously and negate these limitations. Our approach combines GRACE’s least-squares normal equation (obtained from ITSG-Grace2016 product) with the results from the Community Atmosphere Biosphere Land Exchange (CABLE) model to improve soil moisture and groundwater estimates. This study demonstrates, for the first time, an importance of using the GRACE raw data. The GRACE-combined (GC) approach is developed for optimal least-squares combination and the approach is applied to estimate the soil moisture and groundwater over 10 Australian river basins. The results are validated against the satellite soil moisture observation and the in situ groundwater data. Comparing to CABLE, we demonstrate the GC approach delivers evident improvement of water storage estimates, consistently from all basins, yielding better agreement on seasonal and inter-annual timescales. Significant improvement is found in groundwater storage while marginal improvement is observed in surface soil moisture estimates.


Introduction
The changes in terrestrial water storage ( TWS) derived from the Gravity Recovery And Climate Experiment (GRACE) data products have been used in the last decade to study global water resources, including groundwater depletion in India and the Middle East (Rodell et al., 2009;Voss et al., 2013), water storage accumulation in Canada (Lambert et al., 2013), and flood-influenced water storage fluctuation in Cambodia (Tangdamrongsub et al., 2016). The gravity data obtained from GRACE satellites are commonly processed and released in three different product levels (L) that increase in the amount of processing, L1Bsatellite tracking data (e.g. Wu et al., 2006), L2 -global gravitational Stokes coefficients (e.g. Bettadpur, 2012), and L3global grids (e.g. Landerer and Swenson, 2012). The original (L1B) GRACE information is inevitably altered due to data processing and successive post-processing filterings, because the error covariance information is not propagated through each post-processing step.
The GRACE-derived TWS has been computed widely from the higher-level products (e.g. L2 and L3) on which various ad hoc post-processing filters were applied (e.g. Gaussian smoothing filter, Jekeli, 1981;destripe filter, Swenson N. Tangdamrongsub et al.: On the use of the GRACE normal equation of inter-satellite tracking data and . TWS obtained from these filters lacks proper error covariance information and is attenuated and distorted. To overcome the signal attenuation in GRACE high-level products, empirical approaches have been developed, including the application of scale factors computed from land surface models (LSMs; Landerer and Swenson, 2012) to the GRACE L3 products. GRACE uncertainty in the high-level product is usually unknown or assumed. For example, Zaitchik et al. (2008) derived empirically a global average uncertainty that is variable depending on choices of post-processing filters (Sakumura et al., 2014). Furthermore, GRACE error and sensitivity is dependent on latitudes due to the orbit convergence toward poles  and any post-processing filters will alter the GRACE data and their error information. Rigorous statistical error information is of equal importance to derivation of TWS for data assimilation and model calibration (Tangdamrongsub et al., 2017;Schumacher et al., 2016Schumacher et al., , 2018. TWS and its uncertainty estimates should be formulated directly from L1B data considering the complete statistical information. The GRACE information is not fully exploited in many studies. For example, groundwater storage variation ( GWS) is often computed by subtracting the soil moisture variation ( SM) component simulated by the land surface model from GRACE-derived TWS data (Rodell et al., 2009;Famiglietti et al., 2011), assuming the model SM is error-free. This may result in the inaccurate GWS and the associated error estimate as the uncertainties in observation and of the land surface model outputs are neglected in the combination (or regression) of two noisy data (e.g. Long et al., 2016). In data assimilation, the GRACE uncertainty is often derived empirically, not necessarily reflecting the actual GRACE error characteristics (e.g. Zaitchik et al., 2008;Tangdamrongsub et al., 2015;Tian et al., 2017). For example, Girotto et al. (2016) used the L3 product and showed that it was necessary to adjust GRACE observation and its uncertainty in order to make their water storage estimates more accurate. Similarly, Tian et al. (2017) reported the need for the application of a scale factor to GRACE uncertainty (from mascon, mass concentration, product) in their GRACE assimilation process. It is apparent that the use of post-processed GRACE products often requires data tuning, leading possibly to an integration of the altered gravity information into the data assimilation system. Some recent studies began to employ the full variance-covariance information in the data assimilation scheme to enhance the quality of the estimates (Eicker et al., 2014;Schumacher et al., 2016;Tangdamrongsub et al., 2017;Khaki et al., 2017a, b).
This study aims to use the GRACE information of TWS measurement directly from the least-squares normal equation data. The approach optimally combines GRACE's normal equations with the model simulation results from the Community Atmosphere Biosphere Land Exchange (CABLE, Decker, 2015) to improve SM and GWS estimates. The proposed approach presents three main advantages. Firstly, one can exploit the full GRACE signal and error information by using the normal equation data sets. Secondly, the approach is developed for optimal least-squares combination (e.g. Ramillien et al., 2004), which maximizes the model and observation strength while simultaneously suppressing their weaknesses. Finally, the method bypasses empirical, multiple-step post-processing filters.
The main objective of this study is to present the GRACEcombined (GC) approach to improve the model-estimated SM and GWS on regional scales. We demonstrate our approach applied to 10 Australian river basins (Fig. 1a). One advantage of the study area is that the state vector can be simply defined by SM and GWS as other hydrological components (e.g. snow, glacier) are negligible. We validate the top layer of SM estimates against the satellite soil moisture observation (the Advanced Microwave Scanning Radiometer aboard the Earth Observing System (EOS) -AMSR-E, Njoku et al., 2003) over all 10 basins and the GWS estimates against the in situ groundwater data available over Queensland and Victoria ( Fig. 1b and c).
This paper is outlined as follows: firstly, the derivation of the GC approach is presented in Sect. 2 while the description of GRACE data processing, including the use of the GRACE normal equation, is given in Sect. 3. Secondly, the CABLE modelling is outlined in Sect. 4. This includes the derivation of model uncertainty based on the quality of precipitation data and the model parameter inputs. The processing of validation data is also described in Sect. 4. Thirdly, Sect. 5 presents the result of SM and GWS estimates and comparison to in situ data. The long-term trends in the Australian mass variation over the last 13 years are also investigated in this section.

A method of combining GRACE L1B data with land surface model outputs
The statistical information of TWS computed from a land surface model can be written as where h is the "truth" (unknown) model state vector while h is the calculated state vector characterized with the model error . The model error is assumed to have zero mean and covariance C.
The term h is used to represent a vector including the global TWS grid, and terms with a subscript "R" (e.g. h R , C R ) are used to represent only a regional set of TWS (for example, in Australia). As such, the observation equation over a region can be rewritten as As soil moisture and groundwater are the major components of TWS in Australia (surface water storage being insignificant), the vector h R can be defined as where SM top , SM rz , and GW S represent the vectors of top (surface) soil moisture, root zone soil moisture, and groundwater storage variations, respectively. A least-squares normal equation of GRACE can be written as where N is a normal matrix, x contains the spherical harmonic coefficients (SHC) of the geopotential, and c is the normal vector. In this study, N and c can be obtained from the ITSG-Grace2016 products (Mayer-Gürr et al., 2016; https://www.tugraz.at/institute/ifg/downloads/ gravity-field-models/itsg-grace2016, see more details in Sect. 3.1). Equation (4) can be written in terms of h as follows (see Appendix A for the derivation): where Y converts TWS to geopotential coefficients considering the load Love numbers (e.g. Wahr et al., 1998) and H is the operational matrix converting SM top , SM rz , and GW S to TWS. Equation (5) is based on the assumption that the GRACE orbital perturbation is a result of TWS variation on the surface. If M is the number of model grid cells, N max is the maximum degree of the geopotential coefficients, and L = (N max + 1) 2 − 4 is the number of geopotential coefficients from GRACE; the dimension of Y, H, and h are L × M, M × 3M, and 3M × 1, respectively. Note that Eq. (5) is defined with the global grid of h. For a regional application, Eq. (5) can be modified as where the subscript "R" indicates the grid TWS only in a region of interest and "o" for the rest of the globe. The number of the model grid cells associated with R is J , and the number of grid cells outside cells is respectively. The dimension of N and c remain unchanged, since they are essentially from the normal equations of the original GRACE L1B data (to be discussed in the following section). From Eq. (6), the normal equations associated with TWS in the region of interest can then be written as where As seen, Eq. (7) is the regional representation of Eq. (5) where only the grid cells inside the study region are used, while the contribution from the grid cells outside the region needs to be removed or corrected. Combining the normal equation of Eqs. (2) and (8), the optimal combined solution ofĥ R can be resolved as follows: The computation of model covariance matrix C R will be discussed in Sect. 4.2. The a posteriori covariance ofĥ R can be estimated as follows: and the uncertainty estimate ofĥ R is simply calculated as where diag( ) represents the diagonal element of the given matrix.

GRACE least-squares normal equations
In this study, the least-squares normal equations are obtained from the ITSG-Grace2016 products between January 2003 and March 2016. All L1B data including K-band range (KBR) inter-satellite tracking data, attitude, accelerometer, GPS-based kinematic orbit data, and AOD1B corrections are reduced in terms of the normal equations. These data products are usually used to compute the Earth's geopotential field to the maximum harmonic degree and order of 90, or at a spatial resolution of ∼ 220 km. The products contain the information of the normal matrix N and the vector c (as shown in Eq. 4) as well as the a priori time-varying gravity field coefficients predicted with the GOCO05s solution (Mayer-Gürr et al., 2015). Note that the solution of the ITSG-Grace2016 normal equation is the anomalous geopotential coefficient vector ( x), which is referenced to the a priori time-varying gravity field (x 0 ), through: where d and x 0 are given. To obtain a complete gravity field variation between the study period (x term in Eq. 4), the a priori time-varying gravity field, x 0 , is firstly restored to Eq. (12), and the mean gravity field (x 0 ) computed from all x 0 between January 2003 and March 2016 is then removed as follows: Therefore, in Sect. 2 (e.g. Eq. 5), the matrix N remains unchanged while the vector c can be simply replaced by

GRACE-derived TWS products
Three monthly GRACE-derived TWS products are also used: the ITSG-Grace2016 DDK5 solution (ITSG-DDK5 for short, http://icgem.gfz-potsdam.de/series/ 99_non-iso/ITSG-Grace2016), the CNES/GRGS Release 3 (RL3) (GRGS for short, Lemoine et al., 2015; http://grgs.obs-mip.fr/grace/variable-models-grace-lageos/ grace-solutions-release-03), and the JPL RL05M mascon-CRI version 2 product (mascon for short, Watkins et al., 2015;Wiese et al., 2016; http: //grace.jpl.nasa.gov/data/get-data/jpl_global_mascons). The ITSG-DDK5 product is the post-processed version of the ITSG L2 solution where the non-isotropic filter DDK5 (Kusche et al., 2009) is applied. The DDK5 solution is empirically selected here to be a good balance between the over-smoothed (e.g. DDK1) and noisy (e.g. DDK8) solutions. The GRGS solution provides TWS at 1 • × 1 • globally, derived from the Earth's geopotential coefficients up to the maximum degree and order 80, and no filter nor scale factor is applied (L2 data product). Mascon provides TWS at equal-area 3 • spherical cap grid globally. In contrast to the ITSG-DDK5 and GRGS solutions, the mascon uses a gain factor derived from the land surface model to restore mitigated signals and reduce leakage errors (L3 data products) (Watkins et al., 2015;Wiese et al., 2016). Additionally, mascon provides the TWS uncertainty together with the solution. The uncertainty is computed based on several geophysical models (see Watkins et al., 2015, andWiese et al., 2016, for more details). The uncertainty information is not available in the ITSG-DDK5 or GRGS product.
The GRACE data are obtained between January 2003 and March 2016. After retrieval, the long-term mean value between January 2003 and March 2016 is computed and subtracted from the monthly products. To be consistent with CA-BLE grid spacing (see Sect. 4), the TWS is computed using 0.5 • spatial resolution. The coarse-scale datasets (e.g. mascon, GRGS) are resampled to 0.5 • × 0.5 • using the nearest grid values.
In this study, the independent GRACE solutions are used for two main reasons: 1. To obtain the TWS values outside Australia. As shown in Eq. (7), theĥ o vector needs to be known, which can be from the GRACE-derived TWS solution. We use the GRGS solutions as the GRGS solution is not subject to the filter choice and it provides TWS at a spatial resolution comparable to the normal equation data.
2. To compare with the TWS estimates from our approaches. All solutions are used to compare and validate our TWS estimates.

Model setup
The extensive description of the CABLE model is given in Decker (2015) and Ukkola et al. (2016). This section describes the model setup and specific changes applied to this study. CABLE is a public available land surface model and can be used to estimate soil moisture and groundwater in terms of volumetric water content every 3 h at a 0.5 • × 0.5 • spatial resolution. The soil moisture and groundwater storage can be simply computed by multiplying the estimates with thicknesses of various layers. For soil moisture, the thickness of six soil layers is 0.022, 0.058, 0.154, 0.409, 1.085, and 2.872 m, from top to bottom, respectively. The thickness of the groundwater layer is modelled to be 20 m uniformly. Recalling Eq. (3), SM top is defined as the soil moisture storage variation at the top 0.022 m thick layer, while SM rz is the variation accumulated over the second to the bottom soil layers (depth between 0.022 and 4.6 m).
CABLE is initially forced with the data from the Global Soil Wetness Project Phase 3 (GSWP3), which is currently available until December 2010 (http://hydro.iis.u-tokyo.ac. jp/GSWP3, https://doi.org/10.20783/dias.501). We replace GSWP3 forcing data with GLDAS data (Rodell et al., 2004) Table 2. Model parameters that are sensitive to SM and GWS estimates. The following parameters were perturbed using the additive noise with the boundary conditions given in the last column. The further parameter description can be found in Decker (2015) and Ukkola et al. (2016). to compute the water storage changes to 2016. The forcing data used in CABLE are precipitation, air temperature, snowfall rate, wind speed, humidity, surface pressure, and shortwave and long-wave downward radiations. To investigate the impact of different forcing data, the offline sensitivity study is conducted by comparing the water storage estimates computed using 1. all eight forcing data components of GSWP3 and 2. GSWP3 data with the replacement of one component obtained from GLDAS forcing data.
It is found that the water storage estimate is most sensitive to the replacement of precipitation data, as expected, and relatively less sensitive to the change in other forcing components. We use the GLDAS forcing data in this study and also further test seven different precipitation data products (see more details in Sect. 4.2). The forcing data are up-or downsampled to a 0.5 • × 0.5 • spatial grid to reconcile with the CABLE spatial resolution.

Model uncertainty
In this study, the CABLE uncertainty is derived from 210 ensemble estimates associated with different forcing data and model parameters. The seven different precipitation products (see Table 1) are used to run the model independently. Most products are available to present day while GSWP3, Princeton, and MERRA are only available until December 2010, December 2012, and February 2016, respectively. For each precipitation forcing, 30 ensembles are generated by perturbing the model parameters within ±10 % of the nominal values. The perturbed size of 10 % is similar to Dumedah and Walker (2014). Based on the CABLE structure, the SM and GW S estimates are most sensitive to the model parameters listed in Table 2. For example, the fractions of clay, sand, and silt (f clay , f sand , f silt ) are used to compute soil parameters including field capacity, hydraulic conductivity, and soil saturation which mainly affect soil moisture storage. Similarly, the drainage parameters (e.g. q sub , f p ) control the amount of subsurface runoff, which has a direct impact on root zone soil moisture and groundwater storages.
From ensemble generations, total K = 210 sets of the ensemble water storage estimates (h e ) are obtained: and the mean value of H R is computed as follows: Note that due to the absence of GSWP3, Princeton, and MERRA data, the number of ensembles reduces to K = 180 after December 2010, K = 150 after December 2012, and K = 120 after February 2016, respectively. The GC approach assumes that model errors are normally distributed with zero mean. Any violation of this assumption will yield a bias in the combined solutions. Therefore, the mean value is removed from each ensemble member, H R = H R − h R , and the error covariance matrix of the model is empirically computed as The h R (Eq. 16) and C R (Eq. 17) terms can be directly used in Eq. (1). Furthermore, in practice, the sampling error caused by finite sample size might lead to spurious correlations in the model covariance matrix (Hamill et al., 2001). The effect can be reduced by applying an exponential decay with a particular spatial correlation length to C R . In this study, the correlation length is determined based on the empirical covariance of model-estimated TWS. The covariance function of TWS is firstly assumed isotropic, and it is computed empirically based on the method given in Tscherning and Rapp (1974). The distance where the maximum value of the variance decreases to half is defined as the correlation length. The obtained values vary month to month, and the mean value of 250 km is used in this study.
It is emphasized that the model omission error caused by imperfect modelling of hydrological processes within the LSM is not taken into account in the above description. The omission error may increase the model covariance and introduce a bias as well. We account for the omission error by increasing 20 % of the model covariance. (i.e. multiplying C R by 1.2). We determine such omission error based on trial and error such that it increases the model error (due to the omission error) but does not exceed the model error value reported by Dumedah and Walker (2014). We acknowledge that this is only a simple practical way of accounting for the omission error into the total model error.

Validation data 4.3.1 Satellite soil moisture observation
The satellite-observed surface soil moisture data are obtained from the Advanced Microwave Scanning Radiometer aboard the Earth Observing System using the Land Parameter Retrieval Model (Njoku et al., 2003). The observation is used to validate our estimates of top soil moisture changes ( SM top ). The AMSR-E product provides volumetric water content in the top layer derived from a passive microwave data (from NASA EOS Aqua satellite) and forward radiative transfer model. In this study, the level 3 product, available daily between June 2002 and June 2011 at 0.25 • × 0.25 • spatial resolution, is used (Owe et al., 2008). The measurements from ascending and descending overpasses are averaged for each frequency band (C and X). Then, the monthly mean value is computed by averaging the daily data within a month. To obtain the variation in the surface soil moisture, the long-term mean between June 2002 and June 2011 is removed from the monthly data. Regarding the different depth measured in CA-BLE and AMSR-E, the cumulative density function (CDF)matching technique (Reichle and Koster, 2004) is used to reduce the bias between the top soil moisture model and the observation. The CDF is built using the 2003-2004 data, and it is used for the entire period. There are no satellite-observed or ground-measured root zone soil moisture data for meaningful comparison with our results, particularly on a continental scale. Validation of SM rz on regional and continental scales is currently unachievable due to a complete lack of observations on this spatial scale.

In situ groundwater
The in situ groundwater levels from bore measurements are obtained from two different ground observation networks (see Fig. 1). The data in Queensland are obtained from the Department of Natural Resources and Mines (DNRM) while the data in Victoria are from the Department of Environment and Primary Industries (DWPI). More than 10 000 measurements are available from each network, but the data gap and outliers are present. Therefore, the bore measurement is firstly filtered by removing the sites that present no data or data gap longer than 30 months during the study period.
To obtain the monthly mean value, the hourly or daily data are averaged in a particular month. The outliers are detected and fixed using the Hampel filter (Pearson, 2005) where the remaining data gaps are filled using the cubic spline interpolation. To obtain the groundwater level variation, the longterm mean groundwater level computed between the study period is removed from the monthly values. The groundwater level variation ( L) is then converted to GWS using GWS = S y · L, where S y is specific yield. Based on Chen et al. (2016), S y = 0.1 is used for the Victoria network. Specific yields of Queensland's network have been found ranging from 0.045 (Rassam et al., 2013) to 0.06 (Welsh, 2008), and an averaged S y = 0.05 is used in this study. Finally, the mean value computed from all data (in each network) is used to represent the in situ data of the network.

Model-only performance
We study the model TWS changes under different meteorological forcing and land parameterization. A total of 210 estimates of monthly TWS (sum of SM top , SM rz , and GWS) are obtained between January 2003 and March 2016 from the ensemble run based on seven different precipitation inputs. Then, the averaged values of the TWS estimates are computed from the 30 precipitation-associated ensemble members. This results in seven sets of monthly mean TWS estimates from seven different precipitation data. For each set, the monthly TWS is computed by removing the long-term mean computed between January 2003 and March 2016.
The precipitation-based TWS values are then compared with the GRACE mascon solution (see Sect. 3.2) over 10 different Australian basins. The comparison is carried out between January 2003 and March 2016. Due to the availability of the data, the periods used are shorter in cases of GSWP3, Princeton, and MERRA precipitation (see Table 1). The metric used to evaluate a goodness of fit between the CABLE run and GRACE mascon estimates is the Nash-Sutcliffe (NS) coefficient (see Eq. B1) (Fig. 3).  The lower agreement is mainly due to the quality of rainfall estimates over Australia. The NS of ECMWF is around 0.4.
All model ensembles are consistent with the GRACE data over the Timor Sea and inner parts of Australia (e.g. Lake Eyre, LKE; Murray-Darling, MRD; North West Plateau, NWP) where the NS value can reach as high as 0.9 (see e.g. TRMM over TIM, Timor Sea). On the contrary, the lower agreement is found mostly over the coastal basins. By averaging all TWS estimates from seven different precipitation datasets, the mean-ensemble estimate (MN) delivers the best agreement with GRACE as seen by the highest average NS value (MN of AVG = 0.55) among all ensembles. Particularly, NS values are greater than 0.4 in all basins and no negative NS values are presented in MN. On average, it can be clearly seen that using the mean value (MN) is a viable option to increase the overall performance of the TWS estimates. Therefore, only the CABLE MN result will be used in further analyses. The comparison with the GRGS GRACE N. Tangdamrongsub et al.: On the use of the GRACE normal equation of inter-satellite tracking data solution was also evaluated (not shown here) and the overall results are similar to Fig. 3.

Contribution of GRACE
This section investigates the impact of the GC approach on the estimates of various water storage components. The TWS estimate obtained from the GC approach is demonstrated in Sect. 5.1, by comparing with the independent GRACE mascon solution. Figure 3 shows the GC result yields the highest NS values in all basins, outperforming all other CABLE runs. In the average (AVG), the NS value increases by ∼ 35 % (0.55 to 0.74) from the MN case. Similar behaviour is also seen when compared with the GRGS GRACE solution (not shown); the average NS value increases from 0.50 to 0.74. This is not surprising as the GC approach uses the fundamental GRACE tracking data as GRACE mascon and GRGS solutions do. Improvement of the NS coefficient indicates merely the successfulness of integrating GRACE data and the model estimates.  Fig. 4d), where CABLE overestimates TWS and produces phase delay between 2008 and 2010. The overestimated amplitude and phase delay seen in CABLE GWS during this above period (Fig. 4c) is caused by an overestimation of soil and groundwater storage. The positively biased soil and groundwater storage causes a phase delay by increasing the amount of time required for the subsurface drainage (baseflow) to reduce to soil and groundwater stores. The overestimation of water storage is the result of overestimated precipitation or underestimated evapotranspiration. The amplitude and phase of the water storage estimate are adjusted toward GRACE observation in the GC approach.
The impact of GRACE varies across the individual storage as well as across the geographical location (climate regime). In general, the major contributors to TWS are SM rz and GWS. Due to a small store size (only ∼ 2 cm thick), SM top contributes only ∼ 2 % to TWS. As such, SM rz and GWS have greater variations, which commonly lead to greater uncertainty compared to SM top , and, therefore, the stores anticipate greater shares from the GRACE update. This behaviour is seen over all basins where the differences between the CABLE-simulated and GC SM rz , and GWS estimates are greater (compared to SM top ). Furthermore, the impact of GRACE on SM rz and GWS is different across the continent. For example, over central and southern Australia (see e.g. LKE, MRD, NWP, SWP), the dry climate is responsible for a small amount of groundwater recharge and most of the infiltration is stored in soil compartments. In this climate condition, SM rz amplitude is significantly larger than GWS and it plays a greater role in TWS, and, consequently, the GRACE contribution is mostly seen in the SM rz component. Different behaviour is seen over northern Australia (GOC, NEC, TIM) where GWS amplitudes are greater (∼ 40 % of TWS) compared to other basins (only ∼ 17 % of TWS). This is due to the sufficient amount of rainfall over the wet climate region, replenishing groundwater recharges and resulting in greater variability in GWS. Therefore, compared to the dry climate basin, GRACE contributes to GWS over these basins by a larger amount.

Impact on long-term trend estimates
The spatial patterns of the long-term trends of water storage changes over January 2003 and March 2016 are analysed before and after applying the GC approach (Fig. 6). For comparison, the long-term trends of TWS derived from the ITSG-DDK5, mascon, and GRGS solutions are shown in Fig. 7. From Fig. 6b, GRACE effectively changes the longterm trend estimates in most basins in a way the spatial pattern of the TWS trend of the GC solution consistent to the GRACE solutions while satisfying the model processes and keeping the spatial resolution. The trend of SM top is insignificant (Fig. 6c) and the GC approach does not change (Fig. 6d). The largest adjustment is seen in SM rz and GWS components, to be consistent with the GRACE data in most basins ( Fig. 6 and h).
GRACE shows significant changes in the TWS trend estimates particularly over the northern and western parts of the continent (Fig. 7). The model estimates around the Gulf of Carpentaria basin show a strong negative trend that is inconsistent from the GRACE data. It is found that underestimated precipitation after 2012 is likely the cause of such an incompatible negative trend (see Fig. 4d). Applying the GC approach clearly improves the trend (Fig. 6a vs. Fig. 6b). The other example is seen over the western part of the continent (see rectangular area in Fig. 6a and b) where the averaged long-term trend of TWS was predicted to be −0.4 cm yr −1 but changed to be −1.2 cm yr −1 (see also Sect. 5.4) by the GC approach. The precipitation over Western Australia is understood to be overestimated after 2012, evidently seen by the fact that the model TWS is always greater than the GC solution (see e.g. Figs. 4h, 5d and p). The GC approach reveals that the water loss over Western Australia is at least 2 times greater than what has been predicted by the CABLE model.
In addition, the shortage of water storage in the southeastern part of the continent from the Millennium Drought  Fig. 6) after the rainfall between 2009 and 2012, while the western part is still drying out (seen as negative trends). The trend estimates in terms of mass change are discussed in more detail in Sect. 5.4.

Reduction of uncertainty
Influenced by climate pattern, the uncertainty in water storage estimates significantly varies across Australia. The uncertainty in the model estimate is computed from the variability induced by different precipitation and model parameters while the uncertainty in the GC solution is computed using Eq. (11). As expected, larger uncertainties are observed in SM rz and GWS than in SM top (an order of magnitude smaller) since SM top is smaller than others (Fig. 8). Over the wet basins, larger amplitude of the water storage leads to larger uncertainty, seen over the Gulf of Carpentaria, North East Coast, South East Coast, and Timor Sea where the CABLE-simulated TWS uncertainty is approximately 28 % larger than other basins. The smaller uncertainty is found over the dry regions (e.g. LKE, SWP). In most basins, the uncertainty in SM rz is larger than the GWS, except the wet basins (e.g. GOC, NEC, TIM) where the greater groundwater recharge leads to a larger uncertainty in GWS. Figure 8 demonstrates how much the formal error of each of the storage components is reduced by the GC approach. Overall, the estimated CABLE uncertainties averaged over all basins (AVG) are 0.2, 4.0, 4.0, and 5.7 cm for SM top , SM rz , GWS, and TWS, respectively. With the GC approach, the uncertainties in SM top , SM rz , GWS, and TWS decrease by approximately 26, 35, 39, and 37 %, respectively.
It is worth mentioning that the model uncertainty is mainly influenced by the meteorological forcing data. The uncertainty in precipitation derived from seven different precipi- tation products is shown in Fig. 8e. The spatial pattern of the precipitation uncertainty is correlated with the uncertainty in water storage estimates. The larger water storage uncertainty is deduced from the larger precipitation uncertainty. The quality of precipitation forcing data is found to be an important factor to determine the accuracy of water storage computation.

Soil moisture
The SM top estimates are compared with the AMSR-Ederived soil moisture. The processing of AMSR-E data is described in Sect. 4.3.1. The performance is assessed using Nash-Sutcliffe coefficients, given in Table 3. In general, CA-BLE (MN) shows a good performance in the top soil moisture simulation showing an NS value of > 0.4 for most of the basins. The top soil moisture estimate shows slightly bet-ter agreement with the C-band measurement of the AMSR-E product. This is likely caused by the greater emitting depth of the C-band measurement (∼ 1 cm), which is closer to the depth of the top soil layer (∼ 2 cm) used in this study (Njoku et al., 2003).
The GC approach leads to a small bit of improvement of the top soil estimate consistently from C-and X-band measurements and from all basins. No degradation of the NS value is observed in the GC solutions. The largest improvement is seen over LKE and NEC, where NS increases by 10-15 %. For other regions, the change in the NS coefficient may be incremental.

Groundwater
The GWS estimates from the model and the GC method are compared with the in situ data obtained from two different ground networks in Queensland and Victoria. For each network, all GWS data inside the groundwater network boundary (see polygons in Fig. 1) are used to compute the average GWS time series. From the comparison given in Fig. 9, it is found that the GC solutions of GWS follow the overall inter-annual pattern of CABLE but with a considerably larger amplitude. This results in a better agreement with the in situ GWS data seen from both networks. The NS coefficient of GWS between the estimates and the in situ data are given in Table 4. The CABLE GWS performs significantly better in Queensland (NS = ∼ 0.5) than Victoria (NS = ∼ 0.3). Significant improvement is found from the GC solutions in both networks, where the NS value increases from 0.5 to 0.6 (∼ 22 %) in Queensland and from 0.3 to 0.6 (∼ 85 %) in Victoria. Even greater improvement is seen when the inter-annual patterns are compared. The NS value increases from 0.5 to 0.7 (∼ 32 %) and 0.4 to 0.8 (∼ 93 %) in Queensland and Victoria, respectively.
The comparison of the long-term trend of GWS is also evaluated. The estimated trends in Queensland and Victoria are given in Table 4. Beneficially from the GC approach, the GWS trend is improved by approximately 20 % (from 0.4 to 0.6, compared to 1.6 cm yr −1 ) in Queensland. The increase in GWS is mainly influenced by the large amount of rainfall during the 2009-2012 La Niña episodes (see Fig. 9a). In Table 3. NS coefficients between top soil moisture estimates and the satellite soil moisture observations from AMSR-E products over 10 different Australian basins. The area-weighted average value (AVG) is also shown. Victoria, a significant improvement of the GWS trend by about 76 % (from 0.1 to −0.2, compared to −0.3 cm yr −1 ) is observed. Similar improvement of long-term trend estimates is seen in deseasonalized time series (improves by ∼ 15 % in Queensland and by ∼ 74 % in Victoria). The decrease in GWS in Victoria is mainly due to the highly demanded groundwater consumption by agriculture and domestic activities (Van Dijk et al., 2007;Chen et al., 2016). As the groundwater consumption is not parameterized in CABLE, the decrease in the GWS estimate cannot be properly captured in the model simulation. Applying GC approach effectively reduces the model deficiency and improves the quality of the groundwater estimations.

Assessment of mass variation in the past 13 years
Australia experiences significant climate variability; for example, the Millennium Drought starting from late 1990 (Van Dijk et al., 2013) and extremely wet conditions during several La Niña episodes (Trenberth, 2012;Han, 2017). These periods are referred as "Big Dry" and "Big Wet" (Ummenhofer et al., 2011;Xie et al., 2016). To understand the total water storage (mass) variation influenced by these two distinct climate variabilities, the water storage change obtained from the GC approach during Big Dry and Big Wet is separately investigated over 10 basins. The time window between January 2003 and December 2009 is defined as the Big Dry period while between January 2010 and December 2012 it is defined as the Big Wet period following Xie et al. (2016). In each period, the long-term trends of GC estimates of TWS, SM top , SM rz , and GWS are firstly calculated. Then, the total water storage variation (in m) is simply obtained by multiplying the long-term trend (in m yr −1 ) with the number of years in the specific period -7 years for Big Dry and 3 years for Big Wet. To obtain the mass variation,   During Big Dry (2003)(2004)(2005)(2006)(2007)(2008)(2009), a significant loss of total storage (40-60 Gt over 7 years) is observed over the LKE, MRD, NWP, and SWP basins. The largest groundwater loss of > 20 Gt is found from LKE and MRD. No significant change is observed over the tropical climate regions (e.g. GOC, NEC). The mass loss mostly occurs in the root zone and groundwater compartments where the sum of SM rz and GWS explains more than 90 % of the TWS value. The mass loss is also observed in SM top but is > 10 times smaller than SM rz and GWS.
During Big Wet (2010)(2011)(2012), the basins such as LKE, MRD, and TIM exhibit the significant total storage gain of > 100 Gt. The gain is particularly larger in SM rz over the basins that experienced the significant loss during Big Dry. For example, over LKE and MRD, the gain of SM rz is approximately 2-3 times greater than GWS. It implies that most of the infiltration (from the 2009-2012 La Niña rainfall) is stored as soil moisture through the long drought period and that the groundwater recharge is secondary to the SM rz increase. The opposite behaviour is observed over the basins (such as NEC and GOC) that experienced mass gain during Big Table 4. NS coefficient and long-term trend of GWS estimated from the model-only and GC solutions in Queensland and Victoria groundwater network. The long-term trend of the in situ data is also shown.

Queensland Victoria
In Dry. The water storage gain is greater in GWS compared to SM rz . In NEC, GWS gain is ∼ 8 times larger than SM rz during Big Wet. The soil compartment may be saturated during Big Dry and additional infiltration from the Big Wet precipitation leads to an increased groundwater recharge. The SM rz loss observed over GOC is simply caused by the timing selection of the Big Wet period, which ends earlier (∼ 2011) in GOC than in other basins. The SM rz gain becomes ∼ 26 Gt if the Big Wet period is defined as 2008-2011. During the post-Big-Wet period (2012 and afterwards), the decreasing trend of water storage is observed from all basins (see Figs. 4 and 5). This is mainly caused by the decrease in precipitation after 2012 and by gradual water loss through evapotranspiration (Fasullo et al., 2013). The overall water storage change in the last 13 years demonstrates that the severe water loss from most basins during Big Dry (the Millennium Drought) is balanced with the gain during Big Wet (the La Niña). The negative TWS estimated during Big Dry becomes positive in LKE, MRD, and SEC and less negative in TIM, and the greatest gain is observed from NEC by ∼ 50 Gt during the 13-year period (see Fig. 10c). However, the water mass loss is still detected over the western basins (e.g. IND, NWP, SWP, SWC), and their magnitudes are even larger than the mass loss during Big Dry. For example, the greatest TWS loss of ∼ 79 Gt is observed over NWP, which is ∼ 25 Gt greater than the loss during Big Dry (see Fig. 10a and c). The basin is less affected by the La Niña, and the rainfall during Big Wet is clearly inadequate to support the water storage recovery in the basin. Rainfall deficiency also reduces the groundwater recharge, resulting in even more decreasing of GWS, compared to the Millennium Drought period (see Fig. 10j and l). The continual decrease in water storage over western basins is likely caused by the interaction of complex climate patterns such as the El Niño-Southern Oscillation, Indian Ocean Dipole, and Southern Annular Mode cycles (Australian Bureau of Meteorology, 2012; Xie et al., 2016).

Comparison of the GC approach with alternatives
The simplest approach to estimate GWS is to subtract the model soil moisture component from GRACE TWS data, without considering uncertainty in the model output, as used in Rodell et al. (2009) andFamiglietti et al. (2011). This method is called Approach 1 (App 1). In Approach 2 (App 2) as in Tangdamrongsub et al. (2017), by accounting for the uncertainty in model outputs and GRACE data, the water storage states are updated through a Kalman filter: where h R , H, C R are described in Sect. 2; b is an observation vector containing GRACE-derived TWS; and R is an error variance-covariance matrix of the observation. The GRACE-derived TWS and its error information is obtained from the mascon solution. The matrix R is a (diagonal) error variance matrix since no covariance information is given in the mascon product. Note that the model uncertainty remains the same as in GC approach (Sect. 4.2). The different results from App 1 and App 2 are mainly attributed to the different estimates of the uncertainty. The GWS estimates from App 1, App 2, and GC in Queensland and Victoria are shown in Fig. 11. It is clearly seen that GWS values from App 1 are overestimated while the one from App 2 fits the ground data significantly better. This behaviour was also seen in Tangdamrongsub et al. (2017), where the water storage estimates tend to be overestimated when error components such as spatial correlation error were neglected as in App 1. GWS from App 2 shows clear improvements in terms of NS coefficients in both networks. Considering the deseasonalized GWS estimates, in Queensland, the trend increases from 0.39 ± 0.03 to 0.42 ± 0.03 cm yr −1 (improves by 1.5 %) and the NS value increases from 0.46 to 0.53. In Victoria, the trend decreases from 0.73 ± 0.10 to 0.46 ± 0.05 cm yr −1 (improves by 27 %) and the NS value increases from −0.89 to 0.30. Although App 2 is not yet as good as the GC solution based on the most comprehensive error propagation, this simple test demonstrates the importance of considering the uncertainty. The reason for App 2 being less accurate than GC is likely the too-simplified error information implemented in App 2.

Conclusion
This study presents an approach to combine the raw GRACE observation with model simulation to improve water storage estimates over Australia. Distinct from other methods, we exploit the fundamental GRACE satellite tracking data and the full data error variance-covariance information to avoid alteration of signal and measurement error information present in higher-level data products.
We compare groundwater storage estimates from the GC approach and two other approaches, subject to inclusion of GRACE uncertainty in the GWS calculation. Validating three results of GWS against the in situ groundwater data, we find that the GC approach delivers the most accurate groundwater estimate, followed by the approach based on incomplete information of GRACE's data error. The poorest estimate of groundwater storage is seen when the GRACE uncertainty is completely ignored. This confirms the critical value of using the complete GRACE signal and error information at the raw data level.
The analysis of water storage change between 2003 and 2016 reveals that half of the continent (5 out of 10 basins) has still not fully recovered from the Millennium Drought. The TWS decrease in Western Australia has been most characteristic, and the GC approach finds that the water loss mainly occurs in the groundwater layer. Rainfall inadequacy is attributed to the continual dry condition, leading to a greater decrease in groundwater recharge and storage over Western Australia.
The land surface model we used is deficient in anthropogenic groundwater consumption. The model calibration will never help, and the groundwater consumption must be brought in by external sources. On the contrary, statistical approaches similar to our GC approach may be useful to fill in the missing component and lead to a more comprehensive water storage inventory.
However, it is difficult to constrain different water storage components by only using total storage observation such as GRACE. In addition, it is challenging to improve surface soil moisture varying rapidly in time, using a monthly mean GRACE observation. Tian et al. (2017) utilized the satellite soil moisture observation from Soil Moisture and Ocean Salinity (SMOS, Kerr et al., 2001) in addition to GRACE data for their data assimilation and showed a clear improvement in the top soil moisture estimate. The GC approach with complementary observations at higher temporal resolution should be considered, particularly to enhance the surface soil moisture computation.
Furthermore, the GC approach can be simply extended for GRACE data assimilation. Assimilating the raw GRACE data into land surface models like CABLE enables the model state and parameter to be adjusted with the realistic error information, allowing more reliable storage computation.
Data availability. This study is based on third-party data. The data providers are acknowledged in the data sections ( Sects. 3 and 4). The CABLE model can be obtained after registration at https://trac. nci.org.au/trac/cable.
where y is the observation vector containing various kinds of L1B data including the inter-satellite ranging data; A is the design (partial derivative) matrix relating the data and the Earth gravity field variations; x contains the Stokes coefficients of time-varying geopotential fields (e.g. Wahr et al., 1998); and e is the L1B data noise, which has zero mean and covariance . Equation (A1) can be modified explicitly in terms of soil moisture and groundwater storage variations as y = ASYHh + e; e ∼ N (0, ), where S contains a factor used to convert TWS to geopotential coefficients considering the load Love numbers (e.g. Wahr et al., 1998) and Y converts the gridded data into the corresponding spherical harmonic coefficients. For convenience, the term Y = SY is used in the further derivation. A least-squares solution of Eq. (A2) is given as It can be simplified as where N = A T −1 A and c = A T −1 y. Equation (A4) is identical to Eq. (5).
Appendix B: Nash-Sutcliffe coefficient and area-weighted average The Nash-Sutcliffe coefficient (NS) is computed as follows: where y is an observation vector, y is the mean of the observation,x is a vector containing the simulated result, i is the index of observation, and N is the number of observations. Area-weighted average (Z) is compute as follows: where w is the area size, z is the mean value inside the considered area, j is the area index, and M is the number of considered areas.