Role of forcing uncertainty and model error background characterization in snow data assimilation

Accurate specification of the model error covariances in data assimilation systems is a challenging issue. Ensemble land data assimilation methods rely on stochastic perturbations of input forcing and model prognostic fields for developing representations of input model error covariances. This article examines the limitations of using a single forcing dataset for specifying forcing uncertainty inputs for assimilating snow depth retrievals. Using an idealized data assimilation experiment, 5 the article demonstrates that the use of hybrid forcing input strategies (either through the use of an ensemble of forcing products or through the added use of the forcing climatology) provide a better characterization of the model error background, which leads to improved data assimilation results, especially during the snow accumulation and melt time periods. The use of hybrid forcing ensembles is then employed for assimilating snow depth retrievals from the AMSR2 instrument over two 10 domains in the Continental U.S. with different snow evolution characteristics. Over a region near the Great Lakes where the snow evolution tends to be ephemeral, the use of hybrid forcing ensembles provide significant improvements relative to the use of a single forcing dataset. Over the Colorado Headwaters characterized by large snow accumulation, the impact of using the forcing ensemble is less prominent and is largely limited to the snow transition time periods. The results demonstrate 15 that the availability of a better model error background through the forcing ensemble enables the assimilation system to better incorporate the observational information. 1 Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-581, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. Published: 22 November 2016 c © Author(s) 2016. CC-BY 3.0 License.


Introduction
Land Data Assimilation (DA) methods combine observations of land surface conditions from remote sensing platforms or ground measurements with model forecasts to produce temporally and spatially 20 continuous estimates of land surface fields.The merging of the observations and model forecasts is conducted by weighting them appropriately based on their respective sources of errors.As a result, the skill of the DA systems is critically reliant on the accurate specification of errors in observations and model background.
Despite their importance, the specification of input error covariances is challenging (Dee (1995); 25 Derber and Bouttier (1999); Reichle (2008); Reichle et al. (2008)).The sources of errors in observations include instrument errors, deficiencies of the observation operators (such as radiative transfer models) and representativeness issues from differences in spatial scales (Kumar et al. (2012)).Similarly, uncertainties in model parameters, forcing inputs and deficiencies in model physics contribute to the model background errors.The model error covariance specifications are often made through 30 idealized experiments using analysis of assimilation increments and innovations (Kumar et al. (2008(Kumar et al. ( , 2009))).Comparison of model simulations against independent observations is another approach for developing these specifications.However, given the lack of representativeness of the point-scale in situ measurements and the heterogeneity of the land surface, developing spatially distributed estimates of these model error covariances are difficult.As noted in Reichle (2008), the specification of 35 input error covariances remains a subjective process in current land data assimilation systems.
Ensemble data assimilation techniques such as the Ensemble Kalman Filter (EnKF) are widely used in land data assimilation applications (Crow and Wood (2003); Reichle et al. (2007); Kumar et al. (2009); Reichle et al. (2010);De Lannoy et al. (2012); Kumar et al. (2014)).The EnKF, a Monte-Carlo variant of the Kalman filter, uses an ensemble of model trajectories to represent the 40 model error structures.The model error covariance is diagnosed as the sample covariance of the ensemble of model forecasts.The ensemble is typically created by adding stochastic noise to the meteorological forcing, propagated to the model fields through the non-linear land surface model.In addition, stochastic perturbations are also commonly applied to the model prognostic fields.
Perturbations are sampled from randomly generated noise and are directly applied to the forcing 45 and model prognostic fields.The typical approach is to employ either normally distributed additive perturbations or lognormally distributed multiplicative perturbations, depending on the variable.For example, multiplicative perturbations are normally used for fields such as precipitation, since the use of additive noise could generate unphysical values (less than zero) or consistent positive biases during periods where precipitation is absent.In addition, to avoid introducing systematic biases in 50 the perturbed fields, the ensemble-mean of the perturbations are normally constrained to zero and one, for additive and multiplicative perturbations, respectively.
In this article, we examine how the reliance on ensemble perturbations of forcing fields to develop the model error background impacts the performance of data assimilation.Most land data is characterized by perturbing the meteorological fields from this single data source.Arguably, the accuracy of the model error covariance will greatly depend on the accuracy of the forcing input.For example, in a case where precipitation estimates are underestimated, the forcing uncertainty characterized by the resulting ensemble will lead to the underestimation of the model error covariance.
In contrast, alternate strategies such as the added use of the forcing climatology or multiple forcing 60 data sources are likely to provide better representations of the forcing uncertainty and a better characterization of the model error background.In this article, we examine the impact of such factors in the context of snow data assimilation case studies.
The article presents two sets of experiments: 1) An idealized experiment to demonstrate the impact of model error covariance underestimation and 2) A "real" data assimilation scenario where snow 65 depth retrievals (Oki et al. (2010); Kachi et al. (2013)), from the Advanced Microwave Scanning Radiometer 2 (AMSR2) aboard the Global Change Observation Mission-Water (GCOM-W) satellite are used.The assimilation of AMSR2 data is conducted over two different domains in the continental U.S. with different snow evolution characteristics.The different nature of the snow evolution in these domains is used to investigate the impact of model error background representations in snow 70 data assimilation.All experiments described in this article are conducted using the NASA Land Information System (LIS; Kumar et al. (2006)) which is an observation-driven land surface modeling and data assimilation system.The data assimilation subsystem in LIS (Kumar et al. (2008)) contains algorithms such as the EnKF and supports the assimilation of data from a variety of satellite sensors (Reichle et al. (2010); Liu et al. (2013); Kumar et al. (2014Kumar et al. ( , 2015)); Liu et al. (2015); Kumar et al.

Ensemble Kalman Filter and background error covariance representation
The filtering class of data assimilation algorithms seek the best estimate of the posterior state conditioned on the past observations, using the statistics of the uncertainties in the model and observations.The Kalman Filter (KF) is an optimal estimator for linear dynamical systems driven by 80 Gaussian noise.The EnKF is a reduced-rank variant of the KF, which assumes normality of model and observation errors and typically requires the use of a small number of ensembles to represent these error structures (Reichle (2008)).
EnKF is a sequential data assimilation approach, where the algorithm alternates between a forecast step and an analysis step.In the forecast step, an ensemble of model states is propagated forward in 85 time using the LSM.This is followed by an analysis step where the model forecast is updated based on observations.The analysis step is written in the general form as: where x b is the background model state vector, x a is the analyzed state vector, y is the observation vector and H k is the observation operator that relates the model states to the observations.The 90 subscript k indicates time and the superscripts b and a refer to the state estimates, before and after the update, respectively.K k is the gain matrix, which represents the weighting factor that determines the degree to which the model forecast is adjusted towards the observation.K k is expressed as: where R k and P b k are the observation and forecast model error covariances, respectively.The model 95 error covariance is computed as the sample covariance of the model ensemble.
EnKF relies on the second order statistics of the noise simulated by ensemble perturbations in the model and observations (drawn from Gaussian distributions), to characterize their probability density functions (PDFs).The accuracy of the sampled model error covariance, in particular, is dependent on the size of the ensemble and the presence of model errors (Li et al. (2009)).Prior studies have used 100 techniques such as covariance inflation (Anderson and Anderson (1999)), to deal with the covariance underestimation.These techniques, however, require significant tuning and rely on the assumption that the observation error covariances are known (Miyoshi and Yamane (2007)).In addition, these inflation techniques are ineffective when the model errors are significant and the resulting model error covariances are close to zero.In the examples below, the impact of underestimating the model 105 error background for snow data assimilation is examined.
Figure 1 shows a schematic of three strategies that are used to examine the issue of model covariance underestimation in this article.The first strategy (A), which is the typical practice in land data assimilation systems, is to use a single forcing dataset to drive the ensemble.The small perturbations applied to the input forcing variables help in simulating the ensemble spread.In the second strategy 110 (B), the ensemble is forced with both the given forcing and a climatology of that forcing.The added use of the forcing climatology helps in incorporating the representation of average conditions within the ensemble and in reducing the covariance underestimation due to the reliance and limitations of a single dataset.In the third approach (C), the model ensemble is driven using an ensemble of forcing products from different sources, providing a more realistic representation of the input forcing uncer-115 tainty.Note that small perturbations to the forcing variables are also applied to B and C forcing data to augment the ensemble size.As a result, no non-zero analysis increments are generated from the DA analysis and no changes in the snow depth fields from DA are observed until mid-December, 2012.In contrast, during the peak winter months, the snow depth estimates from DA_FSNGL are closer to the Control simulation, as the availability of a non-zero model error covariance allows DA to compute positive analysis increments.Further, the DA_FSNGL integration again fails to capture the late season snow events (late April and early May 2013), as the deficiencies in the model error background results in the inability of the analysis step to produce meaningful analysis increments.
In the above example, the main source of the model deficiencies is the errors in the forcing inputs, as the same model is used in the Control and open loop integrations.Two variants of this experiment are conducted by: 1) using the forcing climatology in combination with the input forcing to specify the ensemble (EXP-FCLIM) and 2) using an ensemble of forcing datasets to drive the ensemble

Impact of forcing ensemble in the assimilation of AMSR2 snow depth retrievals
The idealized experiments presented in the previous section demonstrate that the use of hybrid forc-210 ing ensemble strategies is helpful in providing a better characterization of the forcing uncertainty and the model background.We extend this approach to a "real" data assimilation scenario where passive microwave snow depth observations from the AMSR2 instrument are employed.These retrievals, available from 2012 July onwards, are obtained from the Japan Aerospace Exploration Agency (JAXA).In all the integrations assimilating AMSR2 retrievals, the standard deviation of the Table 1 shows a summary of the cumulative RMSE from various simulations at these locations.The cumulative RMSE from the OL_FSNGL is 381 mm, which reduces to 275 mm and 169 mm with the OL_FCLIM and OL_FENS, respectively.The cumulative RMSE in the DA integrations is 266  A similar set of evaluations are conducted over the CH domain, an area with deeper seasonal snow accumulation compared to the GL region.Figure 7 presents the RMSE improvement map for the CH domain (similar to Figure 5).Compared to the improvements observed in the GL domain, the patterns of improvements and degradations are more mixed in the CH domain.In addition, stronger  In this article, the limitations of using a single forcing dataset as the basis for developing model error background is examined in the context of snow data assimilation.When significant errors are present in the forcing fields (e.g.precipitation), the resulting model and ensemble estimates will have significant errors.In such instances, the use of an ensemble of forcing datasets, either based on climatology or a suite of independent datasets, is likely to provide a better representation of the 325 forcing uncertainty and the model error background.The article demonstrates these issues through both idealized and real data assimilation experiments.
The idealized experiment presents a case where the snow depth estimates are significantly underestimated due to the presence of precipitation biases.The application of stochastic perturbations using this biased precipitation input is inadequate in providing a realistic model error background in 330 the assimilation system.As a result, the snow depth fields in the DA system remain biased, especially during the snow evolution and spring melt periods.In contrast, when an ensemble of forcing datasets is used to drive the model, the representation of the model error background is more realistic.As a result, the assimilation system performs better in incorporating the impact of observations during the snow evolution and ablation periods.

335
The impact of using a forcing ensemble for developing the model error background is examined for the assimilation of snow depth retrievals from the AMSR2 instrument, over two domains in the Continental U.S. with different snow evolution characteristics.Over the region near the Great Lakes, the snow evolution tends to be shallow, with transitions between snow and no-snow conditions during  2011)).Such limitations contribute to the mixed results seen in these results, especially in the CH domain.In such instances, the poorer performance from the use of the forcing ensemble is a result of the poor skill of the retrievals.To improve the skill of the retrievals themselves, prior studies (Kumar et al. (2014); Liu et al. (2015)) have successfully employed objective analysis techniques such as optimal interpolation to blend in situ measurements with satellite retrievals prior to assimilation.These prior studies and the results of this article suggest that a strategy that combines the use of hybrid forcing inputs (to improve model error background) and in situ data based correction of observations to be assimilated (to enhance the satellite retrievals) is likely to provide a robust configuration for optimal DA performance.
It must be stressed that in the experiments presented in the article, the OL_FSNGL configurations purposely employ an inferior forcing dataset so that the differences between the OL_FSNGL and OL_FCLIM and OL_FENS simulations are more magnified.If the single forcing dataset being used is of high skill, then the added benefit of using the forcing ensemble is likely to be less.Overall, the results in this article indicate that use of a forcing ensemble is helpful in providing better representations of model error background and more positive and consistent improvements in data assimilation.Note also that the use of an ensemble of forcing products may be practical in operational assimilation environments for centers with ensemble prediction systems.Where not available, the combined use of the forcing climatology along with the single, operational forcing input may be an appropriate strategy to improve the skill of the data assimilation system, validated by the results in this paper.Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-581, 2016 Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 22 November 2016 c Author(s) 2016.CC-BY 3.0 License.

3
Assessing the impact of model error covariance underestimation through idealized experiments In this section, we present an idealized snow depth DA experiment to demonstrate the importance 120 of accurately characterizing the input model error covariances.The experiment is conducted at the Niwot Ridge site in Colorado (40.03 • N, 105.5 • W), which is part of the NRCS Snow Telemetry (SNOTEL) network.All model simulations are conducted using the Noah land surface model version 3.3.The DA experiment is set up as an identical twin experiment (Kumar et al. (2009)) with the following structure: First, the Noah LSM is run forced with a certain meteorology (FORCING1)125and is assumed to represent the "true" state of snow depth evolution at this location.This model integration is termed as the Control or "truth" simulation.Next, a set of synthetic snow depth observations is simulated from this Control run by introducing realistic retrieval errors.Similar to the strategies used in previous studies(Kumar et al. (2008)), to account for the limitations of the passive microwave sensors in retrieving snow depth under dense canopies, the observations are masked out 130 when Green Vegetation Fraction (GVF) values used in the model are greater than 0.6.In addition, observations are degraded by introducing multiplicative random noise with standard deviation of 0.05 to simulate the errors in the snow depth retrievals.An open loop (OL) integration is conducted using the same LSM, but forced with a different meteorology (FORCING2) which has a degraded set of precipitation inputs.A data assimilation integration is then conducted by incorporating the 135 simulated observations into the OL configuration using a one-dimensional Ensemble Kalman Filter (EnKF;Reichle et al. (2002)).The modeled estimates from the OL and DA integrations are compared against the true fields from the Control run to evaluate the impact of assimilation.An ensemble size of 20 is used in the integrations with perturbations applied to both meteorological forcing inputs and model prognostic fields to simulate model error background.Multiplicative 140 perturbations are applied to the precipitation and downwards shortwave fields with a mean of 1 and standard deviations of 0.3 and 0.5, respectively.Additive perturbations with a standard deviation of 50 W/m 2 are applied to the longwave radiation fields.The Noah LSM model fields of snow water equivalent (SWE) and snow depth are perturbed with multiplicative noise of 0.01 and 0.02, respectively.Time series correlations are imposed via a first-order regressive model (AR(1)) with a time 145 scale of 24 hours for forcing variables and 12 hours for the model fields.The perturbations to the forcing fields are applied hourly, whereas the model prognostic fields are perturbed at three hour intervals, similar to the configurations used inKumar et al. (2015) andKumar et al. (2014).

Figure 2
Figure 2 shows a time series of the snow depth fields from model integrations for the 2012-2013 winter season, from the Control, observations (OBS), open loop simulation forced with a single me-150

(
EXP-FENS).A climatological forcing dataset is developed by averaging the forcing inputs (used in OL_FSNGL) at each forcing timestep across 4 years (2012 to 2015).In EXP-FCLIM, the forcing climatology is used to drive 10 of the 20 ensemble members with the remaining 10 driven by the OL_FSNGL forcing data.In EXP-FENS, four different forcing datasets (different from the data used in the Control) is used to drive the model ensemble.Each forcing data is used to drive 5 ensemble members within the 20 member ensemble.As before, perturbations are applied to both forcing and model states.These strategies assume that a better representation of the forcing uncertainty and model error covariance can be developed by augmenting the ensemble through the use of multiple data sources.Panels (a) in Figure 3 show the time series of snow depth from open loop and DA integrations from the EXP-FCLIM and EXP-FENS experiments and panels (b) show comparisons of the snow depth ensemble spread from DA integrations.In the EXP-FCLIM experiment, it can be noted that the added use of forcing climatology with the OL_FSNGL forcing is helpful in increasing the ensemble spread in the DA integrations without a significant change to the mean snow depth estimates.The improved model error background representation, however, leads to improved DA performance, as the DA_FCLIM based estimates are improved relative to the DA_FSNGL estimates.The improvements are more apparent during the snow accumulation (Dec-Jan) and melt (April-May) time periods, though they are significantly underestimated relative to the Control.Quantitatively, the RMSE in the OL_FSNGL and OL_FCLIM integrations for the Oct 2012 to Jun 2013 time period is 85 mm.The DA_FSNGL integration with a single forcing dataset has a RMSE of 55 mm and the added use Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2016-581,2016 Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 22 November 2016 c Author(s) 2016.CC-BY 3.0 License. of the forcing ensemble helps in further reducing the overall RMSE to 48 mm in the DA_FCLIM integration.Comparatively, the use of an ensemble of forcing products provides significantly improved performance in the assimilation of simulated observations.First, a significant portion of the bias in the snow depth estimates is reduced by the forcing ensemble based open loop (OL_FENS).The cumu-200 lative RMSE of the OL_FENS integration is 56 mm.The use of the forcing ensemble then helps in improving the DA simulations (DA_FENS), as it shows a closer match with the Control relative to all other DA integrations.In particular, DA_FENS shows improvements in the accumulation (Nov-Dec) and snow melt (Mar-Apr) periods, as the availability of an improved model background helps in generating meaningful (non-zero) analysis increments.A better characterization of the snow accu-205 mulation also helps in improving the model background and data assimilation during the peak snow time periods as well.The use of the forcing ensemble in data assimilation (DA_FENS) provides the lowest RMSE of 29 mm, for the time period of Oct 2012 to June 2013.
figure indicates, the DA integrations employing hybrid forcing inputs are systematically better than the DA_FSNGL simulation in most parts of the domain.Comparatively, the RMSE improvements are larger in the DA_FENS integration than the DA_FCLIM simulation.Note that the improved skill of DA_FENS in particular, is benefited by both the improved model background and the skill of the precipitation data sources that constitute the forcing ensemble, though it is hard to separate their 245

260
mm for DA_FSNGL, 262 mm in DA_FCLIM and 244 mm in DA_FENS.Note that the cumulative RMSE does not reflect the obvious improvement during the late season snow periods in DA_FENS (over OL_FENS), as the early season underestimation dominates these statistics.

Figure 6 (
Figure 6 (B) panel shows a similar time series comparison at point B with larger snow evolution.Similar to point A, OL_FSNGL underestimates the snow evolution throughout the season (RMSE of 265

280
improvements and degradations are observed in the DA_FCLIM and DA_FENS integrations relative to DA_FSNGL.To examine these patterns, the time series of snow evolution from various integrations are compared at two locations in the CH domain( Point C at 40.375, 106.875 and point D at   45.125, 109.875)  and are shown in Figure8.OL_FSNGL underestimates the snow evolution in both locations (RMSE of 424 mm and 276 mm at C and D, respectively as shown in Table1).The added 285 use of the climatology (OL_FCLIM) marginally improves the snow simulation at location C (RMSE of 402 mm) and provides more significant improvements at location D (RMSE of 142 mm).The use of the forcing ensemble (OL_FENS) provides a better match to the observations at location C (RMSE of 179 mm), but overestimates the snow accumulation at location D (RMSE of 215 mm).At location C, the assimilation of AMSR2 improves the snow depth estimates in DA_FSNGL (RMSE of 290 316 mm) and DA_FCLIM (RMSE of 309 mm) integrations relative to their respective OL, whereas DA leads to degradations in the forcing ensemble configuration (RMSE of 285 mm), compared to OL_FENS.At location D, the assimilation of AMSR2 retrievals leads to increased RMSE in the DA integrations (RMSE of 327, 312 and 309 mm for DA_FSNGL, DA_FCLIM and DA_FENS, respectively) These trends are reflective of the fact that the AMSR2 observations underestimate the snow 295 evolution in the peak winter months (Jan-Mar) and overestimates snow estimates in the spring melt time periods (Apr -May), at location C. At location, D, however, the AMSR2 snow observations are generally underestimated.The underestimation of snow at both these locations, is likely due to the fact that passive microwave based retrievals saturate for thick snow packs (Dong et al. (2005)).In general, the DA integrations (DA_FSNGL, DA_FCLIM and DA_FENS), have comparable per-300 formance at both these locations and they mostly follow the snow evolution patterns in the AMSR2 data.The influence of undersampling the model error background can be observed in the early part of the snow season at location C and during late season at location D, where the DA_FSNGL integrations fail to match the snow events captured by AMSR2.During the peak snow time periods, Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2016-581,2016 Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 22 November 2016 c Author(s) 2016.CC-BY 3.0 License.however, the undersampling of model error background in OL_FSNGL is less of a problem over this domain.Though underestimated, the AMSR2 observations capture the seasonality of snow evolution.Once the initial snow accumulation occurs, it provides an adequate model background for subsequent data assimilation updates.Thus, the evaluation of the snow DA integrations at these two regions provide valuable insights on the importance of accurately characterizing the model error background.The use of the hybrid forcing ensemble and improved model background is more help-310 ful over the GL domain, where snow evolution is ephemeral.Over regions with large snowpacks such as the CH region, the representation of the model background is more important during the early accumulation and spring melt time periods.5 Summary Accurate specification of input model and observations error covariances in data assimilation sys-315 tems is challenging though these error specifications are critical in the development of a skillful data assimilation system.In offline ensemble land data assimilation systems, the model ensemble and model error background representation are typically generated by applying small perturbations to the model prognostic states and input meteorological forcing fields.Most Land DA studies are reliant on the use of a single forcing dataset to derive their driving meteorology. 320 Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2016-581,2016   Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 22 November 2016 c Author(s) 2016.CC-BY 3.0 License.each snow season.In this region, the added use of the forcing climatology to drive the ensemble leads to improved DA performance, when compared to the in-situ ground observations of snow depth.The DA performance is further enhanced with the use of an ensemble of forcing inputs, partly aided by the enhanced skill of the precipitation inputs.Over the Colorado Headwaters, an area with large seasonal snow packs, the impact of precipitation biases on the simulation of snow states are largely limited to the snow evolution and ablation time periods.As the occurrences of transitions between snow and no-snow states are less common during the peak winter months in this region, the underestimation of the model error background is less problematic in the DA integrations during these time periods.As a result, the positive impact of the use of forcing ensemble is mostly prominent during the accumulation and ablation time periods.As noted above, the evaluation of snow depth estimates over CH region show mixed results, with several locations indicating worse performance with the use of the forcing ensemble compared to the use of a single forcing dataset.In regions with large snow accumulation (such as the CH region), passive microwave retrievals such as those from AMSR2 are known to have low skill due to issues such as saturation in deep snowpacks, signal loss in wet snow and overestimation in the presence of large snow grains(Dong et al. (2005);Foster et al. (2005);Durand et al. (

Figure 1 .
Figure 1.Schematic of the three strategies used to specify forcing uncertainty in the data assimilation integrations: (A) a single forcing dataset, (B) a single forcing dataset and its climatology and (C) an ensemble of forcing products.In all three cases, perturbations are applied to the forcing inputs to generate the ensemble.

Table 1 .
Cumulative RMSE (mm) from various model and DA integrations at the four locations in the Great Lakes and Colorado Headwaters domains used in the Figures 6 and 8. Earth Syst.Sci.Discuss., doi:10.5194/hess-2016-581,2016 Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 22 November 2016 c Author(s) 2016.CC-BY 3.0 License.