Assimilation of SMOS Brightness Temperatures or Soil Moisture Retrievals into a Land Surface Model

Three different data products from the Soil Moisture Ocean S linity (SMOS) mission are assimilated separately into the Goddard Earth Observing System Model, version 5 (GEOS-5) to improve estimates of surface and root-zone soil moisture. T h first product consists of multi-angle, dual-polarization b rightness temperature (Tb) observations at the bottom of th e atmosphere extracted from Level 1 data. The second product is a derived S MO Tb product that mimics the data at 40 ◦ incidence angle 5 from the Soil Moisture Active Passive mission. The third pro duct is the operational SMOS Level 2 surface soil moisture (SM) retrieval product. The assimilation system uses a spat ially distributed ensemble Kalman filter (EnKF) with season ally varying climatological bias mitigation for Tb assimilatio n, whereas a time-invariant cumulative density function ma tching is used for SM retrieval assimilation. All assimilation exper iments improve the soil moisture estimates compared to mode l-only simulations during the period 1 July 2010 to 1 May 2015 and for 187 sites across the United States. Especially in areas wher e 10 the satellite data are most sensitive to surface soil moistu re, large skill improvements (e.g. increase in anomaly corr elation by 0.1) are found in the surface soil moisture. The domain-av erage surface and root-zone skill metrics are similar among the various assimilation experiments, but large differenc es in skill are found locally. The observation-minus-forec ast residuals and analysis increments reveal large differences in how the obs rvations add value in the Tb and SM retrieval assimilati on systems. The distinct patterns of these diagnostics in the t wo systems reflect observation and model errors patterns tha t are not 15 well captured in the assigned EnKF error parameters. Conseq uently, a localized optimization of the EnKF error paramete rs is needed to further improve Tb or SM retrieval assimilation.


Introduction
Microwave satellite missions are collecting large amounts of data for soil moisture monitoring.It is not yet clear, however, how this wealth of data can be used in the most efficient way to obtain global estimates of soil moisture that can improve, e.g., weather prediction, flood and drought modeling, agricultural yield monitoring, or landslide predictions.Many such applications require knowledge of soil moisture in a deeper layer, where water is extracted by plant roots or stored to buffer drainage and runoff, not the approximately 5 cm surface layer to which the current L-band (∼ 1.4 GHz) microwave missions are sensitive.Moreover, L-band satellite observations have a fairly coarse spatial resolution (about 40 km) and are available only at particular overpass times, typically once every 2-3 days for a given location.The challenge is thus to derive soil profile moisture information at all times and locations through data assimilation, that is, through the merger of satellite observations with information from a dynamical land surface model.
The Soil Moisture Ocean Salinity (SMOS; Kerr et al., 2010) mission and the Soil Moisture Active Passive (SMAP; Entekhabi et al., 2014) mission are the two L-band observatories currently orbiting in space with the specific aim of measuring global soil moisture.These missions supply Level 1 Published by Copernicus Publications on behalf of the European Geosciences Union.
(L1) brightness temperature (Tb) data, Level 2 (L2) surface soil moisture (SM) retrievals, and derived Level 3 (L3) products.The SMAP mission also provides an operational Level 4 surface and root-zone soil moisture product (L4_SM; Entekhabi et al., 2014;Reichle et al., 2016) that is based on the assimilation of L1 SMAP Tb data into Goddard Earth Observing System Model, version 5 (GEOS-5) land surface simulations.Alternatively, a soil moisture assimilation system could ingest L2 SM retrievals instead of L1 Tb observations.
In this paper, we compare Tb and SM retrieval assimilation using a historical (5-year) record of SMOS observations over North America in an assimilation system similar to that of the SMAP L4_SM system.The main differences between the SMAP L4_SM system and the experiments in this paper pertain to the differences in assimilated data, to the difference in spatial resolution of the resulting soil moisture products (36 km in the current paper; see below; 9 km for the L4_SM product), and to differences in meteorological forcing input (re-analysis meteorology in the current paper; operational forecast meteorology corrected with gauge-based precipitation in the L4_SM product).
It is more difficult to assimilate Tb observations than SM retrievals because brightness temperatures are only indirectly connected with the land surface variables of interest and the Tb data come in multiple polarizations.SMOS Tb observations are even more complex because of their multi-angular nature.Some of the SMOS L1 Tb data complexity is reduced in the L3 SMOS Tb product and further addressed in Munoz-Sabater et al. (2014) and De Lannoy et al. (2015), who prepared the L1 SMOS Tb data for assimilation into (quasi-)operational systems.
Successful examples of SMOS Tb assimilation using a variety of simplifying assumptions are illustrated in Lievens et al. (2015); De Lannoy and Reichle (2016); Kornelsen et al. (2016).These studies use a radiative transfer model (RTM) to dynamically invert Tb information into corrections to modeled soil moisture estimates.In this paper, we advance the spatially distributed multi-angle and dual-polarization Tb assimilation of De Lannoy and Reichle (2016) in the GEOS-5 land surface model with a new version of Tb observations and an improved spatial support and forward simulation of the Tb observation predictions.Moreover, to mimic SMAP Tb assimilation we also assimilate dual-polarization singleangle 40 • SMOS Tb observations after fitting the multi-angle Tb data (De Lannoy et al., 2015).
A key disadvantage of a system that assimilates SM retrievals is that the SM retrievals may be produced with inconsistent ancillary data, such as for example soil temperature simulated by another model than that used in the assimilation system.The current SMOS SM retrievals by themselves have been found to be skillful (Al-Yaari et al., 2014;Fascetti et al., 2016), and research is ongoing to further improve them (Rodriguez-Fernandez et al., 2015;Ye et al., 2015;Zhao et al., 2015;van der Schalie et al., 2016;Wigneron et al., 2016).The use of these SMOS SM retrievals has been manifold, e.g., to derive enhanced estimates of precipitation (Wanders et al., 2015;Koster et al., 2016), to derive offline rootzone soil moisture estimates (Ford et al., 2014), or to offline downscale the data to higher-resolution soil moisture estimates (Piles et al., 2014).Other studies have assimilated SMOS SM retrievals online into land surface models to possibly downscale the retrievals and consistently improve soil moisture and other land surface variables (Ridler et al., 2014;Zhao et al., 2014;Lievens et al., 2015), leading to, e.g., improved estimates of floods (Alvarez-Garreton et al., 2015) and crop growth (Chakrabart et al., 2014).In this paper, we use a spatially distributed assimilation system to integrate SMOS SM retrievals into the GEOS-5 land surface model with the aim of inferring improved surface and root-zone soil moisture estimates.Our study mainly differs from the above SMOS SM retrieval studies in the continental and multi-year scale of the experiments, in the advanced quality screening and spatial support of the SM retrieval observations, and in the comparison between Tb and SM retrieval assimilation (also discussed in Lievens et al., 2015).
To assess the potential of Tb and SM retrieval assimilation, 5 years of SMOS Tb data or SM data are assimilated into the GEOS-5 land surface model using a careful data quality control and data preprocessing.The observations are associated with a realistic antenna pattern, containing 50 % of the signal power in a circular area with 20 km radius.Special attention is paid to large-scale patterns of random and persistent forecast and observation errors in the different assimilation systems, and to the impact of the different assimilation schemes on the skill of surface and root-zone soil moisture estimates.Section 2 describes the SMOS observations, the various modeling components, and the in situ validation data.Section 3 highlights the technical differences between the various assimilation schemes, and Sect. 4 presents the results.
2 Data and model

SMOS Tb observations
The Microwave Imaging Radiometer with Aperture Synthesis (MIRAS) onboard SMOS provides multi-angle Tb data, with a nominal (3 dB) spatial resolution of 43 km and a global coverage approximately every 3 days (at either 06:00 or 18:00 local time, i.e., ascending or descending half-orbits, separately).The most recent version (v620) of the SCLF1C Tb data is used.Observations are retained for further processing only (a) in the alias-free zone, (b) when the data are not contaminated by point source radio frequency interference (RFI) or tails thereof, (c) when the values fall within the range 100-320 K, and (d) when valid data are available for both horizontal (H ) and vertical (V ) polarization.The flag for snapshot RFI is not activated, because it is currently Hydrol.Earth Syst. Sci., 20, 4895-4911, 2016 www.hydrol-earth-syst-sci.net/20/4895/2016/ From these preprocessed Tb data, two datasets are derived for assimilation: (i) a seven-angle Tb dataset, with incidence angles θ = [30,35,40,45,50,55,60 • ] (De Lannoy et al., 2013), and (ii) a fitted Tb dataset (De Lannoy et al., 2015) from which only the Tb at a 40 • incidence angle is used to mimic the single-angle nature of SMAP Tb observations.We refer to these datasets as Tb_7ang and Tb_fit, respectively.Tb_fit data are only retained when the fitting error is less than 5 K and a minimum of 15 data points contribute to the entire fitted angular signature, with at least 5 data points above and below the 40 • incidence angle and at least 10 data points in the incidence angle interval between 30 and 50 • .

SMOS SM retrieval observations
The SMOS SM retrievals are extracted from the SMUDP2 product v552.Because this product version ends in early May 2015, we limit our study period to 1 July 2010-1 May 2015.(The reprocessed v620 version of the SM retrievals was not yet available at the time we conducted the experiments.)The SMOS retrieval algorithm simultaneously retrieves soil moisture and vegetation opacity, by fitting multiangle Tb observations at both H -and V -polarization with simulations of the L-band Microwave Emission of the Biosphere Model (L-MEB, Wigneron et al., 2007).Based on the quality information provided within the SMOS products, the SM data are retained only if (a) all retrieved variables fall within a realistic range (0-0.6 m 3 m −3 for soil moisture), (b) the SM uncertainty estimated by the SMOS retrieval algorithm is less than 0.1 m 3 m −3 , (c) the RFI probability for both H -and V -polarization is less than 0.3, and (d) SM retrieval flags are not raised for high topographic complexity, high urban fraction, high open water fraction, sea ice, coastal areas, and high total electron content.Further screening for frozen temperature and snow is based on GEOS-5 model output (Sect.2.3).After the regridding from the 15 km DGG grid to the 36 km cylindrical EASEv2 grid, the data are screened for excessive sub-36 km heterogeneity (spatial standard devi-ation > 0.2 m 3 m −3 ).SM values for a given 36 km EASEv2 grid cell are computed only if at least two valid DGG observations are available.

Soil moisture and brightness temperature modeling
The land data assimilation system used here employs the GEOS-5 catchment land surface model (CLSM; Koster et al., 2000), along with an L-band tau-omega radiative transfer model (RTM;De Lannoy et al., 2013, 2014b).The CLSM simulations use GEOS-5 parameters (Mahanama et al., 2015;De Lannoy et al., 2014a) similar to those used in the SMAP L4_SM product, and are forced with 1/2 • × 2/3 • GEOS-5 forcing data from MERRA (Rienecker et al., 2011) bilinearly interpolated to the model grid.The study domain covers most of North America, with the northwestern corner at (125 • W, 55 • N) and the southeastern corner at (60 • W, 24 • N).
The computational elements are the 36 km EASEv2 grid cells.The land model computation time step is 7.5 min, and output is saved at 3 h intervals.At each grid cell, the surface soil moisture content (sfmc, 0-5 cm) and root-zone soil moisture content (rzmc, 0-100 cm) are diagnosed based on three prognostic variables: catchment deficit (catdef), rootzone excess (rzexc), and surface excess (srfexc).Similarly, the surface (skin) temperature is diagnosed from the prognostic land surface temperatures across the saturated (tc1), unsaturated (tc2), and wilting (tc4) sub-grid areas.Finally, the soil temperature (tp1 for the topmost layer) is diagnosed from the prognostic ground heat content (ght1 for the top layer).An overview of the model variables is given in Reichle et al. (2015); Koster et al. (2000) and Ducharne et al. (2000).
The L-band tau-omega RTM converts the 36 km CLSM soil moisture and temperature simulations into 36 km Lband Tb estimates when the soil is not frozen or covered with snow, when precipitation is less than 10 mm day −1 , and where the open water fraction is less than 5 %.For each 36 km grid cell, key parameters of the RTM are estimated by minimizing Eq. (B.1) in De Lannoy et al. (2014b), using a 5-year history of SMOS v620 Tb data, and computing observation predictions (see below) at the footprint scale.Specifically, all 36 km grid cells within one footprint area are initially assigned the same set of RTM parameters, while the dynamic background information is spatially variable.For each 36 km grid cell, the calibration estimates a spatially homogeneous set of RTM parameters for the entire associated footprint area, and the resulting values are assigned to the central (and typically dominant) 36 km grid cell only.For the forward calculation of the Tb observation predictions during the data assimilation, all 36 km pixels have a unique set of RTM parameters.The RTM is calibrated using all 5 years of available Tb data and aims at minimizing climatological biases.The data assimilation is performed over the same 5 years and aims at addressing random (or short-term) errors.The methodology is very similar to that in De Lannoy and Reichle (2016), but with the difference that, here, the RTM does not simulate atmospheric contributions (because the Tb observations are now a priori corrected for atmospheric contributions) and the observation predictions are now spatially aggregated using a realistic (but approximate) antenna pattern.
For the computation of differences between SMOS observations and footprint-scale model simulations in the RTM calibration and for the computation of the "observationminus-forecast" (O-F) residuals in the assimilation system (Sect.3.1, Fig. 1), the modeled 36 km soil moisture or Tb simulations are aggregated to the footprint scale by spatial convolution with weights given by an approximation of the SMOS antenna pattern.We also refer to these spatially aggregated model estimates as "observation predictions".The SMOS antenna pattern is approximated by a twodimensional Gaussian function containing 50 % of the signal within a circle with a radius of 20 km.The simulations outside a radius of 40 km are discarded in the computation of the footprint-scale estimates.
The number of 36 km EASEv2 grid cells included in one footprint area varies with latitude.The circular footprint shape is preserved everywhere on the globe.In contrast, the shape of the EASEv2 grid cells projected on the globe varies with the latitude, with an aspect ratio of 1 at 30 • (northsouth) latitude, larger than 1 towards the poles and less than 1 towards the Equator.Therefore, at higher latitudes multiple EASEv2 grid cells with the same latitude and various longitudes belong to one circular footprint, whereas towards the Equator, several EASEv2 grid cells with the same longitude and various latitudes contribute to the footprint.Overall, the difference between single 36 km simulations and footprintscale values is small, but the number of valid Tb observation predictions at the footprint scale is reduced, because of the increased likelihood of finding a 36 km grid cell with a non-negligible water fraction, snow amount, or precipitation within the footprint area.

In situ soil moisture data and metrics
The assimilation results are evaluated using independent in situ measurements of surface and root-zone soil moisture from two sparse networks across the US: the US Natural Resources Conservation Service Soil Climate Analysis Network (SCAN; Schaefer et al., 2007) and the US Climate Reference Network (USCRN; Diamond et al., 2013;Bell et al., 2013).Surface soil moisture measurements are taken at approximately 5 cm depth.Root-zone soil moisture measurements are a weighted average of measurements at 5, 10, 20, and 50 cm depth, with respective weights of 0.1, 0.1, 0.27, and 0.53.Given the difference in spatial support between these point measurements and the 36 km gridded model and assimilation results, the skill is quantified in terms of anomaly time series correlation (anomR) and unbiased rootmean-square difference (RMSD ub ;Entekhabi et al., 2010), using all 3 h forecast and analysis time steps in the period 1 July 2010-1 May 2015, excluding times when the soil is frozen (top layer soil temperature < 274.15 K) or snow covered (snow water equivalent > 0 kg m −2 ).The anomaly correlation is based on anomaly time series obtained by subtracting a multi-year smoothed climatology from both the simulations and in situ observations.Note that the assimilation and open-loop simulations have, by design, the same climatological variability; the assimilation only corrects for random errors.Metrics at a single site are only calculated if at least 200 data points are available.Skill metrics across an entire network are calculated by clustering the sites within SCAN and USCRN to avoid densely sampled areas dominating the validation metrics and to ensure realistic confidence intervals (De Lannoy and Reichle, 2016).The number of clusters is estimated a priori after prescribing an average cluster radius of 3 • , which approximately reflects the autocorrelation length of large-scale topographic and meteorological phenomena, or of large-scale soil moisture patterns (Vinnikov et al., 1996).The actual size of the clusters that results from the clustering algorithm varies strongly in space.
3 Data assimilation

Distributed ensemble Kalman filter
For both Tb and SM retrieval assimilation, a spatially distributed (or three-dimensional, 3-D) ensemble Kalman filter (EnKF; Reichle and Koster, 2003;De Lannoy and Reichle, 2016) is used.This system simultaneously assimilates multiple spatially distributed observation sets, using horizontal and vertical error covariance structures, to update the simulations at each 36 km model grid cell.The details of the Tb assimilation system are explained in De Lannoy and Reichle (2016) and differ only in that the observations are here associated with a spatially variable antenna pattern reaching out to a radius of 40 km.
During the model integration, a data assimilation step is activated every 3 h.All the SMOS observations y i collected within 1.5 h of the analysis time i are assimilated simultaneously to update the forecasted state xj− k,i at location k as follows: with j denoting the ensemble member, K k,i the Kalman gain, the observation predictions, and h i (.) the observation operator mapping the simulated land surface variables to observed quantities.Bias in the observation-minus-forecast residuals is addressed prior to the analysis (Sect.3.2).The ensemble is created by perturbing the model forcing, the model forecasts, and the observations (Sect.3.3).The Kalman gain is calculated as where Cov( x− k,i , ŷ− i ) is the (sample) error covariance (across the ensemble) between the forecasted land surface state and the forecasted Tb or SM.Similarly, Cov( ŷ− i , ŷ− i ) is the (sample) error covariance of the Tb or SM forecasts, and R i is the Tb or SM observation error covariance.The Kalman gain is identical for all ensemble members.
In the case of SM retrieval assimilation, the observation operator h i (.) performs the spatial aggregation of soil moisture simulations from the 36 km grid cells to the satellite footprint; in the case of Tb data assimilation, the observation operator includes both the RTM and the spatial aggregation of gridded Tb simulations to the footprint (Sect.2.3).For the Tb_7ang assimilation, one observation set at location κ contains Tb observations at a maximum of seven angles and both H -and V -polarization, i.e., up to 14 individual observations y λ,κ,i ∈ y κ,i .The subscript λ refers to the polarization and incidence angle of the individual Tb observations.In the middle part of the swath, all 14 observations are typically available, whereas slightly fewer observations are available in the outer portions of the swath, where the observations with lower incidence angles are missing.
For the Tb_fit assimilation, one observation set usually contains two observations, i.e., both H -and V -polarization Tb at a 40 • incidence angle.For the SM retrieval assimilation, each observation set contains only one observation.In all cases, the observation vector y j i collects multiple perturbed observation sets that are spatially distributed within an influence radius of 1.25 • around the model grid cell k, and each observation vector y j i has a forecasted counterpart ŷj− i .After removal of the persistent errors (Sect.3.2) from the O-F residuals (or innovations), the increments K k,i [y j i − ŷj− i ] are calculated and applied to the state variables.Figure 1 illustrates the forward simulation from 36 km gridded land surface simulations to footprint-scale observation predictions of Tb and the downscaling of the footprint-scale Tb innovations to 36 km gridded land surface increments.
The subset of prognostic variables updated in Eq. ( 1) differs depending on the assimilation experiment.The state vector for Tb assimilation (x = [catdef, srfexc, rzexc, tc1, tc2, tc4, ght1] T ) includes prognostic variables related to soil moisture and soil temperature (Sect.2.3), because Tb observations are by definition sensitive to surface soil moisture and temperature.In contrast, the state vector for SM retrieval assimilation (x = [catdef, srfexc, rzexc] T ) contains only model prognostic variables related to soil moisture, because the SM retrievals do not carry direct information about the soil temperature.The selected updates will be propagated to all other variables within the land surface modeling system through energy and water exchange between various soil layers and land-vegetation-atmosphere compartments.For the discussion of the soil moisture increments we will focus on the total profile water increments ( wtot= srfexc+ rzexc-catdef) in units of kg m −2 (that is, mm of water equivalent).This quantity is easily understandable and thus simplifies the discussion.
Figures 2 and 3 illustrate the concept for Tb assimilation and SM retrieval assimilation, respectively.Figure 2a-b show swaths of footprint-scale bias-corrected Tb_fit innovations (mapped onto the 36 km EASEv2 grid), for H -and Vpolarization at a 40 • incidence angle from the single-angle Tb assimilation system.The Tb innovations are then transformed into soil moisture and temperature increments using Eq. ( 1).Where Tb innovations are warm, the soil water is reduced and the temperature is increased.Figure 2c shows the total profile water increments wtot and Fig. 2d shows increments to the first soil layer temperature tp1.Increments to the surface temperature prognostic variables (Sect.2.3; tc1, tc2, tc4) are similar (not shown).Finally, the increments are added to the forecasted fields to create spatially complete analysis maps of surface and root-zone soil moisture, as well as surface temperature and soil temperature (Fig. 2e-g).
Similarly, Fig. 3a shows the SM innovations from the SM retrieval assimilation at the same time as in Fig. 2. Areas with positive (wet) SM innovations in the SM retrieval assimilation roughly correspond to negative (cold) Tb innovations in the Tb assimilation system (Fig. 2a-b).Note that the color bars for Tb and SM throughout the paper are chosen according to the rule of thumb that a 2-3 K change in Tb corresponds to a 0.01 m 3 m −3 change in soil moisture, but keep in mind that the relationship between Tb and SM is nonlinear and varies with time, location, and incidence angle.Next, the SM innovations are converted to soil moisture increments ( wtot; Fig. 3b); no increment to surface or soil temperature is calculated.Figures 2c and 3b show that the Tb and SM retrieval assimilation systems produce wtot increments with somewhat different large-scale patterns, which is further discussed in Sect.4.2.Finally, Fig. 3c-d show the resulting surface and root-zone soil moisture analysis fields obtained by adding the increments to the model forecast fields.For both the Tb and SM retrieval assimilation systems, the analysis increments blend smoothly into the forecast fields; that is, the analysis maps do not reveal sharp spatial edges that would reveal the geometry of the assimilated satellite swaths.Further details about this figure are discussed in Sect.4.1.

Tb and SM innovation bias
To limit the long-term biases between Tb observations and simulations, the RTM was calibrated (Sect.2.3).The CLSM soil moisture was not calibrated for lack of global observations that would support such an effort and because modeled soil moisture does not necessarily represent soil moisture as observed in the field anyway (Koster et al., 2009).Unlike biases in Tb innovations, the biases in the SM innovations are more stationary and do not depend on seasonal temperature variations.Therefore, the SM innovation biases are not corrected seasonally, but instead cumulative distribution function (CDF) matching between the observations and simulations is performed (Reichle and Koster, 2004) to reconcile the differences in long-term mean, variance, and higher moments, as in earlier retrieval assimilation studies (Liu et al., 2011;Draper et al., 2012).The observed and simulated SM CDFs are computed for the entire study period, i.e., for 1 July 2010-1 May 2015, at each 36 km grid cell individually.

Random forecast and observation error
The imposed ensemble forecast perturbations for Tb and SM retrieval assimilation are identical to those of De Lannoy and Reichle (2016) and not repeated here.The total observation error standard deviation for SMOS Tb_7ang is set to 6 K, which yields near-optimal assimilation diagnostics on average across the globe.However, the diagnostics are not necessarily near-optimal in individual regions (De Lannoy and Reichle, 2016).The input observation error standard deviation for SM retrievals is 0.04 m 3 m −3 , in line with the soil moisture accuracy requirement for the recent SMOS and SMAP missions.The SM retrieval error standard deviation is rescaled following the CDF matching of the SM observations and results in an effective mean error standard deviation of 0.02 m 3 m −3 , with larger values in the wetter eastern part, which exhibits a higher temporal variability in soil moisture simulations, and lower values in the drier, western part of the study domain (not shown).In all cases, the spatial observation error correlation length is 0.25 • .In the case of multiangle Tb_7ang assimilation, interangular error correlations are imposed as in De Lannoy and Reichle (2016).
Observation errors in Tb data or SM retrievals are a combination of instrument error and representation error (Cohn, 1997;van Leeuwen, 2015).The 6 K Tb error consists of a radiometric error of about 4 K for individual incidence angles (instrument error) plus 4.5 K representation inaccuracies (in our system, i.e., based on the near-optimal 6 K observation error) due to errors in the RTM, the spatial aggregation, or other discrepancies between Tb observations and forecasts (6 = 4 2 + 4.5 2 ).For Tb_fit observations, the instrument error may be slightly reduced compared to that for Tb_7ang after the angular smoothing, but the representation error remains similar.SM observations contain retrieval errors due to errors in the RTM and in the input L1 Tb observations, as well as representation error due to, e.g., the inherently different nature of simulated and observed soil moisture (Koster et al., 2009).In either case, the representation error depends on the soil moisture and temperature dynamics and should ideally be modeled as a function of time and location, but we chose a constant input observation error standard deviation in this paper for simplicity.For SM retrieval assimilation, some spatial error variability is introduced after rescaling in line with the CDF matching.

Tb or SM retrieval assimilation
In our experiments, we do not expect the SMOS Tb and SM retrieval assimilation systems to yield the same results.During the SMOS L2 SM retrieval optimization, the Tb data are used to estimate surface soil moisture and vegetation opacity, given soil temperature background fields provided by the European Center for Medium-Range Weather Forecasts (ECMWF) and look-up parameter information that differs significantly from the NASA GEOS-5 land data assimilation system.In contrast, our SMOS Tb assimilation scheme estimates soil moisture and temperature, given vegetation information.Furthermore, the data screening is necessarily different for Tb data and SM retrievals, and the approach for bias correction is intentionally different.The soil moisture information extracted during the L2 retrieval process or Tb assimilation is thus by design expected to be different.Finally, differences in the Tb and SM retrieval assimilation results could also be due to differences in how close each of the syswww.hydrol-earth-syst-sci.net/20/4895/2016/ Hydrol.Earth Syst.Sci., 20, 4895-4911, 2016 tems is to an optimal calibration of its model and observation error parameters.

Results
4.1 Observation and forecast diagnostics

Number of assimilated observations
Let us revisit Figs.2a-b and 3a to further highlight some differences between the various assimilated SMOS observations.First, the swath width for Tb innovations is much narrower than that of the SM innovations because the assimilated Tb observations are strictly limited to the aliasfree zone within the full swath, while the assimilated SM retrievals are retained in the extended alias-free zone.Furthermore, the swath width of the Tb_fit innovations is narrower than that of the multi-angle assimilation (not shown) because the fitting requires sufficient data at a range of incidence angles and lower angle data are not available at the outer edges of the swaths.Note that SMAP provides useable Tb measurements over a much wider swath (not shown).
The different swath widths result in different numbers of observation sets assimilated in each of the three experiments.Figure 4a-c show the average number of assimilated observation sets (defined in Sect.3.1) over the study period 1 July 2010-1 May 2015.The number of observation sets is smallest (one every 4 days) for Tb_fit and largest for SM retrievals (one every 2 days), because the swath width is narrowest for Tb_fit and widest for SM retrievals.The northern areas and the western mountain ranges have the fewest observations, because data are not used when the soil is frozen or snow covered.Tb observations are not assimilated in many small areas scattered around the study domain, where more than 5 % of open water is found in the footprint, based on the underlying GEOS-5 land mask.For the SM retrievals, the screening for an excessive (> 5 %) water fraction is only based on the product science flags, not on GEOS-5 information.Data gaps in the SM retrievals are found in the western mountain ranges and in the vegetated southeastern part of the US.The data coverage is also different for Tb and SM retrieval assimilation because the availability of the climatological information needed for the innovation bias correction (Sect.3.2) is different for the Tb and SM retrieval observations.

Actual observation and forecast errors
The long-term mean observation-minus-forecast differences (O-F, or innovations) are unbiased by design (Sect.3.2).The Hovmüller plots for two data assimilation cases in Fig. 5 reveal that the temporal pattern in area-averaged biases is fairly random for the Tb_7ang assimilation case (very similar for Tb_fit assimilation, not shown), whereas it shows a slight seasonal pattern in the SM retrieval assimilation case.This small difference is not surprising, given that the Tb innova-tion bias is seasonally corrected, whereas the SM innovation bias is not.
The time series standard deviation of the innovations, that is, the root-mean-square difference (RMSD) between SMOS observations and simulations, represents the total observation and forecast error that is present in the assimilation system (Desroziers et al., 2005).The spatial patterns of this diagnostic are very different for Tb and SM retrieval assimilation.Figure 4d-e show values of about 7.4 K for Tb_7ang and Tb_fit, with larger values (exceeding 10 K) in the central plains and along the Mississippi, where agricultural practices, such as altering crop rotation and irrigation, are observed by SMOS, whereas interannual variations in vegetation are not simulated by the model or provided as input to the model.Along the eastern coast and in the southeast, the temporal standard deviation in the innovations is low (2-3 K): forests show a limited interannual variability, and under dense vegetation Tb is only marginally sensitive to soil moisture and depends primarily on vegetation characteristics and (physical) temperature.
The standard deviation in the SM innovations in the SM retrieval assimilation (Fig. 4f) is 0.03 m 3 m −3 , showing larger values in the wetter vegetated east and smaller values in the drier west, with the exception of the western coast.Surprisingly, even though altering crop rotation and irrigation are not simulated, the values over the central agricultural area are not higher than elsewhere in the domain.This good agreement between SMOS SM retrievals and our simulations is partly due to the bounded nature of SM (unlike Tb) and the CDF matching between both.
Our current system has a Tb sensitivity to soil moisture of about 1.3 K/0.01 m 3 m −3 across the domain, averaged over all incidence angles and polarizations.A standard deviation in SM innovations of 0.03 m 3 m −3 would thus roughly correspond to a standard deviation in Tb innovations of about 4 K, but instead we find 7.4 K across the study domain in the Tb assimilation systems.The Tb observations thus either have a comparably higher observation (including representation) error or they contain more information than the SM retrievals.At this point, we anticipate that the larger Tb innovations in the central plains may indicate that the Tb observations contain more unfiltered information about soil moisture (e.g., irrigation) and that the Tb observation error is higher due to shortcomings, e.g., in the vegetation modeling (representation error).

Actual vs. simulated observation and forecast errors
In a near-optimal filtering system, that is, a system that correctly simulates the actual model and observation errors, the standard deviation of the normalized innovations ] λλ is close to unity (Reichle et al., 2002).Figure 4g-i  main (and across all angles and polarizations for Tb assimilation), this metric is 1.14, 1.11, and 1.23 (-) for Tb_7ang, Tb_fit, and SM retrieval assimilation, respectively.The figure thus suggests that, on average, the simulated errors in the assimilation system only slightly underestimate the actual errors.But the figures also show that the metric varies strongly across the domain and exhibits very different spatial patterns for Tb and SM retrieval assimilation.For Tb_7ang and Tb_fit assimilation, values are much larger than 1 in the central area and much smaller than 1 in the eastern forested area.This indicates that the assigned observation and forecast errors are severely underestimated in the central area and overestimated in the eastern forested area.Over forests, it can be assumed that the assigned representation error (part of the observation error) should be smaller.The Tb forecast error is already very small (see below), because the Tb uncertainty is only marginally sensitive to soil moisture uncertainties under dense vegetation.For SM retrieval assimilation, the pattern is reversed, with the largest values in the eastern half of the domain, suggesting that here the simulated errors underestimate the actual errors.Values less than 1 are found in most of the western half of the domain, where the SM retrieval assimilation seems to overestimate the actual errors.
To further interpret the actual and simulated error magnitudes, Fig. 4j-k show the ensemble spread in the Tb forecasts (that is, the simulated forecast error standard deviation) Averaged across all angles and polarizations λ, the values are around 2 K when averaged across the entire domain.Larger values (3 K) are found in the central and dry western part, and smaller values (1 K) in the wetter eastern part.This pattern is similar for the SM ensemble spread in the SM retrieval assimilation system (Fig. 4l).In dry climates, the root-zone soil moisture often drops to the wilting point, remains stagnant and no longer replenishes the surface.This results in increased sensitivity of the surface soil moisture to perturbations in meteorological conditions, and thus in higher uncertainty estimates for surface soil moisture in dry climates.
Given that the Tb observation error [R κ,i ] λλ is set to 6 K for each individual angle, polarization, and overpass time in the Tb assimilation, the approximate total assigned observation and forecast error is 6.1 K ( √ 6 2 + 2 2 ) across the study domain, 6.7 K ( √ 6 2 + 3 2 ) in the central area, and 6 K ( √ 6 2 + 1 2 ) in the eastern Appalachian area.Because the assigned observation error is uniformly set to 6 K, the spatial variability in the total simulated errors is thus too small compared to the actual errors (Fig. 4d-e), which ranges from more than 10 K in the central area to around 2-3 K in the eastern Appalachian area.
The SM observation error (after rescaling) is 0.02 m 3 m −3 on average across the domain, with higher values in the eastern part and lower values in the western part, with the exception of Mexico, California, and western Oregon, where higher observation errors are found (Sect.3.3).This general pattern is reversed in the SM forecast errors.Combined, the spatial variability in the SM observation and forecast errors does not capture the spatial variability in the actual errors (Fig. 4f), which leads to an overestimation of the errors in the west and an underestimation in the east.

Spatio-temporal patterns
The Kalman filter translates footprint-scale innovations into 36 km increments.Because of the spatially distributed (3-D) filtering (Sect.3.1), the number of increments in Fig. 6ac is about 1.4 times the number of assimilated observation sets (Fig. 4a-c).Many areas with missing observations (or observation predictions) are filled through interpolation and extrapolation.With SM retrieval assimilation, there is almost one increment per day.
The individual components of the wtot increments are shown in Fig. 6g-i for the surface excess increments, Fig. 6jl for the root-zone excess increments, and Fig. 6m-o for the catchment deficit increments.The patterns in wtot increments are dominated by catdef increments, and they generally reflect the patterns in the respective innovations' stan- dard deviations (Fig. 4d-f), which are very different for Tb and SM retrieval assimilation.The catdef increments pertain to the entire profile depth (which typically ranges between 2 and 3 m) and they presumably have a relatively small impact on the upper 5 cm soil layer (surface soil moisture): the domain-averaged magnitude of 5.4, 4.9, and 3.5 mm for catdef increments due to Tb_7ang, Tb_fit or SM retrieval assimilation, respectively (Fig. 6m-o), would linearly scale to about 0.1 mm for a 5 cm soil layer.This is a rough approximation: in reality the part of catdef that contributes to the 5 cm soil moisture cannot be calculated without computing the entire balanced profile.However, the approximate 0.1 mm is considerably less than the 0.6, 0.4, and 0.4 mm for the corresponding srfexc increments (Fig. 6g-i), which are directly applied to the upper 5 cm soil layer.The increments in rzexc (Fig. 6j-l) are relatively the smallest, because this variable is not perturbed by design.Both Tb and SM retrieval assimilation show similar spatial patterns in the standard deviations of srfexc increments (Fig. 6g-i): the largest increments are found in the dry west and the smallest in the wetter east.The patterns in srfexc increments agree with the patterns in the ensemble forecast uncertainty for this variable (not shown, but implied by the Tb and soil moisture uncertainty in Fig. 4j-l).The srfexc values are small with small uncertainties, and the increments are thus similarly bounded in both Tb and SM retrieval assimilation, yielding comparable spatial increment patterns.
Finally, Fig. 7 compares spatially and temporally collocated wtot, srfexc, and rzexc increments obtained with Tb_7ang assimilation, Tb_fit assimilation, and SM retrieval assimilation; i.e., the figure shows all pairs of increments available from two assimilation cases.The scatter plots show that the increments are usually small and unbiased.The correlation between the wtot increments (Fig. 7a) obtained by Tb_7ang and Tb_fit assimilation is 0.7, and aligns with the expectation that either Tb assimilation experiment roughly corrects for the same events.In contrast, the correlation between the increments obtained by Tb_7ang and SM retrieval assimilation is only 0.3 (Fig. 7b).The figure is similar when comparing the Tb_fit and SM retrieval assimilation (not shown).For srfexc and rzexc (Fig. 7c-f), the increments are again similar for Tb_7ang and Tb_fit assimilation, but different for Tb and SM retrieval assimilation.For all soil moisture prognostic variables, Tb assimilation leads to larger increments than SM retrieval assimilation.The different assimilation systems thus introduce distinct corrections to the modeled soil moisture trajectories.

Discussion
In a nutshell, Eq. ( 1) states that the increments are given by the product of the Kalman gain and the innovations.To explain the differences in increment patterns between Tb and SM retrieval assimilation, we must therefore consider each system's innovations and Kalman gains.The relatively larger magnitude of the Tb innovations compared to the SM innovations (Sect.4.1.2)contributes to the fact that the Tb assimilation results in larger soil moisture increments.This is the case even though the SM retrieval assimilation (unlike Tb assimilation) applies increments only to moisture variables and does not adjust modeled temperatures.
Furthermore, the Kalman gain matrices K k,i (Eq.2) for Tb and SM retrieval assimilation are different because the two systems employ different observation operators h i (.) and different observation error covariances R i .First, we note that the nonlinear inversion of Tb innovations to soil moisture increments, driven by the RTM in the observation operator, is not responsible for the larger wtot increments in the central grass and crop areas, because these areas exhibit low values for the microwave roughness parameter (h < 0.2, not shown) and a high sensitivity of Tb to soil moisture (as confirmed by the high forecast Tb errors in Fig. 4j-k).That is, in these areas commensurately large Tb innovations (O-F) values result in only small updates to soil moisture.
Second, the choice of a spatially uniform observation error covariance in the Tb assimilation experiment creates an imprint of the innovation pattern in the increment pattern.Higher increments are found in the agricultural areas with large Tb innovation standard deviations (Fig. 4d-e), because irrigation is not modeled and vegetation is not accurately parameterized.Since the filter is not set up to correct the latter, occasional excessive increments to soil moisture and temperature may be introduced.Such shortcomings could be mitigated by a more sophisticated assignment of Tb observation (representation) errors.
For SM retrieval assimilation, the pattern of the SM innovation standard deviation (RMSD) is similarly visible in the increments, with smaller values in the west and higher values in the east.Here again, the true spatio-temporal nature of the observation errors is not captured in the assigned observation error covariance and therefore propagated into the increments.Note also that the 0.03 m 3 m −3 SM innovation standard deviation (top 5 cm, Fig. 4f) is translated into a standard deviation of profile moisture increments of 0.002 m 3 m −3 (Fig. 6f rescaled by profile depth), but these increments are not equally distributed; i.e., larger increments are found for surface soil moisture and smaller increments for the deeper profile.

In situ validation
The above discussion highlights similarities and stark contrasts in how the Tb and SM retrieval assimilation systems operate.In this section, we look at the effect of these differences on the skill of the assimilation estimates vs. in situ observations.Figure 8 shows the RMSD ub (Sect.2.4) for the model-only open-loop (OL) simulation, and the change in RMSD ub (Sect. 2.4) between the OL simulation and either the Tb_7ang or SM retrieval data assimilation (DA) experiment ( RMSD ub = RMSD ub (DA) -RMSD ub (OL)) at individual SCAN and USCRN sites, for the period 1 July 2010-1 May 2015.The gray background shading indicates areas with modest topographic complexity and vegetation cover and where the satellite observations are most sensitive to surface soil moisture (details in De Lannoy and Reichle, 2016).The OL simulation has an average RMSD ub value of 0.054 m 3 m −3 for surface soil moisture and 0.039 m 3 m −3 for root-zone soil moisture.Looking more closely, the RMSD ub values are generally higher in the central and wetter eastern regions.In dry areas, the RMSD ub is limited, because the time series show a limited variability for lack of much precipitation.On average, both assimilation experiments introduce improvements at about 80 % of the sites for surface soil moisture, with spatially averaged RMSD ub values of −0.004 and −0.003 m 3 m −3 for Tb_7ang and SM retrieval assimilation, respectively.(Spatial average metrics are computed using a cluster-based algorithm, Sect.2.4.)The improvements are also propagated to the root-zone soil moisture (65 % of sites improved) with smaller average RMSD ub values of −0.002 and −0.001 m 3 m −3 , respectively.
The domain-average RMSD ub values caused by assimilation are only barely statistically significant for surface soil moisture in "favorable" areas, i.e., where the satellite observations are most sensitive to soil moisture (indicated with green background shading in Fig. 8).The differences between Tb_7ang, Tb_fit, or SM retrieval assimilation are not significant.The assimilation contributes an average relative improvement in surface soil moisture of 7 % of the OL RMSD ub in favorable locations and 4 % in non-favorable areas.Both Tb and SM retrieval assimilation show improvements in the central and eastern parts of the US, but perform poorly in the western dry mountain areas, where the RMSD ub for the OL was small and the assimilation may have introduced some additional noise.The Tb_7ang assimilation shows the largest improvements in the central US, whereas the SM retrieval assimilation shows the largest improvements in the southeastern part, for both surface and root-zone soil moisture.It is possible that the Tb assimilation has a larger impact in the central US than the SM retrieval assimilation, because irrigation events may be filtered in the SM retrievals (and perhaps partly assigned to vegetation opacity retrievals).
The bar plots in Fig. 9 summarize the average anomR values for the open-loop and data assimilation experiments, after stratifying all SCAN and USCRN sites into "favorable" and "non-favorable" categories (gray vs. white background in Fig. 8).The figures show that the open-loop anomR values for surface soil moisture are similar for both the favorable and non-favorable areas (0.51 and 0.50, respectively).However, data assimilation has a larger impact in favorable areas, where all assimilation schemes introduce significant improvements (anomR = 0.63, 0.61, and 0.59 for Tb_7ang, Tb_fit, and SM retrieval assimilation).In nonfavorable areas, the improvements are smaller but still significant (anomR = 0.57, 0.56, and 0.54, for Tb_7ang, Tb_fit, and SM retrieval assimilation).
In the root zone, data assimilation also improves the skill over the open-loop simulations, but without statistical significance.The open-loop simulations yield anomR values of 0.56 and 0.50 in favorable and non-favorable areas, respectively.In favorable areas, the assimilation increases the anomR to 0.64, 0.64, and 0.62, for Tb_7ang, Tb_fit, and SM retrieval assimilation.In non-favorable areas, the skill improvement is limited and the anomR values are 0.54, 0.54, and 0.52, for Tb_7ang, Tb_fit, and SM retrieval assimilation.In any case, with assimilation, all anomR values exceed 0.5, meaning that the skill becomes better than a climatological forecast (Brier skill score larger than 0).Overall, the skill metrics are comparable for the Tb_7ang and Tb_fit assimilation (Fig. 9).The results from SM retrieval assimilation are slightly worse than those from Tb assimilation, which may indicate that Tb observations indeed still contain more information (Sect.4.2) than the SM retrievals, which are implicitly filtered during the retrieval process.However, the differences between the domain-averaged skill values of the various assimilation schemes are minimal.Furthermore, when running the assimilation scheme with different spatially constant Tb observation error parameters, the skill metrics only changed marginally.This shows that our skill metrics are relatively insensitive to uniform changes in the data assimilation parameters.One reason for this is that the skill metrics are presented as (clustered) spatial averages, which compensate for large local differences.It is expected that the skill of our data assimilation systems can only be further improved by using a more localized (in space and time) approach to optimizing the assimilated observations (e.g., L2 SM retrievals) and the forecast and observation error parameters in the EnKF.

Open loop
Finally, unlike Liu et al. (2011), the skill improvements in this study are smaller when we correct the re-analysis precipitation input with gauge-based precipitation data (Reichle and Liu, 2014).This and other recent improvements in the GEOS-5 modeling system make it increasingly chal-lenging to obtain significant skill improvements from the assimilation of microwave observations over areas for which high-quality forcing data are available, such as the domain studied here.The benefits of the microwave-based soil moisture assimilation system are expected to be greater in areas with poorer ancillary inputs to the modeling system.This aspect will be further investigated through the validation of the global SMAP L4_SM data product.

Conclusions
The SMOS and SMAP satellite missions currently provide a wealth of L-band data to monitor large-scale soil moisture.
A key question is how to make the best use of these data in current land surface data assimilation systems.The L1 Tb data from these missions are often complex, because of their multi-polarization and possibly multi-angle nature and their indirect connection with soil moisture.In theory, the best approach is to directly assimilate Tb observations using a consistent data assimilation system, but a correct global characterization of the Tb forecast and observation errors remains difficult.The L2 SM retrievals are easily handled products, but their assimilation is impacted by errors introduced by inconsistent ancillary information in the SM retrieval algorithm and the assimilation system.With further improvements in Hydrol.Earth Syst. Sci., 20, 4895-4911, 2016 www.hydrol-earth-syst-sci.net/20/4895/2016/  the assimilated retrievals and careful selection of the ancillary data, SM retrieval assimilation may become a coequal alternative.
Three different data products from the SMOS mission are assimilated separately into the GEOS-5 land surface model to improve estimates of surface and root-zone soil moisture and to study the workings of each assimilation system.The first product consists of L1-based data of multi-angle, dualpolarization Tb observations at the bottom of the atmosphere.The second product is a derived 40 • Tb product that mimics SMAP data.The third product is the operational L2 SM dataset.Special care is taken during quality control and processing of the satellite observations prior to assimilation and within the assimilation system.The Tb assimilation uses a distributed EnKF with a temporally variable Tb bias mitigation, a system that is also used for the SMAP L4_SM product (Reichle et al., 2016).The SM retrieval assimilation uses a similar system, but with CDF matching instead to eliminate the more stationary SM innovation biases.The study covers most of North America for the period of 1 July 2010-1 May 2015.
The Tb and SM innovations show very different spatial patterns and the number of assimilated observations differs because of different needs for data screening and bias mitigation.Based on the average sensitivity of Tb to soil moisture, the magnitude of the Tb innovations is comparably larger than that of the SM innovations, which may either intro-duce more information or more error into the Tb assimilation system.The Tb and SM retrieval assimilation schemes also yield surprisingly different spatio-temporal increment patterns, leading to very different adjustments to the modeled soil moisture trajectories.Despite these stark differences, the various assimilation schemes yield soil moisture estimates with similar average skill metrics, computed from a set of 187 SCAN and USCRN sites across the US.Compared to in situ observations, both Tb and SM retrieval assimilations yield anomaly correlations around or larger than 0.6 for both the surface and root-zone soil moisture in "favorable" areas, where the satellite data are expected to better represent the soil moisture conditions, i.e., in areas with limited topographic complexity and limited vegetation.The anomaly correlation with data assimilation is between 0.5 and 0.6 in nonfavorable areas.The data assimilation introduces significant improvements over the model-only simulations for surface soil moisture everywhere, but the improvements are much larger in favorable areas.For the root zone, improvements are also found, but without statistical significance.While no significant differences in domain-averaged skills can be found between the various assimilation systems, there are large local differences in performance between the Tb and SM retrieval assimilation which may be due to differences in information content and screening of the observations, and differences in how close each of the systems is to an optimal calibration of its model and observation error parameters.Therefore, we expect that soil moisture data assimilation systems can be further improved only if the systems manage to better simulate the spatial and temporal variations of the actual errors in the model and the observations.Furthermore, the SM retrieval assimilation results will benefit from any future improvement in the SM retrievals.
In line with our findings for the SMOS data assimilation, we anticipate that future versions of the Tb assimilation system for the SMAP L4_SM product may benefit from an improved characterization of spatial model and observation error structures, and from a better representation of some modeling components, such as, e.g., vegetation.In addition, given that SMOS and SMAP both provide L-band Tb observations, future assimilation systems should consider a joint assimilation of SMOS and SMAP Tb data.In such a system, it is important to consider the different instrument, Tb processing, and Tb error characteristics of the two L-band missions (De Lannoy et al., 2015).

Data availability
The SMOS data are distributed by ESA.The model and assimilation results can be obtained from the authors upon request.

Figure 1 .
Figure 1.Flowchart of Tb assimilation.The forward simulation consists of (a) land surface model simulations and (b) Tb simulations on the 36 km EASEv2 grid.The Tb simulations are subsequently (c) aggregated using weights based on an approximate antenna pattern.The resulting footprint-scale brightness temperature observation predictions are compared to (d) SMOS observations to calculate innovations (O-F) at the footprint scale.(e) The three-dimensional EnKF maps the footprint-scale innovations to the 36 km EASEv2 grid based on the modeled error correlations between the footprint-scale Tb and the 36 km soil moisture and soil temperature state variables (per Eqs. 1 and 2).

Figure 4 .
Figure 4. Observation-space assimilation diagnostics for the period from 1 July 2010 to 1 May 2015.Number of assimilated observation sets for (a) Tb_7ang assimilation, (b) Tb_fit assimilation, and (c) SM retrieval assimilation.Standard deviation of the (d) Tb innovations from Tb_7ang assimilation, (e) Tb innovations from Tb_fit assimilation, and (f) SM innovations from SM retrieval assimilation.(g, h, i) Same as (d, e, f), but for normalized innovations (normO-F).Ensemble standard deviation of the (j) Tb forecast error for Tb_7ang assimilation, (k) Tb forecast error for Tb_fit assimilation, and (l) surface soil moisture forecast error for SM retrieval assimilation.The titles show the spatial mean (m) and standard deviation (s) across each map.

Figure 5 .
Figure 5. Hovmüller plots showing the temporal evolution of longitudinally averaged innovations (O-F) for the period from 1 July 2010 to 1 May 2015.(a) Tb_7ang innovations, averaged over Hand V -polarization, ascending and descending swaths, and over seven incidence angles.(b) SM innovations, averaged over ascending and descending swaths.

Figure 6 .
Figure 6.Statistics of the increments, calculated for the period from 1 July 2010 to 1 May 2015.Number of increments per day for (a) Tb_7ang assimilation, (b) Tb_fit assimilation, and (c) SM assimilation.Temporal standard deviation of total profile water (wtot) increments for (d) Tb_7ang assimilation, (e) Tb_fit assimilation, and (f) SM assimilation.(g, h, i) Same as (d, e, f) but for srfexc increments.(j, k, l) Same as (d, e, f) but for rzexc increments.(m, n, o) Same as (d, e, f) but for catdef increments.The titles show the spatial mean (m) and standard deviation (s) across each map.

Figure 7 .
Figure 7. Spatially and temporally collocated analysis increments from (a, c, e) Tb_fit assimilation and (b, d, f) SM retrieval assimilation vs. the same from Tb_7ang assimilation for (a, b) profileintegrated wtot increments, (c, d) srfexc increments, and (e-f) rzexc increments.Increments are from the period 1 July 2010 to 1 May 2015.The plot range is limited to the maximum value of 10 times the standard deviation in either experiment, and divided into 100 even sample bins.Colors indicate the number of sample points within each 1.5, 0.13, or 0.44 mm bin for wtot, srfexc, and rzexc, respectively.R is the spatio-temporal Pearson correlation coefficient between the individual increments from two assimilation experiments.

Figure 8 .
Figure 8. Unbiased RMSD (RMSD ub ) for the model-only open-loop (OL) simulation, and change in unbiased RMSD ( RMSD ub ) due to data assimilation at (circles) SCAN and (triangles) USCRN sites for (a, b, c) surface and (d, e, f) root-zone soil moisture.The skill of (a, d) the open-loop simulation is the reference value for the changes in skill due to (b, e) Tb_7ang and (c, f) SM retrieval assimilation.Statistically significant changes are marked by larger symbols (e.g., the southeastern US for SM retrieval assimilation).Metrics are calculated across 3 h time steps during the period from 1 July 2010 to 1 May 2015.The titles indicate the spatial mean ( )RMSD ub across all sites with clustering (31 clusters).The gray background shading marks areas with limited vegetation and topographic complexity based on model parameters.
, Tb fit DA, SM DA

Figure 9 .
Figure 9. Performance of open-loop and data assimilation experiments in terms of anomaly correlations (anomR) calculated across 3 h analyses and forecast time steps from 1 July 2010 to 1 May 2015 for (a) surface and (b) root-zone soil moisture.The bars show skill metrics averaged over sites in either favorable or non-favorable areas, where favorable areas refer to the areas indicated by the gray background shading in Fig. 8.The variable N is the total number of SCAN and USCRN sites considered for each category, with the number of clusters in parentheses.The error bars reflect clusteraveraged 95 % confidence intervals.