Evaluating the utility of satellite soil moisture retrievals over irrigated areas and the ability of land data assimilation methods to correct for unmodeled processes

Introduction Conclusions References


Introduction
Examples of human-induced land surface changes include urbanization, deforestation, and agriculture, all of which have significant impacts on local and regional water and energy budgets and hydrologic and biogeochemical processes. The expansion of infrastructure and agriculture, necessitated by increasing societal demands, has led to significant transformation of the natural features of the land surface, affecting more than 50 % of the land area (Hooke et al., 2012). Most current land surface models are not only severely deficient in representing the impacts of such engineered artifacts but are also limited in representing features of many natural systems such as seasonal flood plains and wetlands. Remote sensing measurements offer a potential alternative for capturing the effects of such unmodeled processes. Moreover, data assimilation, which is a common approach to merge the information from observations with model estimates, may provide 4464 S. V. Kumar et al.: Utility of soil moisture retrievals for irrigation detection a possible mechanism for incorporating the effects of such unmodeled processes into model estimates.
Irrigation is an important land management practice that has had a significant impact on the global and regional water budgets. As noted in Gordon et al. (2005), the global increase in water vapor flows from irrigation is comparable to the decrease caused by deforestation. It has been estimated that as much as 87 % of the global fresh water withdrawals by humans have been used for agriculture (Douglas et al., 2009), which leads to significant alteration of the global and regional hydrological cycle. Though recent studies have reported the development of conceptual representations of irrigation in land surface models (de Rosnay et al., 2003;Ozdogan et al., 2010;Leng et al., 2013;Lawston et al., 2015), capturing and representing the nature of irrigation practices remains a hard problem. Therefore, in this article we focus on irrigation as an analog of a human-engineered process that is typically not represented in land surface models.
There is a long legacy of retrieving estimates of surface soil moisture from satellite microwave radiometry using a variety of sensors (Jackson, 1993;Njoku and Entekhabi, 1995). In the past decade, near-surface soil moisture retrievals have become available from a number of passive microwave and scatterometer-based platforms. They include Advanced Microwave Scanning Radiometer-Earth Observing System (AMSR-E) aboard the Aqua satellite, WindSat multifrequency polarimetric microwave radiometer aboard the Coriolis satellite, the Advanced Scatterometer (ASCAT), a C-band active microwave remote sensing instrument aboard the Meteorological Operational (METOP) satellites, the Advanced Microwave Scanning Radiometer 2 (AMSR2) onboard the Global Change Observation Mission-Water (GCOM-W) satellite, the Soil Moisture Ocean Salinity (SMOS) mission, and the Soil Moisture Active Passive (SMAP) mission. Except for AMSR-E, which stopped functioning in October 2011, all these instruments are currently providing measurements of surface soil moisture. In this article, we first examine if the satellite soil moisture retrievals are effective in capturing the effects of irrigation. The comparison is performed through a quantitative comparison of the probability density functions (PDFs) of remote sensing data sets against those of land surface model simulations that do not include formulations of irrigation.
The second focus of the article is to examine whether the current data assimilation practices are adequate if unmodeled processes such as irrigation are present in observations. When irrigation practices are employed, they lead to wetter soil moisture relative to non-irrigated time periods. Assuming that the physical model used in data assimilation does not have irrigation formulations in it, the assimilated observations would have a systematic bias (relative to the model) during irrigation periods. In real data assimilation systems, biases between model forecasts and observations are unavoidable, and they typically result from a combination of model deficiencies, instrument and retrieval errors.
Proper treatment of these biases is important, as the assimilation methods are primarily designed to work with errors that are strictly random (Dee and da Silva, 1998). Here we evaluate the impact of using common bias correction practices when unmodeled processes are the primary source of the biases between the model and observations.
Primarily, there are two approaches to handling biases in data assimilation systems (Dee, 2005): (1) "bias-aware" systems which are built to diagnose and correct the biases in the observations and/or the model forecasts during data assimilation integration, and (2) "bias-blind" systems which assume the observations and model forecasts to be unbiased. Ideally, biases must be estimated by comparing the observations and/or model states to the true mean states derived, for example, from in situ measurements. However, as noted in Draper et al. (2015), developing spatially distributed bias estimates is much harder for the land surface, compared to the atmosphere or ocean, since point-scale in situ observations are generally not representative of the spatial scale of remotely sensed or modeled states, due to the heterogeneity of land. Though there have been a number of studies that rely on online estimation of biases (De Lannoy et al., 2007;Reichle et al., 2010), the common practice in land data assimilation studies is to remove the bias between the observations and the model and to use a bias-blind assimilation approach to correct only short-lived model errors. This is typically achieved by rescaling the observations prior to assimilation, to have the same statistics as the model, using quantile mapping approaches so that the observational climatology matches that of the land model. This approach is easy to implement as a preprocessing step to the data assimilation system and has been used extensively in many land data assimilation studies (Reichle and Koster, 2004;Drusch et al., 2005;Crow et al., 2005;Reichle et al., 2007;Kumar et al., 2009;Liu et al., 2011;Draper et al., 2011Draper et al., , 2012Hain et al., 2012;Kumar et al., 2012Kumar et al., , 2014. A known disadvantage of the approach is that it assumes stationarity in model-observational biases and cannot easily adjust to dynamic changes in bias characteristics. Common quantile mapping approaches used for scaling observations into the model's climatology include the standard normal deviate based scaling (Crow et al., 2005) and the CDF (cumulative distribution function)-matching method (Reichle and Koster (2004); hereafter referred to as RK04). The standard normal-deviate-based scaling matches the first and second moments of the observation and model distributions, whereas the CDF matching approach corrects all quantile-dependent biases between the model and observations, regardless of the shape of the distributions.
When observations are rescaled prior to assimilation, standard normal deviates or percentiles (rather than the raw observations) are assimilated. This ensures that the model climatology is preserved in data assimilation and that assimilation only affects temporal patterns of the anomalies. In such cases, the influence of assimilation is likely to be greater at the shorter timescales (Kumar et al., 2014).
In this article, we argue that the approach of rescaling the observations could be problematic, particularly when the underlying distributions of the model estimates and the observations are different. Such differences in the distributions are possible when features from human-induced activities such as irrigation are present in observations and missing in modeled estimates. Through a suite of synthetic experiments, we demonstrate the limitations of the rescaling approaches when the reference climatology is fundamentally limited in representing unmodeled processes whose effects nevertheless impact the observations. In such cases, stationarity assumptions about the climatologies could also lead to spurious, statistical features in the assimilation results. As a result, the use of rescaling would become problematic for demonstrating short-term assimilation impacts.
The rescaling approach through CDF-matching for land data assimilation proposed by RK04 was motivated by the fact that the true climatology of soil moisture at the global scale remains unknown. The CDF-matching method is based on similar applications of the method for establishing rainfall-reflectivity relationships for the calibration of radar or satellite observations of precipitation (Calheiros and Zawadzki, 1987;Atlas et al., 1990;Anagnostou et al., 1999). The quantile mapping methods are also widely used for correcting biases of regional climate model simulations relative to observational data (Salathe Jr. et al., 2007;Li et al., 2010). These studies assume that the probability density functions of radar reflectivity and in situ rainfall are equivalent and therefore quantile mapping can be used to translate one into the other. RK04 extended this approach to soil moisture data assimilation by transforming the satellite soil moisture retrievals into the model's climatology for removing the relative biases between the model and observations. The important difference between the precipitation/climate downscaling studies and RK04 is that in the former, the remotely sensed retrievals/climate model data were rescaled to observed data, whereas in RK04 the satellite retrievals were rescaled to a modeled climatology. In this article, we demonstrate that rescaling to a model climatology that is not representative of the observations may distort the scale of the actual observational features and may lead to loss of valuable signals.
There are a number of alternative strategies for bias correction in bias-blind data assimilation systems. Instead of employing a single CDF (at each grid cell) that encapsulates the soil moisture dynamics across all seasons (called lumped CDFs), temporally stratified (monthly or seasonally) CDFs can be used. The finer temporal stratification would help to reduce the impact of statistical artifacts of using lumped CDFs, but would also require sufficient sample sizes to accurately derive CDFs for each temporal window. As demonstrated in Kumar et al. (2012), the land surface model could be calibrated against the retrieval products and the calibrated, unbiased model could then be used in assimilation. Though this strategy eliminates the biases in the variables being as-similated, the climatologies of other outputs from the model could be affected, unless additional constraints are included in calibration. De Lannoy et al. (2013) employed a similar strategy for assimilating SMOS L-band brightness temperatures, by calibrating the forward radiative transfer model parameters and by keeping the land surface model parameters unchanged. This strategy preserves the land surface model climatology but only works when radiance measurements are being assimilated through a forward model. Forman et al. (2014) suggested a similar strategy for assimilating passive microwave brightness temperatures for snow data assimilation through the use of an artificial neural network (ANN). The ANN uses the inputs from the land surface model and is trained to the observed brightness temperatures to be assimilated. The results of Forman et al. (2014) indicated that the ANN could serve as a computationally efficient observation operator instead of more complex radiative transfer models. In the current study, we evaluate the effectiveness of a number of these strategies for land data assimilation when the observations include the effects of processes that are not included in the land surface model.
The article is organized as follows: First, we examine the effectiveness of satellite soil moisture retrieval products in their ability to capture the effects of irrigation (Sect. 2). The evaluation is conducted by quantitatively comparing the probability distribution functions from various remote sensing soil moisture data sets and land surface model simulations. The article then focuses on the impact of various a priori bias correction approaches in data assimilation when the distributions of the model and the observations are significantly different due to unmodeled irrigation processes (Sect. 3). Section 3 presents a synthetic data assimilation experiment that explores the limitations of a suite of a priori bias correction strategies in such scenarios. Finally, Sect. 4 presents a summary and discussion of major conclusions of the study.

Evaluation of satellite remote sensing data over irrigated areas
In this section, we examine the utility of modern soil moisture remote sensing data sets towards the detection of irrigation features. The irrigation practices over the world differ in the method of irrigation, the trigger used and the amount of water used in irrigation. A typical irrigation practice in the US is to apply irrigation throughout the growing season at a level where the plants are not under transpiration stress. The introduction of irrigation at the beginning of the growing season would lead to increased surface soil moisture and a significant dry down would only occur at the end of the growing season when irrigation controls are removed. Figure 1 shows the MODIS-based irrigation grid-cell fraction map (%) derived by Ozdogan and Gutman (2008) and validated against USGS irrigation data. Some of the known To demonstrate the impact of irrigation, land surface model simulations are conducted at a single grid point located in the plains of Nebraska (as shown in Fig. 1). Figure 2 presents the time series of surface soil moisture (using a 10 cm thick surface layer) for a representative year from two simulations of the Noah (version 3.3; Ek et al., 2003) land surface model (LSM). The simulations demonstrate the impact of irrigation at this location: (1) model forced with a given meteorological forcing data (SIM1), and (2) a seasonal irrigation scheme simulated on top of the SIM1 configuration (SIM2). The simulations use the modified 20 category MODIS land cover data (Friedl et al., 2002) and are forced with meteorological boundary conditions from the North American Land Data Assimilation System Phase 2 (NLDAS-2; Xia et al., 2012) data. The initial conditions for the model simulations are generated by spinning up the LSM from 1979 to 2000. The irrigation scheme employed here simulates a demand driven, sprinkler irrigation technique, based on Ozdogan et al. (2010). Irrigation is triggered when the root zone soil moisture falls below the transpiration stress threshold for a particular grid cell. The scheme computes the irrigation requirement as an equivalent height of water, which is then applied as an addition to the precipitation input to the model. The irrigation scheme is only applied to irrigated land types such as crops and grasslands and is only enabled during growing seasons (when 40 % of the annual range of green vegetation fraction at a grid cell is exceeded). In addition, the irrigation requirement is enabled daily between 06:00 and 10:00 LT (local time), similar to the approach used in Ozdogan et al. (2010).
As the seasonal irrigation picks up in April, surface soil moisture in SIM2 gets much wetter compared to the SIM1 integration. Towards the end of the summer, the imposed irrigation is removed, which then causes the soil moisture to dry down and approach the SIM1-based estimates. For the purpose of developing climatologies, the model integrations were conducted for several years (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012). Similar features are repeated in other years, leading to a seasonal wet bias in SIM2 compared to SIM1, mainly in the summer months.
A comparison of the soil moisture distributions from the two integrations are shown in Fig. 3, which shows a plot of the quantiles of SIM1 data against the quantiles of SIM2 data. For comparison, the figure also shows a 45 • reference line. If the two data sets come from similar distributions, the points in the q-q (quantile-quantile) plot should fall approximately along this reference line and the departure from the reference line indicates differences in the distributions of the two data sets. Figure 3 indicates that there are significant differences in the SIM1 and SIM2 distributions with shifts in location, scale and symmetry. The data points are systematically above the 45 • line indicating that the mean of the distributions are significantly different, with the distribution for SIM2 valued higher than that of SIM1. The q-q plot also shows a bimodal nature due to the seasonal effect of irrigation. Finally, the slope of the points on the q-q plot is higher than 1, indicating that there are differences in the spread or variances of the two distributions as well.
A two-sample Kolmogorov-Smirnov (K-S) test (Chakravarti et al., 1967) can be used to quantitatively compare the probability distributions of two data sets. The K-S statistic quantifies a distance between empirical distribution functions of two samples (F (x) and G(x) where x is the sampled variable) and is computed as follows: where m and n are the sample sizes of F and G. The null distribution of the K-S statistic is calculated under the null hypothesis that samples are drawn from the same distribution. The null hypothesis is rejected at level α if where c(α) is the inverse of the Kolmogorov distribution at α.
To examine the impact of irrigation on soil moisture distributions over a larger spatial domain, the SIM1 and the SIM2 experiments are extended to a larger domain, encapsulating the continental US at 0.125 • spatial resolution. Values of D closer to zero indicate that the soil moisture distributions from the SIM1 and SIM2 integrations are similar. Conversely, larger D values indicate locations where the soil moisture distributions from the two integrations differ. As the difference between the two integrations in this example is only due to the simulation of seasonal irrigation, the locations with positive K-S metric values in Fig. 4 indicate areas where the irrigation artifacts are applied and are consistent with the input irrigation intensity data used in the simulations. The K-S metrics, therefore, can be used to detect instances where the distributions of soil moisture retrievals and the model estimates differ significantly, including differences due to the treatment of irrigation. Figure 5 shows a quantitative comparison of the differences in soil moisture distributions using the K-S metrics from six remote sensing soil moisture retrievals and a land surface model simulation (SIM1 configuration), for the continental US. The remote-sensing-based products are (1) the blended multi-sensor soil moisture product from the European Space Agency (ESA) known as the essential climate variable (ECV) product (Liu et al., 2012b), (2) soil moisture retrievals from AMSR-E using the Land Parameter Retrieval Model (LPRM) algorithm (Owe et al., 2008), (3) soil moisture retrievals from WindSat, (4) soil moisture retrievals from the backscatter measurements acquired by ASCAT, (5) soil moisture retrievals from AMSR2, and (6)   The CDFs for each data set are computed using all available data. The available quality control information in each remote sensing data set is used to exclude data over regions with dense vegetation, radio frequency interference, precipitation and frozen ground. The model CDFs are computed using the simulated surface soil moisture estimates from 2000 to 2013. Differences in the dynamic range between observed and modeled soil moisture are normally removed prior to assimilation and here we remove the differences in the mean and variance prior to calculating the K-S metrics. The data values are normalized first with a standard score approach ((x t i − µ i )/σ i where x t i is the data value at time t and µ i and σ i are the mean and standard deviation of the data at grid point i) before computing the K-S metrics. Figure 5 shows the D estimates from the K-S test for each comparison at grid points where the null hypothesis of the K-S static is not rejected. Not surprisingly, the ECV data comparison shows the lowest D values, possibly due to the fact that the ECV product was generated by CDF matching soil moisture estimates from different sensors to a simulation of the Noah land surface model from the Global Land Data Assimilation System (GLDAS; Rodell et al., 2004). Comparatively, larger differences are seen in all other comparisons, which are likely caused by a mix of biases resulting from instrument error, retrieval algorithm errors, unmodeled processes, and other representativeness differences.
The spatial patterns of these metrics shown in Fig. 5 can also be potentially used as a first measure of whether a sensor captures observational features such as irrigation. Specif-ically, a relatively small K-S metric at a location known to have irrigation suggests that the remotely sensed observations did not detect that irrigation. However, the converse is not necessarily true, in that a large K-S metric does not necessarily indicate successful detection of irrigation (since it could be caused by other model/remotely sensed discrepancies). For example, a strong signal of vegetation density in the eastern US can be noticed in the K-S metric map for AMSR2. Similarly, in the ASCAT K-S metric map, large differences can be observed around several major cities such as Dallas, Houston and Atlanta. We focus on three key hotspots of irrigation in the US shown in Fig. 1: the plains of Nebraska, lower Mississippi Basin and California's Central Valley. Of these three regions, only the lower Mississippi has relatively higher K-S metric values, and only for AMSR-E, AMSR2 and SMOS.
To examine if the spatial patterns of differences in K-S metrics from various remote sensing data sets are in fact representative of observational artifacts such as irrigation, we examine the time series of soil moisture over these regions. For an equivalent comparison given the possible differences in the respective dynamic ranges, each data set is normalized first (using the standard score approach) before comparing them on the same graph. Figure 6 shows the normalized time series of soil moisture from the observations (from SMOS, AMSR2 and ASCAT), the Noah LSM driven with the NLDAS-2 forcing, with and without irrigation for the year 2013. A 5-day moving average is applied to the SMOS, AMSR2 and ASCAT retrievals to reduce the noise in the Hydrol. Earth Syst. Sci., 19, 4463-4478 satellite retrieval time series. As the grid-cell averages for the satellite retrievals reflect averages of the irrigated and nonirrigated pixels, a similar weighted estimate was produced for the LSM-based irrigated soil moisture values. The LSM time series representing irrigation in Fig. 6 is generated by weighting the model soil moisture estimate with and without irrigation for each grid cell by the irrigation fraction and (1 − irrigation fraction) of that grid cell, respectively. Finally, in each region, only grid cells with at least 30 % irrigation fraction indicated by the MODIS map are employed in computing the spatial averages. Figure 6 indicates that the SMOS and AMSR2 retrievals agree more closely with the LSM estimate without irrigation, in all three regions. In particular, there are few indicators of systematic differences between the observations and the model without irrigation in the summer months, suggesting the limited skill of the SMOS and AMSR2 retrievals for detecting features of seasonal irrigation. In contrast, ASCAT retrievals show better agreement with the LSM estimate with irrigation in the summer and fall months, over the plains of Nebraska and lower Mississippi Basin. In these regions, the ASCAT moisture signal shows a wetter trend in the late fall months, which are in agreement with the LSM estimate with irrigation. In California's Central Valley, however, no such distinct contrast due to irrigation is observed in all three satellite retrievals. Similar trends are seen in other years (not shown). From these results, it appears that neither SMOS nor AMSR2 retrievals capture the effects of irrigation, whereas the ASCAT retrievals are somewhat effective in detecting ir-rigation features in the plains of Nebraska and the lower Mississippi Basin.
It is important to note here that the apparent inability of SMOS and AMSR2 to capture the irrigation signal should not be assumed attributable to sensor deficiency; it may instead reflect their larger spatial footprints. The raw resolutions of SMOS and AMSR2 products used here are at least of 40 km, much coarser than the 0.125 • resolution employed in the LSM simulations. Thus, because Fig. 6 focuses on 0.125 • grid cells with at least 30 % irrigation, the SMOS and AMSR2 data (interpolated to that resolution) will necessarily include some soil moisture information from areas outside those defined by the 30 % threshold -areas that are, almost by definition, drier. ASCAT, with a raw resolution of ∼25 km does not seem as affected by this, perhaps in part due to its finer base resolution. Another possible reason may be related to the influence of intercepted water, which has opposite effects on the active and passive sensors. More analysis is needed, however, to understand the different behaviors of the sensors.
Note that in the model formulations, irrigation is simulated consistently from the late spring months to early fall months, though these assumptions about the timing and duration of irrigation in these regions may be imperfect relative to the actual practices in the field. In the plains of Nebraska, the agreement between the ASCAT and model with irrigation is consistent throughout the summer and early fall months (from late June to early October). In the lower Mississippi, on the other hand, the ASCAT time series indicates that the application of irrigation occurs in the later months (from late August onwards). The agreement between the model with irrigation and the ASCAT time series is lower in the early summer months. Though it is hard to ascertain the ability of ASCAT data for characterizing the timing of irrigation, it can be concluded that ASCAT retrievals perform better than the SMOS and AMSR2 retrievals in terms of capturing the anomalous wet soil moisture signals from irrigation over these areas known to be irrigated.

Evaluation of bias correction strategies in the presence of unmodeled processes
This section presents an examination of the effectiveness of a number of a priori bias correction strategies in data assimilation when unmodeled processes (such as irrigation) are a major source of biases between the model and the observations. A synthetic experiment setup based on the SIM1 and SIM2 configurations presented in Sect. 2 is used to explore these issues. If SIM2 represents the observations to be used in assimilation, the typical procedure in data assimilation systems is to rescale SIM2 estimates to the model climatology (SIM1 in this example). Figure 7 illustrates the impact of rescaling SIM2 to SIM1 climatology with CDF matching (using both lumped and monthly CDFs), for the year 2000. When lumped CDFs are used, rescaling leads to a wetter soil moisture time series (compared to SIM1) during the summer months (but significantly lower than SIM2), whereas during the nonirrigation months, rescaling leads to a much drier soil moisture time series, relative to SIM1. Lumped CDF matching attempts to keep the climatology of the rescaled time series to be close to the overall SIM1 climatology. As a result, higher soil moisture values during irrigation are compensated by lower soil moisture values during non-irrigated months to keep the overall climatology the same as that of SIM1. In this example, the lumped CDF-based rescaling approach introduces spurious statistical artifacts during non-irrigated periods. The statistical artifacts of rescaling during the nonirrigated months are greatly reduced if the CDF matching is performed in a more temporally stratified manner. As indicated by Fig. 7, when rescaling uses monthly CDFs, the resulting time series remain close to SIM1 both during the irrigated and non-irrigated periods. Note that most data assimilation studies Kumar et al., 2009;Liu et al., 2011;Draper et al., 2012;Kumar et al., 2012Kumar et al., , 2014 use the lumped CDF-scaling approach due to sampling density limitations of using temporally finer-resolved CDFs.

Structure of synthetic data assimilation experiments
The suite of data assimilation experiments employs an identical twin experiment setup. The model simulations are conducted at the single grid point shown in Fig. 1. The Noah LSM simulation forced with the NLDAS-2 data is termed the open loop (OL) integration. A scheme designed to mimic seasonal crop irrigation employed on top of the OL configuration is used as the "Control/Truth" simulation. All model integrations use the same forcing and parameter data sets as that of the experiment presented in Sect. 2. The time period from 2000 to 2012 is used here for various evaluations. From the truth simulation, observations are generated after incorporating realistic errors and limitations of passive microwave remote sensing retrievals. To account for difficulties in retrieving soil moisture products from microwave sensors, the observations are masked out when the green vegetation fraction values exceed 0.7 and when snow or precipitation are present. Random Gaussian noise with an error standard deviation of 0.02 m 3 m −3 is added to the truth soil moisture values to mimic measurement uncertainties, which is an optimistic estimate of the error levels in the current space-borne L-band radiometers (SMOS and SMAP). Finally, a data assimilation (DA) integration that assimilates the simulated observations in the OL configuration is conducted. The DA and OL integrations are compared against the known truth to evaluate the impact of observations.
Most synthetic experiment studies (Reichle et al., 2002b;Crow et al., 2005;Crow and Reichle, 2008;Kumar et al., 2009Kumar et al., , 2012 Fig. 2 and including the surface soil moisture time series from rescaling SIM2 to the SIM1 climatology. The red and blue lines represent the SIM2 integration rescaled to the SIM1 climatology using lumped and monthly CDF matching, respectively. and OL configurations to simulate the systematic biases that are often present (between observations and the model) in real data assimilation scenarios. Here we intentionally use a setup where the only difference between the control run and OL is a process (irrigation) that is not modeled in the OL simulation but is included in the control run. One could en-vision similar issues in real data assimilation systems, where features from engineered systems will be present in observations but not simulated in physical models. In this idealized scenario, biases between the model and the observations are purely from observational features that are not modeled.
Four different data assimilation integrations are conducted using the synthetic observations: (1) DA-NOBC, assimilating observations directly without any bias correction; (2) DA-CDFL, assimilating a priori-scaled observations using CDF matching (using lumped CDFs representing all years and seasons); (3) DA-CDFM, assimilating a prioriscaled observations using monthly CDF matching (the model and observation CDFs are generated separately for each calendar month); and (4) DA-ANN, assimilating the simulated observations directly and using a trained ANN as the observation operator in the data assimilation system (see Sect. 3.3 for details). In experiments DA-NOBC, DA-CDFL and DA-CDFM, the observation operator is the land surface model itself, whereas the observation operator is represented by the trained ANN in the DA-ANN experiment.
In the DA-CDFL experiment, the observation and model CDF are first computed independently for each grid cell using the 13-year (2000-2012) period. During data assimilation, the observations are rescaled (separately for each grid cell) using these lumped CDFs. As noted by Drusch et al. (2005), the climatologies between the model and observations may change with season, which is clearly the case in our synthetic experiment setup due to the influence of seasonal irrigation. In the DA-CDFM experiment, the observation and model CDFs are generated separately for each month and for each grid point. The 13-year record of data ensures that there is enough sampling density to accurately derive CDFs when the CDF calculation is stratified by calendar months.

Data assimilation method
The data assimilation integrations are conducted using a onedimensional ensemble Kalman filter (EnKF; Reichle et al., 2002a) algorithm. An ensemble size of 12 is used in the simulations with perturbations applied to both meteorological fields and model prognostic fields to simulate uncertainty in the model estimates. The determination of 12 as the ensemble size was based on prior works Kumar et al., 2008Kumar et al., , 2009Kumar et al., , 2012 and because the size of the model state vector is small (4 Noah soil moisture state variables). The EnKF alternates between an ensemble forecast step and a data assimilation step. An ensemble of model states is propagated forward in time using the land surface model during the forecast step. In the update step at time k, the model forecast is adjusted toward the observation based on the relative uncertainties, with appropriate weights expressed in the "Kalman gain" K k : where x k and y k represent the model state and observation vectors, respectively. The observation operator H k relates the model states to the observed variable. The superscripts i− and i+ refer to the state estimates of the ith ensemble member (−) before and (+) after the update, respectively. Equation (3) indicates that the analysis increments ( are computed by multiplying the Kalman gain K k with the innovations (y i k − H k x i− k ). In "bias-blind" data assimilation systems, observations (y k ) and model forecasts (H k x i− k ) are expected to be unbiased relative to each other, which presents two choices for bias correction: (1) rescale observations into the model climatology, so that the innovations are computed in the climatology of H k x i− k or (2) compute the innovations in the observation space by having an operator (H k ) that translates the model states into the observation space. The quantile mapping approaches fall in the first category, whereas the use of trained forward models as observation operator represents the second category. We examine the impact of using both sets of approaches when unmodeled processes dominate the sources of biases.

Use of a trained ANN as a forward observation operator
Artificial neural networks (ANNs) are data processing systems used for pattern matching applications and consist of a highly interconnected array of processing elements (called neurons), designed as a mathematical generalization of human cognition and learning. The basic architecture of an ANN consists of three layers: input, hidden and output layers. The inputs processed through the input layer are communicated to the hidden layers and the results are output through the output layer. The topology of the layers (defined by "activation functions") and the weights of the interconnections are used to develop accurate outputs. During the training phase, the ANN is presented with a set of inputs and corresponding outputs. The trained ANN can then be used for generating new predictions when presented with a new set of inputs. Figure 8 shows the structure of the ANN used in this study. The input layer consists of six inputs, which are a combination of the meteorological inputs (rainfall and snowfall), land surface model parameters (green vegetation fraction) and land surface model estimates (surface soil temperature, snow water equivalent and surface soil moisture). Note that the surface soil moisture in the input layer is from the LSM integration without irrigation. Five neurons were employed in the hidden layer based on a similar approach used in Cao et al. (2008) and Forman et al. (2014). For this study, a single output node that estimates surface soil moisture values is used. The ANN is trained to the simulated surface soil moisture observations at this grid point (generated from the truth integration) during the time period of 2000-2012. Since the entire observation record is used for estimating CDFs, we use the whole record for training the ANN as well, so that the experiments are comparable. Figure 9 shows a comparison of the simulated observations and the estimates from the trained ANN for the year 2000. It can be seen that the trained ANN model helps to capture the wetting of the soil due to seasonal irrigation during the spring and summer months. Similar patterns are observed in other years (not shown). The skill in turning on or off irrigation in the ANN is likely due to the incorporation of the information in the training inputs of soil temperature and green vegetation fraction. The trained ANN is then used in the DA experiments.

DA experiment results
The evaluation of the four DA experiments is presented in Figs. 10-12. Figure 10 shows daily averaged soil moisture estimates from various model integrations for the year 2000 as a representative time series. The DA-NOBC integration assimilates the raw observations (shown in the figure) and as a result provides soil moisture estimates closer to the Control simulation during times when observations are available. DA-CDFL and DA-CDFM integrations ingest rescaled observations (not shown), which do not show a systematic increase in the soil moisture values during the spring and summer months. Similar behavior is seen for the DA-ANN integration, which assimilates the raw observations, but does not represent the anomalously wet soil moisture of the Control simulation. The DA-CDFL, DA-CDFM and DA-ANN integrations do not deviate much from the open loop integrations as the size of the analysis increments (K k [y i k − H k x i− k ]) in these integrations is small. In DA-CDFL and DA-CDFM, the rescaling causes the innovations to be computed in the climatology of the model states whereas in the DA-ANN experiment, the innovations are computed in the climatology of the observations. In either case, when these small increments generated by the assimilation system are applied back to the soil moisture forecast values (in the open loop climatology), the anomalous wet signals in the observations are removed as bias artifacts and are never included in the analysis. Figure 11 shows the average seasonal cycle of RMSE (root mean squared error; stratified monthly across the entire simulation period of 2000-2012) of surface soil moisture from various model integrations. Similar to the trends in Fig. 10, the DA-NOBC integration shows significant improvements from data assimilation except for August. The peak of vegetation (determined based on the green vegetation fraction) occurs in August leading to observations being excluded from the data assimilation system. As a result, the improvements through assimilation are small during this time period. The seasonal nature of the RMSE estimates from DA-CDFL, DA-CDFM, and DA-ANN is similar and is close to the open loop RMSE estimates. The use of the scaled observations (in DA-CDFL and DA-CDFM) and the use of the trained forward model (in DA-ANN) causes the dampening and exclusion of the wet biases from irrigation in these DA integrations.
An important philosophical point, however, is warranted here. Implicit in the above discussion of Fig. 11 is the assumption that a higher RMSE reflects a poorer performance. Depending on application, this may not be true at all. It is a well-established fact that the soil moisture estimate from the model is essentially an index of wetness and a highly modeldependent quantity . As a result, care must be exercised when comparing model soil moisture directly to in situ or satellite measurements. The whole point of the scaling exercise is to convert a satellite-based soil moisture value, prior to its assimilation, to a value consistent with that of the LSM used. This allows the further use of the assimilated soil moisture value in that LSM, e.g., to initialize a forecast. If, once the data assimilation process is finished, a soil moisture value is needed that reflects a more "correct" climatology (e.g., with an irrigation-influenced seasonal cycle, as in the Control simulation), the data assimilation product can easily be scaled back to that climatology using the reverse of the original scaling approach. Viewed in this light, the data assimilation approach, with scaling, is essentially designed to capture the year-to-year or short-term variations in soil moisture anomalies rather than the structure of the seasonal cycle. Also note that, though the seasonal cycle of RMSE is lowest in the DA-NOBC integration, this configuration is not really viable in real data assimilation systems where biases are unavoidable. The DA-NOBC integration is included in the suite of experiments, as we have the knowledge of the exact sources and magnitudes that contribute to the biases in this synthetic configuration. For real data assimilation systems, metrics recommended by Entekhabi et al. (2010), which compute estimates of soil moisture accuracy while accounting for biases, may be more appropriate.
The EnKF algorithm assumes linear system dynamics. It further assumes model and observation errors that are Gaussian and mutually and serially uncorrelated. If these assumptions hold, then the distribution of filter innovations (observation minus model forecast residuals) normalized by their expected covariance will follow a standard normal distribution N [0, 1] (Gelb, 1974). The deviations from the N[0, 1] of the normalized innovation distribution is typically used as a measure of the degree of suboptimality of the data assimilation system (Reichle et al., 2002a;Crow and Van Loon, 2006;Reichle et al., 2007;Kumar et al., 2008). Figure 12 compares the distribution of the normalized innovations from These internal diagnostics are also often used for the estimation of input error parameters of the data assimilation system. For example, in adaptive filter implementations , the model and observation error specifications are continually adjusted to yield near-optimal behavior of the internal diagnostics (i.e., close to N [0, 1] response of normalized innovation distribution). The analysis presented above indicates that if unmodeled processes are present, these bi-ases are reflected in the innovation diagnostics as deviations from expected optimal measures. In such cases, the reliance on these assimilation diagnostics may be misleading if the end goal of the assimilation process is to correct the modeled seasonal cycle of soil moisture toward that of the observations. As noted above, though, the goal may instead be to capture, for various applications, year-to-year anomalies in soil moisture and, for this purpose, the innovation diagnostics reveal that the scaling approaches do provide superior behavior. Again, though, it should be pointed out that the synthetic experiment was designed to isolate the impacts of unmodeled irrigation. In practice, the effects of unmodeled irrigation will be conflated with bias issues that result from differences in land surface parameters and differences in the very meaning (such as layer-depth) of the modeled and retrieved soil moisture values.

Summary
Due to the heterogeneity of the land surface and the large impact of human activities, quantifying the variability of water and energy budgets on the land surface presents unique challenges compared to the atmosphere and ocean components of the Earth system. Irrigation is one of the pervasive human-induced land management practices that has a direct impact on the local and regional water budgets. In this article, we examine the utility of satellite soil moisture retrievals to detect irrigated areas. In addition, the article also examines the limitations of current data assimilation practices when the observations are dominated by processes that are not included in the land model.
Application of seasonal irrigation is likely to introduce systematic differences in the soil moisture distributions. Therefore, if the remote sensing data sets are skillful in detecting irrigation features, the resulting soil moisture distributions would be significantly different compared to a model simulation that does not simulate irrigation. We use this hypothesis to examine the effectiveness of modern remote sensing soil moisture products from ASCAT, AMSR2 and SMOS in their ability to detect irrigation. A two-sample Kolmogorov-Smirnov test is used to quantify the systematic differences between distributions of model and remote sensing data sets, over a continental US domain. The analysis reveals systematic differences in spatial patterns of the distributions of model and remote sensing data. Additional analysis, however, suggests that these differences are not always related to the detection of irrigation artifacts. Generally, AS-CAT retrievals were found to be somewhat more skillful than the SMOS and AMSR2 retrievals in their ability to capture features of irrigation on the land surface.
Overall, the analysis presented in the paper assumes a demand-driven irrigation scheme maintained throughout a growing season at a level where the plants are not under transpiration stress. In reality, however, the type and level of ir-rigation may not be seasonally persistent and therefore the nature of the expected biases in the soil moisture signal due to irrigation may not be systematic throughout a season. Further comparisons with in situ soil moisture data at irrigated locations will be required to confirm and isolate the limitations of the remote sensing data over these areas. A major source of the biases between the satellite retrievals and the LSM estimates is the differences in the land surface parameters used in the respective models. The biases from these parameter differences are likely to dominate the more subtle effects of irrigation. In addition, the scale mismatches between the model and the observations are also likely to have an influence in the comparisons presented here. The spatial resolution of the model (0.125 • ) and the observations (∼ 25-40 km) can be considered relatively coarse for detecting uniformly and simultaneously irrigated areas.
The second focus of the article is on the limitations of various a priori bias correction strategies in land data assimilation towards representing unmodeled processes. This issue is explored through a suite of synthetic data assimilation experiments. A simulation of seasonal irrigation is used as analog for an engineered process that is typically not included in large-scale land surface model simulations. The data assimilation integrations merge the observations generated from the irrigation simulation into a model in which irrigation is absent and features a free-running land surface model. The data assimilation integrations include simulations that employ no bias correction, a lumped CDF-matching correction, a seasonally varying CDF-matching correction or ANN as a forward observation operator. As the a priori bias correction approaches make no distinction of the source of the biases (unmodeled or from other sources), they treat all systematic differences between the model and observations as biases. As a result, all a priori bias correction strategies considered above cause the signal from seasonal irrigation (or other unmodeled processes) to be excluded in the DA results, though the analysis of the DA internal diagnostics indicate near-optimal performance for such configurations.
The challenge in data assimilation systems is to separate the biases that are due to instrument error and retrieval algorithm errors from the biases induced by unmodeled processes so that the true observational features are not excluded during data assimilation. Detecting spatial patterns of such differences could be useful in the utilization of soil moisture data sets in data assimilation systems. For example, even if a particular retrieval data set is not skillful in detecting irrigation in a given region, it could still perform adequately over other regions. If knowledge of such limitations are known a priori, at the very least the assimilation system could simply exclude the use of the retrievals over known irrigated areas. Ancillary information from other remote sensing platforms could be useful in such scenarios. For example, estimates of land surface temperature and evapotranspiration measurements can be obtained from thermal remote sensing platforms, which can also be used as an analog for detecting the effects of ir-rigation. Similar to the wetting effect irrigation has on surface soil moisture, it also has a cooling effect on the surface temperature. Increased water availability from irrigation also leads to increased evaporation. Finally, irrigated time periods also correlate with an increase in vegetation indices such as leaf area index (LAI) and normalized vegetation difference index (NDVI). Expected trends of anomalies in these data sets from irrigation could be used as added constraints in data assimilation to mask the known limitations of passive microwave soil moisture retrievals.
It is obviously difficult to attribute bias to unmodeled processes or other factors. Calibrating the land surface model parameters can be an effective a priori correction approach in this scenario. Land surface model calibration would incorporate the observational signals by altering the default model behavior. When the calibrated model is subsequently used in data assimilation, the observational signal is preserved. In contrast, calibrating a forward radiative transfer model (or using a trained ANN) has a different impact in terms of representing unmodeled processes. Calibration of forward observation operators would attribute the bias to its parameters; however, since the default land model behavior is unchanged, the unmodeled process is ultimately not represented. The land surface model parameter estimation in this context also has a number of disadvantages. The physical realism of the estimated parameters may be violated given that the calibration would attribute the error from all unmodeled processes to model parameters. If data from multiple sensors are being concurrently assimilated into such a "calibrated" model, the calibration approach would not be viable because a new set of calibrated model parameters would be needed for each sensor, leading to differing model climatologies and behaviors.
Another possible alternative may be to examine the characteristics of the differences between the model and observations over a larger domain and infer a general estimate of the relative biases. For example, instead of computing CDFs at each grid point, they can be computed by grouping and stratifying model and observation estimates based on the vegetation type across a larger domain. The grid points where the soil moisture PDFs differ significantly (based on the K-S metrics, for example) can be excluded in the computation of the CDFs. The model and observation CDFs computed for each vegetation type can then be used in the data assimilation system. The goal behind such an approach would be to develop an overall estimate of the true biases between the model and observations (i.e., an estimate that excludes biases due to unmodeled processes). The downside of such approaches is that they obviously disregard the importance of geographic specificity in bias correction strategies. The synthetic experiment that we have used above is not an ideal setup to examine this approach. Since the only difference between the observations and the open loop in this setup is the effect of irrigation, the use of the above mentioned approach would produce identical CDFs for the model and ob-servations and therefore the assimilation approach would be equivalent to that of DA-NOBC. We therefore leave the evaluation of such alternate approaches to future work.