Journal topic
Hydrol. Earth Syst. Sci., 22, 2255–2267, 2018
https://doi.org/10.5194/hess-22-2255-2018
Hydrol. Earth Syst. Sci., 22, 2255–2267, 2018
https://doi.org/10.5194/hess-22-2255-2018

Research article 13 Apr 2018

Research article | 13 Apr 2018

# Using lagged dependence to identify (de)coupled surface and subsurface soil moisture values

Using lagged dependence to identify (de)coupled surface and subsurface soil moisture values
Coleen D. U. Carranza1, Martine J. van der Ploeg1, and Paul J. J. F. Torfs2 Coleen D. U. Carranza et al.
• 1Soil Physics and Land Management Group, Wageningen University, Wageningen, the Netherlands
• 2Hydrology and Quantitative Water Management Group, Wageningen University, Wageningen, the Netherlands

Correspondence: Coleen D. U. Carranza (coleen.carranza@wur.nl)

Abstract

Recent advances in radar remote sensing popularized the mapping of surface soil moisture at different spatial scales. Surface soil moisture measurements are used in combination with hydrological models to determine subsurface soil moisture values. However, variability of soil moisture across the soil column is important for estimating depth-integrated values, as decoupling between surface and subsurface can occur. In this study, we employ new methods to investigate the occurrence of (de)coupling between surface and subsurface soil moisture. Using time series datasets, lagged dependence was incorporated in assessing (de)coupling with the idea that surface soil moisture conditions will be reflected at the subsurface after a certain delay. The main approach involves the application of a distributed-lag nonlinear model (DLNM) to simultaneously represent both the functional relation and the lag structure in the time series. The results of an exploratory analysis using residuals from a fitted loess function serve as a posteriori information to determine (de)coupled values. Both methods allow for a range of (de)coupled soil moisture values to be quantified. Results provide new insights into the decoupled range as its occurrence among the sites investigated is not limited to dry conditions.

1 Introduction

Although recent decades have seen great advances in remote sensing applications for mapping surface soil moisture , most hydrological studies that make use of soil moisture data require integrated values over a certain soil depth . Extrapolation of surface soil moisture from remote sensing techniques to depths beyond the sensor's capacity (up to 5 cm) is not a trivial task given the spatiotemporal variability of soil moisture. The vertical distribution of soil moisture, which determines integrated soil moisture content over a soil column, is rarely uniform, as more pronounced dynamics are expected closer to the surface compared to deeper in the soil . Currently, information derived from remote sensing are assimilated into hydrological models to obtain integrated soil moisture values . However, stressed that it is important to assess vertical variability, especially the strength of coupling between surface and subsurface soil moisture for improvement of data assimilation (DA) results. Analyses of vertical soil moisture distributions also have important implications for modeling studies, as they could be used for calibration or validation of model parameters .

The amount of soil moisture at any given time is controlled by factors operating at different timescales. While prevailing atmospheric conditions directly affect surface layers and control the temporal dynamics of soil moisture , it is the downward movement of water from the surface that dictates the amount of subsurface soil moisture at a given time . Flow rates to the subsurface are driven by hydraulic properties, which are in turn controlled by physical soil characteristics such as texture, bulk density, and structure. Relative to changes in atmospheric conditions, soil physical properties change over longer timescales. Vegetation further modifies vertical soil moisture distribution by root water uptake and by changing soil structure .

Given the variability along a soil column, during which conditions does surface soil moisture reflect subsurface soil moisture? Several studies have investigated this relation to address the correspondence between surface and subsurface soil moisture. One of the earliest studies is by , in which they compared modeling outputs with remote sensing measurements. Using very shallow depths of 5 mm and 5 cm, they referred to decoupling as the deviation from a linear correlation between these depths due to variable drying rates. Further assessment of decoupling from model-generated time series soil moisture data have been investigated using cross-correlation values . High correlation to the subsurface was obtained using lagged values of surface soil moisture. However, cross-correlation is limited to providing a single value throughout the range of soil moisture encountered per lag. Furthermore, cross-correlation generally aims to evaluate the strength of lagged linear dependence between two variables . However, lagged dependence between surface and subsurface soil moisture may not be linear given that nonlinear processes determine water flow along the soil profile. Using in situ field measurements, investigated spatial surface (0–6 cm) and subsurface (0–30 cm) soil moisture distribution by calculating statistical metrics and by means of a variogram. Decoupling between the two depths was observed, which they suggested was influenced by vegetation, especially root density at surface soil. Their results were also affected by the dry soil moisture range and emphasized the importance of distinguishing between surface and total soil moisture for future applications of remote sensing to atmospheric studies.

Based on previous studies, the term decoupling refers to a weak dependence between soil moisture contents at the surface and subsurface. Recognition of decoupling is important; however, most studies have been limited to providing qualitative characterization of conditions when decoupling occurs (e.g., dry period). Only identified a mid-range soil moisture ( 0.3 cm3 cm−3) when the surface and very near surface begin to decouple. Their results, however, are limited to a thin layer of the soil column. In this paper, our main objective is to quantitatively identify a range of surface soil moisture values that is decoupled from the subsurface. Furthermore, we consider depths greater than those investigated by . The ability to quantify (de)coupled surface and subsurface soil moisture contents will contribute to more effective estimation of depth-integrated soil moisture data using remote sensing methods and improved data assimilation results in hydrological models.

We utilized in situ time series datasets at depths of 5 and 40 cm to represent surface and subsurface, respectively. Values outside the decoupled range are considered coupled since soil moisture is inherently bounded up to a maximum value equal to soil's porosity. The investigation of (de)coupling is based on the idea that surface conditions will be reflected at the subsurface after a certain delay, indicating strong coupling between the two zones, and vice versa. More focus is given to the decoupled soil moisture range since it has greater implications for extrapolation of surface soil moisture values to deeper soil layers. We applied statistical methods to identify conditions of decoupling with no prior assumptions on the type of functional relation between surface and subsurface. As an exploratory step, we first assessed the dependence without considering lags using regression and residual analysis. The main approach for assessing decoupling was application of distributed-lag nonlinear models (DLNMs; Gasparrini et al.2010) to incorporate both the lag structure and the functional relation between surface and subsurface soil moisture. Applications of distributed-lag models to econometrics and environmental epidemiology have been well documented . However, their application to hydrological studies have rarely been explored.

2 Description of datasets and study sites

Four time series datasets from the Twente soil moisture and temperature monitoring network were used in this study (Fig. 1). Datasets from 2014–2016 are available with only short periods of missing data. The stations are located in agricultural fields with sensors installed at 5, 10, 20, and 40 cm depths. To investigate decoupling, only the 5 and 40 cm depths were considered because the largest possible distance was desired. Each station consists of EC-TM ECH2O capacitance probes (Decagon Devices, Inc., USA) that logged soil moisture data every 15 minutes. A calibration procedure using gravimetric measurements was applied prior to analysis .

Land cover in the area varies from corn in one field (SM05), to grass in two fields (SM05 and SM13), to a forest area (SM20). Values at 40 cm capture the root zone of vegetation for each site. In reality, rooting depths vary and depend on species composition, climate, and plant growth rate. However, the depth considered would still allow for approximation of root zone conditions. The landscape is characterized by flat to slightly sloping terrain. It is important to note that SM20 is located at the eastern foot of a small hilly terrain. Throughout the study period, either land cover remained unchanged or the same crop was planted. The soil types for the stations range from coarse sandy soils to weakly silty soils . A summary of the land cover and relevant characteristics of the stations are summarized in Table 1.

Figure 1Location of study site in the eastern part of the Netherlands (inset). Triangles represent stations used within the Twente soil moisture and temperature monitoring network . Squares represent meteorological stations. Symbols with similar colors indicate the pair of measurements used for the analysis.

Table 1Summary of land cover descriptions at each station covering the period of 2014–2016. Soil descriptions and codes are based on BOFEK 2012 . Both the slope and distance to nearest ditch were determined from 5 m resolution DEM. Datasets are from 2016 and were obtained from the publicly available national topographic database of the Netherlands (TOPNL).

Soil moisture values were averaged into daily values to match the available daily rainfall data from the Dutch national weather service (KNMI). For SM13 and SM20, there are some missing data from the beginning of 2014. The datasets from SM13 begins on 25 April 2014 while SM20 begins on 2 May 2015 (Fig. 2).

3 Methods

## 3.1 Regression and residual analysis

As an exploratory step, the dependence between surface and subsurface soil moisture was initially visualized using scatter plots. Conditional means for every 0.01 cm3 cm−3 interval and vertical bars representing ± standard deviation were added to show changes in vertical variability across the soil moisture range. Longer standard deviation bars indicate higher vertical variability (Fig. 3a). We referred to vertical variability as the unevenness or irregularity in the soil moisture distribution within a certain depth of interest along the vertical profile, in this case up to depths of 40 cm. For the rest of this paper, variability will refer to vertical soil moisture variability, unless otherwise stated. The plotted points were colored per month to show any impacts of seasonality. The effect of rainfall was included by adjusting the sizes of the points proportional to rainfall intensity measured from the nearest KNMI stations. For the overall measure of dependence, Spearman's rank correlation coefficient Rs was computed for every pair of ranked values in the time series. This was chosen as the assumption of linear dependence was not made.

Figure 2Time series plots of surface (5 cm in light blue) and subsurface (40 cm in dark blue) soil moisture. Vertical black bars at the top show daily precipitation data from the nearest KNMI station.

Figure 3Schematic diagram using hypothetical soil moisture values to show vertical variability. (a) Scatter plot showing the trend with a fitted loess function. The variability can be seen using the standard deviation bars. (b) Scatter plot of the residuals from the fitted function. Soil moisture variability is visible from the variance given by the vertical bars and the cumulative residual variance given by the black line. A change in variability at an intermediate soil moisture value is marked by a change in the slope of the cumulative variance line, indicated by θc.

A flexible nonparametric locally weighted regression function (commonly called a loess function, ) was fitted along the soil moisture range. This was used to explore and identify trends across the soil moisture range. A linear regression was also fitted only for comparison. Residuals were analyzed further for variability not captured by the fitted function (Fig. 3b). The residual variance for every 0.1 cm3 cm−3 interval as well as the resulting cumulative residual variance were analyzed to examine variability across the range. The degree of variability was related to the slope of the cumulative variance line, with steep slopes indicating high variability. In addition, a significant change in variance between two points was indicated by a significant change in the slope of the line. The soil moisture value where a significant change in slope occurred was marked by θc; this divides the soil moisture range into two groups. The group with a steeper slope was interpreted as the decoupled range, and vice versa. Since the measured variance is sensitive to sample size, a correlation coefficient was calculated to determine if there was significant dependence between the two variables. Residual variance were first normalized from 0 to 1 because of the varying soil moisture range encountered at each station.

Results of the exploratory methods were considered a posteriori knowledge for analysis of lagged dependence and interpretation of results.

## 3.2 Analysis of lagged dependence

### 3.2.1 Cross-correlation

Since decoupling is based on the strength of lagged dependence, the existence of lag between surface and subsurface soil moisture values was first determined. Cross-correlation is known to be a quick and easy method to apply for this objective. Lagged values of surface soil moisture were correlated with instantaneous values at the subsurface. A maximum cross-correlation at negative lags indicated that surface soil moisture is leading subsurface soil moisture, and vice versa . A 10-day lag was deemed long enough to show the presence of lag–lead relations in the time series since the maximum correlation occurred within this period.

Figure 4Scatter plots of 5 cm vs. 40 cm soil moisture values at lag = 0. Colors correspond to the months in a year and sizes of points are proportional to rainfall intensity. Trends along the soil moisture range shown with the fitted loess function (black line). A linear function (red line) is also fitted for comparison. The overall dependence using Spearman's rank correlation Rs is given in the upper left corner of each plot. Residual standard errors (RSEs) for loess (lo) and linear (lm) fits are also shown.

### 3.2.2 Distributed-lag nonlinear model

We incorporated delayed or lagged effects in evaluating the relation between surface and subsurface values, and eventually in determining the (de)coupled values. It should be emphasized that the analysis was primarily focused on examining the trends and relation between surface and subsurface soil moisture. Moreover, it was not intended to replace other existing models for estimating soil moisture or examining its patterns.

A distributed-lag nonlinear model developed by was applied to the 5 and 40 cm time series datasets at the study sites. Briefly, the model is capable of simultaneously representing both functional the dependence and delayed response between exposure and response values. We considered surface soil moisture as the exposure values that produced delayed effects to the response values at the subsurface. A nonlinear model was selected in order to capture the nonlinear dynamics of flow and transport along the soil profile . Furthermore, DLNM offered enough flexibility to model a variety of dependencies in the time series dataset by selecting a suitable basis function. As an analogy, a DLNM is to a linear time series model (e.g., autoregressive model) just as a generalized linear model is to a linear model, as can be seen in Eq. (1).

In assessing lagged dependence, event-scale patterns were of interest rather than large-scale trends within the time series . This required seasonal patterns to be addressed prior to applying the DLNM. This was done by fitting a loess function to the time series and then subtracting it from the original soil moisture values . Removal of seasonality was further justified by the scatter plot results (see Sect. 4.1). The influence of seasonality on the vertical soil moisture variability is indicated by clustering of observation points occurring within the same months (Fig. 4). Deseasonalized soil moisture values were used for identifying (de)coupled soil moisture conditions.

For consistency in modeling, the range of surface soil moisture values used was from 0–0.50 cm3 cm−3. This was based on the highest surface soil moisture value encountered among the four sites. A lag value of up to 30 days was considered long enough to investigate delayed effects. This period also approximated the recurrence of heavy rainfall within the study sites. A spline function was the basis function chosen to represent the functional dependence and delayed effects as it offered flexibility to capture nonlinearities. In addition, contributions from daily rainfall data were used to incorporate current and past meteorological conditions. This was applied as a covariate and was represented with an additional basis function. We only considered delayed effects in vertical flow as lateral movement is deemed negligible in a flat to slightly sloping terrain (Table 1). The analysis was performed in R software using dlnm and mgcv (Wood2006a) packages.

The following section concisely describes the mathematical formulation of a DLNM. However, the reader may choose to skip this section as the general description of the methods applied has already been given in the text above. For a more detailed explanation, readers are referred to and .

To more formally describe a DLNM, let us first consider a general time series model, where outcomes Yt with t= 1, n can be described by

$\begin{array}{}\text{(1)}& g\left({\mathit{\mu }}_{t}\right)=\mathit{\alpha }+\sum _{j=\mathrm{1}}^{J}{s}_{j}\left({x}_{tj};{\mathbit{\beta }}_{j}\right)+\sum _{k=\mathrm{1}}^{K}{\mathit{\gamma }}_{k}{\mathbit{u}}_{t}k,\end{array}$

where μE(Y), assumed to be derived from a Poisson distribution, and g is a monotonic link function. The functions sj denote relationships between the variables xj and vector parameters βj. Other uk variables with predictors are included in coefficients γk to specify their related effects. The relation between x and g(μ) is represented by s(x) through a basis function. The complexity of this estimated relationship depends on the type basis function chosen and its dimensions. In the presence of delayed effects, the outcome Y at any time t is explained by the past exposures xtl with l as the “lag” representing the elapsed time between exposure and response. The final goal of a DLNM is to simultaneously describe the dependency along both the predictor space and lag dimension. This is achieved by selecting two sets of basis functions that are combined to obtain the cross-basis functions .

Within the DLNM framework, a response Yt at time t= 1 is based on lagged occurrences of predictor xt, which is represented by vector qt=$\left[{x}_{t-{l}_{\mathrm{0}}}$xtL]T. The minimum and maximum lags are given by l0 and LT, respectively. The function represents dependence through

$\begin{array}{}\text{(2)}& s\left(q,t\right)=s\left({x}_{t{,}_{t-{l}_{\mathrm{0}}}},\mathrm{\cdots },{x}_{t-L}\right)=\sum _{l={l}_{\mathrm{0}}}^{L}f\cdot w\left({x}_{t-L},\mathbit{l}\right),\end{array}$

where fw(xtLl) represents the exposure–lag–response function, which is composed of two marginal functions: the exposure–response function f(x) and lag–response function w(l) in the space of the lag. Parameterization of f and w is achieved by application of the known basis functions to the vectors qt and l. The result can be expressed as matrices R and C with dimensions (Ll0+ 1) ×vx and (Ll0+ 1) ×vl, respectively.

The cross basis function s and parameterized coefficients η are given by

$\begin{array}{}\text{(3)}& s\left({x}_{t{,}_{t-{l}_{\mathrm{0}}}},\mathrm{\cdots },{x}_{t-L};\mathbit{\eta }\right)=\left({\mathrm{1}}_{L-{l}_{\mathrm{0}+\mathrm{1}}}^{T}{\mathbf{A}}_{t}\right)\mathbit{\eta }={\mathbit{w}}_{t}^{T}\mathbit{\eta }.\end{array}$

The values of w are derived from At, which is computed from the row-wise Kronecker product between matrices R and C. The dependence is expressed through w and parameters η. The cross-basis function represents the integral of s(xt) over the interval [l0L], summing the contributions from the exposure history. The estimated dependence to specific exposure values is determined by prediction of $\stackrel{\mathrm{^}}{\mathit{\beta }}$, called lag coefficients. The estimated $\stackrel{\mathrm{^}}{\mathit{\beta }}$ and covariance matrix $\mathbf{V}\left(\stackrel{\mathrm{^}}{\mathit{\beta }}\right)$ is given by

$\begin{array}{}\text{(4)}& & \stackrel{\mathrm{^}}{\mathit{\beta }}={\mathbf{A}}_{x}\stackrel{\mathrm{^}}{\mathbit{\eta }},\text{(5)}& & \mathbf{V}\left(\stackrel{\mathrm{^}}{\mathit{\beta }}\right)={\mathbf{A}}_{x}\mathbf{V}\left(\stackrel{\mathrm{^}}{\mathbit{\eta }}\right){\mathbf{A}}_{x}^{T}.\end{array}$

A further extension to DLNM is the application of penalties for smoothness of the lag structure and shrinkage of lag coefficients to null at very high lags. These penalties were applied in the analysis using a second-order difference (Wood2006b) and varying ridge penalties , respectively. Application of penalties was based on the assumption that, at higher lags, the lag coefficients become smaller and approach the null value.

## 3.3 Evaluating (de)coupled soil moisture values

Application of a DLNM resulted in the estimation of parameter $\stackrel{\mathrm{^}}{\mathit{\beta }}$ for each surface soil moisture value (Eqs. 4 and 5). This indicated the strength of dependence between surface and subsurface soil moisture. Higher $\stackrel{\mathrm{^}}{\mathit{\beta }}$ values indicated stronger dependence or coupling between the two. Hence, we referred to $\stackrel{\mathrm{^}}{\mathit{\beta }}$ as the relative influence of surface soil moisture on subsurface values.

Figure 5Residual variance plots from the fitted loess function. Vertical bars at the bottom of each plot represent the variance for every 0.1 cm3 cm−3 interval. The cumulative residual variance (gray line) shows a change in slope at θc, indicated by the vertical black dashed line. This separates the soil moisture values into a range with higher variance (steeper slope) and another range with lower variance (gentler slopes). The range with higher variance is considered decoupled, and vice versa.

4 Results

## 4.1 Regression and residual analysis

The overall dependence between surface and subsurface given by the Spearman's rank coefficient (Rs) range from 0.746 to 0.866 (Fig. 4). However, even with a high overall dependence, variability is not uniform across the soil moisture range (Fig. 4). Except for SM13, increased variability is observed towards drier soil moisture values. Furthermore, the degree of variability also differs among the four sites. The most pronounced variability is observed at SM13 and the least at SM05. Clustering of observation points occurring within the same months indicate that seasonality dictates soil moisture values and impacts soil moisture variability. Rainfall events measured on the same day do not show a clear effect on surface and subsurface soil moisture dependence. Observations with higher rainfall intensities appear scattered in the plots (Fig. 4). In addition, the said observation points do not necessarily fall along the fitted functions or at the wet soil moisture region of the scatter plots. As lag is not considered, the impact of rainfall on variability is not fully captured in the scatter plots alone.

Assessment of the regression fit quality was performed by comparison using residual standard errors (RSEs). The results for both linear and loess functions show highly similar values (Fig. 4). This indicates that, in this case, a linear function captures the relation between surface and subsurface values. Nevertheless, the more flexible loess function was preferred for further residual analysis because of its slightly better model fit and, using only visual inspection of Fig. 4, it more closely approximates the calculated conditional mean.

Figure 5 shows the residual plots with lines of the cumulative residual variance. The change in slope of the line is a feature consistent for all sites regardless of the magnitude of residual variance. The changes in variability are more clearly observed from the residuals than from the standard deviation bars in the scatter plots. The change in slope at θc is highlighted by the vertical dashed line. The decoupled soil moisture range corresponds to the section of cumulative residual variance line with a steeper slope. Specifically, the range of decoupled surface soil moisture values (in cm3 cm−3) is 0.08–0.21 for SM05, 0.12–0.27 for SM09, 0.30–0.39 for SM13, and 0.08–0.12 for SM20. Except for SM13, the decoupled values are within the dry to intermediate soil moisture range. The cumulative residual variance line for SM13 appears to increase exponentially with increasing surface soil moisture. This differs from the other three sites which show a distinct decrease in slope at increasing soil moisture values. For SM20, a second point is identified with a change in slope. The flat line starting from 0.24–0.28 cm3 cm−3 indicates there is still lowered variance at the very wet soil moisture range.

The correlation between normalized variance and sample size yielded a value of 0.24 (Fig. 6). This low-correlation magnitude confirms that the variance obtained for the soil surface moisture intervals was not strongly influenced by the sample size used.

Figure 6Scatter plot of sample size vs. normalized residual variance calculated for each 0.01 cm3 cm−3 interval. Colors indicate soil moisture conditions at each point. The plot of points show very weak linear dependence, which is further confirmed by a 0.24 correlation coefficient.

## 4.2 Cross-correlation

Figure 7 shows cross-correlation values at the four sites. Maximum correlation occurs at 1 to 2 days of lag, except at SM20. This translates to a 1–2-day lead of surface soil moisture values. For SM20, the maximum correlation occurs at positive lags. Correlation values from lag = 0 to lag = 10 are almost equal at SM20. Although this indicates leading subsurface values, it does not eliminate the possibility of having a lag between surface and subsurface values (see Sect. 5.2). Other factors may play a role in having leading subsurface values in the cross-correlation plots. Hence, SM20 was still analyzed for decoupling using DLNM.

## 4.3 Distributed-lag model

Figure 8 shows the overall $\stackrel{\mathrm{^}}{\mathit{\beta }}$ for each surface soil moisture value with 5 and 95 % confidence intervals in shaded gray regions. In order to identify a range that is decoupled, a threshold value (${\stackrel{\mathrm{^}}{\mathit{\beta }}}_{\mathrm{c}}$) must be specified. This value is comparable to the intermediate soil moisture θc identified from Sect. 4.1. The values of θc provided a suitable guide for identifying a threshold common to all four sites (Table 2). The corresponding $\stackrel{\mathrm{^}}{\mathit{\beta }}$ values obtained at θc were very close to 1; therefore, setting the threshold ${\stackrel{\mathrm{^}}{\mathit{\beta }}}_{\mathrm{c}}$= 1 seemed a reasonable choice. This was preferred over the exact $\stackrel{\mathrm{^}}{\mathit{\beta }}$ at each θc since the latter was defined using exploratory methods at lag = 0. Using the chosen ${\stackrel{\mathrm{^}}{\mathit{\beta }}}_{\mathrm{c}}$= 1, surface soil moisture values with $\stackrel{\mathrm{^}}{\mathit{\beta }}$< 1 are considered decoupled while those with $\stackrel{\mathrm{^}}{\mathit{\beta }}$ 1 are coupled.

Based on ${\stackrel{\mathrm{^}}{\mathit{\beta }}}_{\mathrm{c}}$, the identified decoupled values are generally in the dry to intermediate soil moisture range (Fig. 8), except for SM13 where decoupled values are at the wet range. Table 2 shows the decoupled values identified based on the selected ${\stackrel{\mathrm{^}}{\mathit{\beta }}}_{\mathrm{c}}$. The behavior and trends of $\stackrel{\mathrm{^}}{\mathit{\beta }}$ also differ for each station. For instance, at SM05 and SM09, there is a general increase in $\stackrel{\mathrm{^}}{\mathit{\beta }}$ from dry towards wet surface soil moisture values. SM20 also shows increasing $\stackrel{\mathrm{^}}{\mathit{\beta }}$ over a limited soil moisture range (0.1–0.25 cm3 cm−3). Outside this range, the estimated $\stackrel{\mathrm{^}}{\mathit{\beta }}$ values for SM20 were less than 1 and have very broad confidence intervals. Recall that the range used for DLNM was only for uniformity among the four study sites. The lack of or very few observations for very dry or very wet soil moisture conditions led to wider confidence intervals not only for SM20 but also for the other three sites. Compared to the three sites, the estimated $\stackrel{\mathrm{^}}{\mathit{\beta }}$ values for SM13 show decreasing values towards the wet soil moisture range (> 0.3 cm3 cm−3). From the intermediate to dry soil moisture conditions, the values fluctuate around the designated ${\stackrel{\mathrm{^}}{\mathit{\beta }}}_{\mathrm{c}}$.

Table 2List of surface soil moisture values (SSM in cm3 cm−3) obtained from Figs. 5 and 8. SSM at θc in Fig. 5 were used to determine $\stackrel{\mathrm{^}}{\mathit{\beta }}$ in Fig. 8. A common threshold ${\stackrel{\mathrm{^}}{\mathit{\beta }}}_{\mathrm{c}}$ was used for all sites since $\stackrel{\mathrm{^}}{\mathit{\beta }}$ are all close to 1. The resulting decoupled SSM values are shown in the fourth column.

Figure 7Cross-correlation plots of soil moisture values. The lagged surface soil moisture values at 5 cm are correlated with subsurface values at 40 cm. A 1–2 day lead of surface soil moisture is observed, except for SM20. This is indicated by maximum correlation values at lags of 1 to 2 days. At SM20, the maximum correlation occurs at positive lags.

Figure 8The relative influence of surface soil moisture on subsurface values obtained by summing the predicted $\stackrel{\mathrm{^}}{\mathit{\beta }}$ along the 30-day lag. The threshold value (${\stackrel{\mathrm{^}}{\mathit{\beta }}}_{\mathrm{c}}$) used to identify the decoupled range is indicated by the horizontal line. Surface soil moisture values below ${\stackrel{\mathrm{^}}{\mathit{\beta }}}_{\mathrm{c}}$ are considered decoupled. The 5 and 95 % confidence intervals of the predicted values are shown as shaded regions.

5 Discussion

## 5.1 Decoupled soil moisture values

Regression and residual analyses show that there is an inherent vertical variability between surface and subsurface soil moisture values based on the lack of 1 : 1 correspondence between the two (Fig. 4). This inherent variability is also not uniform as higher variability is observed at certain soil moisture ranges. The cumulative residual variance plots (Fig. 5) clearly indicate the soil moisture values where vertical variability starts to become consistently larger. The increase in variability further translates to weak lagged dependence which we observe as low $\stackrel{\mathrm{^}}{\mathit{\beta }}$ values from DLNM. The increase in vertical variability and weakening of lagged dependence is what we considered as decoupling between the surface and subsurface soil moisture.

Both residual analysis and DLNM were successful in identifying a decoupled soil moisture range, and there is good agreement between the results from both. Three out of four sites show decoupled values in the dry to intermediate soil moisture range (Fig. 5 and Table 2). These results agree with the known range where decoupling is expected . For SM05 and SM09, the intermediate soil moisture value θc that marks when decoupling begins (Table 2) is close to that identified by . They obtained a value of 0.3 cm3 cm−3 as the point below which decoupling begins. However, results for SM13 do not conform to the traditional concept of decoupling. This result is significant as it implies that decoupling may occur at any value and is not confined to dry soil moisture range.

The vegetation type at each site exerts some influence on the soil moisture variability and the resulting (de)coupled values. First, the vegetation type affects how much ground surface is directly exposed to atmospheric conditions. Forested areas and grass fields are almost fully covered by vegetation compared to a corn field where the crops are organized in equidistant rows. Vegetation or canopy cover will determine how atmospheric conditions affect the soil moisture values. For instance, the amount of intercepted precipitation and evaporation are both dependent on vegetation cover. This in turn will have direct impacts to the surface soil moisture dynamics at each of the sites. For comparison, the variability given by the standard deviation bars in Fig. 4 and variance in Fig. 5 at the cornfield (SM09) is higher compared to that of the grass field (SM05) or the forested area (SM20). In addition, the forested area (SM20) has the smallest range of soil moisture values among the four sites. This may be due to the large intercepted rainfall by the forest canopy. Root water uptake (RWU) is another method by which vegetation affects soil moisture variability. RWU can have significant influence on the subsurface dynamics. The influence of RWU may vary for different vegetation types as it can be exerted over a range of depths, leading to differences in the resulting (de)coupled values.

Among the four sites, the subsurface trends observed for the 40 cm values at SM13 show consistently high values, which can be more pronounced during winter months. This resulted in decoupling during wet soil moisture conditions (Fig. 8). This trend is different from the other three sites, which only show a slight increase in the subsurface values. Further inspection of the time series data at SM13 reveals no sudden disturbance in the signal which could be attributed to errors in the sensor. Field investigation confirmed an increase in silt content at 40 cm compared to the upper layers. The increase in silt content promotes a decrease in hydraulic conductivity over depth that results in a slower vertical flow towards deeper layers. The presence of burrowing and hibernating animals was also observed at the site during winter. These create macropores which eventually alter the hydraulic properties of the soil . We infer that, at the measurement domain of the sensor, these burrows or macropores facilitated faster vertical flow to the subsurface. Alternatively, if the burrows produced voids around the measurement domain, this would result in lowered soil moisture or data gaps due to the loss of sensor-to-soil-media contact. However, there were no gaps observed that coincided with the burrowing animals' period of hibernation. During precipitation events, soil moisture flowing from upper layers arrived more rapidly at 40 cm depths due to the presence of macropores. There it accumulated and flowed more slowly to deeper layers because of the low hydraulic conductivity promoted by the increase in silt content. The overall effect of these factors was the pronounced increase in soil moisture values at 40 cm compared to those at 5 cm during winter periods, as observed from the time series dataset (Fig. 2).

Site-specific characteristics at each station control the magnitude of variability as well as the range at which decoupling is observed. However, the occurrence of decoupling is independent of the magnitude of variability since it was observed from SM05, where variability is least up to that of SM13 where it is greatest. The methods applied in this study only identify conditions when decoupling occurs but do not explicitly determine its controls. Identification of controls for decoupling requires a separate analysis where mechanistic models or statistical approaches can be applied.

## 5.2 Assessing the use of lagged dependence for identifying decoupled conditions

To assess the applicability of the methods applied, we further discuss their strengths and weaknesses. We also present opportunities for further studies as well as foreseen limitations for other sites.

### 5.2.1 Strengths

The residual analysis and DLNM methods allow quantification of a range of soil moisture values where decoupling occurs. This provides further extension to previous studies where decoupling is only described qualitatively. As seen from the results at the four sites, decoupling can occur at any soil moisture value, and is not confined to dry periods or ranges. Furthermore, by making no initial assumptions on data distributions and the type of functional relation and lag structure, the methods applied were considered robust. Nonlinear functions were applied as they conform to the nonlinearity of water flow in the unsaturated zone. They can also handle a variety of bivariate dependence, even in cases where the relation is linear, as shown by the highly similar fit of the loess and linear functions in Sect. 4.1.

### 5.2.2 Weaknesses

The first aspect that needs to be further investigated is the selected ${\stackrel{\mathrm{^}}{\mathit{\beta }}}_{\mathrm{c}}$ value for identifying the decoupled soil moisture range. Although the selection in this study was based on trends identified from time series datasets, the methods applied should be tested further using other datasets to confirm the suitability of ${\stackrel{\mathrm{^}}{\mathit{\beta }}}_{\mathrm{c}}$= 1 for other depths and soil types. The choice of βc is crucial as it dictates which soil moisture values are expected to be decoupled. For instance, at the sites where decoupling occurs during dry conditions, a higher βc value would enlarge the decoupled range. A similar effect would be expected for the site with decoupling during wet conditions. However, a lower βc value could result to decoupling only during extreme soil moisture conditions (e.g., very wet or very dry).

Another aspect to further examine is the use of cross-correlation for confirming the presence of leading surface soil moisture values. Results from SM20 show maximum correlation at positive lags which indicate leading subsurface values (Fig. 7). The weakness of using cross-correlation as a test for the presence of lag can be two-fold. First, cross-correlation can also capture the effect of subsurface dynamics such as groundwater influence and lateral flow. We infer that in SM20, subsurface dynamics dominates and masks the lag relation sought. An additional covariate representing subsurface dynamics was not included in the DLNM analysis since a dominant downward vertical flow was assumed. This assumption was based on the flat slopes encountered at SM20 (Table 1). Therefore, the occurrence of subsurface lateral flow or groundwater influence pose limitations to the applicability of DLNM for assessing decoupling. Second, cross-correlation is limited to evaluating linear lagged dependence and, in incorporating nonlinear lagged dependence, can make the test more robust. Equivalent methods exist (e.g., mutual information content; ) but they are much more computationally demanding when the goal is simply to check for the existence of lag–lead relation.

Figure 9Subsurface soil moisture dynamics for vertically discrete (40 cm) and depth-average value. Left panels: time series of soil moisture at 40 cm and depth-averaged values. The dynamics observed for depth-average values are highly similar to those at 40 cm. Right panels: scatter plot showing that these two sets of values are highly correlated.

### 5.2.3 Opportunities

In relation to utilizing remote sensing techniques, our results imply that the accuracy of estimating subsurface values from surface soil moisture can be greatly affected by vertical coupling. Lower variability and hence lower uncertainties are expected in the coupled soil moisture range. Assessment of decoupling can be used in combination with modeling studies as a preliminary method to determine the range where variability is expected to be higher. Furthermore, it can be helpful in assessing whether simulation results capture the variabilities observed in both the coupled and decoupled ranges. Taking decoupling into account can also assist in evaluating the necessity of complex models for simulating vertical soil moisture content.

For data assimilation applications, (de)coupling methods can be used for cross-comparison of the vertical coupling derived from DA model outputs with those observed from long term in situ measurements. This can aid in examining the adequacy of the assumed inherent connection between surface and subsurface values. As pointed out, land surface models vary in their representation of the strength of this connection (e.g., weak or strong connection), which contributes to the degree to which modeling results are improved. They also suggested that strong coupling is a more robust choice unless independent information suggests that a more decoupled surface–subsurface representation is more realistic. In this aspect, the analysis applied in this study could be a valuable tool in determining which type of surface–subsurface coupling is the more optimal choice. Furthermore, the assumed connection strength is adopted for the whole range of soil moisture values. The results of our analysis show that at any given site, decoupling will occur regardless of the degree of soil moisture variability. A variable coupling strength could be adopted based on the soil moisture range where decoupling is likely to occur as an alternative to the single value for the whole range.

Although the study focused on vertically discrete values, the results are also applicable for depth-average values commonly used in remote sensing and DA applications. This requires that the vertically discrete values adequately capture the overall dynamics within zone being investigated. In such a case, we infer that the translation to depth-averaged values would result in (de)coupled values that are close, but not identical, to the values obtained when only comparing two discrete depths. As an illustration, we calculated the depth-average values using all the available measurements at each site (i.e., 5, 10, 20, and 40 cm depth) following the formula from . Figure 9 (left) reveals highly similar dynamics for both discrete and depth-average values. Therefore, it can be expected that the results from a regression and DLNM analyses using depth-average values would be highly similar to the original results in Figs. 5 and 8. However, if the vertically discrete values insufficiently represent the subsurface dynamics, larger deviations in the resulting decoupled values can be expected.

### 5.2.4 Limitations

In this study, only meteorological factors were incorporated into the DLNM analysis since vertical movement was assumed to be the dominant flow mechanism. However, the subsurface can also be influenced by lateral movement or groundwater by capillary rise. In such scenarios, decoupling will not be limited to changes in surface conditions. For this, SM20 provides an excellent example. This station is located at the foot of a small hill (Fig. 2) where the occurrence of lateral subsurface movement is highly probable. This shows that although the analysis would be limited to smaller scales, or even a single point, recognition of regional setting is important for interpretation of results. In addition, subsurface dynamics can also be affected by capillary rise in areas with shallow groundwater. For future applications, the effect of both capillary rise and lateral movements to subsurface dynamics should be assessed and included in the DLNM analysis, but caution should be exercised when interpreting the results. Assessment of decoupling with DLNM is deemed more applicable to areas where the subsurface has insignificant groundwater influence and where vertical downward movement is the dominant flow mechanism.

6 Conclusion

The methods applied in this study allow for investigation of vertical soil moisture variability. More importantly, application of DLNM allowed for a decoupled soil moisture range to be quantitatively identified. The results also reveal that decoupling is not confined to dry soil moisture range as implied by previous studies. The reasons for decoupling are manifold, and controls for the dry soil moisture range may differ from those for the wet range. The results of this study have implications for remote sensing and data assimilation methods, especially for uncertainties related to the use of surface soil moisture to obtain integrated soil moisture values.

Data availability
Data availability.

The datasets for soil moisture were obtained from the Water Resource Department of ITC-Twente University. At the moment, the datasets are not publicly available. Access to the datasets may be granted upon request from the institute through Rogier van der Velde, PhD (r.vandervelde@utwente.nl).

Author contributions
Author contributions.

CDUC and MPvdP initially conceptualized the idea for investigating the relation between surface and subsurface soil moisture values. PJJFT provided significant contributions to the statistical analysis applied. All three authors contributed to writing and editing of the paper.

Competing interests
Competing interests.

The authors declare that they have no conflict of interest.

Acknowledgements
Acknowledgements.

The authors are grateful for the Water Resources Department of ITC-Twente University, the Netherlands, for sharing the datasets from their network. We thank the three anonymous reviewers for providing critical insights that improved the paper. This work is part of the research programme Optimizing Water Availability through Sentinel-1 Satellites (OWAS1S) with project number 13871 which is (partly) financed by the Netherlands Organisation for Scientific Research (NWO). We also thank colleagues in the lab meeting group for insightful discussions as well as Demi Moore for proofreading earlier versions of the text.

Edited by: Nunzio Romano
Reviewed by: three anonymous referees

References

Albertson, J. D. and Montaldo, N.: Temporal dynamics of soil moisture variability: 1. Theoretical basis, Water Resour. Res., 39, 1274, https://doi.org/10.1029/2002WR001616, 2003. a

Almon, S.: The distributed lag between capital appropriations and expenditures, Econometrica, 33, 178–196, 1965. a

Angers, D. A. and Caron, J.: Plant-induced changes in soil structure: processes and feedbacks, in: Plant-induced soil changes: processes and feedbacks, Springer Netherlands, 55–72, 1998. a

Belmans, C., Wesseling, J., and Feddes, R. A.: Simulation model of the water balance of a cropped soil: SWATRE, J. Hydrol., 63, 271–286, 1983. a

Beven, K. and Germann, P.: Macropores and water flow in soils revisited, Water Resour. Res., 49, 3071–3092, 2013. a

Bhaskaran, K., Gasparrini, A., Hajat, S., Smeeth, L., and Armstrong, B.: Time series regression studies in environmental epidemiology, Int. J. Epidemiol., 42, 1187–1195, 2013. a

Brocca, L., Ciabatta, L., Massari, C., Camici, S., and Tarpanelli, A.: Soil Moisture for Hydrological Applications: Open Questions and New Opportunities, Water, 9, 1–20, 2017. a

Capehart, W. J. and Carlson, T. N.: Decoupling of surface and near-surface soil water content: A remote sensing perspective, Water Resour. Res., 33, 1383–1395, 1997. a, b, c, d, e

Cleveland, R. B., Cleveland, W. S., and Terpenning, I.: STL: A seasonal-trend decomposition procedure based on loess, J. Offic. Stat., 6, 3–73, 1990. a

Cleveland, W. S. and Devlin, S. J.: Locally weighted regression: an approach to regression analysis by local fitting, J. Am. Stat. Assoc., 83, 596–610, 1988. a

Das, N., Mohanty, B., Cosh, M., and Jackson, T.: Modeling and assimilation of root zone soil moisture using remote sensing observations in Walnut Gulch Watershed during SMEX04, Remote Sens. Environ., 112, 415–429, 2008. a

De Lannoy, G. J., Verhoest, N. E., Houser, P. R., Gish, T. J., and Van Meirvenne, M.: Spatial and temporal characteristics of soil moisture in an intensively monitored agricultural field (OPE 3), J. Hydrol., 331, 719–730, 2006. a

Dente, L., Verkedy, Z., Su, Z., and Ucer, M.: Twente Soil Moisture and Soil Temperature Monitoring Network, Tech. rep., ITC – University of Twente, Enschede, the Netherlands, http://www.itc.nl/library/papers_2011/scie/dente_twe.pdf (last access: 11 April 2018), 2011. a, b, c

Ford, T. W., Harris, E., and Quiring, S. M.: Estimating root zone soil moisture using near-surface observations from SMOS, Hydrol. Earth Syst. Sci., 18, 139–154, https://doi.org/10.5194/hess-18-139-2014, 2014. a

Gasparrini, A.: Distributed lag linear and non-linear models in R: the package dlnm, J. Stat. Softw., 43, 1–20, 2011. a

Gasparrini, A., Armstrong, B., and Kenward, M.: Distributed lag non-linear models, Stat. Med., 29, 2224–2234, 2010. a, b, c, d

Gasparrini, A., Scheipl, F., Armstrong, B., and Kenward, M.: A penalized framework for distributed lag non-linear models, Biometrics, 73, 938–948, https://doi.org/10.1111/biom.12645, 2017. a, b

Hirschi, M., Mueller, B., Dorigo, W., and Seneviratne, S.: Using remotely sensed soil moisture for land–atmosphere coupling diagnostics: The role of surface vs. root-zone soil moisture variability, Remote Sens. Environ., 154, 246–252, 2014. a

Houser, P. R., Shuttleworth, W. J., Famiglietti, J. S., Gupta, H. V., Syed, K. H., and Goodrich, D. C.: Integration of soil moisture remote sensing and hydrologic modeling using data assimilation, Water Resour. Res., 34, 3405–3420, 1998. a

Hupet, F. and Vanclooster, M.: Intraseasonal dynamics of soil moisture variability within a small agricultural maize cropped field, J. Hydrol., 261, 86–101, 2002. a

Jackson, T. J.: III. Measuring surface soil moisture using passive microwave remote sensing, Hydrol. Process., 7, 139–152, 1993. a

Kim, G. and Barros, A. P.: Space–time characterization of soil moisture from passive microwave remotely sensed imagery and ancillary data, Remote Sens. Environ., 81, 393–403, 2002. a

Kodešová, R., Kodeš, V., Žigová, A., and Šimúnek, J.: Impact of plant roots and soil organisms on soil micromorphology and hydraulic properties, Biologia, 61, S339–S343, 2006. a

Koster, R. D., Dirmeyer, P. A., Guo, Z., Bonan, G., Chan, E., Cox, P., Gordon, C., Kanae, S., Kowalczyk, E., Lawrence, D., et al.: Regions of strong coupling between soil moisture and precipitation, Science, 305, 1138–1140, 2004. a

Kumar, S. V., Reichle, R. H., Koster, R. D., Crow, W. T., and Peters-Lidard, C. D.: Role of subsurface physics in the assimilation of surface soil moisture observations, J. Hydrometeorol., 10, 1534–1547, 2009. a, b

Mahmood, R., Littell, A., Hubbard, K. G., and You, J.: Observed data-based assessment of relationships among soil moisture at various depths, precipitation, and temperature, Appl. Geogr., 34, 255–264, 2012. a

Martinez, C., Hancock, G., Kalma, J., and Wells, T.: Spatio-temporal distribution of near-surface and root zone soil moisture at the catchment scale, Hydrol. Process., 22, 2699–2714, 2008. a

Mohanty, B. P. and Skaggs, T.: Spatio-temporal evolution and time-stable characteristics of soil moisture within remote sensing footprints with varying soil, slope, and vegetation, Adv. Water Resour., 24, 1051–1067, 2001. a

Mohanty, B. P., Cosh, M. H., Lakshmi, V., and Montzka, C.: Soil Moisture Remote Sensing: State-of-the-Science, 16, 1–9, https://doi.org/10.2136/vzj2016.10.0105, 2017.  a

Njoku, E. G., Jackson, T. J., Lakshmi, V., Chan, T. K., and Nghiem, S. V.: Soil moisture retrieval from AMSR-E, IEEE T. Geosci. Remote, 41, 215–229, 2003. a

Obermeier, V., Scheipl, F., Heumann, C., Wassermann, J., and Küchenhoff, H.: Flexible distributed lags for modelling earthquake data, J. Roy. Stat. Soc. C, 64, 395–412, 2015. a

Qiu, J., Crow, W. T., Nearing, G. S., Mo, X., and Liu, S.: The impact of vertical measurement depth on the information content of soil moisture times series data, Geophys. Res. Lett., 41, 4997–5004, 2014. a, b

Rodriguez-Iturbe, I., Porporato, A., Ridolfi, L., Isham, V., and Cox, D.: Probabilistic modelling of water balance at a point: the role of climate, soil and vegetation, P. Roy. Soc. Lond. A, 455, 3789, https://doi.org/10.1098/rspa.1999.0477, 1999. a

Shumway, R. H. and Stoffer, D. S.: Time series analysis and its applications: with R examples, Springer Science & Business Media, New York, 2010. a, b

Wilson, D. J., Western, A. W., Grayson, R. B., Berg, A. A., Lear, M. S., Rodell, M., Famiglietti, J. S., Woods, R. A., and McMahon, T. A.: Spatial distribution of soil moisture over 6 and 30 cm depth, Mahurangi river catchment, New Zealand, J. Hydrol., 276, 254–274, 2003. a, b

Wilson, D. J., Western, A. W., and Grayson, R. B.: Identifying and quantifying sources of variability in temporal and spatial soil moisture observations, Water Resour. Res., 40, W02507, https://doi.org/10.1029/2003WR002306, 2004. a

Wood, S. N.: Generalized Additive Models: An Introduction with R, CRC Press, Wageningen, the Netherlands, 2006a. a

Wood, S. N.: Low-Rank Scale-Invariant Tensor Product Smooths for Generalized Additive Mixed Models, Biometrics, 62, 1025–1036, https://doi.org/10.1111/j.1541-0420.2006.00574.x, 2006b. a

Wosten, J., de Vries, F., Hoogland, T., Massop, H., Veldhuizen, A., Vroon, H., Wesseling, J., Heijkers, J., and Bolman, A.: BOFEK2012, de nieuwe bodemfysische schematisatie van Nederland, Tech. rep., Alterra, Wageningen, the Netherlands, 2013. a, b

Wu, W., Xiao, Y., Li, G., Zeng, W., Lin, H., Rutherford, S., Xu, Y., Luo, Y., Xu, X., Chu, C., et al.: Temperature–mortality relationship in four subtropical Chinese cities: a time-series study using a distributed lag non-linear model, Sci. Total Environ., 449, 355–362, 2013. a

Yu, G.-R., Zhuang, J., Nakayama, K., and Jin, Y.: Root water uptake and profile soil water as affected by vertical root distribution, Plant Ecol., 189, 15–30, https://doi.org/10.1007/s11258-006-9163-y, 2007. a

Zanobetti, A., Schwartz, J., Samoli, E., Gryparis, A., Touloumi, G., Atkinson, R., Le Tertre, A., Bobros, J., Celko, M., Goren, A., Forsberg, B., Michelozzi, P., Rabczenko, D., Aranguez Ruiz, E., and Katsouyanni, K.: The temporal pattern of mortality responses to air pollution: a multicity assessment of mortality displacement, Epidemiology, 13, 87–93, 2002. a