The relationship between terrestrial water storage (TWS) and atmospheric processes has important implications for predictability of climatic extremes and projection of future climate change. In places where moisture availability limits evapotranspiration (ET), variability in TWS has the potential to influence surface energy fluxes and atmospheric conditions. Where atmospheric conditions, in turn, influence moisture availability, a full feedback loop exists. Here we developed a novel approach for measuring the strength of both components of this feedback loop, i.e., the forcing of the atmosphere by variability in TWS and the response of TWS to atmospheric variability, using satellite observations of TWS, precipitation, solar radiation, and vapor pressure deficit during 2002–2014. Our approach defines metrics to quantify the relationship between TWS anomalies and climate globally on a seasonal to interannual timescale. Metrics derived from the satellite data were used to evaluate the strength of the feedback loop in 38 members of the Community Earth System Model (CESM) Large Ensemble (LENS) and in six models that contributed simulations to phase 5 of the Coupled Model Intercomparison Project (CMIP5). We found that both forcing and response limbs of the feedback loop in LENS were stronger than in the satellite observations in tropical and temperate regions. Feedbacks in the selected CMIP5 models were not as strong as those found in LENS, but were still generally stronger than those estimated from the satellite measurements. Consistent with previous studies conducted across different spatial and temporal scales, our analysis suggests that models may overestimate the strength of the feedbacks between the land surface and the atmosphere. We describe several possible mechanisms that may contribute to this bias, and discuss pathways through which models may overestimate ET or overestimate the sensitivity of ET to TWS.
Land–atmosphere feedbacks can result from the coupling of the terrestrial moisture state with temperature, precipitation, or radiation (Betts et al., 2014; Findell and Eltahir, 1997; Guillod et al., 2015; Koster et al., 2004). Land–atmosphere coupling occurs when terrestrial moisture anomalies influence the partitioning of surface energy between latent and sensible heat fluxes that, in turn, influence the development of the planetary boundary layer (PBL) (Seneviratne et al., 2010). Temperature coupling generally leads to a positive feedback, with wetter soil contributing to a higher evaporative fraction (EF; the ratio of the latent heat flux to the sum of the sensible and latent heat fluxes), a lower surface temperature, and decreased evaporative demand (Hirschi et al., 2011; Miralles et al., 2012). Precipitation coupling can lead to both positive and negative feedbacks, as the influence of EF on the development of the PBL can serve to either enhance or suppress cloud formation and precipitation (Findell and Eltahir, 2003; Guillod et al., 2015). Cloud radiative coupling can likewise lead to positive or negative feedbacks as insolation and evaporative demand, as a function of cloud cover, are either enhanced or suppressed (Betts, 2009; Cheruy et al., 2014). Temperature, precipitation, and radiation feedbacks each stem from coupling between terrestrial moisture and evapotranspiration (ET), which occurs most strongly in conditions of intermediate moisture availability (Seneviratne et al., 2010).
Evidence of these feedbacks has been observed in both in situ and remotely sensed data (Eltahir, 1998; Findell and Eltahir, 1997; Guillod et al., 2014, 2015). Some observational analyses have found land–atmosphere feedback strength to be relatively weak compared to the influence of large-scale atmospheric forcing (Alfieri et al., 2008; Phillips and Klein, 2014). Other observational studies have highlighted the role of these feedback mechanisms in the initiation and exacerbation of climatic extremes such as droughts and heat waves (Hirschi et al., 2011; Miralles et al., 2014; Whan et al., 2015).
Large-scale land–atmosphere coupling in general circulation models has been demonstrated by a series of experiments from the Global Land–Atmosphere Coupling Experiment (GLACE) project (Guo et al., 2006; Koster et al., 2004, 2006). The GLACE efforts found that coupled climate models differed greatly in the extent to which soil moisture variations affect precipitation and surface air temperature, but models generally agreed on the spatial distribution of relative coupling strength, with “hotspots” of strong coupling during boreal summer found in the central United States, northern Amazonia, the Sahel, western Eurasia, and northern India. These hotspots were found in regions of intermediate soil wetness, which is consistent with the understanding that strong land–atmosphere coupling occurs under conditions in which terrestrial moisture availability limits ET (Seneviratne et al., 2010). GLACE efforts also showed that correct soil moisture initialization improves seasonal forecast skill of temperature and, to a lesser extent, precipitation, particularly in cases with a large initial soil moisture anomaly (Koster et al., 2010, 2011).
Additional studies have considered land–atmosphere feedbacks in the coupled Earth system models (ESMs) used by the Intergovernmental Panel on Climate Change (IPCC) (Dirmeyer et al., 2013; Notaro, 2008; Seneviratne et al., 2006, 2013). Notaro (2008) was able to confirm the boreal summer GLACE hotspots, as well as identify several additional austral summer hotspots, in the models used for the IPCC Fourth Assessment Report (AR4). Analysis of long-term projections from phase 5 of the Coupled Model Intercomparison Project (CMIP5) indicated an increased control of land surface moisture on boundary layer conditions with climate change (Dirmeyer et al., 2013). The GLACE-CMIP5 experiment found that modeled coupling strength plays an important role in simulated response to global warming, with greater warming evident in more strongly coupled models due to interactions between soil moisture, temperature, and precipitation (Berg et al., 2015; May et al., 2015; Seneviratne et al., 2013).
Despite the importance of land–atmosphere coupling in both short-term predictability of climatic extremes and long-term uncertainty in climate change, validation efforts have suggested that climate models may not be correctly representing the strength, and in some cases even the sign, of these feedbacks (Ferguson et al., 2012; Hirschi et al., 2014). The metrics developed for the GLACE are based on model experiments with no direct observational equivalents. However, correlation-based metrics that do enable direct comparison with observations suggest that models may overestimate land–atmosphere coupling strength (Dirmeyer et al., 2006a). Zeng et al. (2010) found that version 3 of the Community Climate System Model (CCSM3) showed a higher coupling strength than reanalysis or observational data. Mei and Wang (2012) found that coupling strength was reduced when the Community Atmosphere Model (the land surface component of CCSM3) was updated from version 3 (CAM3) to version 4 (CAM4), though the coupling strength of the updated version was still stronger than observations and reanalysis.
The Local Land–Atmosphere Coupling (LoCo) project has focused on developing a suite of metrics for diagnosing land–atmosphere coupling strength in observations and models. LoCo metrics consider both the influence of soil moisture on EF, and the influence of EF on diurnal-scale boundary layer development (Santanello et al., 2009). Ferguson et al. (2012) used the LoCo approach to compare global remote sensing data sets of soil moisture, EF, and the lifting condensation level with several land surface models and reanalyses. They found that even though the models were able to simulate the correct spatial pattern of stronger coupling in moist–arid transitional regions, the models tended to simulate a stronger influence of soil moisture on surface turbulent fluxes than what was observed in the satellite data. Guillod et al. (2014) used a combination of flux tower, remote sensing, and reanalysis data sets to demonstrate that the measured strength of coupling between EF and precipitation depends greatly on the data source and scale, and that a strong coupling apparent in a previous analysis (Findell et al., 2011) was not consistent with the observations.
While many of the previously mentioned studies have confirmed the long-standing suspicion that models may overestimate coupling strength relative to observations, more recent work has indicated that observations and models may not even agree on the sign of the precipitation feedback. Taylor et al. (2012) performed a spatial analysis of the relationship between soil moisture and afternoon precipitation using data from remote sensing, reanalysis, and coupled models. They found evidence of a negative feedback in the remote sensing observations, with afternoon rain being more likely over regions of drier soil, as opposed to the positive feedback that was apparent in the models. Guillod et al. (2015) addressed these findings by replicating the spatial analysis and complementing it with a temporal analysis. They found a negative spatial feedback, consistent with the one found by Taylor et al. (2012), but a positive temporal feedback, with afternoon precipitation at a given location being more likely after mornings of relatively moist soil.
These studies highlight the need for continued efforts toward evaluating the coupling strength of models relative to observations using a wide array of data sources at a range of spatiotemporal scales. Apparent coupling strength depends greatly on the spatial scales of analysis (Hohenegger et al., 2009), indicating that observations at the scale of flux towers should not be expected to yield the same coupling strength as those at the scale of global climate models (Guillod et al., 2014). Consistency between the spatial scale of observations and models is greatly assisted by Earth observation satellites that have been continuously monitoring several relevant land surface and atmospheric variables over multiple years (Teixeira et al., 2014). Measured or modeled coupling strength will also depend on the timescales in question (Guillod et al., 2015), and while the LoCo efforts have improved the understanding of synoptic and diurnal-scale mechanisms, there is an additional need to examine these processes on seasonal to interannual time periods.
Here we introduce a set of metrics for measuring the strength of land–atmosphere interactions on seasonal timescales by combining satellite remote sensing data sets of terrestrial water storage, precipitation, shortwave radiation, and surface atmospheric temperature and water vapor during 2002–2014. These new metrics complement previous studies and are unique in several ways. In particular, we designed our metrics to consider interannual variability of entire seasons in order to complement the temporal resolution of LoCo metrics, which focus on day-to-day variability within one or more seasons. Land–atmosphere coupling on seasonal timescales has been shown to be essential in enabling tropical forests to survive during the dry season in the Amazon (Lee et al., 2005) and as a mechanism enabling seasonal forecasts of fire risk (Chen et al., 2013, 2016).
Until recently, studies using remote sensing data to look for evidence of land–atmosphere coupling relied on products that provide information about surface soil moisture (Ferguson et al., 2012; Taylor et al., 2012). Consideration of root-zone soil moisture has recently been accomplished only indirectly via data-assimilated estimates (Guillod et al., 2015). The inability to directly consider root-zone soil moisture has been suggested as an explanation for the relatively weak coupling observed using remote sensing data (Hirschi et al., 2014). In order to include root-zone soil moisture, as well as other sources of moisture available across entire seasons, here we analyzed remote sensing data of the entire terrestrial water storage (TWS) column.
The metrics introduced here were specifically designed to use the monthly TWS anomaly (TWSA) product from the Gravity Recovery and Climate Experiment (GRACE) mission (Landerer and Swenson, 2012; Wahr et al., 2004). The GRACE TWSA product integrates soil moisture at all layers along with surface, canopy, snow/ice, and aquifer storage, as each of these components represents a potential source of moisture for fulfilling evaporative demand. For example, in areas where agricultural ecosystems are important, diversion of lake and river water resources and withdrawal from aquifers may contribute to irrigation fluxes and thus ET. Furthermore, surface storage of liquid water and snow represents sources of water that are available for and potentially limiting to ET. Under these conditions, month-to-month TWS anomalies capture portions of the terrestrial water cycle that soil moisture alone may not.
Previous studies have largely focused on land surface moisture availability as a forcing mechanism on the atmosphere, as this relationship has important implications for seasonal predictability as well as the projection of the frequency and severity of climatic extremes. However, the land surface response to the atmosphere is governed by many of the same processes through which terrestrial moisture availability forces atmospheric conditions, and it determines the conditions that drive subsequent land surface forcing. It is therefore critical to assess the response of land surface moisture to atmospheric conditions, as an accurate representation of these processes is essential for generating the correct terrestrial moisture variability that will go on to influence the atmosphere. As far as we can tell, this response limb of the land surface feedback loop has not been systematically integrated with existing analyses of land–atmosphere coupling strength.
Our globally applicable approach used the annual cycle of TWS drawdown and recharge to isolate the months of the year during which the land surface loses moisture, which we refer to as the drawdown interval (Fig. 1a). We selected this interval because past work has shown that the land surface's influence on the atmosphere is most prevalent during summer in the Northern Hemisphere (Cheruy et al., 2014; Phillips and Klein, 2014) and during the dry season in tropical forests (Harper et al., 2013; Lorenz and Pitman, 2014). This approach allowed us to investigate land surface coupling at a global scale, and to extend metrics developed in previous work for pre-defined monthly intervals corresponding to boreal summer (e.g., Guo and Dirmeyer, 2013; Koster et al., 2006) to be applicable to any seasonality.
In our analysis, separate metrics were calculated to consider the influence of TWS at the onset of the drawdown interval on atmospheric conditions in subsequent months, and simultaneously, the influence of atmospheric conditions during the drawdown interval on terrestrial water storage at the end of the season. We refer to these two relationships as the forcing and response limbs, respectively, of the fully coupled feedback loop between the land surface and the atmosphere (Fig. 1b). We estimated the strength of these feedbacks during 2002–2014 using GRACE and other satellite remote sensing data (Table 1). We then used the satellite observations to evaluate the strength of these feedbacks in the Community Earth System Model (CESM) Large Ensemble (LENS) (Kay et al., 2014) and in several models that contributed simulations to CMIP5 (Table 2).
Conceptual description of coupling metrics:
We obtained level 3 TWSA data from GRACE using the University of Texas at
Austin Center for Space Research (CSR) spherical harmonic solutions
(Swenson, 2012). Global land data at a 1
Level 3 near-surface temperature and relative humidity were obtained
globally at a monthly, 1
As a first step, we used the mean annual cycle from GRACE to determine the
months of the maximum and minimum TWS anomalies in order to define the
drawdown interval at each 1
Remote sensing products used for analysis.
Month of maximum and minimum TWSA and the length of the drawdown
interval from GRACE
Existing literature generally defines “land–atmosphere coupling” as the extent to which atmospheric conditions are forced by the land surface state, and would use the term “atmosphere–land coupling” to refer to the land surface response to atmospheric drivers (Seneviratne et al., 2010). In this study, we develop what we refer to as “coupling metrics” to indicate the strength of both limbs of the fully coupled land–atmosphere feedback loop. We use the terms “forcing” and “response” to indicate whether we are considering the forcing of the atmosphere by the land surface, or the response of the land surface to the atmosphere (Fig. 1b).
We defined our forcing metric as the Pearson product-moment correlation
coefficient between the TWS anomaly at the onset of the drawdown interval (TWSA
Similarly, we defined our response metric as the correlation coefficient
between ATM
CMIP5 models used for analysis.
We note that these metrics do not provide distinctive information for measuring the strength of land–atmosphere coupling or the land surface response. While the metrics include the influence of direct land–atmosphere interactions, they are also potentially influenced by atmospheric and soil moisture persistence, as well as remote forcing from sea surface temperatures (SSTs) (Orlowsky and Seneviratne, 2010; Mei and Wang, 2011). Nevertheless, these metrics may still serve as useful benchmarks against which to evaluate the ability of ESMs to reproduce the proper relationships based on the combination of these factors.
Here we note that our evaluation of both the forcing and response metrics will follow a nomenclature that considers strong coupling as acting in the direction of an overall positive feedback loop. In regions with a strong positive feedback, higher than average TWS would be followed by lower than average VPD, as more available water is able to fulfill evaporative demand. Therefore, strong TWS forcing on VPD would be associated with a negative correlation coefficient. Higher VPD during the drawdown interval would increase evaporative demand, potentially leading to a more negative TWS anomaly; therefore, a strong response of the land surface to VPD would also be associated with a negative correlation coefficient.
Because the partitioning of surface fluxes can, depending on the
spatiotemporal scale, cause a change of either sign to cloudiness and
precipitation (Taylor et al., 2012; Guillod et al., 2015), correlation
coefficients of either sign could indicate strong land surface forcing on
PPT and SW
We used the metrics described above to evaluate feedback strength in the
CESM LENS. LENS comprises an
ensemble of 38 fully coupled runs in which air temperature initial
conditions are perturbed slightly (by an amount less than the round-off error)
to reveal the internal variability inherent within the coupled climate
model. LENS has demonstrated that the uncertainty in climate projections due
to internal climate variability inherent in CESM is comparable to the ranges
of output within the entire CMIP5 experiment (Kay et al., 2014). LENS uses
version 1 of CESM (CESM1) with version 5 of the Community Atmosphere Model (CAM5)
and version 4 of the Community Land Model (CLM4) at a horizontal
resolution of 1
The LENS data were chosen as a starting point for feedback evaluation for two reasons. First, the availability of a TWS variable in these simulations enabled a direct comparison with metrics derived using data from GRACE. The TWS field in CLM4 included water from surface and canopy storage, snow and ice, soil moisture, and a dynamic aquifer, in addition to river water storage terms from the coupled River Transport Module (RTM). The coupling of CLM4 with RTM has been shown to be important for simulating both the annual cycle and interannual variability of TWS in comparison with GRACE (Kim et al., 2009).
Second, the ensemble allowed us to test the importance of internal model variability for the diagnosis of feedback strength. Because the complete satellite record was relatively short (containing no more than 12 drawdown intervals at any location), comparison with an equivalent single time series of model output could be influenced by a model's internal decadal-scale variability (Kay et al., 2014). Analyzing the full ensemble from LENS enabled us to assess the sensitivity of our forcing and response metrics to this variability. We extracted from each ensemble member the equivalent months of the satellite record, with data prior to December 2005 coming from the historical runs, and data from January 2006 onward coming from the RCP8.5 simulations.
To assess the sensitivity of our metrics to observational uncertainty, we used a Monte Carlo sampling approach. For each of the 38 members of LENS, we calculated coupling metrics 10 times with random noise added to both TWSA and atmospheric variable time series at each grid cell. The noise was randomly generated from a Gaussian distribution with a mean of zero and a standard deviation equal to 25 % of the standard deviation of the original data. Comparing these results with those from the unaltered data provided some indication of how much our coupling metrics are degraded by random noise as an approximation of observational uncertainty.
In addition, to assess how our analysis may be influenced by uncertainty due to the selection of satellite data, we substituted data from the European Centre For Medium-range Weather Forecasting (ECMWF) Interim Reanalysis (ERA-Interim) (Dee et al., 2011) in place of AIRS-, GPCP-, and CERES-derived variables. We only used atmospheric reanalysis data for this sensitivity analysis, as these data benefit from assimilation of observations, while we continued to use GRACE for TWSA. Comparing results from this GRACE-reanalysis hybrid to those using only satellite data provided a general indication of how sensitive our coupling metrics were to the data source.
To extend our analysis to models that did not output an explicit TWS field, we compared accumulated residuals of precipitation, evapotranspiration, and total runoff (surface and subsurface) with the explicit TWS variable in the LENS simulations. We also compared coupling metrics calculated from LENS using accumulated residuals with those calculated from the explicit TWS field. After we determined that the accumulated residuals of the water balance represented much of the variability in the explicit TWS variable and yielded coupling metrics with similar distributions within LENS, we calculated equivalent metrics for several model simulations in the CMIP5 archive (Table 2). We selected the CMIP5 models that were similar to LENS (CESM1-CAM5 and CESM1-BGC) as well as the models that participated in the GLACE-CMIP5 experiment (Seneviratne et al., 2013) for which each necessary output field was available (CCSM4, GFDL-ESM2M, GFDL-ESM2G and IPSL-CM5A-LR).
Interannual variability (standard deviation) of TWSA
A comparison of the months of maximum and minimum terrestrial water storage
as determined by climatologies of GRACE and the LENS ensemble mean indicated
that the model largely reproduces the timing of TWSA seasonality evident in
the satellite observations (Fig. 2). Geographic patterns of seasonality
were consistent between the model and observations, though a phase shift in
the drawdown interval is apparent in eastern Canada and central Eurasia
where LENS had a 1-month early bias for both the maximum and minimum TWSA,
in southeast North America where the onset of the modeled drawdown interval
was slightly later than the observations, and in parts of east Asia and
Australia where the modeled drawdown interval ended earlier than in the
observations. However, despite capturing generally correct timing, the model
exhibited higher interannual variability of TWSA
A comparison of the interannual variability of atmospheric variables across
multiple drawdown intervals between the model and satellite data showed
various degrees of consistency (Fig. 4). The magnitude and geographic
pattern of VPD
Interannual variability (standard deviation) of VPD
Forcing and response metrics for VPD from GRACE/AIRS
Comparing both the timing of TWS dynamics and the interannual variability of TWS and the atmospheric variables between the observations and model output provides a context for interpreting the correlation-based metrics we present next. Although there are some inconsistencies, as noted above, the model largely reproduced the same patterns evident in the remote sensing data. In many regions, the interannual variability in model output was similar to the observed variability, indicating that CESM was able to simulate reasonably well the baseline properties (timing and variability) that influence feedback dynamics.
The forcing metric for VPD derived from GRACE and AIRS showed regions of
strong coupling, in which TWSA
The response metrics for VPD showed much stronger coupling than the forcing metrics in both the satellite data and the model (Fig. 5b and d). Satellite data yielded negative correlation coefficients nearly everywhere, with positive correlations found only in arid regions of low TWS variability. Particularly strong response metrics were found in eastern North America, northern South America, western Eurasia, the Sahel, India, and eastern Australia. The first ensemble member from LENS showed widespread negative correlations, and did not show the positive correlations found in the satellite data. Response coupling in LENS was much more spatially homogeneous than in the satellite data, though northern South America and western Eurasia still showed stronger coupling than elsewhere.
Forcing and response metrics for PPT from GRACE/GPCP
Forcing and response metrics for SW
Ensemble histogram of forcing metrics from the 38 simulations in LENS (gray bars) compared to satellite observations from GRACE/AIRS/GPCP/CERES (solid black line) and the alternate set of observations from GRACE and ERA-Interim (dashed black line), averaged across land regions within different latitude bands.
Many of the areas that showed a strong forcing metric for VPD also showed a
relatively strong forcing metric for PPT, though the PPT forcing metric was
overall weaker than that for VPD (Fig. 6a). The response metric for PPT
was generally positive, indicating that for much of the globe, a more
positive TWSA
The forcing metrics for SW
In temperate and tropical regions, forcing metrics were generally stronger
in LENS (more positive correlations for PPT, more negative for VPD and
SW
Response metrics were also generally higher in LENS than in both the
satellite and reanalysis data (Fig. 9). Noticeable exceptions were the VPD
and PPT response metrics in the tropics, which were close to the satellite
observations, and the boreal SW
Ensemble histogram of response metrics from the 38 simulation in LENS (gray bars) compared to satellite observations from GRACE/AIRS/GPCP/CERES (solid black line) and the alternate set of observations from GRACE and ERA-Interim (dashed black line), averaged across land regions within different latitude bands.
The internal variability across the ensemble of simulations in LENS yielded
a distribution of forcing and response metrics with a spread on the same
order of magnitude as the difference between modeled and satellite-derived
zonal averages. The distribution of coupling metrics from LENS revealed the
sensitivity of the relationships to decadal climate variability given the
relatively short TWS time series. Comparing this distribution with the
spread between the purely satellite-derived metrics and GRACE-reanalysis
hybrid indicated the sensitivity of our metrics to the choice of data
source. The differences between satellite and reanalysis metrics were
generally greater in the tropics, particularly for VPD and SW
Comparing the original LENS forcing and response metrics with those calculated after adding random noise to LENS (Figs. S1 and S2 in the Supplement) provided an estimate of the metrics' sensitivity to observational uncertainty. Adding random noise with 25 % of the standard deviation of the original data to the model time series of TWSA and atmospheric variables at each grid cell does degrade the metrics slightly, causing areal averages to be closer to zero, but the differences are relatively small compared to the differences between observed and modeled averages as well as the spread of the ensemble itself. This sensitivity analysis provided evidence that observational errors likely have a relatively small impact on the quality of our satellite-derived metrics.
Comparison of the explicit TWS field from LENS with the accumulated residuals of the surface water budget, as well as the forcing and response metrics calculated using both (Figs. S3 and S4), indicated that the alternative formulation provides an acceptable substitute when an explicit TWS field is not available from an ESM. More specifically, it suggests that water storage in rivers, lakes, and other parts of the terrestrial hydrologic system that are downstream from grid cell-level runoff did not significantly degrade the set of metrics evaluated here. Some degradation of the forcing metrics for PPT was apparent in the middle and low latitudes, but the remaining metrics are not highly sensitive to TWS formulation. This suggests that metrics calculated for CMIP5 output using accumulated residuals could be reasonably and effectively compared with the metrics derived from LENS and the observations (Fig. 10).
As with LENS, the metrics derived from CMIP5 output indicated generally
stronger coupling metrics than the observations for both the forcing and
response limbs. Exceptions include the VPD response metric in the tropics,
the boreal PPT and SW
Comparison of CMIP5 and LENS models indicated a mostly positive relationship between forcing and response metrics in temperate and tropical latitude bands. In boreal latitudes, there was little distinction between the forcing metrics of the different models, all of which were close to zero, though there were some clear differences within the response metrics. In temperate and tropical latitudes, models that showed the strongest forcing metrics generally also showed the strongest response metrics for a given variable. This relationship suggests that analysis of the response limb of the feedback loop is important for understanding how conditions are set up for subsequent forcing via land–atmosphere coupling.
The metrics developed here from satellite observations provide a means for evaluating land–atmosphere feedback strength on seasonal to interannual timescales in coupled ESMs. The use of correlation coefficients in this study does not enable a direct assessment of whether the relationships are directly causal, as correlation between atmospheric and terrestrial conditions could result from atmospheric persistence and remote forcing from SSTs (Orlowsky and Seneviratne, 2010; Mei and Wang, 2011). Nonetheless, the satellite-derived metrics provide a meaningful constraint against which coupled models can be benchmarked, as these models need to correctly represent the combined effects of persistence, remote SST forcing, and land–atmosphere coupling.
The forcing metrics, by indicating the relationship between antecedent TWS and subsequent atmospheric characteristics, provide observational constraints to complement previous research in large-scale land–atmosphere coupling in global models (e.g., Guo and Dirmeyer, 2013; Koster et al., 2006; Seneviratne et al., 2013). Observed forcing metrics were found to be strong in some of the regions of intermediate wetness in which ET is limited by terrestrial moisture availability, in addition to some regions in the moist tropics in which ET is generally considered to be energy limited. Recent observational analyses by Hilker et al. (2014) demonstrated that at least in the Amazon, deep rooting-zone water supplies can become seasonally depleted, leading to a stronger land–atmosphere coupling. This is consistent with findings that deep rooted plants vertically redistribute soil water to shallower layers, allowing higher levels of evapotranspiration to be sustained during the dry season (Lee et al., 2005). It is also consistent with recent work demonstrating that TWSAs can be used as predictors for fire season severity in the Amazon (Chen et al., 2013).
The inclusion of response metrics in our analysis allows the full feedback loop to be considered by recognizing the two-way dependence between the land surface and the atmosphere. The generally higher correlation coefficients in observed response metrics indicates the importance of the land surface response in priming the system for subsequent forcing on the atmosphere. For example, if the TWS response is too strongly coupled to the atmosphere, a small change in atmospheric conditions could yield an unrealistically large change in TWS. The unrealistically large TWS anomaly, in turn, would have the potential to impart a larger land surface forcing of the atmosphere in subsequent time steps. That models and ensemble members with high forcing metrics were also generally found to have high response metrics (Fig. 10) highlights the need to consider this.
Both the forcing and response metrics as calculated from the output of the ESMs analyzed in the current study indicated generally stronger coupling compared with those derived from the satellite observations. There are exceptions to this pattern, but it holds generally true, particularly across middle and lower latitudes, and particularly in the LENS data. This is consistent with previous studies conducted at finer temporal resolutions (Ferguson et al., 2012) and across more limited spatial domains (Hirschi et al., 2011). As described below, there are several possible explanations as to why models may simulate a stronger feedback than is observed in the satellite record.
Scatter plots of forcing and response metrics for LENS and CMIP5 models with observations, averaged across land regions within different latitude bands. For LENS, we show metrics calculated using the explicit TWS output (darker gray) and TWSA estimates from the accumulated residuals of the surface water balance (lighter gray). For CMIP5 models, we calculated metrics using TWSA estimates from the accumulated residuals of the surface water balance.
One set of possible explanations for the stronger coupling metrics in models relative to observations involves models overestimating the amount of water available for ET during the drawdown interval. The land surface influence on the atmosphere requires water to be a limiting factor to ET but not limiting enough to prevent it altogether. Under more moisture-limited conditions, a drawdown interval may experience multiple shorter time periods during which ET is inhibited due to insufficient water, and the terrestrial moisture state exerts no control over flux partitioning. These periods of insufficient moisture would tend to reduce the overall feedback strength integrated across the duration of the drawdown interval. Model shortcomings that make water too readily available for ET could reduce the amount of time spent in a periods of insufficient moisture during the drawdown interval, thereby unrealistically strengthening the longer-term feedback. We note that the opposite could take place under near-saturated conditions if a model overestimates the amount of time in which ET is energy limited, but we would not expect these conditions to be as prevalent during the drawdown interval that was the time period of focus in our analysis.
ESMs are known to simulate unrealistically homogeneous rainfall intensity, with overestimates of drizzle and underestimates of large infrequent events (Dai, 2006). Infrequent high-intensity rainfall events would yield much more runoff from saturated soil, which would lead to a weaker connection between the land and atmosphere than frequent low-intensity drizzle. If a model simulates too much drizzle, precipitation could lead to too much storage, which would cause a model to overestimate the response metrics. Too much storage also could allow water to be too readily available for ET, causing an overestimate of the forcing metrics. Contributions from drizzle could be offset if insufficient rainfall intensity does not allow for high enough throughfall or soil moisture recharge. The issue of rainfall intensity is related to issues of convective parameterization (described below), and may be addressed in future versions of ESMs through atmospheric superparameterization, in which a model's convective parameterization is replaced with embedded cloud-resolving models (Kooperman et al., 2016).
A misrepresentation of either the amount of bare soil or of bare soil processes also could lead to overestimates of the amount of water available for ET and thereby coupling strength. Current land surface schemes of ESMs are based on the “big leaf” model paradigm, which could lead to overestimates of ET if runoff and groundwater recharge are underestimated as a consequence of an unrealistically small bare soil fraction. In addition, even if bare soil fraction were correct, overestimates of ET due to an incomplete representation of surface resistance of bare soil, as found in CLM4 by Swenson and Lawrence (2014), would amplify positive feedbacks.
Additional explanations for why models may overestimate feedback strength include the parameterization of convection in the PBL or stomatal conductance responses to soil moisture. Previous work using a regional climate model (RCM) with a higher spatial resolution have determined that convective parameterizations are as important as spatial resolution in the simulation of precipitation coupling (Hohenegger et al., 2009). Taylor et al. (2013) similarly found parameterized convection in an RCM yielding a positive coupling in contrast to the negative coupling found in both observations and model runs with explicitly simulated convection. If negative coupling mechanisms are present in reality but absent from models, this could contribute to an overestimate of coupling metrics and underrepresentation of negative feedbacks in models. Similarly, the diversity of stomatal conductance parameterizations in CMIP5 ESMs is relatively low (Medlyn et al., 2011; Swann et al., 2016), and if stomatal apertures close too rapidly in response to an initial deficit in terrestrial water storage, transpiration–humidity feedbacks may be intensified in an unrealistic manner.
One factor that could contribute toward stronger coupling metrics in models relative to observations is the effect of observational uncertainty combined with a relatively short time series. Adding random error to one or more variables in a correlation analysis will reduce the correlation coefficient, and this degradation has been shown to be sensitive to the length of data sets used to establish metrics of land–atmosphere coupling (Findell et al., 2015). Given the relatively short time series available for the current analysis, the correlation coefficients derived from remote sensing data may be reduced due to observational uncertainty, unlike those derived from internally consistent models. We found that adding random noise to LENS at 25 % of the standard deviation of the original data caused some degradation of our area-averaged coupling metrics, but only by a small amount relative to the difference between LENS and the observations (Figs. S1 and S2). We chose 25 % as a qualitative upper bound on likely uncertainties introduced from random observational error within the TWSA and atmospheric variable time series. This highlights the need for developing more quantitative error estimates in remote sensing and reanalysis products. More generally, this sensitivity analysis suggests that our coupling metrics, when averaged across large areas, may be useful in identifying robust data–model differences.
Another possible explanation stems from the fact that our coupling metrics include co-variability due to atmospheric persistence and remote forcing by SST (Orlowsky and Seneviratne, 2010; Mei and Wang, 2011) alongside the direct influence of land–atmosphere interactions. For this reason, we caution that overestimates of coupling metrics do not imply that the land–atmosphere feedback is necessarily stronger, but could be due to an overestimate of SST-driven correlations between the land surface and the atmosphere. Wei et al. (2008) demonstrated that negative correlations between soil moisture and subsequent precipitation can be explained by precipitation persistence combined with negative temporal autocorrelation of precipitation associated with intra-seasonal modes such as the Madden–Julian Oscillation (MJO). Poor representation of the MJO period in CMIP5 models leads to unrealistic patterns of precipitation persistence (Hung et al, 2013). If models are failing to capture MJO-driven negative correlations, this could lead to overly strong positive correlations relative to observations. However, this would depend on the length of the drawdown interval relative to persistence time and the period of intra-seasonal modes.
The current study demonstrates the utility of the coupling metrics presented here, but conclusions are limited by the time span of the satellite record. While LENS enables the internal variability of these relationships to be investigated within the model, it is unclear how much natural climate variability affects these relationships in reality on timescales longer than the satellite record. Furthermore, we acknowledge that observational error over an insufficiently long time series could reduce the apparent strength of correlations (Findell et al., 2015). Therefore, the utility of the coupling metrics we present will increase alongside the length of the time series available from remote sensing platforms. This emphasizes the importance of the GRACE follow-on mission (Flechtner et al., 2014) and the need for continuity in the record between missions.
Furthermore, incorporating additional remote sensing products can reduce uncertainties inherent in the satellite-derived data sets. We presented metrics derived using ERA-Interim in place of AIRS, GPCP, and CERES in order to qualitatively illustrate this uncertainty. We found a non-negligible amount of uncertainty in both forcing and response metrics due to inconsistencies between the remote sensing and reanalysis products. Future work will address these uncertainties by incorporating additional observations and observationally constrained data sets such as those from the Global Soil Wetness Project (Dirmeyer et al., 2006b) and the Global Land Data Assimilation System (Rodell et al., 2004). In addition, as increasingly long time series of data become available from the Soil Moisture Ocean Salinity (Mecklenburg et al., 2012) and Soil Moisture Active Passive (Panciera et al., 2014) missions, the metrics developed here can be applied to those data sets as well, which will elucidate the importance of surface soil moisture relative to the total TWS column in these interactions.
Finally, the issue of causality and the possibility that correlations result primarily from atmospheric persistence and remote forcing from SST rather than land–atmosphere interactions may be addressed using sensitivity experiments similar to those of the GLACE and GLACE-CMIP experiments. While the previous experiments have tested the importance of soil moisture interaction with the atmosphere, additional experiments could expand upon these methods by treating SST variability similar to terrestrial soil moisture availability. Such experiments could determine the relative importance of remote SST forcing, including the effect of atmospheric persistence, and local land–atmosphere coupling in explaining correlations between TWS and atmospheric conditions.
As these sources of uncertainty are diminished, the coupling metrics introduced here may be used to assess whether improvements to model biogeophysics and parameterizations yield relationships that are more consistent with observations. CMIP5 models are known to have a high ET bias (Mueller and Seneviratne, 2014), which could be due in part to the explanations proposed as possible reasons for overestimated coupling metrics in models. As data become available from phase 6 of the Coupled Model Intercomparison Project (CMIP6), these metrics could provide an assessment of whether improvements to ET processes in models also improves the relationship between the land surface and the atmosphere.
We have developed a new approach for measuring the strength of the two-way feedback relationships between TWS and the atmosphere. This approach was designed specifically to take advantage of TWSA data from the GRACE mission, along with concurrently collected remote sensing and reanalysis data sets of atmospheric variables, in a manner that could then be applied to Earth system models. The coupling metrics described here quantify the relationships between both antecedent TWS and subsequent atmospheric conditions, as well as antecedent atmospheric conditions and subsequent TWS.
Regions of strong forcing, in which the TWSA at the beginning of the drawdown interval was related to the subsequent atmospheric state, coincided with the semi-arid zones previously found to be hot spots of land–atmosphere coupling, as well as some new tropical zones that may have moisture-limited ET regimes. Regions of strong response metrics, in which the TWSA at the end of the drawdown interval is related to the atmosphere, are much more widespread. Modeled coupling metrics are generally found to be stronger than those observed in the satellite record. If this discrepancy is due to models overestimating the two-way feedback between the land surface and the atmosphere, this could lead to models incorrectly projecting future warming trends and climatic extremes (e.g., Hirschi et al., 2011; Seneviratne et al., 2013; Cheruy et al., 2014; Miralles et al., 2014).
The results of this study are consistent with previous studies at smaller temporal scales indicating land–atmosphere coupling strength may be stronger in models than in observations. There are several possible mechanisms that may contribute to the overestimation of land–atmosphere coupling in models, and future studies may incorporate the metrics introduced here to assess the role of these mechanisms. These metrics will become increasingly useful as the temporal coverage of the remote sensing record grows longer and additional missions come online.
All data used in this analysis are publicly available. GRACE Monthly Land
Water Mass Grids NetCDF Release 5.0 is available from
We received funding support from the United States Department of Energy Office of Science Biogeochemistry Feedbacks Scientific Focus Area and the Climate Change Prediction Program, cooperative agreement (DE-FC03-97ER62402/A010). The CESM Large Ensemble Community Project was supported by the National Science Foundation (NSF) with supercomputing resources provided by the Climate Simulation Laboratory at NCAR's Computational and Information Systems Laboratory (CISL). We also acknowledge the World Climate Research Programme's Working Group on Coupled Modelling, which is responsible for CMIP, and we thank the climate modeling groups (listed in Table 2 of this paper) for producing and making available their model output. For CMIP, the US Department of Energy's Program for Climate Model Diagnosis and Intercomparison provides coordinating support and led the development of software infrastructure in partnership with the Global Organization for Earth System Science Portals. Edited by: P. Gentine Reviewed by: T. Ford and two anonymous referees