Traditional Budyko analysis is predicated on the assumption that the
watershed of interest is in dynamic equilibrium over the period of study, and
thus surface water partitioning will not be influenced by changes in
storage. However, previous work has demonstrated that groundwater–surface
water interactions will shift Budyko relationships. While modified Budyko approaches
have been proposed to account for storage changes, given the limited ability
to quantify groundwater fluxes and storage across spatial scales,
additional research is needed to understand the implications of these
approximations. This study evaluates the impact of storage changes on
Budyko relationships given three common approaches to estimating
evapotranspiration fractions: (1) determining evapotranspiration from
observations, (2) calculating evapotranspiration from precipitation and
surface water outflow, and (3) adjusting precipitation to account for storage
changes. We show conceptually that groundwater storage changes will shift the
Budyko relationship differently depending on the way evapotranspiration is
estimated. A 1-year transient simulation is used to mimic all three
approaches within a numerical framework in which groundwater–surface water
exchanges are prevalent and can be fully quantified. The model domain spans
the majority of the continental US and encompasses 25 000 nested watersheds
ranging in size from 100 km

The Budyko hypothesis states that the fraction of precipitation (

The simplicity of this relationship has since garnered much interest within the hydrologic community for its potential to predict watershed behavior using only climate variables, which are often easier to observe than many hydrologic variables, and without relying on computationally expensive or heavily parameterized numerical models. In recent years, the Budyko hypothesis has also been put forward as a way of predicting hydrologic sensitivity to climate change, especially in ungauged basins (e.g., Donohue et al., 2011; Jones et al., 2012; Renner et al., 2014). However, application of this method has been partially limited by spatial variability between watersheds and the required steady-state assumption.

The original Budyko curve presented a universal relationship between evapotranspiration and aridity (Budyko, 1974). Subsequent work has shown that, while the Budyko curve is generally robust, climate alone is not sufficient to predict watershed partitioning; the shape of the curve can vary between locations, especially for smaller watersheds. Differences in behavior between river basins have been attributed to seasonal lags in water and energy supply and vegetative and soil properties (Donohue et al., 2007). The original Budyko curve has been reformulated multiple times to incorporate additional free parameters to reflect these differences (Choudhury, 1999; Fu, 1981; Milly, 1994; Zhang et al., 2001, 2004), and numerous studies have used these modified formulations to relate curve parameters to physical basin characteristics in many settings (e.g., Li et al., 2013; Shao et al., 2012; Williams et al., 2012; Xu et al., 2013; Yang et al., 2009). For example, Li et al. (2013) and Yang et al. (2009) evaluated relationships between the shape of the Budyko curve and vegetation coverage. Similarly, Williams et al. (2012) and Zhang et al. (2004) found distinct shape parameters when comparing forested watersheds to grasslands, although it should be noted that they reached the opposite conclusion about their relative magnitudes. Others have focused on the role of soil moisture and noted differences in behavior based on plant water availability and seasonal lags in supply and demand (e.g., Milly, 1994; Yang et al., 2007; Yokoo et al., 2008).

Many previous studies have demonstrated good predictive abilities using modified Budyko formulations even when applied to smaller watersheds and shorter timescales than those originally intended. However, poor performance in some locations, especially over annual or seasonal time periods, has been attributed to the influence of storage changes that violate the steady-state assumption (Milly and Dunne, 2002; Zhang et al., 2008). Istanbulluoglu et al. (2012) and Wang et al. (2009) showed interannual storage changes can produce a negative correlation between the evapotranspiration ratio and aridity that is counter to the Budyko curve for baseflow-dominated basins in the Nebraska Sand Hills. Wang (2012) evaluated inter-annual storage changes for twelve watersheds in Illinois and showed that, on an annual timescale, variability in runoff and storage is larger than evapotranspiration, and accounting for storage can improve the performance of Budyko predictions. Du et al. (2016) presented a method for explicitly accounting for storage changes within the Budyko framework and demonstrated that this approach can greatly improve performance in arid regions, or over shorter timescales where the steady-state assumption is not valid.

These studies all indicate the potential importance of groundwater–surface water interactions within the Budyko framework and illustrate paths forward for incorporating groundwater–surface water interactions into Budyko analysis. However, the extensive field work needed to fully quantify groundwater–surface water exchanges is often not possible and is counter to the simplicity and minimal data requirements of the Budyko approach. Even in Budyko analysis focused on groundwater–surface water interactions, quantifying groundwater changes remains a limiting factor. For example, in some studies, the impact of groundwater storage changes have been inferred from variability around the Budyko relationship without directly measuring these changes (Milly and Dunne, 2002; Zhang et al., 2008). Others have addressed interactions more directly using baseflow separation techniques that require only streamflow observations (Wang et al., 2009) or lumped watershed models that parameterize baseflow and recharge (Du et al., 2016). However, with both of these approaches the groundwater system is still not directly simulated or observed. Istanbulluoglu et al. (2012) and Wang (2012) did use observations of water table depth to directly quantify storage changes and demonstrate the impact of this change within the Budyko framework; but the study areas with this approach were relatively limited (four watersheds for Istanbulluoglu et al., 2012, and twelve for Wang, 2012). Groundwater observations sufficient to precisely characterize watershed storage changes are difficult to obtain and are not widely available. Therefore, adding groundwater storage calculations into Budyko analyses remains infeasible in many cases, and more work is needed to understand the sensitivity of Budyko relationships to changes in storage.

There are three common approaches to estimating evapotranspiration (

evaluate the sensitivity of Budyko relationships to groundwater storage changes;

characterize systematic differences in the impact of storage changes on Budyko relationships;

illustrate variability between approaches across physical settings and spatial scales.

We use an integrated hydrologic model to simulate water and energy fluxes in
both the surface and the subsurface. Here we apply a high-resolution
(1 km

calculating evapotranspiration from simulated runoff and precipitation;

using simulated evapotranspiration values directly;

using simulated evapotranspiration values directly and taking into account storage changes.

Conceptual illustration of

The numerical modeling approach used here provides several important advantages for this type of analysis. Within the model, groundwater–surface water exchanges for every watershed in the system are fully characterized. This guarantees perfect closure of the water balance and means that we can mimic all three approaches within a consistent numerical framework where storage changes are directly accounted for. Furthermore, because the goal is to understand differences between approaches, and not to predict local Budyko parameters, the key advantage here is the ability to evaluate physically realistic behavior across a variety of physical settings and spatial scales where groundwater can be fully accounted for. Within this context, it should also be noted that the focus is on how groundwater storage changes perturb relationships. Therefore, uncertainty in local model parameters is much less important than realistic simulation of physical interactions for a range of storage changes and aridity values within a controlled numerical framework.

Sections 2.1 and 2.2 detail the numerical modeling approach and the continental-scale simulation used for analysis. An explanation of the source of each of the relevant water balance terms generated from the model is provided in Sect. 2.3. Sections 2.4 and 2.5 explain the three different approaches for ET estimation and how they are evaluated within the Budyko Framework.

Previous work has evaluated the Budyko curve using hydrologic models of varying levels of complexity. The “abcd” model employed by Du et al. (2016), among others, is a lumped water balance model that includes baseflow and groundwater recharge using calibrated parameters. Yokoo et al. (2008) used a different water balance model with a more complex groundwater formulation that includes saturated and unsaturated zones, but the authors noted limitations in simulating infiltration excess overland flow with this approach. Gentine et al. (2012) applied a water balance model that includes a soil bucket and can simulate infiltration excess overland flow; however it did not include topography and was only applied at the plot scale. While these approaches do account for storage in the subsurface, and varying levels of complexity in groundwater–surface water exchanges, they all take a lumped approach and rely on calibrated parameters that are not physically based. The lumped parameter approach is illustrated in Fig. 1a.

Increasing in sophistication, Troch et al. (2013)
used a semi-distributed model that included shallow perched aquifers as well
as root zone and soil moisture dynamics; and Koster and Suarez (1999)
evaluated a global circulation model that simulated land
surface and atmospheric processes using physically based equations.
Incorporating more sophisticated physical processes increases computational
expense, especially for large high-resolution domains. To address this,
Koster and Suarez (1999) used a global simulation but
at low spatial resolution (4

To the authors' knowledge, no one has evaluated Budyko behavior over large spatial scales using a hydrologic model that integrates lateral groundwater flow with surface processes (Fig. 1c). So-called integrated hydrologic models that incorporate physically based lateral groundwater flow with overland flow and land surface processes are a relatively new development in hydrologic modeling. These tools are ideal for capturing dynamic behavior and interactions throughout the terrestrial hydrologic cycle and they have been increasingly applied over the last decade. Achieving this level of complexity requires significant computational resources and detailed model inputs. These requirements have generally limited the application of integrated tools to regional-scale domains. Continental-scale high-resolution simulations have only recently become technically feasible.

For this analysis, we use the first high-resolution integrated groundwater–surface water simulation of the majority of the continental US (CONUS)
(Maxwell and Condon, 2016; Maxwell et al.,
2015). The CONUS simulation was developed using the integrated hydrologic
model ParFlow–CLM (Kollet and Maxwell, 2006, 2008;
Maxwell and Miller, 2005). ParFlow simulates three-dimensional
variably saturated groundwater flow using Richards' equation:

Overland flow is included in the groundwater flux term of Eq. (1) (i.e., in
the first term on the right hand side) using a free surface overland-flow
boundary condition that applies continuity of pressure and flux across the
boundary between the land surface and the subsurface. Overland flow is
solved using the kinematic wave approximation of the momentum equation where
the diffusion terms are neglected and it is assumed that the bed slope,

ParFlow is also coupled with a land surface model derived from the Common
Land Model (CLM) (Dai et al., 2003). In the combined
ParFlow–CLM model (Kollet and Maxwell, 2008), ParFlow solves
the water balance in the subsurface and CLM solves the combined water energy
balance at the land surface. At the land surface, the energy balance (

Map of the simulation domain extent (black box) with major river basins highlighted and labeled. Subbasins within the domain are outlined in grey. Major rivers are shown in blue for reference (note that the simulated river network is much more highly resolved, as illustrated in Maxwell et al., 2015).

This study focuses on simulated evapotranspiration

Similarly, transpiration is calculated by scaling the potential
evapotranspiration to account for stomatal and aerodynamic resistance as
follows:

The analysis presented here is based on a previously developed transient
ParFlow–CLM simulation of the majority of the CONUS
documented in Maxwell and Condon (2016). The CONUS domain
covers the majority of eight major river basins, shown in Fig. 2, and spans
roughly 6.3 million km

As detailed in Maxwell and Condon (2016) and Maxwell et al. (2015), the model extends 102 m below the subsurface, with five vertical layers that contour to the land surface using a terrain following grid formulation (Maxwell, 2013). The vertical resolution of the domain decreases with depth to better resolve the shallow subsurface. Layer thicknesses are 0.1, 0.3, 0.6, 1 and 100 m moving from the land surface down. Spatially heterogeneous physical parameters for the subsurface include porosity, saturated hydraulic conductivity and van Genuchten parameters. Subsurface spatial units were determined using a national permeability map developed by Gleeson et al. (2011) for the bottom 100 m of the domain and the soil survey geographic database (SSURGO) for the top 2 m. Maps of the subsurface units and their properties are available in Maxwell and Condon (2016) and Maxwell et al. (2015). The land surface was derived from the hydrologic data and maps based on the shuttle elevation derivatives at multiple scales (HydroSHEDS) digital elevation model using a topographic processing algorithm to ensure a fully connected drainage network (Barnes et al., 2016). Vegetation types were extracted from the USGS land-cover dataset using the IGBP land-cover classifications.

The model was first initialized to a steady-state groundwater configuration, using the ParFlow model without CLM, starting from a completely dry domain and providing a constant recharge forcing over the land surface to achieve a dynamic equilibrium. Development of this steady-state simulation and evaluation of the resulting groundwater configuration are provided in Condon et al. (2015) and Maxwell et al. (2015). Using the steady-state groundwater configuration as a starting point, and following some initialization period, the coupled ParFlow–CLM model was used to simulate the fully transient system including land surface processes for water year 1985 (i.e., 1 October 1984 through 30 September 1985), which was chosen as it is the most climatologically average within the past 30 years. The transient simulation was driven by historical hourly meteorological forcings for water year 1985 from the North American Land Data Assimilation System Phase 2 (NLDAS 2) (Cosgrove et al., 2003; Mitchell et al., 2004). Anthropogenic activities such as groundwater pumping and surface water storage are not included in the transient simulation. Therefore the simulation represents natural flows in a pre-development scenario, which is ideal for Budyko analysis. Complete details of the development of the transient simulation are available in Maxwell and Condon (2016).

The 1-year simulation presented here intentionally violates the steady-state assumption. The purpose of our analysis is to evaluate the impact of net storage changes on Budyko relationships, and therefore a steady-state simulation is not the goal. It can also be argued that storage changes will vary from year to year or depending on the multi-year period analyzed. The 1985 simulation year is not presented as a prediction of long-term storage variability, it is simply used to sample a range of groundwater–surface water exchanges across variable climates and physical settings. We present a general framework for understanding the impacts of storage changes in various Budyko formulations using water year 1985 as a representative example.

Similarly, because we are focused on a comparative analysis within the Budyko framework, the results are not dependent on local calibration between simulated results and observations. The discrepancies between approaches stem from differences in the variables used to create a water balance (refer to Sects. 2.3 and 2.4); these findings are not sensitive to parameter uncertainty in the model. Still, the transient simulation has been rigorously validated against all publicly available observations for water year 1985. This includes transient observations at varying frequencies from 3050 stream gauges, 29 385 groundwater wells and 378 snow stations for a total of roughly 1.2 million comparison points. Flux tower observations were not available over this period, but latent heat fluxes were also compared to the Modern Era Retrospective-analysis for Research and Application (MERRA) dataset. Complete details of the model validation are provided in the Supplement of Maxwell and Condon (2016).

Although there are of course limitations to the model and significant uncertainties in spatial model parameterization, especially for the subsurface, overall comparisons between simulated and observed values demonstrate that the modeling approach is robust. Stream-flow timing and magnitude are generally well matched in undeveloped basins, snowpack timing and melt is accurate, and spatial patterns in latent heat flux are reasonable. Most importantly for this analysis, the model validation shows that ParFlow is accurately capturing the relevant physical processes. Uncertainty in subsurface parameterization, bias in atmospheric forcing data and lack of anthropogenic activities were identified as key areas that could improve the local predictions of the model. However, as discussed above, the purpose of this work is not to predict Budyko curve parameters for water year 1985. The uncertainties listed here are therefore important to note, but do not limit the utility of this tool as a test bed for evaluating interactions across spatial scales and complex physical settings.

Outputs from the hydrologic simulation are used to quantify all of the
relevant water balance components for Budyko analysis. Precipitation is an
input to the ParFlow CLM model. Within the model, precipitation can
infiltrate to the subsurface, contribute to runoff or pond on the land
surface. Evaporation occurs from ponded water, bare soil and canopy
interception. Additionally, roots pull water from the subsurface to support
transpiration for plants and lateral groundwater flow redistributes moisture
within the subsurface and can further support overland flow. All of these
processes occur within every 1 km

At the watershed scale, precipitation

There are multiple ways to estimate groundwater contributions within the
model. Using gridded model outputs, the exchanges across the boundaries of
every river cell can be summed to determine net contribution of groundwater
to overland flow. Similarly, we can aggregate hourly changes in groundwater
storage for every subbasin to determine total storage exchanges. Because we
are interested in the net contribution of groundwater to streamflow and
evapotranspiration for this analysis, we can take a simpler approach here. Within
our numerical framework we have guaranteed closure of the water balance for
every watershed and therefore the net change in groundwater storage that
contributes to the surface water budget is simply

This approach is focused solely on the net contribution of groundwater to the surface water budget. Nested systems of local and regional lateral groundwater flow are simulated within the model and previous work has evaluated spatial patterns and physical drivers of lateral groundwater imports and exports across the domain (Condon et al., 2015; Maxwell et al., 2015) as well as groundwater residence times (Maxwell et al., 2016). Here we focus only on net exchanges with the surface that are relevant to the Budyko formulation. We do not need to quantify lateral exchanges in the subsurface directly for these purposes; however, it should be noted that the lateral redistribution of groundwater that occurs within the model is still vital to generating realistic groundwater configurations and supporting groundwater–surface water exchanges.

In addition to the simulated evapotranspiration (

Figure 3 maps the aridity index (

Within this annual simulation, Fig. 3d shows that groundwater–surface water exchanges (

Maps of

The groundwater contribution ratio map also illustrates the importance of
lateral groundwater flow at multiple spatial scales within the system.
Groundwater storage gains (i.e., positive values of

We have identified three common treatments of evapotranspiration within
Budyko analyses. As will be demonstrated later on, these three approaches
are identical in systems where the steady-state assumption is valid and no
storage changes are occurring. However, when this is not the case, we
hypothesize that the different formulations for evapotranspiration will
yield systematically different results. Here we summarize the three
approaches to

Precipitation and runoff are generally much easier to measure at the
watershed scale than evapotranspiration or groundwater storage changes. As a
result, in many Budyko analyses evapotranspiration is not actually measured
directly, but is calculated as the difference between precipitation and
surface outflow (e.g., Greve et al., 2015; Jones et al.,
2012; Renner et al., 2014; Wang et al., 2009; Xu
et al., 2013; Yang et al., 2009). This approach relies on the
assumption that changes in storage are negligible. We refer to this as the

A more direct, if less common, approach is to quantify evapotranspiration
from field observations. This approach does not require a steady-state
assumption when calculating evapotranspiration but it does require more
rigorous field observations and is therefore not feasible for Budyko
analysis of data-sparse areas. Within our simulation results, however,
“data” is not a limitation. Our modeled outputs include gridded hourly
evapotranspiration for the entire domain. Simulated

Finally, the most rigorous, and data-intensive, approach is to quantify both
evapotranspiration and groundwater–surface water exchanges directly. This
approach has been used in recent studies seeking to evaluate storage impacts
on Budyko relationships (e.g., Istanbulluoglu et al., 2012;
Wang, 2012). Changes in groundwater storage are not used to
adjust evapotranspiration values directly but they can be applied to
precipitation estimates to better reflect the quantity of water that is
available to partition into overland flow or evapotranspiration. This is
defined as effective precipitation and is calculated as precipitation minus
groundwater contribution (

It should be noted here that the first two approaches (i.e., inferred and direct evapotranspiration) are commonly used in analyses that rely on the standard equilibrium assumption while the final method is designed for situations where this is not the case. By comparing results between all three, we consider the impact of nonzero groundwater contributions both for approaches that assume it is negligible and those that account for it.

Budyko's original formulation expressed the evapotranspiration ratio (

Illustration of the Budkyo framework showing curves with three different
shape parameters (black lines,

Budyko plots for the three approaches

Here we apply the commonly used Budyko formulation from Fu (1981) and Zhang et al. (2004):

Figure 4 plots Eq. (8) for a range of

In the following sections, Budyko relationships are plotted and shape
parameters are evaluated for all three approaches using variations of Eq. (9)
as follows:

inferred evapotranspiration: evapotranspiration is calculated from
precipitation and outflow so (

direct evapotranspiration: Eq. (9) is applied as written;

effective precipitation: precipitation is replaced by effective
precipitation (

Results and discussion are divided into two sections. In Sect. 3.1 the three approaches to evapotranspiration fractions are compared across the entire simulation domain. Systematic differences are identified and evaluated as a function of groundwater contributions. A conceptual framework is presented to explain the biases between approaches. In Sect. 3.2 the potential implications of these differences are illustrated by comparing spatial patterns between the three approaches as well as relationships across spatial scales.

Figure 5 plots every watershed in the domain shown in Fig. 2 using the three
approaches to estimating the evapotranspiration fraction. In all three figures
the watershed points follow the overlaid Budyko curves; 77 % of the
watersheds fall within the 1.6 to 3.6 shape parameter lines for the inferred
evapotranspiration approach, 51 % for the direct approach and 72 % for
the effective precipitation approach. This demonstrates that Budyko
relationships are recreated with the integrated hydrologic model. However,
there are some notable differences between methods. With the inferred

Illustration of the treatment of groundwater contributions for each of the three approaches. The black lines show the water and energy limits and an example Budyko curve, similar to Fig. 4. Arrows indicate the water balance component represented above and below the curve in each case.

Systematic differences between the Budyko plots shown in Fig. 5 are
explained by the way groundwater contributions influence each approach. This
is illustrated conceptually in Fig. 6. In systems with groundwater–surface water interactions, incoming precipitation is equal to the sum of
evapotranspiration, outflow and ground water contributions (Eq. 6). This
means that the difference between precipitation and outflow will
only equal evapotranspiration if there are no storage changes (i.e.,

The direct evapotranspiration approach avoids the limitations of the inferred approach by evaluating Budyko relationships as a function of the evapotranspiration fraction as intended in Eq. (10). However, groundwater contributions will still bias the results with this approach because the difference between precipitation and evapotranspiration is outflow plus groundwater contribution (Eq. 6). Thus, the curve in Fig. 6b represents the evapotranspiration fraction (as with Fig. 4) but now the partitioning is occurring between evaporation and runoff plus groundwater contributions, not just runoff. This means that the maximum evapotranspiration fraction (i.e., the upper water limit) is not 1, but 1 minus the groundwater contribution fraction.

This shift in the upper limits of water availability explains the values greater than 1 in Fig. 5b; in these watersheds groundwater contributions are negative (i.e., groundwater is supplying water to the land surface) and this allows for evapotranspiration values that are greater than the incoming precipitation. Similar shifts in the upper limits of the system for arid locations were found by Potter and Zhang (2009) who noted that evapotranspiration was actually approaching a fixed portion of potential evapotranspiration for high-rainfall years in arid basins in Australia.

Comparison of shape parameters to groundwater contribution ratios for
the three approaches in every watershed

The effective precipitation approach is designed to maintain focus on
partitioning between evapotranspiration and overland flow by removing
groundwater contributions from the denominator of both ratios
(i.e., adjusting both the

The systematic differences explained in Fig. 6 are evaluated by calculating
the shape parameter (Eq. 9) for the curve corresponding to every watershed
plotted in Fig. 5. Figure 7a–c plot the resulting shape parameters as a
function of groundwater contribution fraction colored by aridity for each of
the three approaches. Recall from Fig. 4 that larger curve numbers fall
closer to the upper limits on the Budyko plots and positive groundwater
contribution fractions occur when there is a net flux from the surface water
to the groundwater (i.e., net infiltration). Positive

Both the inferred (Fig. 6a) and direct approaches (Fig. 6b) show clear, but
contradictory, relationships with groundwater–surface water exchanges. There
is a positive relationship between the shape parameter and groundwater
contribution fraction for the inferred evapotranspiration approach at the
lower limits of the system, as delineated by the dashed line in Fig. 7a. This
indicates that in arid watersheds, increased groundwater
contributions are correlated with larger evapotranspiration fraction
(i.e., with larger curve numbers). This behavior is consistent with Fig. 6a; because
the groundwater contribution is included in the evapotranspiration fraction
when evapotranspiration is inferred from precipitation and outflow
(i.e.,

Taking this idea further, Fig. 7d shows that a constant positive groundwater contribution applied across aridity values will vertically shift the Budyko curve relative to a scenario with no storage changes if evapotranspiration is inferred. In the case of a positive groundwater contribution, this vertical shift moves points closer to the water and energy limits of the system, and therefore increases their shape parameters. Note that in the Fu equation (Eq. 9), Budyko curves with different shape parameter are not parallel to one another and converge at low aridity values; therefore the same groundwater contribution value changes the shape parameter differently depending on the location within the Budyko plot. The linear trend traced along the lower portion of the scatter plot in Fig. 7a shows that for the lowest curve numbers, occurring in watersheds with high aridity, there is a roughly linear relationship between groundwater contribution and shape parameters. This approximate linearity occurs because the Fu curves become almost parallel for high aridity values (see Fig. 4). For lower aridity values, this is not the case and the relationship between groundwater contribution and shape parameter will be positive but nonlinear.

Figure 7b plots groundwater contributions versus shape parameters similar to Fig. 6a but for the direct evapotranspiration approach. Recall that with this approach the groundwater contributions are now essentially lumped with the outflow fraction (as opposed to the evapotranspiration fraction with the inferred approach; refer to Fig. 6a and b). This means that rather than shifting points vertically in the Budyko plot (i.e., Fig. 7d), positive groundwater contributions change the total water that is available for evapotranspiration. This can be conceptualized as shifting the limits of how much total water is available for evapotranspiration.

In this case a positive groundwater contribution (i.e., surface water
infiltrating to groundwater) is essentially a loss to the surface water
system and decreases the upper limit of water available.
Figure 7e illustrates this point for a constant groundwater contribution across the
entire Budyko plot. When groundwater contributions are present, the upper
water limitation on the system shifts from 1 to 1

Finally, a scatter plot of shape parameters versus groundwater contribution fraction for the effective precipitation case (Fig. 6c) shows similar patterns with aridity but no clear correlation between storage changes and shape parameters. This is to be expected because the effective precipitation approach adjusts for groundwater contributions in both the evapotranspiration ratios and the aridity index before plotting. However, it should be noted that some dependence on groundwater contribution is still to be expected, to the extent that groundwater–surface water exchanges are also correlated with other watershed properties. For example, groundwater contribution levels can also be correlated with vegetation type, soil properties and other watershed characteristics, which have been correlated to shape parameters in previous research (e.g., Li et al., 2013; Shao et al., 2012; Williams et al., 2012; Xu et al., 2013; Yang et al., 2009).

This is true for the other approaches too; while the effect of groundwater contributions within each space can be precisely determined using Eqs. (7) and (9), it is important to note that the watersheds evaluated here are also heterogeneous in land cover, topography and seasonality. Therefore, in the scatter plots shown in Fig. 7, the relationships between shape parameters and groundwater contribution explained by Fig. 7d and e appear as limits rather than strong predictors. This point is also made by Istanbulluoglu et al. (2012) who evaluated the impact of groundwater storage changes on Budyko relationships using the inferred evapotranspiration approach and adjusting for storage changes using estimates from groundwater observations. They provide a similar conceptual model to Fig. 7d, describing consistent shifts within the Budyko space as a function of groundwater contribution. However, for the four basins in Nebraska that they evaluated they found a negative relationship between inferred evapotranspiration ratios and aridity. This was attributed to a strong negative correlation between groundwater contribution fraction and aridity index. In other words, for this subset of basins, they show that the resulting trend is controlled by the dependence of groundwater contribution on other watershed characteristics.

Comparison of shape parameters between the three approaches for every watershed. Points are colored by groundwater contribution fraction as shown in Fig. 3d. The dashed line on each plot is a one-to-one line for reference.

Figure 8 compares the shape parameters calculated with each approach to illustrate the way that different assumptions can bias derived Budyko relationships. Figure 8a shows the differences between the inferred and direct evapotranspiration approaches, which are commonly used in studies that assume no change in storage. Because groundwater contributions are incorporated into different components of the water balance with these methods, Fig. 8a shows that, for positive groundwater contributions (green points), the inferred shape parameters are systematically higher than the direct shape parameters, while the inverse is true when groundwater contributions are negative (purple points). Furthermore, when groundwater contributions are large (i.e., the dark green circles in Fig. 8a), the direct method has uniformly low shape parameters, but the inferred method still shows a range of shape parameters. This is to be expected from the conceptual model of the direct evapotranspiration approach (Fig. 7e), where we showed that high groundwater contributions decrease the upper limit of the evapotranspiration ratio. This shift biases the system towards uniformly low shape parameters that are less sensitive to other watershed characteristics.

The direct and inferred evapotranspiration methods are also compared to the effective precipitation approach, which does account for groundwater contributions (Fig. 7b and c). As would be expected, the direct and inferred approaches have inverse biases relative to the effective precipitation method; shape parameters are systematically higher with the inferred approach relative to effective precipitation and lower for the direct approach. Here too, the trends with groundwater contributions are reversed, with positive contributions creating a positive bias for the inferred case and a negative bias for the effective precipitation case. This result is in keeping with the conceptual model of groundwater contributions to each approach; with the inferred evapotranspiration approach, groundwater contributions are lumped with evapotranspiration, while in the direct approach they are lumped with outflows.

Also, there is a much stronger correlation between the inferred
evapotranspiration and effective precipitation approaches (Fig. 8b) than
between direct evapotranspiration and effective evapotranspiration
approaches (Fig. 8c) (

Section 3.1 explored the relationship between groundwater storage and shape parameters using the three different approaches to evapotranspiration fractions. Here, we illustrate the impacts of these differences on spatial patterns in shape parameters and scaling relationships. The intent is to provide a demonstration of how systematic differences will propagate across spatial scales using the 1985 simulation as a test case. Obviously local differences will vary depending on the time period used for analysis and the associated levels of groundwater contribution.

Figure 9 maps shape parameters for all of the roughly 25 000 nested watersheds
in the simulation domain, calculated using the three different approaches to
evapotranspiration ratios. Even though the 1-year transient simulation
used for the analysis presented does not meet the Budyko equilibrium
criteria, Figs. 4c and 8c show that realistic Budyko relationships are still
found when groundwater contributions are accounted for using the effective
precipitation approach. Xu et al. (2013) built a neural
network model to predict shape parameters using long-term observations from
224 watersheds with drainage areas ranging from 100 to 10 000 km

Map of shape parameters calculated for the 24 235 nested watersheds
using the

All three maps demonstrate local variability and regional trends in the shape parameters. This spatial variability is partially caused by the spatial patterns in groundwater contribution fraction shown in Fig. 3d; however, it is also a reflection of variability in catchment characteristics such as vegetative properties, topography and climate that have been correlated to Budyko relationships by previous studies (e.g., Li et al., 2013; Milly, 1994; Shao et al., 2012; Williams et al., 2012; Xu et al., 2013; Yang et al., 2009; Yokoo et al., 2008). The purpose here is not to isolate all of the sources of spatial heterogeneity, but rather to illustrate how spatial patterns change depending on the treatment of storage.

Budyko plots of evapotranspiration ratio versus aridity index using
the effective precipitation method with watersheds grouped by drainage area
[km

Spatial patterns are consistent between the three approaches in the more
humid eastern portion of the domain, where groundwater contribution ratios
are generally smaller (Fig. 3d), but in the more arid western portion of the
domain significant differences are observed. For both the inferred
evapotranspiration and effective precipitation approaches there are large
red areas indicating shape parameters greater than four where the
evapotranspiration ratio is falling very close to the water limitation. The
areas with the highest shape parameters (i.e., greater than four) are
generally consistent between the inferred evapotranspiration and effective
precipitation approaches, but the inferred approach results in higher curve
numbers throughout the western portion of the domain than the effective
precipitation approach. This is consistent with Fig. 8b, which showed strong
correlations between the shape parameters of these two approaches
(

Conversely, with the direct evapotranspiration approach, the western portion of the domain has much lower shape parameters and less spatial variability. Again, this finding is consistent with Fig. 7b and e, which show that when groundwater contributions are high, the curve numbers are uniformly low because the flux from the surface water system to the groundwater shifts the upper limit of the evapotranspiration fraction down. The systematic differences in Fig. 9, both with respect to the shape parameter values and the spatial patterns in these parameters, where groundwater–surface water exchanges are occurring indicate the potential to arrive at fundamentally different conclusions about spatial trends in shape parameters, depending on the approach used.

Next, we evaluate groundwater impacts as a function of drainage area. Budyko
originally limited analysis to large basins (which he defined as drainage
areas greater than 10 000 km

The shape parameters estimated with the effective precipitation approach are arguably the most comparable to other long-term studies that have assumed equilibrium conditions (assuming that the watersheds they studied actually were in equilibrium over the study period). The simulated median value found here is slightly lower than the original Budyko value of 2.6 and the median value of 2.56 found by Greve et al. (2015) using the 411 Model Parameter Estimation Experiment (MOPEX) catchments in the US. However, it compares well with 1.8 median value for large MOPEX basins in the US reported by Xu et al. (2013); although, it should be noted that Xu et al. (2013) report a higher 2.6 median value for small basins, and the median small basin value reported here is 2.0. Part of this bias can likely be attributed to the concentration of MOPEX basins in the eastern portion of the US where Fig. 9 shows that shape parameters are generally higher. Overall, the consistency in spatial patterns and convergence around the Budyko curve for large drainage areas indicates that the ParFlow–CLM model recreates Budyko relationships even over a relatively short annual simulation period, as long as groundwater contributions are adjusted for (i.e., using the effective precipitation approach). However, for smaller watersheds, variability in catchment characteristics is still an important consideration.

While all three approaches have decreased variance with increased drainage area, the median and variance are not necessarily consistent between methods. Figure 11 shows the interquartile range of shape parameters for each approach with increasing drainage area. In all three cases, the 75th percentile shape parameters decrease and the 25th percentile shape parameter increases with increasing area. Again this indicates increased importance of watershed characteristics at smaller scales; local variability is muted and the probability of observing very high or very low shape parameters decreases as the scale increases from smaller to larger watersheds. In the case of the inferred and direct evapotranspiration approaches, because groundwater contributions are not accounted for in the calculations, some of this variability can also be attributed to spatial patterns in groundwater–surface water exchanges and lateral groundwater flow. As previously noted, the groundwater contribution map (Fig. 3d) shows that the largest (positive or negative) groundwater contribution fractions generally occur in small headwater basins. Across larger areas, local groundwater–surface water exchanges balance out and the overall groundwater contribution fractions for large watersheds tend to be smaller.

Box plots showing the interquartile range (i.e., 25–75th percentile values) of shape parameters for all three approaches grouped by drainage area. Dashed lines are at 1.6 and 2.6 for reference.

Consistent with Figs. 7 and 8, the inferred evapotranspiration and effective precipitation approaches are the most similar. For the largest drainage areas, the median shape parameter is 1.8 for the inferred evapotranspiration approach, 1.5 for the direct evapotranspiration approach and 1.7 using effective precipitation. The direct evapotranspiration formulation has systematically lower shape parameters than the other two approaches; the median value for this method is consistently below the other two. Again this agrees with Sect. 3.1 where we demonstrated an inverse relationship between shape parameters and groundwater contributions. The direct evapotranspiration approach also has a consistently smaller interquartile range than the other two methods. This results from the negative correlation with groundwater contribution and the decreased sensitivity that was shown for small shape parameters in arid locations. Figure 11 shows that all three approaches will yield qualitatively similar scaling relationships and convergence for large basins; however, the shape parameter values will vary.

One of the primary assumptions of the Budyko hypothesis is that watersheds are in equilibrium and there are no changes in storage. This means that all incoming precipitation will either leave the watershed as evapotranspiration or overland flow. While the original Budyko curve has been well verified with observations from around the globe, it is also now widely accepted that the relationship between evapotranspiration ratios and aridity indices is not universal, and some additional curve parameters are needed to account for spatial variability between watersheds. Many subsequent studies have related curve parameters to catchment properties such as vegetation, topography and seasonality (e.g., Li et al., 2013; Shao et al., 2012; Williams et al., 2012; Xu et al., 2013; Yang et al., 2009). More recently, additional studies have shown that if groundwater–surface water exchanges are present this can also influence the shape of the curve and account for additional variability between watersheds (Milly and Dunne, 2002; Zhang et al., 2008).

While methods have been developed to account for storage changes within the Budyko framework (e.g., Du et al., 2016), very few studies have sufficient data on groundwater–surface water interactions to evaluate the validity of the equilibrium assumption, much less to precisely quantify storage changes in their analysis. One of the key advantages of the Budyko approach is its ability to predict behavior based on a small number of relatively easy-to-obtain observations. Given its common application to data-sparse watersheds, where even evapotranspiration measurements are often not available, directly quantifying groundwater–surface water exchanges in these locations seems unlikely. Therefore, it is important to understand the sensitivity of Budyko relationships to uncertainty in storage changes in a general context that can be used to interpret results where precise measurements are not available.

Previous work has demonstrated systematic shifts in Budyko plots caused by groundwater–surface water interactions (Du et al., 2016; Istanbulluoglu et al., 2012; Milly and Dunne, 2002; Wang, 2012; Zhang et al., 2004). Here we demonstrate that the influence of groundwater storage changes on Budyko results will vary depending on how evapotranspiration is handled in the study. If evapotranspiration is measured directly, positive groundwater contributions (i.e., net infiltration from the surface to the subsurface) shift shape parameters down; conversely, if evapotranspiration is estimated using precipitation and runoff, positive groundwater contributions will increase shape parameters. In both cases the sensitivity of the shape parameter to storage changes varies non-linearly with both the aridity of the watershed and the evapotranspiration fraction.

Using a 1-year simulation with an integrated hydrologic model, we demonstrate these differences can result in different conclusions about spatial patterns in Budyko relationships and the median shape parameter across spatial scales. This indicates that it is important to consider the approach used for estimating evapotranspiration fractions when comparing results between studies, and provides a demonstration of the types of bias that would be expected if different methods are used.

These results also have implications for the myriad of studies that seek to relate shape parameters for Budyko curves to other watershed characteristics. The conceptual models shown here illustrate that groundwater contributions will shift points in consistent and predictable ways when other variables are held constant (i.e., if you apply a consistent groundwater contribution across the entire range of aridity values or consider the shift of a single point with a given aridity value). However, we use the results from our integrated hydrologic model to demonstrate that, within complex heterogeneous domains, groundwater–surface water exchanges are spatially heterogeneous and depend on watershed characteristics such as aridity values, which can also influence Budyko relationships. The scatter in Figs. 6 and 7 demonstrates that groundwater contributions cannot easily serve as an independent predictor of the shape of Budyko relationships. This also shows that in large comparative studies, the bias caused by groundwater–surface water interactions may not be readily apparent because it will vary from watershed to watershed.

The intention of these comparisons is not to discredit previous approaches, but rather to illustrate the potential impacts of assuming equilibrium conditions across a broad range of physiographic settings and spatial scales without the ability to verify this assumption. Our results show that even when changes in storage are occurring, large watersheds still roughly follow the Budyko curve; however, the shape parameter and scatter will vary with groundwater contribution and depending on how evapotranspiration is quantified. We suggest that studies that cannot verify the equilibrium assumption using groundwater observations include additional analysis to evaluate the sensitivity of their findings to uncertainty in storage changes by perturbing points using the conceptual models presented here. Even if groundwater contributions cannot be directly incorporated into analyses, this can help determine whether differences in shape parameters are actually resulting from unique basin characteristics or uncertainty in storage.

All data from this analysis are available upon request. Instructions for accessing the ParFlow simulations used here are provided in Maxwell and Condon (2016).

The authors declare that they have no conflict of interest.

Funding for this work was provided by the US Department of Energy Office of Science, Offices of Advanced Scientific Computing Research and Biological and Environmental Sciences IDEAS project. The ParFlow simulations were also made possible through high-performance computing support from Yellowstone (ark:/85065/d7wd3xhc) provided by National Center for Atmospheric Research's Computational and Information Systems Laboratory, sponsored by the National Science Foundation. Edited by: M. Bierkens Reviewed by: four anonymous referees