Stream recession curves and storage variability in small watersheds

The pattern of streamflow recession after rain events offers clues about the relationship between watershed runoff (observable as river discharge) and water storage (not directly observable) and can help in water resource assessment and prediction. However, there have been few systematic assessments of how streamflow recession varies across flow rates and how it relates to independent assessments of terrestrial water storage. We characterized the streamflow recession pattern in 61 relatively undisturbed small watersheds (1–100 km2) across the coterminous United States with multiyear records of hourly streamflow from automated gauges. We used the North American Regional Reanalysis to help identify periods where precipitation, snowmelt, and evaporation were small compared to streamflow. The order of magnitude of the recession timescale increases from 1 day at high flow rates ( ∼1 mm h−1) to 10 days at low flow rates (∼0.01 mm h−1), leveling off at low flow rates. There is significant variability in the recession timescale at a given flow rate between basins, which correlates with climate and geomorphic variables such as the ratio of mean streamflow to precipitation and soil water infiltration capacity. Stepwise multiple regression was used to construct a six-variable predictive model that explained some 80 % of the variance in recession timescale at high flow rates and 30–50 % at low flow rates. Seasonal and interannual variability in inferred storage shows similar time evolution to regional-scale water storage variability estimated from GRACE satellite gravity data and from land surface modeling forced by observed meteorology, but is up to a factor of 10 smaller. Study of this discrepancy in the inferred storage amplitude may provide clues to the range of validity of the recession curve approach to relating runoff and storage. Correspondence to: N. Y. Krakauer (nkrakauer@ccny.cuny.edu)


Introduction
The observation that river flow gradually decreases after a rainstorm was modeled mathematically by the beginning of the 20th century with the concept of a "recession curve" that describes the characteristic decay of flow rate with time during rainless periods. Such recession curves have been used to forecast flows, estimate the probabilities of low flows, infer groundwater storage or aquifer characteristics, and detect change in watershed characteristics over time. Analytical expressions for the form of the recession curve could be derived for idealized basin shapes and subsurface flow properties, or curves could be fit empirically from streamflow measurements. Early applications were held back in part by the lack of a systematic procedure for determining an appropriate functional form and parameters for the recession curve shape from river discharge measurements (for reviews, see Hall, 1968;Tallaksen, 1995). Brutsaert and Nieber (1977) developed a procedure for visualizing the recession curve for a given river that has been widely used and adapted. A family of functions describing river recessions is given by the power laẇ where Q is river discharge,Q is its rate of change (dQ/dt), and a and b are parameters. b = 1 corresponds to exponential decay of the flow, Q(t) = Q(0)·e −at , while for b = 1, Eq. (1) implies For a river recession that follows this pattern, a scatter plot of log(Q) vs. log(Q) should approximate a straight line, since from Eq. (1), log(−Q) = log(a) + blog(Q), when Q > 0. This is convenient because values for a and b can be fitted to recession data by linear regression. Further, the appropriateness of the power law functional form can be assessed by how well the points in the scatter plot are described by a straight line. In practice, Brutsaert and Nieber (1977) took time series of daily streamflow for several streams in New York state and plottedQ as the difference in stream flow between consecutive days (that were at least 5 days after the most recent rain) against the corresponding Q (estimated as the average of the two days). a and b were then estimated by drawing lines to match the lower envelope of the cloud of (log(Q),log(−Q)) points. The slope of this lower envelope, corresponding to b, tended to be higher than 1 -around 1.5 at low flows, possibly increasing to about 3 at high flows. Wittenberg (1999) fit a power law to data from German watersheds, also finding that b was around 1.5. A number of studies have determined or assumed that, at least well after rain events, b = 1 to acceptable accuracy, so that only one parameter, a, must be estimated from measured streamflow (e.g. Vogel and Kroll, 1992;Brandes et al., 2005;Eng and Milly, 2007; van Dijk, 2010). In this case the reciprocal of a is the recession timescale τ , corresponding to the ratio −Q/Q. In general, for other shapes of the recession curve (functional relationships between Q andQ), this recession timescale τ would vary as a function of the flow rate Q. Brutsaert (2008) argued that the recession timescale is fairly constant not only for a given stream but also across streams, at least during summer low flows and for large basins, at 45 ± 15 days.
A large number of studies have examined the variability in recession timescale across streams, most often in small regions (e.g. Bingham, 1986;Vogel and Kroll, 1992;Brutsaert and Lopez, 1998;Wittenberg, 1999;Zhang et al., 2009;Biswal and Marani, 2010;Zhu et al., 2010), but also on continental and larger scales ( van Dijk, 2010;Peña-Arancibia et al., 2010), and related inter-stream variability in the recession timescale to climate, topographic, or geologic factors. However, these studies have tended to concentrate on lowflow periods and fit a simple functional form of the recession curve, generally the power law (Eq. 1, often with b = 1 so that the recession timescale is taken to be constant), to derive the recession timescale. Therefore, it is not clear how the recession timescale and its spatial distribution varies across flow rates. Thus, one purpose of the current study is to examine the characteristics of recession curves derived with uniform procedures from hourly streamflow data across a range of climate and terrain. Kirchner (2009) refined the estimation of recession curves based on measured Q andQ. In the approach of Kirchner (2009), the time series of hourly streamflow Q is sorted into bins and the averageQ is determined for each bin, using only time periods when both measured precipitation and estimated potential evaporation are small compared to streamflow, and parameters describing the relationship between Q andQ are fit for the binned data. Excluding time periods when evaporation might be a substantial part of the water budget avoids possible bias in the recession curve due to evaporation, which would be expected to increase the streamflow recession rate −Q at a given Q (Weisman, 1977;Wittenberg, 2003). It adds the complication of requiring hourly, rather than daily, streamflow data, since in most tropical and temperate watersheds, potential evaporation under rainless conditions is only likely to be much lower than streamflow at night; until recently, only daily streamflow data have generally been made available to the hydrological community. Kirchner (2009) showed that for two small watersheds in Wales, an empirically chosen quadratic functional form fits the binned hourly data well. b was found to be close to 2, and the departure (expressed by the quadratic term) from the log-linear power-law relationship was found to be small but significantly different from zero. We adopt the approach of Kirchner (2009) as a starting point because it has the advantage of making use of all hours for which streamflow data is available, excluding only those where other fluxes such as precipitation and evaporation are likely to be significant. Other common selection criteria, such as fitting the lower envelope of log(−Q)) or excluding streamflow records from a certain number of days after rain events, involve arbitrary thresholds and make it difficult to estimate the error of the fitted recession timescale. Brutsaert and Nieber (1977) chose to fit the lower envelope of log(−Q) on physical grounds -to select conditions under which groundwater flow is dominant, as opposed to other modes of flow with shorter recession timescales. In this study, our interest is in total streamflow, not in the groundwater component as such. For this objective, averaging all streamflow data that meet the selection criteria is more appropriate. Using hourly, as compared to daily, streamflow data enables the selection of low-evaporation periods and avoids bias in recession time estimates at higher flow rates when the recession timescale τ = −Q/Q is 1 day or less (Rupp and Selker, 2006a;Rupp and Woods, 2008). In this study, we consider only small watersheds (<100 km 2 ), so that the lag between runoff generation within the watershed and streamflow at the gauge is not much more than an hour and the measured discharge gives a reasonable estimate of hourly runoff.
The recession curve, expressed as the function τ (Q), can relate the streamflow Q to the basin water store S. The rate of change of storage,Ṡ, is the sum of the water fluxes in and out of the basin, namely streamflow, precipitation P , and evaporation E: When streamflow Q is the major flux of water in or out of the basin (precipitation and evaporation are small), we can make the approximatioṅ and therefore Having estimated τ as a function of Q for periods with low P and E from recession curve analysis, one can therefore estimate the change in storage S − S 0 corresponding to observed streamflows Q: where the reference streamflow Q 0 and reference storage level S 0 = S(Q 0 ) are arbitrary. Kirchner (2009) observed that by invoking water balance, one can estimate not only the rate of change in storage but also the net flow into the watershed F = P − E: The fitted function τ (Q) is derived from the recession curve over hours with negligible precipitation and evaporation; estimating S and F for other conditions from the above equations requires the assumption that the deduced relationship between Q and S continues to hold. Kirchner (2009) argued that this is indeed the case for his two study watersheds, as evidenced by the ability of the basin storage-discharge relationship estimated using Eq. (7) to successfully infer rainfall using only streamflow observations, and to predict the evolution of streamflow using measured precipitation and estimated evaporation. If so, streamflow time series could be used to infer other water flows (precipitation, evaporation) and stores at the catchment scale, which Kirchner (2009) termed "doing hydrology backward" compared with the conventional hydrology approach of deriving streamflow from meteorological forcing and the basin characteristics, which are often quite uncertain.
One recent application of "doing hydrology backward" is by Palmroth et al. (2010), who applied recession curve analysis to construct storage-discharge relationships to estimate evapotranspiration over parts of North Carolina state, although correlation with independent (eddy covariance) measurements of evapotranspiration was found to be poor. Brutsaert (2010) quantified changes in summer terrestrial water storage across the central United States in recent decades based on changes in summer streamflows along with an assumed recession timescale τ based on previous studies. The inferred changes in water storage were found to be consistent with groundwater observations in Illinois state (Brutsaert, 2008). To utilize this approach more widely, however, the validity of water storage changes inferred from recessioncurve analysis warrants further testing and comparison with available watershed-scale hydrometeorological data.
Multi-year time series of streamflow measurements at high (sub-hourly) temporal resolution are now freely available for many streams. Here, we employ these data to construct recession curves across a range of topography, geology, and climate in order to answer the following questions: 1. What is the variability across streams of the recession timescale at different flow rates? How much of this variability is correlated with factors such as climate and topography?
2. How does the variability in basin water storage inferred from streamflow recession curve analysis compare to basin water storage variability inferred from other, independent methods?

Streamflow data
The Hydroclimatic Data Network (HCDN) includes about 1500 stream gauge records from the United States Geological Survey (USGS) stream gauge network chosen to represent streams with long monitoring histories and whose flow has experienced minimal human disturbance (Slack and Landwehr, 1992). For this study, we chose HCDN gauges draining small watersheds (area under 100 km 2 ) in the coterminous United States (i.e. the first 48 states, excluding Alaska and Hawaii) which had daily records over at least 2/3 of the period 1979-2008. High-resolution streamflow measurements for these streams were obtained from the USGS Instantaneous Data Archive (IDA, http://ida.water.usgs.gov/ ida/; Showstack, 2007). The selection criteria yielded 75 streams, of which 61 had flow records at hourly or better resolution available through IDA. The median basin area for these 61 streams was 59 km 2 (range: 6.1-98 km 2 ), and basin locations covered a wide range of climate as well as topography, although most were in the wetter regions of the country, near the Atlantic and Pacific coasts (Fig. 1). Median streamflow per unit basin area was 1.17 mm day −1 (range: 0.14-4.10 mm day −1 ) ( Fig. 1), compared to an average of about 0.44 mm day −1 for the coterminous United States (USA) over the same period (Krakauer and Fung, 2008). The IDA streamflow records, typically at 15-min resolution, were averaged to generate hourly streamflow series. Streamflow values within one minute of the turn of the hour were assigned half-weight for estimating both hours' streamflow. Only hours with at least one streamflow measurement (or two measurements at their borders) were used in the analysis. The 61 streams in this study had an average of 148 thousand usable hours (equivalent to 17 years; range: 87-232 thousand). While these hours were not all consecutive (records frequently had gaps), this is not a problem for our analysis procedures.

Meteorology
Precipitation and evaporation records for the study watersheds were used to isolate hours with low/no precipitation or evaporation to use for estimating recession curves. Since sub-daily field measurements of precipitation and evaporation for each of the watersheds were not in general available, precipitation and evaporation were obtained from the National Centers for Environmental Prediction (NCEP) North American Regional Reanalysis (NARR) (Mesinger et al., 2006) for 1979-2008, which uses the Eta Model to simulate regional atmospheric circulation at relatively high horizontal resolution (32 km). For each watershed, meteorological fields were taken from the closest grid cell to the stream gauge -because the watersheds are all <100 km 2 while a NARR grid cell is ∼10 3 km 2 , the entire watershed is likely to lie within a single grid cell.
Precipitation in NARR assimilates rain gauge and satellite observations and is therefore much more accurate than previous reanalyses. Evaporation in NARR is simulated by the Noah land surface model and is only indirectly tied to observations (Mesinger et al., 2006).

Binning streamflow recession data
Following Brutsaert and Nieber (1977), hourly −Q was estimated as 1 t (Q h −Q h+1 ), the difference between streamflow in adjacent hours, while the corresponding hourly Q was the average for those hours, Q = 1 2 (Q h + Q h+1 ). Pairs of hours were selected for which both precipitation plus snowmelt and evaporation were less than 10 % of average streamflow (Fig. 2).
The NARR output fields include snowmelt, but the precipitation field is not divided into rain vs. snow. For determining rain-free hours, we required that the precipitation plus snowmelt be less than 0.1 of streamflow. For determining hours with low evaporation, we required that evaporation be less than 0.1 of streamflow (cf. Kirchner, 2009).
NARR output fields are available at 3 h time resolution and were matched to the corresponding hourly streamflow data. To account for delay between runoff generation and streamflow at the gauge location, we followed Kirchner (2009) in estimating this lag for each basin from the position of the maximum lagged cross-correlation between precipitation andQ. The streamflow time series were then shifted relative to the precipitation and evaporation time series by that amount before deciding what hours to exclude from the recession curve analysis as affected by precipitation or evaporation. The small size of the basins we analyzed kept this lag small (0-2 h for 50/61 basins and 0-5 h for 59/61, ranging up to 11 h).
Because the NARR grid size is bigger than the areas of our watersheds, there remains the concern that localized bursts of precipitation are not reflected in the NARR precipitation record. To reduce the effect that such unrecorded precipitation might have on the estimated recession curve, we further excluded periods of two or more consecutive positive hourlyQ (rising streamflow), on the assumption that these correspond to precipitation or snowmelt events not necessarily captured in NARR. Altogether, exclusion based on NARR precipitation, snowmelt, and evaporation and on observed rising flow left 0.6-27 % (median 7.3 %, or 10 thousand hours) of the original number of hours for constructing the streamflow recession curve, with the lower percentages found in arid basins (where there was often very little streamflow) and the higher percentages found in more humid basins.
The selected values ofQ (i.e. those when the lagged precipitation and snowmelt were small, and excluding periods of rising streamflow) were then averaged over ranges of Q. These ranges were selected as follows (cf. Kirchner, 2009): (1) begin with the top 1 % of the logarithmic range in Q; (2) compute the mean and standard error ofQ for all Q in that range; (3) if the number of values in the range is less than 9 or the meanQ is nonnegative or the standard error inQ is more than half its absolute mean value, expand the bin by another 1 % of the logarithmic range; (4) otherwise, keep the mean Q andQ of the bin and continue with the next 1 % of the logarithmic range. This resulted in typically 20-70 bins (median 58 bins), each with a mean Q, meanQ, and standard error ofQ. Regression on the binned values is basically equivalent to weighted regression on the original hourly data, where weighting is by the inverse variance ofQ over a small range in Q.
Occasionally (1 of our 61 sites) this procedure did not converge (i.e. the scatter inQ was large enough that the standard error did not drop sufficiently even when the bins were expanded). In such cases, the range in Q was divided into bins each with roughly equal numbers of elements (approximately the square root of the number of usable hours), and the mean and the standard error ofQ in each bin was calculated; bins with nonnegativeQ were simply discarded.

Fitting recession curves
We experimented with different functional forms for the recession curve, as fit to the binned log(Q) and log(−Q), including the linear-in-logs (power law) relationship (Eq. 3) and the quadratic-in-logs relationship (Eq. 4). The goodness of fit of different functional forms was assessed by fitting the functional form to one half of each streamflow record and calculating the misfit between the fitted values and the binned values found for the other half of the same streamflow record. We found that for our sample of watersheds, a nonparametric functional form corresponding to locally-weighted least squares linear regression (similar to LOWESS, Cleveland, 1979) gave a better fit than the linear or quadratic relationships. At each bin's value of log(Q) (log(Q bin )), a smoothed value of log(−Q) is obtained by weighted linear regression with weights that favor adjacent bins, namely wherew i is weighting based on the standard error at each bin (w i = (E i /log(−Q i )) −2 , where E i is the standard error of the binnedQ i ) and α is a parameter that sets the size of the neighborhood that is considered in the locally weighted linear regression. Between the points log(Q i ), the function was taken to be piecewise linear in log(Q). We used generalized cross validation (GCV, Craven and Wahba, 1979;Krakauer et al., 2004) to estimate a suitable value for α for each stream, which typically was around 0.3 log units. A lower limit of 0.1 log units was imposed on α to limit the tendency of GCV to occasionally return undersmoothed curves (Silverman, 1984). At the limit α → ∞, the result approaches a straight line (Eq. 3), while as α → 0 the fitted function becomes less smooth and approaches linear interpolation.
Given the above weights, uncertainties for the fitted function value were then calculated using standard linear regression methodology. If E i /log(−Q i ) in fact reflect the error standard deviation of the binned values and if this error is normally distributed, then r, the weighted sum of squares of the residuals from the fitted function, (where log(−Q i ) is the value of the fitted function at log(Q i )), should be close to the effective number of degrees of freedom n−m, where n is the number of bins and m the effective number of fitted parameters (which will increase as α decreases). In practice, we found that the r was often somewhat larger than this, presumably reflecting non-lognormal errors in streamflow measurements that propagate to the calculated −Q; the median ratio r/(n − m) across our sample was 1.5. Therefore, calculated uncertainties were multiplied by r n−m . This adjusted uncertainty for the fitted recession curve was found to be realistic: the difference between the binned log(−Q) and log(Q) derived from one half of a streamflow record and the fitted recession curve function derived from the other half was consistent with the adjusted uncertainty in the fit. Figure 3 shows the fitted function and its estimated uncertainty for an example site.
Alternative flexible functions to fit the recession curve are available, and should perform similarly to the piecewise linear (LOWESS-like) method we used. As an example, Fig. 4 shows a smoothing cubic spline fit (Cook and Peters, 1981) to binned discharge data along with the piecewise linear fit, with the smoothing parameter determined to approximate the mean square misfit of the corresponding piecewise linear curve (Woltring, 1986). (In this particular case, estimating the cubic spline smoothing parameter with GCV resulted in a clearly undersmoothed fit.) Both the piecewise linear and the cubic spline functions fit the data acceptably. In this study, we chose to use the piecewise linear fit rather than the cubic spline because the former gives straightforward estimates of the local uncertainty of the fit and because it has more predictable behavior at the extremes of the range of available streamflows. By contrast, linear regression does not represent the trend of the data well, while quadratic regression gives reasonable results for most of the observed range but does not capture the observed flattening of the recession time at both extremes (Fig. 4). Given that the slope of the relationship between log(−Q) and log(Q) is known to vary over the period following a rain event (Mizumura, 2005), reflecting changing saturated-zone thickness (Szilagyi, 2009), and that the details of this response will be modulated by heterogeneity in hydraulic properties within the watershed ) as well as drainage morphology (Biswal and Marani, 2010), a linear or quadratic function with few adjustable parameters is likely to be generally less suitable for modeling the streamflow recession curve as compared to a flexible locally smooth function.
Given the fitted recession curve, which gives log(−Q) as a piecewise linear function of log(Q) with associated uncertainty, the corresponding recession time τ (Q) is −Q/Q, with fractional uncertainty equal to that inQ (Fig. 3b). A lookup table was generated for the estimated storage S corresponding to various streamflow levels in the observed range by numerically integrating τ (Q)dQ (Eq. 8). Monthly mean watershed storage was then computed as the mean of hourly S calculated from the observed Q, and the variability of this monthly storage was compared with satellite and model estimates of water storage variability.

Explaining inter-stream variability in recession curves
For understanding the relationship between the recession curve and watershed hydrological processes, analysis of factors correlated with variability in the recession curve, as expressed by the function τ (Q), is helpful. For each stream, the function τ (Q) was expressed as 31 values for τ logarithmically spaced in Q for Q ranging between 0.0048 and 1.3 mm h −1 , corresponding to the median range of the streamflow bins used to fit the recession curves. Potential predictor variables used were mean streamflow (calculated from the streamflow time series); gauge longitude, latitude, and elevation, basin area, precipitation, climatological January minimum temperature, mean elevation, percent forest cover, percent lake cover, and soil water infiltration capacity, and stream length and slope (taken from the HCDN data files); and annual mean precipitation, snowfall, and evaporation (taken from NARR). The nonparametric (Spearman) correlation coefficient of each variable against log(τ ) at each of the 31 flow rates was calculated, and the mean square correlation coefficient across the 31 flow rates was compared to that obtained from regressions with 1000 random permutations of the predictor values to assess whether this variable is a significantly correlated to τ (Q) at the 0.05 level.
Stepwise multiple linear regression against log(τ ) (weighted by its site-specific estimated uncertainty) was also performed, where at each step the predictor variable was added whose inclusion most increased the weighted mean R 2 of the regression model. The procedure was terminated when the increase in R 2 from another variable being added to the regression model was not significant at the 0.05 level, as quantified by comparing to the increase in R 2 when values for that variable were randomly permuted before being added to the regression. Any missing values for predictor variables in the HCDN data file were filled in with the average value for that variable, to minimize bias in the estimated regression coefficients.

Water storage data
Monthly terrestrial water storage anomalies on a 1 • × 1 • grid for 2002-2008, estimated from gravitational anomalies reflected in the GRACE satellite positions, were obtained from the Jet Propulsion Laboratory (JPL). The water storage anomalies were derived from gravitational fields estimated from the satellite orbits by CSR (U. Texas/Center for Space Research), destriped to minimize processing artifacts, and rescaled based on a land surface model to minimize error in the storage anomaly magnitudes due to contributions from adjacent areas to the measured gravity anomaly (Swenson and Wahr, 2006;Swenson, unpublished). Another set of estimates of monthly terrestrial water storage anomalies on a 1 • × 1 • grid for 2001-2008 from the Noah land surface model run using observed meteorological forcing as part of the Global Land Data Assimilation System (Ek et al., 2003;Rodell et al., 2004)  . Variability in recession curves across streams is consistently much greater than the uncertainty associated with recession-curve estimation for individual streams, meaning that most of the variability in recession times seen across streams is real.
Measures of water storage variability calculated from the stream recession curves and for the corresponding grid cells in the GRACE and Noah datasets included the seasonal cycle amplitude (the standard deviation of the mean seasonal cycle) and interannual variability (the standard deviation of monthly storage anomalies once the mean seasonal cycle has been removed). Additionally, we calculated correlation coefficients to assess to what extent the spatial pattern of seasonal to interannual variability magnitudes was consistent between the streamflow-based, GRACE, and Noah storages, and to compare the temporal variation in water storage during the period of overlap of the different estimates.

Recession curves
Recession time as a function of streamflow Q showed broadly similar patterns across the sample of watersheds, characteristically decreasing from ∼ 10 days at the lowest streamflow rates resolvable with our binning method (∼ 0.005 mm h −1 ) to ∼ 1 day at high streamflow rates seen soon after rain or snowmelt (∼ 1 mm h −1 ) (Fig. 5). These timescales are similar to the pattern seen in stream recession curves constructed in previous studies (Brutsaert and Nieber, 1977;Vogel and Kroll, 1992;Brandes et al., 2005;Kirchner, 2009; van Dijk, 2010;Peña-Arancibia et al., 2010), although notably smaller than the 45 ± 15 day timescale for low-flow conditions seen in the studies cited by Brutsaert (2008), perhaps because of differences in the estimation procedure. The lower-envelope approach would tend to give smallerQ and larger τ compared to an averaging like the one used here. Note that the function τ (Q) is on average almost flat at low streamflows (corresponding to b ≈ 1 in the  Fig. 6. Inverse-variance weighted mean recession timescale τ across sites (as in Fig. 5b) along with the mean recession timescale for sensitivity analyses where the hours considered (1) included periods of high evaporation, (2) included periods of rising flow, (3) used an absolute rather than a relative precipitation-intensity threshold for excluding hours with precipitation. power-law relationship, Eq. 1), while it decreases with increasing Q at higher streamflows (corresponding to b ≈ 1.6); thus, no single power-law relationship can represent accurately the typical recession curve.
We see from Fig. 5a that there is across-stream variability of an order of magnitude in the recession time τ for any given flow rate Q. Comparing with Fig. 3b suggests that this variability is larger than the uncertainty in the fitted recession time curve for any one stream. This is confirmed by comparing the weighted mean fit uncertainty (spread of lines in Fig. 5b) with the total standard deviation across streams (error bars in Fig. 5b), which is much greater across flow rates.
We performed three sensitivity analyses to understand the impact of the criteria for choosing suitable hours on the recession curve. In one analysis, we did not exclude hours with high evaporation, as most previous analyses did not. This typically resulted in little change in the recession curve at high flows (when evaporation was likely small compared to flow), but lowered the recession timescale by up to 40 % during low flow, qualitatively similar to the finding of Weisman (1977) that flow diminished faster during periods of high evaporation than during periods of low evaporation (Fig. 6). In a second analysis, we did not exclude hours with rising streamflow (but little precipitation according to NARR). This resulted in a longer recession timescale at all flow rates, with the recession timescale almost doubling at low flow rates (Fig. 6). In a third analysis, we excluded hours with precipitation based on an absolute cutoff (precipitation of 0.01 mm h −1 or more) rather than a relative cutoff (precipitation equal to 10 % of streamflow or more). This led to small increases (up to 11 %) in the mean recession timescale at low flow rates (where some hours with slight precipitation were now included) and small decreases (up to 4 %) in the mean recession timescale at high flow rates (where some hours with slight precipitation were now excluded) (Fig. 6).
Given the variability across streams in recession timescales, it is of interest to determine what basinspecific factors could influence the recession timescale. Longer recession timescales were significantly correlated (in decreasing order of significance) with higher ratio of streamflow to precipitation, larger channel slope, higher elevation, more forest cover, higher basin soil infiltration capacity, lower longitude (i.e. western as compared to eastern USA), lower temperature, higher latitude (i.e. northern as compared to southern USA), and higher cold-season precipitation fraction. Many of these predictor variables were highly correlated with each other (for example, slope and elevation, or latitude and temperature). Recession timescales for the studied sample of small watersheds were not significantly correlated with lake cover, channel length, watershed area, or mean precipitation, streamflow, or evaporation. Ratios of stream channel length to drainage area, which occur in many analytical expressions for recession time constants of idealized aquifers (Rupp and Selker, 2006b, Table 3), also were not significant predictors in this sample of small watersheds.
Stepwise multiple regression analysis yielded a model with six predictor variables, in the order they were added the model: (1) longitude, (2) soil infiltration capacity, (3) latitude, (4) channel length, (5) forest cover, (6) HCDN precipitation. Of these, (4) and (6) were not found to be significant predictors in the univariate analysis, (cf. Fig. 7a). The relationship of the predictor variables to the recession timescale varied across flow rates: for example, the soil infiltration capacity showed a significant positive association with recession timescale only at low and moderate flow rates, while forest cover showed a significant positive association with recession timescale only at higher flow rates; the correlation of recession timescale with latitude was positive only at high flow rates (Fig. 7a). The multivariate model best predicted recession timescales at high flow rates, where R 2 was above 0.8, while at low flow rates R 2 was 0.3-0.5; the average R 2 across the range of flow rates was 0.57 (Fig. 7b). (Modeling recession timescale with the same predictor variables across flow rates simplifies the determination of overall model statistical significance, since the recession timescales at different flow rates are not assumed to be independent. We also tried an alternative approach, where a multivariate regression model was fitted independently for each flow rate. The significant predictor variables were different between flow rates, as suggested by Fig. 7a, and high flow rates tended to have more significant predictor variables and higher model R 2 compared to low flow rates.) Spatial semi-variograms of the recession timescale at particular flow rates (not shown) showed no evidence of spatial clustering, beyond the continental-scale east-west and northsouth trends captured by the linear relationship with longitude and latitude.

Basin storage
Seasonal variability in storage inferred from streamflow and the recession curve showed good coherence with the variability in terrestrial water storage inferred from GRACE, with a median (across sites) coefficient of determination (R 2 ) between the two of 0.69, compared with a median R 2 of 0.41 between the modeled (Noah) seasonal cycle and GRACE over the same grid cells. This performance of the recession curve technique in matching the phase of the GRACE seasonal cycle in water storage is particularly impressive because of the scale difference between the watershed size and the GRACE data (tens of km 2 versus ∼ 10 4 km 2 ), as compared to the similarity in scale between GRACE and the Noah simulations. Interannual variability in storage (computed for each site over months when the two data sets overlapped) was less coherent between the streamflow and GRACE approaches, with a median R 2 of 0.22, but this was better than the correlation of Noah interannual variability with GRACE, where the median R 2 was only 0.06. Both the seasonal and interannual variability in storage as inferred from the recession curves were generally lower than those derived from GRACE by around a factor of 10 (Fig. 8). The median ratio between streamflow-inferred and GRACE standard deviation in storage was 0.081 for the annual cycle and 0.106 for interannual variability. Storage variability in the Noah model was lower than the results from GRACE, because Noah does not represent variability in groundwater and surface water (Syed et al., 2008), but was still generally higher than the variability inferred from the recession curves. The median ratio between streamflowinferred and Noah standard deviation in storage was 0.389 for the annual cycle and 0.227 for interannual variability. Also, there was little correlation across watersheds between storage amplitude inferred from streamflow and that inferred from GRACE (Fig. 8).

What accounts for variability in recession timescales across streams?
In our sample, inter-stream variability in the recession timescale τ (Q) was correlated with measures of climate (ratio of streamflow to precipitation, forest cover, temperature), topography (elevation, channel slope), and soil (infiltration capacity). The geographic patterns observed (with higher recession timescales in the west and north) are probably related to the continental gradient in climate and geomorphology. The influencing variables and the direction of correlations between them and streamflow are largely consistent with those found in previous studies.
In studies by van Dijk (2010) in Australia and Peña-Arancibia et al. (2010) across the tropics and subtropics, basin aridity was found to be a dominant control on recession time, with more arid areas having shorter recession timescales. In the current sample of temperate-zone watersheds, we found a similar pattern: a lower ratio of streamflow to precipitation correlated with shorter recession timescale.
High forest cover, typically associated with moist conditions, was associated with longer recession times, as also found by Peña-Arancibia et al. (2010). The positive correlation between forest cover and recession timescale was found only at relatively high flow rates. Vegetation cover and density has a major impact on the spatial organization of soil moisture (Mohanty et al., 2000;Gómez-Plaza et al., 2000Qiu et al., 2001;Cantón et al., 2004;Temimi et al., 2010). The presence of vegetation fosters the retention of water in the canopy, litter layer, and root zone, which leads to slower drainage and therefore longer recession timescales immediately after storms. Roering et al. (2010) found that trees modify the topography around them by promoting soil formation and porosity and reducing erosion, which would tend to enhance the percolation of precipitation. The reduction of surface evaporation by vegetation shading (Hébrard et al., 2006) would also tend to increase recession timescale. On the other hand, the high transpiration rates of forests tend to drive down deep soil moisture during dry spells and reduce summer low flows, which would correspond to shorter recession timescales (Federer, 1973;Johnson, 1998). These evaporation-related impacts should be less pronounced in our analysis because we excluded periods with high evaporation when computing recession curves.
We found that channel slope and basin elevation were positively correlated with the recession timescale. Peña-Arancibia et al. (2010) also found a positive correlation between basin slope and recession timescale. This contradicts the theoretical expectation of a negative correlation: if the contributing aquifer has a larger slope, it would be expected to drain faster, all else being equal (Brutsaert and Nieber, 1977;Vogel and Kroll, 1992). A negative correlation is also seen in some observational studies covering smaller spatial scales (Zecharias and Brutsaert, 1988;Brandes et al., 2005). One possible explanation for this discrepancy is that local or watershed-level attributes like channel slope and elevation do not necessarily correspond to aquifer properties, which would depend more on regional topography and geology (Temimi et al., 2010). The positive correlation between soil infiltration capacity and recession timescale is more intuitive, although Peña-Arancibia et al. (2010) found no correlation between the recession timescale and mapped soil infiltrability and drainage indices.
An original contribution of this study is the attempt to quantify the factors controlling the recession timescale at different flow rates, rather than estimating a single recession timescale for each watershed. We found that some variables were significant predictors of the recession timescale only at high or low flow rates, showing the value of explicitly including flow rate in this sort of regression analysis. Particularly at low flow rates, there also appeared to be substantial interstream variability in the recession timescale not captured by the set of predictor variables we used. Studies of variability in recession timescales across smaller spatial scales of tens to hundreds of km point to important geological controls associated with indicators such as bedrock porosity, drainage density (based on total channel length, including tributaries), and soil group (e.g. Bingham, 1986;Brandes et al., 2005) that were not available for our sample of watersheds. van Dijk (2010) found that, after controlling for aridity, variability in recession timescales in Australia was spatially correlated over distances of 100-150 km, presumably reflecting geologic or topographic controls on soil and bedrock properties that was not reflected in the set of predictor variables used.
Intensive studies of flow pathways in research watersheds as well as studies of large samples of small gauged watersheds with watershed properties estimated from remote sensing and other distributed data sets can help characterize the link between watershed geology and morphology, on the one hand, and stream hydrology as reflected in the recession timescale, on the other, on a regional to global scale.

Why are storage amplitudes inferred from the recession curve so small?
We found that recession timescales derived from the recession curve constructed for periods of low precipitation, evaporation, and snowmelt to estimate a watershed storagedischarge relationship can be used to estimate monthly storage fluctuations that are coherent with those inferred from GRACE -more so, in fact, than those estimated by a state-ofthe-art land surface model forced by observed meteorology (Noah/GLDAS). Thus, recession curve analysis is promising for extending the GRACE record of terrestrial water storage variability to higher spatial and temporal resolution or over longer time periods. Additional study is required to extend discharge-based storage estimates to larger basins with concentration times of days or weeks, where the method used here to construct recession curves probably would not work, for direct comparisons with GRACE and with models over a regional spatial scale. The annual cycle and the interannual variability estimated from streamflow recession curve analysis prove to be similar in time evolution to those measured by GRACE, but are typically smaller by a factor of 10. This suggests that quantitative estimates of basin storage based on streamflow fluctuations and recession analysis should be treated with caution. We see several possible reasons for the small dynamic storage found with the recession curve approach. No single reason appears sufficient to explain the full magnitude of the disparity with GRACE, but several of them taken together may do so.
1. Because of the scale mismatch between GRACE and our study watersheds, the storage variability is being compared over quite different spatial scales. It is possible that our sample of watersheds represents a subset of the coterminous USA with particularly low water storage capacity, perhaps because these are disproportionally mountainous watersheds with high hydraulic gradients and limited soil profile development. However, we see low storage amplitude, as compared to GRACE, even in watersheds with relatively little topographic relief. Further, we would expect that if all things are equal, storage amplitude measured over a small watershed would be larger than the regional mean sensed by GRACE, because storage variations in adjacent watersheds within a region partly cancel out in the regional mean to the extent that they are not completely in phase.
2. It is possible that our assumption is false that during periods with low precipitation, evaporation, and snowmelt (according to the reanalysis), streamflow is the dominant flux of water in or out of the watershed. If so, then the streamflow-storage relationship we construct would be biased. It is clear from the streamflow record that the reanalysis frequently misses periods of heavy snowmelt in high-mountain basins, partly because the large topographic relief in these basins (which impacts the periods of snow accumulation and melt) is not captured by the ∼32 km grid spacing of the reanalysis. The atmospheric model used in NARR tends to underestimate mountain snowfall, so that snow is added to the model land surface in the analysis steps to nudge NARR toward observed snow cover (Luo et al., 2007). However, this would not explain the low storage amplitudes compared to GRACE observed in basins where snow is not a major part of the water budget.
3. Schaller and Fan (2009) suggested that groundwater movement out of headwater basins that is not reflected in streamflow could be an important term in basin-scale water balance over the USA, which would also seriously bias the storage-discharge relationships we constructed. However, given that basins with net groundwater outflow must be largely balanced on the continental scale by those with net groundwater inflow, it is difficult to see how this effect could be large and consistent enough to account for the systematic underestimation of the storage amplitude seen in almost our entire sample.
4. We assumed that when flow drops to zero, basin storage remains constant at a minimum value extrapolated from the storage-discharge relationship for periods of positive flow. This unquestionably results in an underestimate of the storage variability, since evaporation will result in decreasing storage over periods of zero flow. However, this cannot explain the disparity seen, since only 15/61 of our sample basins have any recorded hours of zero flow.
5. As we saw, assuming that a particular stream follows a single recession curve can be taken to imply that discharge for that stream is a single-valued function of basin water storage. While this assumption holds in analytical solution of some very simple aquifer models and for flow systems dominated by deep, homogenous aquifers (Brutsaert and Nieber, 1977;Dewandel et al., 2003;Rupp and Selker, 2006b), more complex watershed models show a clear dependence of flow rate on the time history of water input (rainfall or snowmelt), so that flow is not a single-valued function of basin storage and a recession curve plot will show systematic scatter (Sloan, 2000;Rupp et al., 2009). In such cases our approach to fitting a recession curve will produce averages of the rate of change in flowQ at given flow rates Q. However, using these averages may lead to biased estimates of the recession timescale τ (Q) = −Q/Q (since in general −Q/Q = −Q(1/Q)). This bias propagates to the dynamic storage, estimated by integrating τ (Q) (Eq. 8). For example, if −Q at a given Q is distributed lognormally across recession events, our average τ = −Q/Q would be biased low, which would lead to too-low calculated storage amplitude: specifically, if the standard deviation of log(−Q) is σ , the actual average τ would be e σ 2 times the value estimated using our method.
To roughly estimate the magnitude of this bias, we fit power law curves (Eq. 2) to individual recession events (defined as at least 48 consecutive hours of declining flow, with negligible rain and snowmelt, but including hours with non-negligible evaporation) for streams in our sample. We found that the standard deviation σ across recession events was typically 0.3-0.6 log units (Fig. 9), and inspection of published data for other streams (e.g. Fig. 7 of Rupp et al., 2009) shows similar spreads. If this spread is dominated by real variability rather than by measurement error or error arising from our approximation of each recession event as following a power law, then the mean recession timescales and storage amplitude may be some 10 %-40 % higher (e σ 2 −1) than estimated from the mean recession curve. Further study of how best to quantify and correct for this source of bias in recession curve analysis is needed.
6. Finally, the assumption that the storage-discharge relationship as determined for hours of low precipitation, evaporation, and snowmelt is valid for other periods, which is necessary for computing monthly-mean storage, may not hold. For example, the pool of water contributing to evaporation (largely soil moisture) and the pool of water contributing to streamflow (largely groundwater) may be partly decoupled, so that changes in basin storage due to evaporation would not be reflected in streamflow to the same extent as changes in basin storage due to streamflow. Similarly, when there is precipitation, streamflow generation is likely to be qualitatively different than during periods without precipitation, with overland flow and greater contributing area, resulting in a different storage-discharge relationship than that inferred here. This would mean that "doing hydrology backward" based on the storage-discharge relationship inferred from recession curve analysis using Eq. (9) does not work for small watersheds generally, although the concept may nevertheless have value when it is approximately valid, for example, over periods of light precipitation. Note, however, that explaining the majority of the observed discrepancy as due to a non-constant storage-discharge relationship would require this relationship to undergo very large fluctuations over periods of high precipitation, evaporation, and/or snowmelt (enough to increase the mean recession timescale by a factor of 10; cf. Eq. 7), compared to the relatively modest change seen when the subset of hours chosen for constructing the recession curve is modified (Fig. 6).

Conclusions
We have outlined a systematic method for constructing recession curves for small watersheds based on high-frequency streamflow measurements combined with reanalysis meteorology. We found that for the selected continent-wide sample of small, undisturbed watersheds, recession curves, as constructed by a uniform method intended to minimize the impacts of precipitation, snowmelt, and evapotranspiration, had broad similarities, with recession timescales typically increasing by a factor of 10 going from high flows as seen immediately after storms to flows near the median level, and leveling off at low flows. We were able to quantify the uncertainty in each recession curve, and linked variability in the recession timescale across watersheds to known climatic and geomorphological factors, but with a component of smallscale variability (particularly at low flow rates) which needs to be investigated in larger samples or with more explanatory variables. Storage variations inferred from the recession curve agree in terms of timing, but not amplitude, with independent gravimetric estimates. Study of the discrepancy in the inferred storage amplitude may provide clues to the range of validity of the recession curve constructed according to the method used here.