Flood risk reduction and flow buffering as ecosystem services : 1 a flow persistence indicator for watershed health 2 3

Flood damage depends on location and adaptation of human presence and activity to inherent variability of river flow. Reduced predictability of river flow is a common sign of degrading watersheds associated with increased flooding risk and reduced dry-season flows. The dimensionless FlowPer parameter (Fp), representing predictability, is key to a parsimonious recursive model of river flow, Qt = FpQt-1 + (1-Fp)(Pt-Etx), with Q, P and E expressed in mm d-1. Fp varies between 0 and 1, and can be derived from a time-series of measured (or modeled) river flow data. The spatially averaged precipitation term Pt and preceding cumulative evapotranspiration since previous rain Etx are treated as constrained but unknown, stochastic variables. A decrease in Fp from 0.9 to 0.8 means peak flow doubling from 10 to 20% of peak rainfall (minus its accompanying Etx) and, in a numerical example, an increase in expected flood duration by 3 days. We compared Fp estimates from four meso-scale watersheds in Indonesia and Thailand, with varying climate, geology and land cover history, at a decadal time scale. Wet-season (3-monthly) Fp values are lower than dry-season values in climates with pronounced seasonality. A wet-season Fp value above 0.7 was achievable in forest-agroforestry mosaic case studies. Interannual variability in Fp is large relative to effects of land cover change; multiple years of paired-plot data are needed to reject no-change null-hypotheses. While empirical evidence at scale is understandably scarce, Fp trends over time serve as a holistic scale-dependent performance indicator of degrading/recovering watershed health.


Introduction
Degradation of watersheds and its consequences for river flow regime and flooding intensity are a widespread concern (Brauman et al., 2007;Bishop and Pagiola, 2012;Winsemius et al., 2013).Current watershed rehabilitation programs that focus on increasing tree cover in upper watersheds are only partly aligned with current scientific evidence of effects of large-scale tree planting on streamflow (Ghimire et al., 2014;Malmer et al., 2010;Palmer, 2009;van Noordwijk et al., 2007van Noordwijk et al., , 2015;;Verbist et al 2010).The relationship between floods and change in forest quality and quantity, and the availability of evidence for such a relationship at various scales has been widely discussed over the past decades (Andréassian, 2004;Bruijnzeel, 2004;Bradshaw et al., 2007;van Dijk et al., 2009).The ratio between peak and average flow decreases between from headwater streams to main rivers in a predictable Hydrol.Earth Syst.Sci. Discuss., doi:10.5194/hess-2015-538, 2016 Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.manner; while mean annual discharge scales with (area) 1.0 , maximum river flow scales with (area) 0.7 on average (Rodríguez-Iturbe and Rinaldo, 2001;van Noordwijk et al., 1998).The determinants of peak flows are thus scale-dependent, with space-time correlations in rainfall interacting with subcatchment-level flow buffering in peakflows at any point along the river.
Whether and where peakflows lead to flooding depends on the capacity of the rivers to pass on peakflows towards downstream lakes or the sea, assisted by riparian buffer areas with sufficient storage capacity (Baldasarre et al., 2013).Well-studied effects of forest conversion on peak flows in small upper stream catchments (Alila et al., 2009) do not necessarily translate to flooding downstream.As summarized by Beck et al. (2013) meso-to macroscale catchment studies (>1 and >10 000 km 2 , respectively) in the tropics, subtropics, and warm temperate regions have mostly failed to demonstrate a clear relationship between river flow and change in forest area.Lack of evidence cannot be firmly interpreted as evidence for lack of effect, however.A recent econometric study for Peninsular Malaysia by Tan-Soo et al.
(2014) concluded that, after appropriate corrections for space-time correlates in the data-set for 31 meso-and macroscale basins (554-28,643 km 2 ), conversion of inland rain forest to monocultural plantations of oil palm or rubber increased the number of flooding days reported, but not the number of flood events, while conversion of wetland forests to urban areas reduced downstream flood duration.This study may be the first credible empirical evidence at this scale.The difference between results for flood duration and flood frequency and the result for draining wetland forests warrant further scrutiny.Consistency of these findings with river flow models based on a water balance and likely pathways of water under the influence of change in land cover and land use has yet to be shown.Two recent studies for Southern China confirm the conventional perspective that deforestation increases high flows, but are contrasting in effects of reforestation.Zhou et al. (2010) analyzed a 50-year data set for Guangdong Province in China and concluded that forest recovery had not changed the annual water yield (or its underpinning water balance terms precipitation and evapotransipiration), but had a statistically significant positive effect on dry season (low) flows.Liu et al. (2015), however, found for the Meijiang watershed (6983 km2) in subtropical China that while historical deforestation had decreased the magnitudes of low flows (daily flows ≦ Q95%) by 30.1%, low flows were not significantly improved by reforestation.They concluded that recovery of low flows by reforestation may take much longer time than expected probably because of severe soil erosion and resultant loss of soil infiltration capacity after deforestation.
The statistical challenges of attribution of cause and effect in such data-sets are considerable with land use/land cover interacting with spatially and temporally variable rainfall, geological configuration and the fact that land use is not changing in random fashion or following any pre-randomized design (Alila et al., 2009;Rudel et al., 2005).Hydrologic analysis across 12 catchments in Puerto Rico by Beck et al. (2013) did not find significant relationships between the change in forest cover or urban area, and change in various flow characteristics, despite indications that regrowing forests increased evapotranspiration.Yet, the concept of a 'regulating function' on river flow regime for forests and other semi-natural ecosystems is widespread.The considerable human and economic costs of flooding at locations and times beyond where this is expected make the presumed 'regulating function' on flood reduction of high value (Brauman et al., 2007) if only we could be sure that the effect is real, beyond the local scales (< 10 km 2 ) of paired catchments where ample direct empirical proof exists (Bruijnzeel, 1990(Bruijnzeel, , 2004).Here we will explore a simple recursive model of river flow (van Noordwijk et al., 2011) that (i) is focused on (loss of) predictability, (ii) can account for the types of results obtained by the cited recent Malaysian study (Tan-Soo et al., 2014), and (iii) may constitute a suitable performance indicator of watershed 'health' through time, combining statistical properties of the local rainfall regime, land cover effects on soil structure and any engineering modifications of water flow (Ma et al., 2014).
 Fig. 1 Figure 1 is compatible with a common dissection of risk as the product of hazard, exposure and vulnerability.Extreme discharge events plus river-level engineering co-determine hazard, while exposure depends on topographic position interacting with human presence, and vulnerability can be modified by engineering at a finer scale.A recent study (Jongman et al., 2015) found that human fatalities and material losses between 1980 and 2010 expressed as a share of the exposed population and gross domestic product were decreasing with rising income.Yet, the planning needed to avoid extensive damage requires quantification of the risk of higher than usual discharges, especially at the upper tail end of the flow frequency distribution.
The statistical scarcity of 'extreme events' and the challenge of data collection where they do occur, make it hard to rely on empirical data as such.Existing data on flood frequency and duration, as well as human and economic damage are influenced by topography, human population density and economic activity, interacting with engineered infrastructure (steps 5-9 in Fig. 1), as well as the extreme rainfall events that are their proximate cause.Common Hydrol.Earth Syst.Sci. Discuss., doi:10.5194/hess-2015-538, 2016 Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.hydrological analysis of flood frequency (called 1 in 10-, 1 in 100-, 1 in 1000-year flood events, for example) doess not separately attribute flood magnitude to rainfall and land use properties, and analysis of likely change in flood frequencies in the context of climate change adaptation has been challenging (Milly et al., 2002;Ma et al., 2014).There is a lack of simple performance indicators for watershed health (step 3 in Fig. 1) that align with local observations of river behavior and concerns about its change and that can reconcile local, public/policy and scientific knowledge, thereby helping negotiated change in watershed management (Leimona et al., 2015).The behavior of rivers depends on many climatic (step 4 in Figure 1) and terrain factors (step 1 in Figure 1) that make it a challenge to differentiate between anthropogenically induced ecosystem structural and soil degradation (step 0) and intrinsic variability (Fig. 1).Hydrologic models tend to focus on predicting hydrographs and are usually tested on data-sets from limited locations.Despite many decades of hydrologic modeling, current hydrologic theory, models and empirical methods have been found to be largely inadequate for sound predictions in ungauged basins (Hrachowitz et al., 2013).Efforts to resolve this through harmonization of modelling strategies have so far failed.Existing models differ in the number of explanatory variables and parameters they use, but are generally dependent on empirical data of rainfall that are available for specific measurement points but not at the spatial resolution that is required for a close match between measured and modeled river flow.Spatially explicit models have conceptual appeal (Ma et al., 2010) but have too many degrees of freedom and too many opportunities for getting right answers for wrong reasons if used for empirical calibration (Beven, 2011).Parsimonious, parametersparse models are appropriate for the level of evidence available to constrain them, but these parameters are themselves implicitly influenced by many aspects of existing and changing features of the watershed, making it hard to use such models for scenario studies of interacting land use and climate change.Here we present a more direct approach deriving a metric of flow predictability that can bridge local concerns and concepts to quantified hydrologic function: the 'flow persistence' parameter (step 3 in Figure 1).
In this contribution to the debate on forests and floods we will first define the metric 'flow persistence' in the context of temporal autocorrelation of river flow and derive a way to estimate its numerical value.We will then apply the algorithm to river flow data for a number of contrasting meso-scale watersheds, representing variation in rainfall and land cover, and and test the internal consistency of results based on historical data: one located in the humid tropics of Indonesia, and one in the unimodal subhumid tropics of northern Thailand.As a Hydrol.Earth Syst.Sci. Discuss., doi:10.5194/hess-2015-538, 2016 Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.next step we show how projected changes in rainfall patterns (frequency, intensity, temporal and spatial autocorrelation) are expected to interact with changes in land cover, soil infiltration behaviour and landscape-level buffering elements such as wetlands and impoundments, on the regularity of river flow, as captured by the flow persistence metric.
Possible applications of the flow persistence metric to questions on low flows are left for a later analysis.In the discussion we will consider the new flow persistence metric in terms of three groups of criteria (Clark et al., 2011;Lusiana et al., 2011;Leimona et al., 2015) based on salience (1,2), credibility (3,4) and legitimacy (5-7): 1. Does flow persistence relate to important aspects of watershed behavior?2. Does it's quantification help to select management actions?3. Is there consistency of numerical results? 4. How sensitive is it to noise in data sources? 5. Does it match local knowledge?6. Can it be used to empower local stakeholders of watershed management? 7. Can it inform local risk management?
2 Flow persistence as a suitable hydrological metric: theory

Basic equations
One of the easiest-to-observe aspects of a river is its day-to-day fluctuation in waterlevel, related to the volumetric flow (discharge) via rating curves (Maidment, 1992).Without knowing details of upstream rainfall and the pathways the rain takes to reach the river, observation of the daily fluctuations in waterlevel allows important inferences to be made.It is also of direct utility: sudden rises can lead to floods without sufficient warning, while rapid decline makes water utilization difficult.Indeed, a common local description of watershed degradation is that rivers become more 'flashy' and less predictable, having lost a buffer or 'sponge' effect (Joshi et al., 2004;Ranieri et al., 2004;Rahayu et al., 2013).The probably simplest model of river flow at time t, Qt, is that it is similar to that of the day before (Qt-1), to the degree Fp, a dimensionless parameter called 'flow persistence ' (van Noordwijk et al., 2011) plus an additional stochastic term ε: Hydrol.Earth Syst.Sci. Discuss., doi:10.5194/hess-2015-538, 2016 Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.
Qt is for this analysis expressd in mm d -1 , which means that measurements in m 3 s -1 need to be divided by the relevant catchment area, with appropriate unit conversion.If river flow were constant, it would be perfectly predictable, i.e.Fp would be 1.0 and ε zero; in contrast, an Fpvalue equal to zero and ε directly reflecting erratic rainfall represents the lowest possible predictability.
The Fp parameter is conceptually identical to the 'recession constant' commonly used in hydrological models, typically assessed during an extended dry period when the ε term is negligible and streamflow consists of baseflow only (Tallaksen, 1995); empirical deviations from a straight line in a plot of the logarithm of Q against time are common and point to multiple rather than a single groundwater pool that contributes to base flow.With increasing size of a catchment area it is increasingly likely that there indeed are multiple, partly independent groundwater contributions.
As we will demonstrate, it is possible to derive Fp even when ε is not negligible.In climates without distinct dry season this is essential; elsewhere it allows a comparison of apparent Fp between wet and dry parts of the hydrologic year.A decrease over the years of Fp indicates 'watershed degradation' (i.e.greater contrast between high and low flows), and an increase 'improvement' or 'rehabilitation' (i.e. more stable flows).
If we consider the sum of river flow over a sufficiently long period, we can expect ΣQt to closely approximate ΣQt-1, and thus From this relationship we obtain a first way of estimating the Fp value if a complete hydrograph is available: Rearranging Eq.( 3) we obtain The Fp term is equivalent with one of several ways to separate baseflow from peakflows.The Σε term reflects the sum of peak flows in mm, while Fp ΣQt reflects the sum of base flow, also in mm.For Fp = 1 (the theoretical maximum) we conclude that all ε must be zero, and all flow is 'base flow'.The stochastic ε can be interpreted in terms of what hydrologists call 'effective rainfall' (i.e.rainfall minus on-site evapotranspiration, assessed over a preceding time period tx since previous rain event): Where Ptx is the (spatially weighted) precipitation (assuming no snow or ice) in mm d -1 and Etx , also in mm d -1 , is the preceding evapotranspiration that allowed for infiltration during this rainfall event (i.e.evapotranspiration since the previous soil-replenishing rainfall that induced empty pore space in the soil for infiltration and retention).More complex attributions are possible, aligning with the groundwater replenishing bypassflow and the water isotopic fractionation involved in evaporation (Evaristo et al., 2015).
The multiplication of effective rainfall times (1-Fp) can be checked by considering the This approaches 1 for large n, suggesting that all of the water attributed to time t, i.e.Pt -Etx, will eventually emerge as river flow.For Fp = 0 all of (Pt -Etx) emerges on the first day, and riverflow is as unpredictable as precipitation itself.For Fp = 1 all of (Pt -Etx) contributes to the stable daily flow rate.For declining Fp, (1 > Fp > 0), river flow gradually becomes less predictable, because a greater part of the stochastic precipitation term contributes to variable rather than evened-out river flow.
Taking long term summations of the right-and left-hand sides of Eq.( 5) we obtain: Which is consistent with the basic water budget, ΣQ = ΣP -ΣE, at time scales that changes in soil water buffer stocks can be ignored.As such the total annual, and hence the mean daily river flow are independent of Fp.This does not preclude that processes of watershed degradation or restoration that affect the partitioning of P over Q and E also affect Fp.

Low flows
The lowest flow expected in an annual cycle is Qx Fp Nmax where Qx is flow on the first day without rain and Nmax the longest series of dry days.Taken at face value, a decrease in Fp has a strong effect on low-flows, with a flow of 10% of Qx reached after 45, 22, 14, 10, 8 and 6 days for Fp = 0.95, 0.9, 0.85, 0.8, 0.75 and 0.7, respectively.However, the groundwater reservoir that is drained, equalling the cumulative dry season flow if the dry period is sufficiently long, is Qx/(1-Fp).If Fp decreases to Fpx but the groundwater reservoir (Res =

Flow-pathway dependent flow persistence
A further interpretation of Eq.( 1) can be that three pathways of water through a landscape contribute to river flow (Barnes, 1939) On this basis a decline or increase in overall weighted average Fp can be interpreted as indicator of a shift of dominant runoff pathways through time within the watershed.
Similarly, a second interpretation of Fp emerges based on the fractions of total river flow that are based on groundwater, overland flow and interflow pathways: Beyond the type of degradation of the watershed that, mostly through soil compaction, leads to enhanced infiltration-excess (or Hortonian) overland flow (Delfs et al., 2009), saturated conditions throughout the soil profile may also induce overland flow, especially near valley bottoms (Bonell, 1993;Bruijnzeel, 2004).Thus, the value of Fp,o can be substantially above zero if the rainfall has a significant temporal autocorrelation, with heavy rainfall on subsequent days being more likely than would be expected from general rainfall frequencies.
If rainfall following a wet day is more likely to occur than following a dry day, as is commonly observed in Markov chain analysis of rainfall patterns (Jones and Thornton, 1997;Bardossy and Plate, 1991), the overland flow component of total flow will also have a partial temporal autocorrelation, adding to the overall predictability of river flow.In a hypothetical climate with evenly distributed rainfall, we can expect Fp to be 1.0 even if there is no infiltration and the only pathway available is overland flow.Even with rainfall that is variable at any point of observation but has low spatial correlation it is possible to obtain Fp values of (close to) 1.0 in a situation with (mostly) overland flow (Ranieri at al., 2004).

Flow persistence as a simple flood risk indicator
For numerical examples (implemented in a spreadsheet model) flow on each day can be derived as: Where pj reflects the occurrence of rain on day j (reflecting a truncated sine distribution for seasonal trends) and Pj is the rain depth (drawn from a uniform distribution).From this model the effects of Fp (and hence of changes in Fp) on maximum daily flow rates, plus maximum flow totals assessed over a 2-5 d period, was obtained in a Monte Carlo process (without Markov autocorrelation of rainfall in the default casesee below).Relative flood protection was calculated as the difference between peak flows (assessed for 1-5 d duration after a 1 year 'warm-up' period) for a given Fp versus those for Fp = 0, relative to those at Fp = 0. zero.This way a relative flood protection, expressed as reduction of peak flow, could be related to Fp (Fig. 4A).Relative flood protection decreased to less than 10% at Fp values of around 0.5, with slightly weaker flood protection when the assessment period was increased from 1 to 5 days (between 1 and 3 d it decreased by 6.2%, from 3 to 5 d by a further 1.3%).
Two counteracting effects are at play here: a lower Fp means that a larger fraction (1-Fp) of the effective rainfall contributes to river flow, but the increased flow is less persistent.In the example the flood protection in situations where the rainfall during 1 or 2 d causes the peak is slightly stronger than where the cumulative rainfall over 3-5 d causes floods, as typically occurs downstream.
As we expect peak flow to be proportional to (1-Fp) times peak rainfall amounts, the effect of a change in Fp not only depends on the change in Fp that we are considering, but also on its initial value, with greater Fp values leading to more rapid increases in high flows (Fig. 4B).
However, flood duration rather responds to changes in Fp in a curvilinear manner, as flow persistence implies flood persistence (once flooding occurs), but the greater the flow persistence the less likely such a flooding threshold is passed (Fig. 4C).The combined effect may be restricted to about 3 d of increase in flood duration for the parameter values used in the default example, but for different parametrization of the stochastic ε other results might be obtained.

An algorithm for deriving Fp from a time series of stream flow data
Equation (3) provides a first method to derive Fp from empirical data if these cover a full hydrologic year.In situations where there is no complete hydrograph and/or in situations where we want to quantify Fp for shorter time periods (e.g. to characterise intraseasonal flow patterns) and the change in the storage term of the water budget equation cannot be ignored, we need an algorithm for estimating Fp from a series of daily Qt observations.
Where rainfall has clear seasonality, it is attractive and indeed common practice to derive a groundwater recession rate from a semi-logarithmic plot of Q against time (Tallaksen, 1995).
As we can assume for such periods that ε = 0, we obtain Fp = Qt /Qt-1, under these circumstances.We cannot be sure, however, that this Fp,g estimate also applies in the rainy season, because overall wet-season Fp will include contributions by Fp,o and Fp,i as well (compare Eq. 9).In locations without a distinct dry season, we need an alternative method.
A biplot of Qt against Qt-1 (as in Fig. 3) during times of flow recession will lead to a scatter of points above a line with slope Fp, with points above the line reflecting the contributions of ε >0, while the points that plot on the Fp line itself represent ε = 0 mm d -1 .There is no independent source of information on the frequency at which ε = 0, nor what the statistical distribution of ε values is if it is non-zero.Calculating back from the Qt series we can obtain an estimate of Qadd as the realization of the stochastic ε for any given estimate of Fp, and select the most plausible value.For high Fp estimates there will be many negative Qadd values, for low Fp estimates all Qadd values will be larger.An algorithm to derive a plausible Fp estimate can thus make use of the corresponding distribution of 'apparent Qadd' values as estimates of ε (Qadd = Qt -Fp Qt-1).While ε, and thus in theory Qadd cannot be negative, small negative Qadd estimates are likely when using real-world data with their inherent errors.The FlowPer Fp algorithm (van Noordwijk et al., 2011) derives the distribution of Qadd,Fptry estimates for a range of Fp,try values (Fig. 5B) and selects the value Fp,try that minimizes the variance Var(Qadd,Fptry) (or its standard deviation) (Fig. 5C).It is implemented in a spreadsheet workbook that can be downloaded from the ICRAF website (****).

Fig. 5
A consistency test is needed that the high-end Qt values relate to Qt+1 in the same was as do low or medium Qt values.Visual inspection of Qt+1 versus Qt, with the derived Fp value, provides a qualitative view of the validity of this assumption.

GenRiver model for effects of land cover on river flow
The

Empirical data-sets
Table 1 provides summary characteristics of four meso-scale watersheds used for testing the Fp algorithm and application of the GenRiver model.Basic site-specific parameterization is given in Table 2 and land-cover specific default parameters in Table 3, while Table 4 describes the six scenarios of land-use change that were evaluated in terms of their hydrological impacts.

.4 Bootstrapping
We used a bootstrap approach to estimate the minimum number of observation (or yearly data) required for a pair-wise comparison test between two time-series of stream flow data (representing 2 scenarios of land use) to be distinguishable from a null-hypothesis of no effect.We built a simple macro in R (R Core Team, 2015) using the following steps: (i) Take a sample of size n from both time-series data with replacement, N times, (ii) Apply the Kolmogorov-Smirnov test, and record the P-value, (iii) Perform (i) and (ii) for different size of n (iv) Tabulate the p-value from various n, and determine the value of n when the p-value reached equal to or less than 0.025.The associated n represents the minimum number of observations required.Appendix 1 provides an example of the macro in R.

Empirical data of flow persistence as basis for model parameterization
Overall the estimates from modeled and observed data are related with 16% deviating more than 0.1 and 3% more than 0.15.The flow persistence estimates derived from the wettest three-month period are about 0.2 lower than those derived for the driest period, when baseflow dominates (Fig. 6).If we can expect Fp,i and Fp,o to be approximately 0.5 and 0, this difference between wet and dry periods implies a 40% contribution of interflow in the wet season, a 20% contribution of overland flow or any combination of the two effects.Among the four watersheds there is consistency in that the 'forest' scenario has the highest, and the 'degraded lands' the lowest Fp value (Fig. 7), but there are remarkable differences as well: in Cidanau the interannual variation in Fp is clearly larger than land cover effects, while in the Way Besai the spread in land use scenarios is larger than interannual variability.In Cidanau a peat swamp between most of the catchment and measuring point buffers most of landcover related variation in flow, but not the interannual variability.Considering the frequency distributions of Fp values over a 20 year period, we see one watershed (Way Besai) where the forest stands out from all others, and one (Bialo) where the degraded lands are separate from the others.Given the degree of overlap of the frequency distributions, it is clear that multiple years of empirical observations will be needed before a change can be affirmed.
Figure 8 shows the frequency distributions of expected effect sizes on Fp of a comparison of any land cover with either forest or degraded lands.Table 5 translates this information to the number of years that a paired plot (in the absence of measurement error) would have to be maintained to reject a null-hypothesis of no effect, at p=0.05.As the frequency distributions of Fp differences of paired catchments do not match a normal distribution, a Kolmorov-Smirnov test can be used to assess the probability that a no-difference null hypothesis can yield the difference found.By bootstrapping within the years where simulations supported by observed rainfall data exist, we found for the Way Besai catchment, for example, that 20 years of data would be needed to assert (at P = 0.05) that the ReFor scenario differs from AgFor, and 16 years that it differs from Actual and 11 years that it differs from Degrade.In practice, that means that empirical evidence that survives statistical tests will not emerge, even though effects on watershed health are real.
 Fig. 8  Table 5 At process-level the increase in 'overland flow' in response to soil compaction due to land cover change has a clear and statistically significant relationship with decreasing Fp values in all catchments (Fig 8A ), but both year-to-year variation within a catchment and differences between catchments influence the results as well, leading to considerable spread in the biplot.
Contrary to expectations, the disappearance of 'interflow' by soil compaction is not reflected in measurable change in Fp value.The temporal difference between overland and interflow (one or a few days) gets easily blurred in the river response that integrates over multiple streams with variation in delivery times; the difference between overland-or interflow and Tree cover has two contradicting effects on baseflow: it reduces the surplus of rainfall over evapotranspiration (annual water yield) by increased evapotranspiration (especially where evergreen trees are involved), but it potentially increases soil macroporosity that supports infiltration and interflow, with relatively little effect on waterholding capacity measured as 'field capacity' (after runoff and interflow have removed excess water).Fig. 6 shows that the total volume of baseflow differs more between sites and their rainfall pattern than it varies with tree cover.Between years total evapotranspiration and baseflow totals are positively correlated (see supplementary information), but for a given rainfall there is a tradeoff.Overall these results support the conclusion that generic effects of deforestation on decreased flow persistence, and of (agro)/(re)-forestation on increased flow persistence are small relative to interannual variability due to specific rainfall patterns, and that it will be hard for any empirical data process to pick-up such effects, even if they are qualitatively aligned with valid process-based models.

Discussion
In view of our results the lack of robust evidence in the literature of effects of change in forest and tree cover on flood occurrence may not be a surprise; effects are subtle and most data sets contain considerable noise.Yet, such effects are consistent with current process and scaling knowledge of watersheds.The key strength of our flow persistence parameter, that it can be derived from observing river flow at a single point along the river, without knowledge of rainfall events and catchment conditions, is also its major weakness.If rainfall data exist, and especially rainfall data that apply to each subcatchment, the Qadd term doesn't have to be treated as a random variable and event-specific information on the flow pathways may be inferred for a more precise account of the hydrograph.But for the vast majority of rivers in the tropics, advances in remotely sensed rainfall data are needed to achieve that situation and Fp may be all that is available to inform public debates on the relation between forests and Legitimacy aspects are "Does it match local knowledge?"and "Can it be used to empower local stakeholders of watershed management?" and "Can it inform risk management?".As the Fp parameter captures the predictability of river flow that is a key aspect of degradation according to local knowledge systems, its results are much easier to convey than full hydrographs or excedance probabilities of flood levels.By focusing on observable effects at river level, rather than prescriptive recipes for land cover ("reforestation"), the Fp parameter can be used to more effectively compare the combined effects of land cover change, changes in the riparian wetlands and engineered water storage reservoirs, in their effect on flow buffering.It is a candidate for shifting environmental service reward contracts from input to outcome based monitoring (van Noordwijk et al., 2012).As such it can be used as part of a negotiation support approach to natural resources management in which leveling off on knowledge and joint fact finding in blame attribution are key steps to negotiated solutions that are legitimate and seen to be so (van Noordwijk et al., 2013;Leimona et al., 2015).
Quantification of Fp can help assess tactical management options (Burt et al., 2014) as in a recent suggestion to minimize negative downstream impacts of forestry operations on stream flow by avoiding land clearing and planting operations in locally wet La Niña years.But the most challenging aspect of the management of flood, as any other environmental risk, is that the frequency of disasters is too low to intuitively influence human behavior where short-term risk taking benefits are attractive.Wider social pressure is needed for investment in watershed health (as a type of insurance premium) to be mainstreamed, as individuals waiting to see evidence of necessity are too late to respond.In terms of flooding risk, actions to restore or retain watershed health can be similarly justified as insurance premium.It remains to be seen whether or not the transparency of the Fp metric and its intuitive appeal are sufficient to make Hydrol.EarthSyst.Sci.Discuss., doi:10.5194/hess-2015-538,2016   Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.
Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2015-538,2016   Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.Qx/(1-Fp)) is not affected, initial flows in the dry period will be higher (QxFpx i (1-Fpx) Res > Qx Fp i (1-Fp) Res for i < log((1-Fpx)/(1-Fp))/log(Fp/Fpx)).It thus matters how low flows are evaluated: from the perspective of the lowest level reached, or as cumulative flow.The combination of climate, geology and land form are the primary determinants of cumulative low flows, but if land cover reduces the recharge of groundwater there may be impacts on dry season flow, that are not directly reflected in Fp.If a single Fp value would account for both dry and wet season, the effects of changing Fp on low flows may well be more pronounced than those on flood risk.Tests are needed of the dependence of Fp on Q (see below).Analysis of the way an aggregate Fp depends on the dominant flow pathways provides a basis for differentiating Fp within a hydrologic year.

Figure 2 
Figure 2 provides an example of the way a change in Fp values (based on Eq. 1) influences the visual pattern of river flow for a unimodal rainfall regime with a well-developed dry season.The increasing 'spikedness' of the graph as Fp is lowered indicates reduced predictability of flow on any given day during the wet season on the basis of the flow on the preceding day.A bi-plot of river flow on subsequent days for the same simulations (Fig.3) shows two main effects of reducing the Fp value: the scatter increases, and the slope of the lower envelope containing the swarm of points is lowered (as it equals Fp).Both of these changes can provide entry points for an algorithm to estimate Fp from empirical time series, provided the basic assumptions of the simple model apply and the data are of acceptable quality (see Section 3 below).For the numerical example shown in Fig.2, the maximum daily flow doubled from 50 to 100 mm when the Fp value decreased from a value close to 1 (0.98) to nearly 0.  Fig.2 Fig.3 Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2015-538,2016   Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.In further analyzing this numerical example, we evaluated the maximum flow by accumulating over a 1-5 d period (in a moving average routine) and compared the maximum obtained for each Fp with what, for the same Monte Carlo realization, was obtained for Fp of Hydrol.EarthSyst.Sci.Discuss., doi:10.5194/hess-2015-538,2016   Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.
GenRiver model (van Noordwijk et al., 2011) is based on a simple water balance concept with a daily timestep and a flexible spatial subdivision of a watershed that influences the routing of water and employs spatially explicit rainfall.Land cover affects rainfall interception losses as well as soil macroporosity (bulk density) modifying infiltration rates.Any land-cover change scenarios are interpolated annually between measured time-series data.The model may use measured rainfall data, or use a rainfall generator that involves Markov chain temporal autocorrelation (rain persistence).The model itself, a manual and application case studies are freely available (**weblink**; van Noordwijk et al., 2011).Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2015-538,2016 Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.

 Fig. 6 4. 2
Fp effects for scenarios of land cover change Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2015-538,2016 Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License. Fig. 7 Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2015-538,2016   Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.baseflow is much more pronounced.Apparently, according to our model, the high macroporosity of forest soils that allows interflow and may be the 'sponge' effect attributed to forest, delays delivery to rivers by one or a few days, with little effect on the flow volumes at locations downstream where flow of multiple days accumulates.The difference between overland-or interflow and baseflow in time-to-river of rainfall peaks is much more pronounced..  Fig.9 Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2015-538,2016   Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.floods.We will discuss the flow persistence metric against criteria based on salience, credibility and legitimacy.Key salience aspects are "Does flow persistence relate to important aspects of watershed behavior?"and "Does it help to select management actions?".Figures2 and 6show that most of the effects of a decreasing Fp value on peak discharge (which is the basis for downstream flooding) occur between Fp values of 1 and 0.7, with the relative flood protection value reduced to 10% when Fp reaches 0.5.As indicated in Fig.1, peak discharge is only one of the factors contributing to flood risk in terms of human casualties and physical damage.The Fp value has an inverse effect on the fraction of recent rainfall that becomes river flow, but the effect on peak flows is less, as higher Fp values imply higher base flow.The way these counteracting effects balance out depends on details of the local rainfall pattern (including its Markov chain temporal autocorrelation), as well as the downstream topography and risk of people being at the wrong time at a given place, but the Fp value is en efficient way of summarizing complex land use mosaics and upstream topography in its effect on river flow.The difference between wet-season and dry-season Fp deserves further analysis.In climates with a real rainless dry-season, dry season Fp is dominated by the groundwater release fraction of the watershed, regardless of land cover, while in wet season it depends on the mix (weighted average) of flow pathways.The degree to which Fp can be influenced by land cover needs to be assessed for each landscape and land cover combination, including the locally relevant forest and forest derived land classes, with their effects on interception, soil infiltration and time pattern of transpiration.The Fp value can summarize results of models that explore land use change scenarios in local context.To select the specific management actions that will maintain or increase Fp a locally calibrated land use/hydrology model is needed, such as GenRiver or SWAT(Yen et al., 2015).The empirical data summarized here for (sub)humid tropical sites in Indonesia and Thailand show that values of Fp above 0.9 are scarce in the case studies provided, but values above 0.8 were found, or inferred by the model, for forested landscapes.Agroforestry landscapes generally presented Fp values above 0.7, while open-field agriculture or degraded soils led to Fp values of 0.5 or lower.Despite differences in local context, it seems feasible to relate typical Fp values to the overall condition of a watershed.Key credibility questions are "Consistency of numerical results?" and "How sensitive are results to noisy data sources?".Intra-annual variability of Fp values was around 0.2 in our results, interannual variability in either annual or seasonal Fp was generally in the 0.1 range, while the difference between observed and simulated flow data as basis for Fp calculations Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2015-538,2016 Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.was mostly less than 0.1.With current methods, it seems that effects of land cover change on flow persistence that shift the Fp value by about 0.1 are the limit of what can be asserted from empirical data (with shifts of that order in a single year a warning sign rather than a firmly established change).When derived from observed river flow data Fp is suitable for monitoring change (degradation, restoration) and can be a serious candidate for monitoring performance in outcome-based ecosystem service management contracts.Where further uncertainty is introduced by the use of modeled rather than measured river flow, the lack of fit of models similar to the ones we used here would mean that scenario results are indicative of directions of change rather than a precision tool for fine-tuning combinations of engineering and land cover change as part of integrated watershed management.

Figure 1 .
Figure 1.Steps in a causal pathway that relates ecosystem structure to function, human land use and a perceived ecosystem service of 'avoided flood damage'; blue (open) arrows refer to water flow, black (solid) arrows to influences; plot-level processing of incoming rainfall(1) influences the total blue-water yield (2) and its temporal pattern (3), in dependence of the time-space pattern of rainfall (4); extreme discharge events (5), jointly with the (engineered) river channel (6), and topography determine flood frequency and duration (7); human population density and activity (8) together with flood characteristics determines victims, damage and its economic consequence (9); attributing 'avoided flood damage' (10) to land cover (0) and its influences on step 1 is thus complex, especially as ceteris paribus assumptions do not generally hold and interactions are common

Figure 2 .
Figure 2. Example of daily river flow for a unimodal rainfall regime with clear dry season, in response to change in the flow persistence parameter Fp

Figure 3 .
Figure 3. Biplots of Q(t) versus Q(t-1) for the same simulations as figure 2

Figure 5 .
Figure 5. Example of the derivation of best fitting Fp,try value for an example hydrograph (A) on the basis of the inferred Qadd distribution (cumulative frequency in B), and three properties of this distribution (C): its sum, frequency of negative values and standard deviation; the Fp,try minimum of the latter is derived from the parameters of a fitted quadratic equation

Figure 6 .
Figure 6.Inter-(A) and intra-(B) annual variation in the Fp parameter derived from empirical versus modeled flow: for the four test sites on annual basis (A) or three-monthly basis (B)

Figure 7 .
Figure 7. Effects of land cover change scenarios (Table 1) on the flow persistence value in four watersheds, modelled in GenRiver 21 over a 20-year time-period, based on actual rainfall records; the left side panels show average water balance for each land cover scenario, the middle panels the Fp values per year and land use, the right-side panels the derived frequency distributions (best fitting Weibull distribution)

Table 2 .
Parameters of the GenRiver model used for the four site specific simulations (van 1 Noordwijk et al., 2011 for definitions of terms; sequence of parameters follows the pathway 2 Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2015-538,2016 Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.

Table 5 .
Number of years of observations on flow persistence required to reject the null-1 hypothesis of 'no land use effect' at p-value = 0.05 using Kolmogorov-Smirnov test.The 2 probability of the test statistic in the first significant number is provided between brackets and 3 where the number of observations exceeds the time series available, results are given in italics 4 A. Natural Forest as reference Hydrol.Earth Syst.Sci.Discuss., doi:10.5194/hess-2015-538,2016 Manuscript under review for journal Hydrol.Earth Syst.Sci.Published: 19 January 2016 c Author(s) 2016.CC-BY 3.0 License.