Characterization and evaluation of controls on post-fire streamflow response across western US watersheds

This research investigates the impact of wildfires on watershed flow regimes, specifically focusing on evaluation of fire events within specified hydroclimatic regions in the western United States, and evaluating the impact of climate and geophysical variables on response. Eighty-two watersheds were identified with at least 10 years of continuous pre-fire daily streamflow records and 5 years of continuous post-fire daily flow records. Percent change in annual runoff ratio, low flows, high flows, peak flows, number of zero flow days, baseflow index, and Richards–Baker flashiness index were calculated for each watershed using preand post-fire periods. Independent variables were identified for each watershed and fire event, including topographic, vegetation, climate, burn severity, percent area burned, and soils data. Results show that low flows, high flows, and peak flows increase in the first 2 years following a wildfire and decrease over time. Relative response was used to scale response variables with the respective percent area of watershed burned in order to compare regional differences in watershed response. To account for variability in precipitation events, runoff ratio was used to compare runoff directly to PRISM precipitation estimates. To account for regional differences in climate patterns, watersheds were divided into nine regions, or clusters, through k-means clustering using climate data, and regression models were produced for watersheds grouped by total area burned. Watersheds in Cluster 9 (eastern California, western Nevada, Oregon) demonstrate a small negative response to observed flow regimes after fire. Cluster 8 watersheds (coastal California) display the greatest flow responses, typically within the first year following wildfire. Most other watersheds show a positive mean relative response. In addition, simple regression models show low correlation between percent watershed burned and streamflow response, implying that other watershed factors strongly influence response. Spearman correlation identified NDVI, aridity index, percent of a watershed’s precipitation that falls as rain, and slope as being positively correlated with post-fire streamflow response. This metric also suggested a negative correlation between response and the soil erodibility factor, watershed area, and percent low burn severity. Regression models identified only moderate burn severity and watershed area as being consistently positively/negatively correlated, respectively, with response. The random forest model identified only slope and percent area burned as significant watershed parameters controlling response. Results will help inform post-fire runoff management decisions by helping to identify expected changes to flow regimes, as well as facilitate parameterization for model application in burned watersheds.


Introduction
The number of wildfires in the western United States (US) is increasing annually, on average costing federal agencies billions of dollars a year in suppression efforts (Whitlock, 2004) and causing an increase in flood events destructive to both life and infrastructure (Neary et al., 2005;Pausas et al., 2008).Westerling et al. (2006) and Westerling (2016) showed that the western fire regime exhibited a significant transition from infrequent and short-duration events to higher-frequency, longer-duration regimes during the mid-S.Saxe et al.: Characterization of post-fire streamflow 1980s.The greatest increases in fire frequency were found to occur in mid-elevation forests, most commonly in the Northern Rockies, Sierra Nevada, southern Cascades, and western coast ranges in northern California and southern Oregon (Littell et al., 2009).This marked change is strongly correlated with climate change impacts, such as warmer springs and longer dry seasons, commonly in occurrence with reduced winter precipitation rates and earlier spring snowmelt.Overall, Westerling et al. (2016) determine that, though landuse history may be a significant factor in the spatial distribution of wildfires within specific forest types, changes in fire regimes in the western US are most likely attributable to recent changes in climate.Other notable research has also provided significant correlative evidence between climate change and wildfire occurrences (Littell et al., 2009;Moritz et al., 2010).
Though wildfires are a part of the natural process of vegetation dynamics, they cause wide-ranging changes to ecosystems (Neary, 2003;Santos et al., 2015) depending on numerous factors, most importantly burn severity.Studies examining the effects of wildfires on a small scale, such as in plotsized and laboratory experiments, show high fire temperatures can result in the combustion of organic matter within soils and cause permanent alteration to the chemical structure of local clays, decreasing soil stability (Shakesby and Doerr, 2006).Water-repellent soil layers can be created in a discrete layer on or below the soil surface through chemical bonding of the combusted organic matter to mineral particles, potentially increasing overall topsoil erosion rates in burned regions (Wilkinson et al., 2009), though this hydrophobicity is highly variable, depending on fire behavior, burn severity, and soil properties (DeBano, 2000).
At larger scales, such as entire watersheds or multiple watershed systems, studies of post-fire erosion rates have shown incompatible conclusions (Moody and Martin, 2001;Owens et al., 2013;Smith et al., 2011), though this is most likely due to the variability of precipitation events and general climate patterns (Moody et al., 2013).In terms of water quality, contaminant levels can be dramatically increased for many years after a wildfire in both soil (Burke et al., 2010) and stream systems (Emelko et al., 2011;Stein et al., 2012;Burke et al., 2013), increasing the workload on source water protection organizations in communities reliant upon burned watersheds for drinking and agricultural water.Furthermore, wildfires are readily attributed as the cause of substantial increases in debris flows (Benavides-Solorio and MacDonald, 2001;Cannon et al., 2001;Meyer et al., 2001).
Studies evaluating post-fire water yield change are highly disparate owing to the transient nature of climate patterns, variations in basin geomorphology, and vegetation recovery patterns, and the resulting complex interactions (Moody et al., 2013).For example, studies in rangeland regions of the US found moderate increases in flow, infiltration, and erosion rates after major wildfires, with trends continuing for as long as 15 years (Emmerich and Cox, 1994;Pierson et al., 2009;Hester et al., 1997;Kinoshita and Hogue, 2015).Fires in chaparral environments, such as in southern California, exhibited increased flows of up to as much as 2 orders of magnitude, with much of this occurring in the dry season (Coombs and Melack, 2013;Kinoshita andHogue, 2011, 2015;Loáiciga et al., 2001).Fires in other chaparral environments were found to also yield flow increases, such as in South Africa (Lindley et al., 1988;Scott, 1993), Cyprus (Hessling, 1999), and France (Lavabre et al., 1993).Additional increases to post-fire flow regimes were found in temperate, forested catchments as well (Neary et al., 2005;Watson et al., 2001).A concise summary of historic changes in US post-fire stream systems is found in Neary et al. (2005), documenting changes in first-year runoff and peak flows, encompassing a range of ecological regions.Conversely, several studies found limited or no significant changes to hydrologic systems post fire, or attributed fluctuations to natural annual variability (Aronica et al., 2002;Bart and Hope, 2010;Britton, 1991;Townsend and Douglas, 2000).
These discrepancies in post-fire flow response lead to the question of which watershed characteristics have the greatest influence over the observed response.Moody et al. (2013) provide a succinct summary of soil-related theories, such as reduced infiltration due to increases in soil-water repellency, increased overland flow velocities due to increased bare ground, and reduced infiltration caused by soil-sealing.Theories commonly found in the literature attribute flow changes to a wider range of factors, including reduction in interception and evapotranspiration (Lavabre et al., 1993;Scott, 1993) and increased hydrophobicity of soils (Neary et al., 2005).In regards to altered peak flows, conflicting evidence is found regarding the importance of burned watershed areas, with some studies finding an inverse correlation between peak flows and watershed size (Biggio and Cannon, 2001;Neary et al., 2005) and others finding no relation at all (Bart and Hope, 2010).
The current research builds upon prior studies but develops a more comprehensive and systematic assessment of post-fire streamflow dynamics by examining burned watersheds that encompass a wide spectrum of climatological and geophysical parameters across the western US.A variety of statistical parameters are also examined which describe changes to flow regimes at several levels.Furthermore, the variability in response by distinct regions is investigated, anticipating distinct differences influenced by regional climate.With downstream communities at risk of flooding, and also relying on catchment runoff for water supply, investigating alterations in post-fire discharge over large scales will provide critical information for regional managers on post-fire runoff mitigation.In addition, understanding factors controlling discharge response will help inform development and calibration of surface water models used for post-fire streamflow predictions.

Study area
A total of 82 burned watersheds were utilized in the current study (Fig. 1), encompassing a range of spatial, temporal, climatological, and topographic factors (Figs. 2 and 3).Selected watersheds were limited to those with burn areas of > 5 % and adequate discharge records (continuous 15-year daily flow, including 10 years pre-fire and 5 years post-fire) (U.S. Geological Survey, 2014) identified through the GAGES-II database (Falcone, 2011).The majority of available watersheds are overwhelmingly found in the western US, predominantly in California, Oregon, and Idaho, with several located in the northeast, Florida, and Kentucky.Due to discharge and burn severity data limitations, the fires cover a temporal range from water years 1984 through 2010.A water year is the 12-month period 1 October through 30 September designated by the calendar year in which it ends.Percent area burned ranges from 5 to 97 %, with a mean of 25 %, over a range of watershed areas from 4.6 to 7000 km 2 .The wide spatial distribution of the studied watersheds results in mean elevations and burn-area slopes varying from 13 to 2760 m and from 0.11 to 16 %, respectively (Fig. 2).
The most important difference between many of these watersheds is the variability in climate, the values of which were collected from the GAGES-II dataset (Falcone, 2011).GAGES-II watersheds have at least one 20-year period with continuous daily flow records.Average basin precipitation ranges from 29 to 220 mm yr −1 , with a mean of 72 mm yr −1 , and average temperature ranges from 1.4 to 23 • C, with a mean of 10 • C. Important for identifying snow dominated regions is the percent of precipitation (PPT) that falls as snow (%Snow/PPT), which ranges from 0 to 72 % for the study watersheds.Relative humidity ranges from 39 to 73 %, with a mean of 55, and potential evapotranspiration ranges from 400 to 1200 mm yr −1 , with a mean of 633 mm yr −1 .
Vegetation types vary across the watersheds as well.Evergreen forest and shrub vegetation are the dominant land cover type over most watersheds used in this study (Fig. 2).The high proportion of evergreen is due to the dominance of this study's high elevation fires in mountainous regions and shrub prevalence is due to the abundance of study watersheds found in the chaparral regions of southern California.Grassland, mixed forest, and developed land account for a smaller proportion of land cover types.Barren land and wetland account for only a small percentage of land cover types throughout the watersheds in this study.

Data collection
Watershed parameters were chosen to encompass the variability in geophysical, burn area, and climatic conditions found throughout the watersheds used in this study (Appendix A).

Aridity index
Percent snow/PPT Figure 3. Boxplots summarizing the distribution of watershed area, elevation, percent area burned, percent precipitation that falls as snow, and aridity index across study watersheds.

Watershed selection
Selected watersheds were required to have continuous daily flow records (> 95 % of daily flow records accounted for in each year) for a minimum of 10 years pre-fire and 5 years post-fire.Using the approximately 9000 watersheds in the GAGES-II dataset (Falcone, 2011), delineated watersheds were spatially cross-referenced with the Monitoring Trends in Burn Severity (MTBS) database of historic wildfires from 1984 to 2010 (Eidenshink et al., 2007).The results were again cross-referenced with USGS daily flow records (U.S. Geological Survey, 2014) to identify watersheds with the required flow records, resulting in 263 unique watersheds in the US with greater than 5 % total burn area in a single water year.Of these, 23 contained 2-3 wildfires within the same year, burning over 5 % of the total area.The remainder contained only a single significant fire in the year of interest.Further exclusion of watersheds was based on the presence of major dams within the watershed flow regimes extracted from the GAGES-II database (Falcone, 2011), resulting in a final collection of 82 watersheds.

Streamflow and precipitation data
Daily flow and peak flow data were obtained from the USGS National Water Information System (U.S. Geological Survey, 2014) subset for the range of 10 years pre-fire and 5 years post-fire for each selected watershed.Monthly precipitation data were collected from the PRISM database, a nation-wide 4 km resolution gridded monthly dataset that extrapolates station climate measurements over unmonitored areas using a topographic-and climate-based algorithm (PRISM Climate Group, 2004).Monthly national precipitation rasters were averaged for each watershed for all months within the flow record period.

Climatological data
Watershed climatological parameters included percent of precipitation that falls as snow (%Snow/PPT) and the aridity index (AI).The %Snow/PPT for each watershed was available through the GAGES-II dataset (Falcone, 2011) and the aridity index was calculated for each watershed as where P avg is average annual precipitation and PET avg is average annual potential evapotranspiration, both of which were available in the GAGES-II dataset.

Burn severity data
Burn severity is the classification of burn areas relating visible changes in living and non-living biomass, fire byproducts, and soil exposure within one growing season, including low, moderate, and high severity categories (Eidenshink et al., 2007).Though categorization varies by region, some generalizations can be made.Typical high severity burns result in complete kills of canopy trees and almost complete consumption of surface litter and organic soil layers (Neary et al., 2005).Characteristics of moderate burn severity include partial canopy cover kill, completely charred or consumed understory vegetation, and widespread destruction of the soil organic layer.Low severity burns lightly scorch trees, char or consume surface litter, and produce little to no charring of the soil organic layer.Wildfires are almost always a patchwork of varying degrees of burn severity.More specifically, burn severity is the qualitative assessment of the heat pulse directed toward the ground during a fire, relating soil heating, fuel consumption, and mortality of buried plant parts (National Wildfire Coordinating Group, 2016).Burn severity data were obtained for each unique fire through the MTBS database (Eidenshink et al., 2007) and quantified as the percent area burned of the total watershed area.Values were categorized as unburned low severity, moderate severity, and high severity (Eidenshink et al., 2007).

Vegetation data
Vegetation data were identified for each burn area prior to the fire event.Due to the temporal gaps in the National Land Cover Database (Homer et al., 2004), an averaged normalized difference vegetation index (NDVI) was collected for pre-fire burn areas to quantitatively summarize vegetation conditions, similar to previous studies (Barbosa et al., 1999;Kinoshita and Hogue, 2011;Lee and Chow, 2015).NDVI is defined as where a nir and a vis are surface reflectance averaged over the ranges of wavelengths in the near infrared and visible spectrums, respectively.Despite NDVI having been shown to have accuracy issues related to atmospheric interference and variations in soil brightness (Carlson and Ripley, 1997), the extended timespan over which values were being averaged may have muted any such error responses.Average values were collected for each watershed through national 32-day NDVI rasters hosted on Google Earth Engine (GEE) (Google, Inc., 2015), in turn calculated from Land-sat5 composite satellite data freely available through the U.S. Landsat archive at the USGS Earth Resources Observation and Science (EROS) Center (Woodcock et al., 2008).Mean NDVI values were produced for the burn areas of all watersheds for 4 years pre-fire.

Soils data
Soils data were collected through an adapted version of the State Soil Geographic (STATSGO) database, a national collection of over 78 000 polygons containing a host of soil characteristics (Schwartz and Alexander, 1995).The soil erodibility factor (KFact) was utilized to numerically represent average soil types, as it provides a quantitative description of a soil's erodibility.The calculation included values of particle size, percent organic matter, percent clay, soil structure index, and profile-permeability class factor (Goldman et al., 1986).KFact increases as the potential erodibility of a soil increases.

Topographic data
Utilized topographic information included watershed area and the elevation, slope, and aspect of burn areas, calculated through a 30 m resolution continental US (CONUS) digital elevation model (U.S. Geological Survey, 2015).Percent burn areas of watersheds were calculated by intersection of watersheds with MTBS fire polygons.

Response variables
A range of response values were selected to quantify post-fire flow changes across a variety of flow conditions, including flows relating to dry season (low flows, baseflows) and wet season (high flows, peak flows).

Low, high, and peak flows
Low flow and high flow metrics were calculated for each of the 10 water years prior to the fire water year and averaged to produce a single value.A low flow value for water was computed as the average of mean daily flows with a 90 % exceedance.To reduce calculation bias due to zero flow days commonly found in ephemeral stream systems, zero flow days were eliminated from exceedance value calculations.High flows were defined similarly, with a 10 % exceedance threshold used to isolate larger volume flows (Kinoshita and Hogue, 2015).Changes in low flows and high flows were calculated as the post-fire percent change from the average of 10 water years pre-fire.Post-fire values were calculated for the first year, the second year, and the 5-year mean.Peak flows were defined as the largest daily mean flow measurement each water year and post-fire changes were calculated as the percent change of the first year, second year, and 5year mean peak flow measurements from the pre-fire 10-year mean.Percent changes in the number of zero flow days were calculated similarly.

Runoff ratios
The runoff ratio is defined as the fraction of total annual runoff depth over total annual precipitation: where P tot is total annual precipitation, Q tot is total annual runoff depth, and A ws is watershed area.Runoff ratio was calculated for the 10 years pre-fire and 5 years post-fire using PRISM precipitation and USGS mean daily flow data.
Post-fire runoff ratio response was calculated as the percent change between the first year, second year, and average 5year values post-fire and the pre-fire 10-year mean.

Baseflow and Richards-Baker indices
Baseflow index, defined as the fraction of total streamflow that is baseflow (Baker et al., 2004), was calculated for each water year through the "hydrostats" R package (Bond, 2015), which applies the Lyne-Hollick filter (Ladson et al., 2013).Baseflow index response was calculated as the percent change in the first-, second-, and 5-year post-fire from the mean of the 10-year pre-fire.The Richards-Baker index quantifies the frequency and rapidity of short-term changes in streamflow (flashiness) based where q is mean daily flow and i is time (Baker et al., 2004).Richards-Baker response was calculated as the percent change from the average value over the 10 years pre-fire to the average 1, 2, and 5 years post-fire.

k-means clustering
To create region-specific regression models, watersheds were classified into unique areas through k-means clustering (MacQueen, 1967) based on all GAGES II watersheds.This approach partitions an N-dimensional population of observations into clusters with minimal variation, allowing for relatively simple similarity grouping.For the current study, the ideal ensemble of clusters was one that produced easily recognizable regions with unique climatological characteristics.Large-scale clustering methods have been applied in prior watershed classification studies, but utilized more complex streamflow and ecological indices as parameters (McManamay et al., 2014;Poff, 1996).
Wildfires in this study are primarily in western evergreen and shrub environments, so clustering using only these watersheds would likely produce regions biased by fire occurrence.To limit this impact, we applied the mclust package in R (Fraley et al., 2012) to cluster using the entire 9000 GAGES-II watershed to produce national regions.The mclust package was chosen over the standard kmeans function in R due to its inclusion of numerous model-based approaches and application of the Bayes information criterion (BIC) to determine the most accurate model and cluster count (Schwarz, 1978).Various groupings of simple parameters were used for clustering, including watershed latitude and longitude, elevation, AI, %Snow/PPT, and mean monthly and seasonal flow statistics.

Control evaluation methods
Characterization of the influence of watershed geophysical parameters on flow response was performed using three approaches.First, calculation of the Spearman correlation coefficient between independent and response variables.Second, calculation of two types of regression equations: (1) a standard multiple linear regression model, and (2) a logistic regression model where responses greater than 20 % are assigned a value of 1 and responses less than 20 % relative to pre-fire conditions are assigned a value of 0. Third, application of a conditional inference tree algorithm through the "party" package in R (Hothorn et al., 2006).This algorithm, also called the random forest method, first applies a significance test to identify the independent variables that have the strongest association with the response variable.It then  searches for the best split points in those independent variables to partition the data into a tree.

Clustering
The k-means clustering performed on the GAGES-II watershed set yielded nine clusters or regions (Fig. 4).The most useful clusters for the current study are 6 through 9, which assemble 77 of the 82 watersheds into unique regions (Fig. 5).These four clusters have unique characteristics (Fig. 6).Watersheds in Cluster 6, based on the median, have the highest KFact, though most are burned less than 20 %.They have relatively moderate %Snow/PPT values and the lowest AI values.Watersheds in Cluster 7 have the highest median watershed area and elevation, and accordingly the highest %Snow/PPT and the lowest NDVI.Cluster 8 contains watersheds with the widest range of relative fire sizes, including watersheds burned from as little as 10 % to as great as 97 %.Watersheds in Cluster 8 also have the lowest median elevations and watershed areas, as well as the smallest %Snow/PPT and low AI.The percent of the burned area rated as high burn severity is also the greatest, based on the me-  dian, in Cluster 8 watersheds.Cluster 9 watersheds have the lowest median KFact and second highest median elevations.These watersheds also have high %Snow/PPT and AI.

Response variable distribution and analyses
Calculated flow response variables indicate an extremely wide range of post-fire system responses (Fig. 7).The greatest ranges occur within variables representing changes in low flows, such as first-year change to the number of zero flow days (SD = 243 %), first-year baseflow index (236 %), and 5-year change to the number of zero flow days (201 %).
The tightest ranges are typically found within mean 5-year variables where extreme changes are muted, such as 5-year Richards-Baker (28 %), runoff ratio (39 %), and baseflow index (48 %).Response variable means range from as low as 0.92 % (5-year Richards-Baker) to as great as 115 % (firstyear change to the number of zero flow days).Due to the nature of response calculations, the number of zero flow day values were limited.First-and second-year zero flow day responses were found for only 22 watersheds, and for 27 watersheds for the 5-year values.

Trend analysis
Comparison of first-year response variables to 5-year mean variables typically results in slope values greater than one, with a mean slope of 1.9 (Fig. 8) and the greatest slopes being found in first-year low-flow, high-flow, and baseflow index response variables.Best fit lines of second-year response variables versus 5-year mean variables produce significantly lower slopes, with a mean of 0.78.We can infer from this that the greatest increases in these response variables are typically generated in the first year following a fire.Similarly, in comparing response variables to percent area burned (Fig. 9), best fit lines show the greatest slopes in firstyear values for low flows, high flows, and peak flows.Slopes in second-year values are almost always less than those of the first-year values and almost always greater than those of the 5-year mean values.The exception to this are second-year numbers of zero flow day values that decrease with increased fire size.Overall, the number of zero flow days is observed to increase post-fire, though due to both a small sample size and a short time period, results are most likely unreliable.Only Richards-Baker indicates little linear correlation with burn area, yielding marginal first-and second-year slopes, and 5year mean values decrease with percent area burned.
These findings largely confirm those of previous studies which posit that streamflow response tends to be greatest immediately following a wildfire.While there is substantial noise in the data, we find similar results.While response tends to increase overall within 5 years of a wildfire, the greatest increases are frequently found within the first year.

Response variability by cluster
Boxplots for response variables show differences across the four useful clusters noted above (Fig. 10).In this instance and that of the CONUS plots, all variables are scaled by dividing the variable by the percent area burned of the watershed in order to show relative response.Generalizing variability and magnitudes, dramatic differences can be established between clusters.Most noteworthy is Cluster 9, which dominantly produces negative streamflow responses post-fire.The largest magnitude relative responses are found in Cluster 6, which results in the greatest variability between variables (SD = 8.5).The most positive responses are found in Cluster 8, which frequently exhibits large positive responses and infrequent low negative responses.Cluster 7 produces low magnitude responses.Though they are, on average, positive, there are also several negative responses.
Evaluating cluster responses when they are not scaled relative to fire size percent area burned shows many similarities to the transformed results (not shown).For instance, average response in Cluster 9 is still dominantly negative with an overall low magnitude.Cluster 8 demonstrates the greatest overall variability and magnitude.On average, response in this cluster is strongly positive.Cluster 7 responses are small relative to other clusters in this study with a negative trend.Cluster 6 still exhibits high variability and low magnitude.What we surmise from these results is that watersheds in the Cluster 8 region will most likely be impacted the most from wildfire in terms of flow response.Watersheds in Clusters 7 and 9 see substantially lower responses, if any at all.Cluster 6 may prove to be the most difficult to predict due to its high variability in response.
Evidence of temporal patterns is sometimes noticeable when examining response by cluster.In Clusters 6 and 9, there are no distinct trends between first-, second-, and 5year mean relative values (Fig. 10).In some response variables, there are increases in the values of second-year variables within Cluster 9. the largest response values in the first year after a fire.Relative values tend to be negative in the second year and positive over the 5-year mean.Actual values are still positive in the second year.
By examining response values by cluster, we are able to identify more intricate and robust trends than by simply examining the dataset as a whole.Spatially, we find that watersheds in Cluster 8 produce much greater and more predictable post-fire flow responses than watersheds in Clusters 6, 7, and 9. Responses in Clusters 7 and 9, overall, tend to be low magnitude and negative.Cluster 6 watersheds yield highly variable responses.Watersheds in Cluster 8 follow the temporal trend found when examining all study watersheds as a whole, with the greatest responses occurring in the first post-fire year and decreasing over time.However, the magnitude of the response of these fires skewed the results of the generalized examination at the beginning of this section.Cluster 7 watersheds, in fact, produce decreased responses in the first-year post-fire with increased flows occurring at the second-year and 5-year mean time periods.Clusters 6 and 9 exhibit little to no temporal trends at all.
Variability in response variables also generally decreases with increasing percent watershed burned.Linear regression modeling of response variables by percent watershed burned yields the error statistics found in Fig. 11 (i.e., decreasing sample size with increasing percent area burned).Fig-ures include the adjusted R 2 and p-value significance tests (α = 0.05).Generally, linear modeling of first-year response variables increases in accuracy as included values are limited by increasing percent burn area.First-year low-flow, highflow, peak-flow, and Richards-Baker variables show substantial increases in adjusted R 2 once included watersheds are limited to those exceeding a 50 % burn area (n = 12).Included p values indicate that several of the first-year lowflow and peak-flow models are statistically significant.
Applying the same methods to second-year values shows dissimilar results, with adjusted R 2 exceeding 0.5 in only a single instance (second-year high flows), and few significant p values.However, in the cases of second-year low flow and Richards-Baker, R 2 values increase with increasing percent burn threshold.Simple regression of 5-year mean values versus percent area burned produces mixed results, with only 5-year mean low flow and peak flow allowing for adjusted R 2 values greater than 0.5, few of which are statistically significant.The 5-year mean baseflow index shows a single instance of a high adjusted R 2 value, but is statistically insignificant.Simple regression modeling was also performed on response variables by limiting included watersheds by decreasing percent area burned (i.e., decreasing sample size with decreasing percent area burned) and results demonstrated zero significant adjusted R 2 values.The results from Cluster 8, a dominantly chaparral environment, agree with other studies within the same region (Coombs and Melack, 2013;Kinoshita andHogue, 2011, 2015;Loáiciga et al., 2001) and in chaparral environments outside the United States (Hessling, 1999;Lavabre et al., 1993;Lindley et al., 1988;Scott, 1993).As mentioned in the introduction, some studies find little to no change in streamflow post-fire (Aronica et al., 2002;Bart and Hope, 2010;Britton, 1991;Townsend and Douglas, 2000).

Spearman correlation
Spearman correlation between the independent and dependent variables, averaged across response variables, shows inconclusive results (Fig. 12).Generally, NDVI, AI, Snow/PPT, and slope are positively correlated with flow response.Low burn severity area, KFact, and watershed area are typically negatively correlated with flow response.Thus, as NDVI, AI, %Snow/PPT, and slope increase, response increases.As low burn severity, KFact, and watershed area increase, response decreases.More difficult to interpret are the results for other independent variables, including moderate burn severity, high burn severity, and elevation.The correlation of these variables appears to switch from positive to negative by cluster.Ascertaining a significance for their relationships may be impossible via correlation for such a small sample size.
Correlation coefficients tend to be most different within Cluster 6, where NDVI, AI, percent area burned, KFact, high burn severity, and moderate burn severity show values contradicting those produced for Clusters 7-8, as well as for the fires as a whole.This is most likely indicative of a significant difference in response patterns for the region.Unfortunately, identifying significant trends is difficult given the relatively small dataset currently available.
Figure 13 demonstrates the correlation coefficients found across all burned watersheds.The largest absolute correlation value is 0.32, only 20 % of coefficients are greater than 0.10, and only 10 correlations are statistically significant according to asymptotic p values.As relative fire size is progressively limited to larger and larger values, correlation coefficients increase.NDVI, AI, percent watershed burned, slope, high burn severity, and moderate burn severity are dominantly positively correlated with flow response.Low burn severity, KFact, and watershed area are dominantly negatively correlated with flow response.Elevation and %Snow/PPT display more varied correlations, but are generally negatively correlated with flow response.From these results, we can surmise that as the NDVI, AI, percent area burned, slope, and moderate/high burn severity area increase, flow response values will increase as well.Furthermore, as low burn severity, KFact, and watershed area increase, flow response will decrease.agreement, showing negative correlation with response.All other independent variables that were dominant under Spearman correlation are either too variable between clusters to accurately characterize, or are opposed to the correlation results.For instance, NDVI and AI can be strongly positively or negatively correlated with response depending on cluster.Clusters 7 and 8 demonstrate an expected response of inverse correlation between NDVI and runoff; i.e., as vegetation becomes denser and ET processes recover, less water exits the watershed through streamflow.

Regression
Similarly, as AI increases, streamflow increases; i.e., in more arid watersheds post-fire vegetation is likely less dense, using less water and releasing more water through streamflow.However, in Cluster 6, NDVI appears to be positively correlated with streamflow and AI negatively correlated.Thus, as vegetation density increases and aridity decreases, streamflow increases.These unexpected deviations in trends may be due in part to climate fluctuations that can have a strong impact on results due to the small sample size.High burn severity is shown to be negatively correlated with response for Clusters 6, 7, and 8 and all clusters combined.Because the sample size of these clusters is so small (ranging from n = 12 to 29), regression results may not be appropriate.

Random forest
Application of the random forest method provided little further insight into controlling watershed parameters.Applying the algorithm to the response values resulted in significant trees for only first-year low flow, high flow, runoff ratio, and peak flow, as well as for second-year low flow.For the low-flow response variables, area burned and aspect were the dominant controlling independent variables.For first-year low flow, watersheds burned greater than 23.3 % show the greatest response.Of those burned less than that threshold, some significant responses are found when burn areas have an aspect greater than 215 • .For second-year low flow, the largest responses are found in watersheds burned greater than 37 %.First-year high flow, runoff ratio, and peak flow are identified as being significantly affected by slope.Slopes of 7.0, 7.0, and 9.8 • divide response, respectively.Watersheds greater than these slopes demonstrate much greater high flow, runoff ratio, and peak flow in the first year than those with gentler slopes.This may be because steeper watersheds produce greater overland flow velocities or support shallower soil layers and thus have lower water absorption rates than in more gently sloped watersheds.

Discussion of geophysical parameters
Through Spearman correlation, the independent variables NDVI, AI, percent area burned, slope, moderate burn severity, and high burn severity are positively correlated with postfire flow response.Low burn severity, KFact, and watershed area are negatively correlated with response.The results of several regression models are less clear, with inconstant relationships across all fires and clusters.Generally, they show that only moderate burn severity is dominantly positively correlated with response and that watershed area and, somehow, high burn severity are negatively correlated with response (Fig. 15).Random forest analysis shows that significant relationships can be found for the first-year low-flow, highflow, runoff-ratio, and peak-flow response variables, as well as second-year low flow.These relationships are dominated by dependence on either percent burn area or by slope.
Overall, results of geophysical parameter characterization are somewhat inconsistent, likely due to the sample size of this study.Though it is one of the largest to date, there are still too few fire events relative to the number of geophys-1233 ical parameters to produce consistent results.What can be gleaned from the various methods used in this section is that slope is frequently a strong predictor of response.Possible reasons for this are that water absorption into soil may decrease along steeper slopes due to greater overland flow velocity (i.e., less time available for absorption), or that soils along steeper slopes may be shallower, thus decreasing potential absorption volume.Both of these possibilities may account for increased water contribution to streamflow in regions with greater slope.
Support for this argument is found in the Spearman correlation coefficient analysis in Figs. 12 and 13, where slope is shown to be strongly correlated with low-flow, high-flow, peak-flow, and runoff responses, as well as in the random forest analysis where slope is one of the few independent variables to be identified as significant.
Determining the influence of independent variables on response at the cluster level seems, with this sample size, unreasonable.Furthermore, several explanatory variables are significantly correlated.Although averaged linear and logistic regression and random forest methods identify correlated values, reducing the overall number of variables in future studies may yield more understandable models.There is too much variance in flow regimes to complete a trend analysis with groupings of the size produced in this study.However, over the next decade the sample size of this study should be able to be increased significantly and provide a more robust dataset for analysis.

Conclusions
Post-fire changes in streamflow are found to be highly variable across regions of the western US and some trends can be difficult to discern or may be non-existent.In general, flow response for the study watersheds was found to be greatest in both magnitude and variability within the first year following a fire and shown to decrease over a 5-year period.Variations in flow response can be found by examining groups of watersheds clustered by climatic variables.We find that these general trends are dominated by the high-magnitude responses of watersheds from Cluster 8.While watersheds from Clusters 6 and 9 do not show identifiable trends, Cluster 7 shows decreased responses in the first year following fires and positive, increasing responses in just the second year and in the 5year means, indicating that Cluster 7 watersheds have a more delayed flow response.Trends within Cluster 8 agree with other studies within the same region and in chaparral environments in other parts of the world.This study similarly identifies several regions (Clusters 6 and 9) that also do not exhibit significant distinguishable trends.
Identification of controlling watershed parameters on response yielded weak correlation metrics.Various methods showed sometimes contrary results, or none at all.Spearman correlation indicated that watershed slope, normalized difference vegetation index (NDVI), aridity index (AI), and percent of precipitation that falls as snow (%Snow/PPT) are positively correlated with flow response.Percent area low burn severity, soil erodibility factor (KFact), and watershed area were shown to be negatively correlated with response.These correlations largely agree with the positions asserted by Moody et al. (2013), Biggio and Cannon (2001), and Neary et al. (2005).Regression models showed that only moderate burn severity and watershed area are consistently positively and negatively correlated, respectively, with flow response.Random forest models indicated that percent area burned and slope are the only significant factors, and only for first-year response metrics.
The observed changes in streamflow following wildfire identified have wide-ranging implications for better understanding post-fire water budgets, downstream flood response, long-term water yield, and watershed modeling.Improved streamflow predictions will ultimately allow water resource managers in water-scarce regions to anticipate surplus volume in their budgeting forecast calculations and also help flood forecasters to identify areas at greater risk of damage and infrastructure overload.Identification of the influence of watershed geophysical parameters on post-fire streamflow should also enable improved calibration of regional models for burned watersheds.

Figure 1 .
Figure 1.Continental United States-scale map of the locations of the 82 watersheds utilized in this study.

Figure 4 .Figure 5 .
Figure 4. CONUS-scale map of the results of k-means clustering of the GAGES-II watershed set by climate variables.

Figure 6 .
Figure 6.Boxplots of the variation in explanatory variables by cluster.

Figure 7 .
Figure 7. Boxplot of the distribution of response variables utilized in this study.Nzero refers to the number of zero flow days.

Figure 11 .
Figure 11.Scatterplots of the adjusted R 2 and p-value results (r) of simple linear regression modeling of response variables versus an increasing percent burn limitation.Response variables are abbreviated as LF (low flow), HF (high flow), RO (runoff ratio), PF (peak flow), BFI (baseflow index), and RB (Richards-Baker).

Figure 12 .
Figure 12.Heatmap of the averaged Spearman correlation coefficients of independent variables versus response variables, by cluster.WS refers to watershed.

Figure 14 Figure 14 .
Figure14shows the beta (coefficient) values produced by averaging the results of the linear and logistic regression models.The resulting betas are highly variable and thus trends are either difficult to interpret or nonexistent.To clarify results, Fig.15is provided with logical values, where positive and negative betas are shown as values 1 and −1, respectively.Some trends are in agreement with those of the correlation results, such as moderate burn severity being positively correlated with response.Watershed area is also in

Figure 15 .
Figure 15.Heatmap of logical values of the averaged regression model beta values for each cluster shown in Fig. 14.WS refers to watershed.