Developing a representative snow monitoring network in a forested mountain watershed

A challenge in establishing new ground-based stations for monitoring snowpack accumulation and ablation is to locate the sites in areas that represent the key processes affecting snow accumulation and ablation. This is especially challenging in forested montane watersheds where the combined effects of terrain, climate, and land cover affect seasonal snowpack. We present a coupled modelling approach used to objectively identify representative snow monitoring locations in a forested watershed in the western Oregon Cascades mountain 10 range. We used a binary regression tree (BRT) non-parametric statistical model to classify peak snow water equivalent (SWE) based on physiographic landscape characteristics in an average snow year, an above average snow year, and a below average snow year. Training data for the BRT classification were derived using spatially distributed estimates of SWE from a validated physically-based model of snow evolution. The optimal BRT model showed that elevation and land cover type were the most significant drivers of spatial variability in peak 15 SWE across the watershed (R = 0.93, p-value < 0.01). Geospatial elevation and land cover data were used to map the BRT-derived snow classes across the watershed. Specific snow monitoring sites were selected randomly within the dominant BRT-derived snow classes to capture the range of spatial variability in snowpack conditions in the McKenzie River Basin. The Forest Elevational Snow Transect (ForEST) is a result of this coupled modelling approach and represents combinations of forested and open land cover types at low, mid, and high 20 elevations. After five years of snowpack monitoring, the ForEST network provides a valuable and detailed dataset of snow accumulation, snow ablation, and snowpack energy balance in forested and open sites from the rain-snow transition zone to the upper seasonal snow zone in the western Oregon Cascades.


Introduction
Mountain snowpack is declining as a result of the warming climate (Kunkel et al., 2016;Knowles, 2015;Pederson et al., 2011Pederson et al., , 2013Rupp et al., 2013;Mote, 2006), subsequently shifting timing (Fritze et al., 2011;Clow, 2010) and volume of streamflow (Woodhouse et al., 2016;Berghuijs et al., 2014;Luce and Holden, 2009) across the western United States. Luce et al. (2013) argued that the declining snowpack is also the result of weakening westerlies leading to a decline in mountain precipitation in the interior western USA. The volume and seasonality of water produced from these snow-dominated watersheds varies spatially and temporally as a function of precipitation and temperature (Tennant et al., 2015;Barnett et al., 2005;Regonda et al., 2005), as well as local physiographic effects of topography, geology, and vegetation dynamics (Molotch and Meromy, 2014;Clark et al., 2011;Jefferson et al., 2008;Ffolliott et al., 1989).
Snow water equivalent (SWE) is a critical hydrologic resource in the montane western USA that has been actively monitored for decades by the Natural Resources Conservation Service (NRCS). The NRCS currently manages approximately 858 Snowpack Telemetry (SNOTEL) stations across the western USA (http://www.wcc.nrcs.usda. gov/snotel/SNOTEL_brochure.pdf). These stations provide near-real-time measurements of SWE, temperature, and precipitation, which is essential data for operational streamflow forecasts used by water managers, who balance a wide range of needs including irrigation, aquatic habitat, hydropower, recreation, and municipal water use. While most SNOTEL sites have been operating since the early 1980s, the data are meant to be used as indices to forecast discharge. These records are valuable but the stations were not designed to be, nor are they representative of, the total snow volume across a basin (Meromy et al., 2013;Molotch and Bales, 2006a). The SNOTEL monitoring stations in the Oregon Cascades are located within a narrow elevation range (1140-1510 m) that may not capture the inherent variability in the spatial distribution of snow under present-day or warmer climate conditions (Nolin, 2012;Brown, 2009).
Modeling has been shown to be an effective means of augmenting remote sensing, and a valuable tool for predicting spatially distributed snow conditions in the rugged, forested, and frequently cloud-covered montane watersheds of the Pacific Northwest (Sproles et al., 2013;Tague and Grant, 2009;Veatch et al., 2009;Luce et al., 1999;Cline et al., 1998). Landscape characteristics have been used to predict snowpack conditions at hillslope scales using non-parametric binary regression tree (BRT) statistical classification models Anderton et al., 2004;Erxleben et al., 2002;Winstral et al., 2002;Balk and Elder, 2000;Elder et al., 1998). Larger-scale BRT approaches have also been conducted using remotely sensed snow-covered area and interpolation methods (Molotch and Meromy, 2014;Molotch and Bales, 2006b). However, no study to date has used landscape characteristics in conjunction with modeled and validated physically based and spatially distributed SWE data to understand physiographic drivers of snow accumulation at broad scales (watersheds > 1000 km 2 ) or to identify optimal locations for snowpack monitoring. Additionally, most of the research on the physiographic relationships to snow processes has been done in cold, dry continental snowpacks where mid-winter melt events are infrequent and wind redistribution is substantial Erxleben et al., 2002;Winstral et al., 2002;Balk and Elder, 2000). Much less is known about how physiographic conditions influence the temperature sensitive snowpacks in the forested maritime basins of the Pacific Northwest.
To objectively identify optimal site locations to distribute a snow monitoring network, which explicitly captures the spatial variability of snow accumulation relative to the physiographic landscape, we used a combination of physically based, statistical, and geospatial models. This paper presents this objective and relatively simple methodology to distribute a snow monitoring network, which captures landscapedriven spatial variability in snow accumulation and includes four major objectives: 1. determine the key physiographic drivers of spatial variability in snow accumulation; 2. classify snow classes in the watershed based on key physiographic drivers using a non-parametric statistical model; 3. spatially distribute these snow classes across the watershed, using a geospatial model; 4. select site locations for a snow-monitoring network, which spans the spatial variability in snow water equivalent in the McKenzie River basin.

Study site
The McKenzie River, located in the western Oregon Cascades, is a major tributary of the Willamette River (Fig. 1). The McKenzie River basin (MRB) drains an area of 3041 km 2 , and covers about 12 % of the land area in the greater Willamette River basin. The MRB is a densely forested mountainous watershed, ranging in elevation from 150 to 3150 m, which is a managed for timber production throughout much of the seasonal snow zone. Brooks et al. (2012), determined that 60-80 % of summer flow in the Willamette River originated from elevations above 1200 m in the Oregon Cascades. The porous basalts in this geologically young landscape allow much of the snowmelt to percolate into groundwater systems Grant, 2004, 2009;Jefferson et al., 2008). The groundwater-fed McKenzie River provides 25 % of the late season volumetric base flow to the Willamette River at its confluence with the Columbia River (Hulse et al., 2002).

Data sources
Gridded data were obtained for physiographic variables shown in the literature to influence snow accumulation and ablation, including elevation, slope, aspect, incoming solar radiation, wind, and three vegetation variables from the following sources for the extent of the MRB. A digital elevation model (DEM) was obtained from the National Elevation Dataset at a 10 m resolution. Slope, aspect, and incoming solar radiation were calculated from the DEM using the spatial analyst and solar radiation tool boxes in Ar-cGIS 10.1 (ESRI, Redlands, CA). The upwind contributing area data, which captures the variability in snow deposition as a result of wind redistribution for each cell throughout the watershed , was calculated following Molotch et al. (2005). The 2006 National Land Cover Data (NLCD) were used to classify land cover across the watershed (Fry et al., 2011). Land cover data were reclassified into a binary product of forest and open land cover classes. The US Geological Survey (USGS) LANDFIRE Data Distribution Site provided the existing vegetation-percent canopy cover data at 30 m spatial resolution. Normalized Difference Vegetation Index (NDVI) data were obtained from the Moderate Resolution Imaging SpectroRadiometer (MODIS) MOD13Q1 -vegetation indices, 16-day land product for the earliest date possible in April 2009, at a 250 m spatial resolution. Watershed boundaries were defined using the USGS National Hydrography Dataset. Public land ownership data were provided by the Oregon Department of Forestry, and obtained from the website, http://www.oregon.gov/odf/ pages/gis/gisdata.aspx. All spatial data were masked to the McKenzie River basin and converted to the same projection and spatial resolution: NAD83, UTM Zone 10, and a 100 m grid cell size. Spatial data were processed using ArcGIS 10.1 using bilinear interpolation for continuous data and nearest neighbor interpolation for discrete data. Modeled and gridded SWE data across the MRB (Fig. 2) were provided by Sproles et al. (2013). These data were developed using a physically based spatially distributed snow mass and energy balance model, SnowModel (Liston and Elder, 2006). SnowModel uses micrometeorological and topographic data to distribute snow across the landscape accounting for climatic, topographic, and vegetation variability. The model was modified by Sproles et al. (2013) to account for rain/snow precipitation phase partitioning using a linear function of air temperature from −2 to 2 • C (USACE, 1965), and snow albedo decay in forested landscapes using an empirically based exponential decay function (Burles and Boon, 2011). This model was calibrated and validated using data from the four SNOTEL sites, meteorological data from the HJ Andrews Long-term Ecological Research site and National Weather Service stations, and Landsat fractional snow-covered area data over the sampling period 1989-2009 (Sproles et al., 2013). The model was run at 100 m spatial resolution on a daily time step. We used modeled peak SWE data as the predicted variable in the BRT model. Sproles et al. (2013) showed that 2009 was considered an average snow year (normal snow year); therefore, we used peak SWE from 2009 (5 days centered on 4 April 2009) as our reference year. Additionally, we used peak SWE from 2008 (5 days centered on 24 April 2008) as an above-average snow year (high snow year), and peak SWE from 2005 (5 days centered on 20 April 2005) as a below-average snow year (low snow year).

Analysis
A BRT model was developed to characterize the spatial variability of snow accumulation across the MRB based on independent physiographic variables using the Classified and Regression Trees (CART) software (Salford Systems, San Diego, CA). The BRT model is a hierarchical non-parametric statistical model that characterizes the mean and variance of a dependent variable using a suite of independent explanatory variables. Modeled SWE and physiographic variable data were used as input data for each cell where snow was present during peak SWE in 2009 (5 day average centered on 4 April 2009). An optimal tree was produced to minimize the standard error of the model, which was pruned to the simplest tree possible within 1 standard error of the optimal tree and so each terminal node represented at least 1 % of the variability in peak SWE. The resultant tree identified 21 terminal nodes that characterized the spatial variability in snow accumulation through combinations of independent drivers into 21 BRT-derived snow classes (Table 1). The BRT model identified elevation, land cover, NDVI, insolation, percent canopy cover, slope, and wind as significant explanatory drivers of the spatial variability of peak SWE (all selected variables had p values < 0.05 and are listed above in order of significance). Although elevation and land cover were the dominant predictive variables, and all other physiographic variables each explained less than 1 % of the variability in peak SWE. In order to reduce the multi-collinearity between related variables and reduce the risk of overfitting the model, we simplified the final optimal model to only include elevation and land cover. Within the CART software, the final optimal BRT model was validated using reserved data from an independent set of 20 000 randomly selected grid cells from within the MRB. The final parameters developed in this optimal tree for peak SWE in a normal snow year (2009), were used to develop equivalent BRT models using peak SWE input for a high snow year (2008), as well as to peak SWE during a low snow year (2005).
Using a Geographic Information Systems (GIS) geospatial model and statistically derived parameters, the 21 BRTderived snow classes were spatially distributed across the MRB. The geospatial model used physiographic data to distribute the areal extent of each BRT class across the MRB by assigning cells that met the statistically derived criteria for each BRT class. Because the BRT model did not determine a lower elevation limit on snow extent, we excluded areas with an elevation less than 600 m to prevent over-prediction of snow-covered area below elevations where it was observed in the modeled data. Total volumetric SWE (SWE depth × area) was calculated for each BRT class across the watershed, using the mean and variance of SWE, and the spatial extent of each BRT class. To validate the spatial distribution of the BRT-derived snow classes, we calculated the overall accuracy of the high snow year (2008) and low snow year (2005) relative to the reference year (2009) snow classes using an error matrix of omission vs. commission statistics (Campbell and Wynne, 2011).
To create a set of feasible locations for the in situ snowmonitoring network, we evaluated the accessibility of locations within the MRB. Using a GIS-based binary selection model, we masked out all private lands and public lands where the presence of endangered northern spotted owl prevented permitted access. To prevent contamination from the road network, but still define accessible site locations, we also identified areas within 100-500 m of a snowmobileaccessible road. From these accessible areas, the final sites were then randomly selected from each of the dominant BRTderived snow classes within the seasonal snow zone.

Results
Modeled peak SWE for all years had a positively skewed distribution across the range of elevations throughout the MRB (2009; kurtosis = 1.8, skewness = 1.62; 2008, kurtosis = 2.1, skewness = 1.7; 2005, kurtosis = 1.5, skewness = 1.7) with the greatest volume of snow located in the mostly forested area between 1300 and 1500 m in elevation (Fig. 3). The final optimal BRT model from the normal snow year (2009)  Table S1 in the Supplement). The final BRT model applied to the low snow year (2005) characterized SWE across the MRB into 21 snow classes with similar spatial variability relative to land cover but differing extents relative to elevation than during the normal snow year (2005 BRT model; R 2 = 0.895, p value < 0.01, RMSE = 0.09 m) ( Fig. 4; Table S2).
Elevation explained the most variance in modeled SWE across the basin, and was the primary driver of all snow classes (2009 BRT model with only elevation; R 2 = 0.91, p value < 0.01). In the middle elevations, land cover was also statistically important in distinguishing snow classes be-   from 1193 to 1747 m. In the high elevations, above the tree line, only elevation was statistically important in classifying the spatial variability in snow accumulation. Snowpack accumulation increased with increasing elevation, resulting in a greater mean SWE per unit area at the highest elevations. Although deep snowpack at the highest elevations only covers a small aerial extent of the MRB, which resulted in decreasing contribution of total basin-wide SWE above approximately 1700 m during the normal and high snow years. In contrast, during the low snow year, the highest elevation classes contributed the most to total basin-wide SWE (Fig. 5).
The BRT-derived volumetric SWE estimates had a similar positively skewed distribution across the elevational gradient as the SnowModel-derived SWE data in the MRB (Fig. 5). The BRT-derived estimate of 1.49 km 3 total SWE stored in the snowpack on 4 April 2009 within the MRB was less than 1 % greater than the SnowModel-derived estimate of 1.48 km 3 . The BRT-derived estimate of 1.94 km 3 total SWE stored in the snowpack on 24 April 2008 within the MRB was less than 1 % greater than the SnowModel-derived estimate of 1.93 km 3 . The BRT-derived estimate of 0.38 km 3 total SWE stored in the snowpack on 20 April 2005 within the MRB was 2.6 % less than the SnowModel-derived estimate of 0.39 km 3 . The final optimal BRT model from the nor-  (Tables S1 and S2). The BRT model performed well across the low and high elevations, where errors of omission and commission were generally lowest (Tables S1 and S2). Although across the mid-elevations, which consist of a patchwork of forest harvest and fire disturbance, were the areas with the greatest error between the BRT models. The high elevations above tree line, were the most consistently classified areas with low error between BRT models. The high error across the mid-elevations was due at least in part to the renumbering of classes when the model is rerun for each year, and therefore these statistics may underrepresent the accuracy of the BRT model in predicting overall spatial patterns of physiographically derived snow classes between years. The BRT-modeled snow classes captured the spatial variability in peak SWE across the MRB relative to elevation and land cover during an average, above-average, and below-average snow year and were used to objectively inform the site selection of a snow-monitoring network.
The geospatial selection model identified 16 of the 21 classes as being accessible during winter (on public land without permit restrictions and within 100-500 m of a snowmobile-accessible road). The highest elevations in the MRB are far from winter-accessible roads and difficult to monitor due to steep and avalanche prone slopes. Within the area covered by these 16 classes, random site locations were selected within the six most abundant classes across the MRB to capture low, medium, and high elevations, with forested and open land cover classes. The resultant Forest Elevation Snow Transect (ForEST) monitoring network site locations were thus objectively selected to sample across the range of spatial variability in SWE (Fig. 4). The ForEST network, composed of six meteorological stations and snow survey transects, was deployed in November 2011, and continues to provide high-quality snow and climate data to evaluate snow-forest-climate interactions in the MRB (Fig. 6).

Discussion
With warming winter temperatures, mountain snowpack in the western USA will likely continue to decline with potential impacts to forest health (Albright and Peterson, 2013) and streamflow (Jung and Chang, 2011;Cayan et al., 2010), as well as snow-related recreation and tourism (Gilaberte-Búrdalo et al., 2014;Nolin and Daly, 2006). There remains uncertainty around the magnitude of these impacts (Warren et al., 2011;Maurer, 2007;Xu et al., 2005); thus, it is important that monitoring networks capture not only normal snowpack conditions but also the range of variability in peak SWE across the landscape and through time.
Pacific Northwest forests play a key role in affecting snow accumulation and ablation across multiple scales; however, most research has been conducted at the stand scale (Storck et al., 2002) or in areas with cold, dry continental snowpacks (Ellis et al., 2013;Pomeroy et al., 2012). By distinguishing snow classes based on forest vs. open land cover across a range of elevations, this study emphasizes the watershed-scale control that vegetation and particularly land cover change relative to timber harvest (and potentially fire disturbance) has on snowpack accumulation in the maritime western Oregon Cascades. During low snow years, the significant influence of forest cover on the spatial variability in snow accumulation moved up in elevation from a normal snow year, suggesting that forest effects may have a more profound influence at higher elevations under future warming climate conditions. Understanding the forest structure effects on snow accumulation and ablation across elevation gradients is increasingly important to help guide decision making by local and regional water and forest managers in response to a changing climate.
We developed a snow-monitoring network representative of the spatial variability of peak SWE relative to physiographic landscape characteristics across the MRB for an average, above-average, and below-average snow year, by coupling a spatially distributed physically based SnowModel, a BRT statistical classification model, and a geospatial selection model. This objective method is a useful tool in classifying snow characteristics across the landscape to determine representative locations for intelligent snowpack monitoring particularly in physiographically complex landscapes. Although it is an improvement over more commonly used heuristic approaches to site selection, the method incorporates uncertainty as a result of compounding physically, statistically, and spatially based models which justifies caution in implementing these estimates in management decisions. However, the method meets assumptions of non-parametric data analysis, is performed with relative ease, and if data are available for the research basin of interest, it can be well validated. As even physically based models incorporate inherent empirically based historically derived assumptions, there is also uncertainty in using this approach to represent future spatial variability in snow accumulation.
The ForEST network contributes to the existing SNOTEL network to explicitly investigate snow-vegetation-climate interactions across the range of elevations and forest types in the watershed. The ForEST network is unique in that the monitoring site locations were selected based on statistical classification and geospatial analysis, rather than subjective methods that may incorporate bias. The paired forest-openland-cover site selection process has already led to important understanding of key sub-canopy snow processes (Storck et al., 2002;Golding and Swanson, 1986). But here, the assumptions driving paired site selection process have been validated using coupled physically based spatially distributed snow model input data and non-parametric BRT statistical modeling across a forested montane watershed. After 5 consecutive years of snow monitoring, we have created a valuable and detailed dataset of snow accumulation, snow ablation, and snowpack energy balance that spans the spatial variability in forest and open land cover types across an elevational gradient (Fig. 6).

Conclusions
The BRT model characterized peak SWE conditions in an average year, an above-average year, and below-average year to provide spatially distributed SWE volume estimates based on physiographic landscape characteristics. This integrated approach informed the distribution of an objective and representative monitoring network that spans the spatial variability in the seasonal snowpack across the MRB (Fig. 4). Throughout the maritime Pacific Northwest, it is critical we monitor snow-vegetation interactions across the elevation gradient, particularly at higher elevations where snowvegetation interactions may be more relevant in low snow years and under a warming climate.
By quantifying the spatial variability in the key drivers of natural resource distribution, researchers can focus on sensitive areas, which may not be identified through traditional site selection means. The use of validated model outputs as a predictor of the spatial variability in snow-vegetation interactions is not new (Randin et al., 2014). The novelty of this research stems from the application of the method, where by the coupling of a traditional BRT classification process with a validated physically based spatially distributed model, we improved snow observational network design in a forested montane watershed.
As the scientific community turns to more complex models to predict ecosystem responses to change, there is still a place for simple modeling approaches to inform scientific research priorities as well as natural resource monitoring and manage-ment. Particularly in rugged and densely forested mountain regions, such as the western Oregon Cascades, where there are few alternatives to modeling spatially distributed SWE, this coupled modeling approach provides a validated hypothesis to guide representative and objective snow-monitoring efforts.

Data availability
Data for the ForEST transect are available through the Oregon State University Scholars Archive at http://hdl.handle. net/1957/59984.
The Supplement related to this article is available online at doi:10.5194/hess-21-1-2017-supplement.