Statistical modelling of the snow depth distribution in open alpine terrain

The spatial distribution of alpine snow covers is characterised by large variability. Taking this variability into account is important for many tasks including hydrology, glaciology, ecology or natural hazards. Statistical modelling is frequently applied to assess the spatial variability of the snow cover. For this study, we assembled seven data sets of high-resolution snow-depth measurements from different mountain regions around the world. All data were obtained from airborne laser scanning near the time of maximum seasonal snow accumulation. Topographic parameters were used to model the snow depth distribution on the catchment-scale by applying multiple linear regressions. We found that by averaging out the substantial spatial heterogeneity at the metre scales, i.e. individual drifts and aggregating snow accumulation at the landscape or hydrological response unit scale (cell size 400 m), that 30 to 91 % of the snow depth variability can be explained by models that are calibrated to local conditions at the single study areas. As all sites were sparsely vegetated, only a few topographic variables were included as explanatory variables, including elevation, slope, the deviation of the aspect from north (northing), and a wind sheltering parameter. In most cases, elevation, slope and northing are very good predictors of snow distribution. A comparison of the models showed that importance of parameters and their coefficients differed among the catchments. A “global” model, combining all the data from all areas investigated, could only explain 23 % of the variability. It appears that local statistical models cannot be transferred to different regions. However, models developed on one peak snow season are good predictors for other peak snow seasons.

human activities, including agriculture (e.g. irrigation), hydropower or water resources management, depend on information on the spatio-temporal variability of snow cover and melt. In order to capture the temporally varying run-off caused by snowmelt, hydrological models need to account for the spatial distribution of snow.
Recent years have seen advances in modelling snow distributions by applying sophisticated physically based snow cover redistribution models (Pomeroy and Li, 2000;Liston and Elder, 2006;Lehning et al., 2008;Schneiderbauer and Prokop, 2011). Such models have been operated at grid resolutions of a few metres (e.g. Essery et al., 1999;Liston and Elder, 2006;Schneiderbauer and Prokop, 2011) to hundreds of meters (e.g. Bernhardt et al., 2009;Bavay et al., 2009Bavay et al., , 2013MacDonald et al., 2010;Magnusson et al., 2010). These models require high level input, including meteorological data, sometimes detailed flow fields  and sometimes high-resolution digital elevation models. They are therefore very expensive in terms of required information and calculation resources, and have not been applied for larger areas or longer time frames. In general, there is a trade-off between model complexity and generality on the one hand and computation time on the other hand although reasonable compromises have been achieved using the hydrological response unit concept (MacDonald et al., 2009;Fang and Pomeroy, 2009) which derives from that of landscape units for stratified snow sampling (e.g. Steppuhn and Dyck, 1974).
Topography strongly influences snow distribution but is not a causative factor in itself (e.g. McKay and Gray, 1981;Pomeroy and Gray, 1995). The spatial heterogeneity of the snow cover is attributed to a number of different processes which act on different scales: local precipitation amounts are strongly affected by the interaction of the terrain with the local weather and climate (Choularton and Perry , 1986;Daly et al., 1994;Beniston et al., 2003;Mott et al., 2013). Wind plays a major role for the deposition and redistribution of snow (e.g. McKay and Gray, 1981;Pomeroy and Gray, 1995;Essery et al., 1999;Trujillo et al., 2007;Lehning et al., 2008). Moreover, snow can be relocated by avalanches and sloughing (Blöschl and Kirnbauer, 1992;Gruber, 2007;Bernhardt and Schulz, 2010), and the local radiation and energy balance influence the spatially varying ablation processes (Cline et al., 1998;Pomeroy et al., 1998Pomeroy et al., , 2004Pohl et al., 2006;Mott et al., 2011).
Many studies have attempted to exploit the relationship between topography and snow in statistical models. Topographic parameters serve as explanatory variables for explaining the spatial heterogeneity of snow depth, snow water equivalent (SWE) or melt rates. Several studies have successfully developed binary regression tree models (e.g. Elder et al., 1998;Balk and Elder, 2000;Erxleben et al., 2002;Anderton et al., 2004;Molotch et al., 2005;Lopez-Moreno and Nogues-Bravo, 2006;Litaor et al., 2008), where the response data (e.g. snow depth) are repeatedly split along a predictor (e.g. terrain parameter) into sub-groups by minimising the sum of the residuals. Such regression tree models were capable of explaining 18 to 90 % of the snow-cover variability. The performance of the models depends on both the complexity of the tree (number of explanatory parameters and number of splits), and on the quality and complexity of the data that are analyzed.
A second very common statistical approach is multiple linear regression. As in the regression tree models, terrain parameters serve as explanatory variables. An early example is the multivariate regression model developed by Golding (1974) and applied to Marmot Creek Research Basin, Alberta. Golding (1974) found that elevation, topographic position, aspect, slope, and forest density were the most important variables for predicting snow accumulation but together these could only explain 48 % of the variation of SWE. Based on 106 snow poles Lopez-Moreno and Nogues-Bravo (2006) were able to explain more than 50 % of the large scale snow depth variability over the Pyrenees with elevation, elevation range, radiation and two location parameters. Chang and Li (2000) used monthly SWE data averaged from 13 to 36 snow courses in Idaho to build regression models with a combination of elevation, slope, aspect and a relative position. The models could explain 60 to 90 % of the monthly SWE variability for different regions and large catchments. Marchand and Killingtveit (2005) found that it was possible to model snow depth in open and forested areas of a large (849 km 2 ) Norwegian catchment with different measures of elevation, aspect, curvature and slope (R 2 = 5-48 %). For their analysis, they aggregated a very large number of snow depth measurements, obtained by georadar and hand probing to a grid of 1000 m. Pomeroy et al. (2002) used a parametric model derived from a physically based snow interception model to explain snow accumulation in the boreal forest of Saskatchewan and Yukon, Canada, as a function of winter plant area index with an R 2 of 79 %; the parametric model had almost identical form to a regression model using canopy cover developed by Ku'zmin (1960) for snow accumulation in Russian boreal forests, suggesting commonality in the relationship of snow accumulation to forest structure in circumpolar boreal forests.
Beside such applications for very large areas, several studies have applied linear regression for smaller catchments: for each of 19 topographic sub-units, Jost et al. (2007) averaged 60 manual snow depth and 12 density measurements of a 17 km 2 catchment in Canada. A linear regression model with elevation, aspect and forest cover could explain up to 89 % of the large scale variability in SWE. In order to apply universal kriging of SWE for a 8 km 2 glacier in Austria, Plattner et al. (2006) built a linear regression model with elevation, curvature, distance from ridge and a shelter index (R 2 = 41 %). The SWE measurements were obtained from the average of three snow depth measurements at 165 sample sites and snow densities from five snow pits. Elevation, potential radiation and forest cover were good for representing 66 % of the spatial variability of SWE, obtained from 33 snow cores in a small Swiss catchment (Hosang and Dettwiler, 1991). In order to optimise a snow depth sampling scheme, Elder et al. (1991) applied regression models with elevation, slope and radiation, which explained 27 to 40 % of the SWE variability of a three year data set for the Emerald Lake basin (California). SWE was calculated from density measurements and 86 to 354 manual snow depths taken by manual probing.
Nonetheless, in evaluating such results the sample density, data quality, and scale of the investigation must be considered. All the studies listed up to now in this paper were limited by the small amount and spatial coverage of snow depth observations available for the analysis. Most are based on manual snow depth measurements and only tens to a few hundreds of samples could be used for the regression modelling.
In the recent years, high-resolution and high-quality snow depth data from terrestrial (e.g. Prokop et al., 2008;Grünewald et al., 2010) and airborne laser scanning (ALS) (Hopkinson et al., 2004;Deems et al., 2006;Grünewald and Lehning, 2011) have become available. Grünewald et al. (2010) used snow depth data obtained by terrestrial laser scanning and built linear regression models (elevation, slope, aspect, radiation/elevation, slope, maximum SWE, wind speed) which could explain 30 and 40 % of the spatial variability of daily ablation rates in the Wannangrat area (Davos, CH). In a recent study, Lehning et al. (2011) analyzed airborne Lidar measurements for two small catchments in Switzerland, and showed that a two-parameter model with elevation and a fractal roughness parameter can explain more than 70 % of the snow depth distribution when clustering data to sub-areas. Even though restricted to two small catchments, Lehning et al. (2011) speculate that it might be possible to transfer their findings to other mountain areas, and that a universal relationship of snow depth and accessible topography parameters might exist.
In this paper we would like to test this hypothesis using a much larger data set. We assembled high-resolution snow depth data from different mountain regions. For the first time, such high-resolution snow depth data from different areas are combined and compared in a statistical analysis. This permits testing the hypothesis of whether the same or similar topographic parameters dominate the snow depth distribution on the catchment scale (100 to 10 000 m) in different areas. In Sect. 2, we describe the climatological and morphological characteristics of each data set and present the methods of data aggregation and linear regression in detail. In the Results section, the models developed for each region are compared with each other and results from model combinations are discussed.

Airborne laser scanning
Several studies have successfully applied snow depth data obtained by ALS in the recent years (e.g. Deems et al., 2006Deems et al., , 2008Trujillo et al., 2007Trujillo et al., , 2009Grünewald et al., 2010;Grünewald and Lehning, 2011). ALS enables area-wide data to be gathered in a very high spatial resolution and accuracy. Hopkinson et al. (2004) and Deems et al. (2006) showed that ALS was an appropriate and accurate method for gathering snow depth measurements, and since then, a rising number of data sets have become available (Deems et al., 2006;Moreno Banos et al., 2009;Dadic et al., 2010a;DeBeer and Pomeroy, 2010;Grünewald and Lehning, 2011;Schöber et al., 2011;Hopkinson et al., 2012). Detailed descriptions of the ALS measurement principle can be found in Geist et al. (2009), Baltsavias (1999 and Wehr and Lohr (1999).
From the point clouds obtained by ALS, we calculated digital surface models by averaging the points to a regular grid of 1 m cell size. Snow depth maps were then produced by subtracting a digital surface model, obtained by the same technology, in snow-free conditions from a digital surface model that was measured in the winter. Unrealistic data such as negative snow depths or extremely high values are rare but still need to be filtered. Such outliers mainly occur in very steep and rough terrain, which is attributed to larger measurement errors in such terrain Grünewald and Lehning, 2011;Bollmann et al., 2011;Hopkinson et al., 2012). In this study, we set negative snow depths to zero and excluded extremely high snow depths from the data. The threshold for extremely high snow depth was set by visual inspection of the distribution separately for each data set, and was between 5 and 15 m. As ALS cannot measure snow depth on forested mountain slopes with reasonable accuracy and snow redistribution by interception and accumulation in forests were outside of the scope of this study, areas with forests and high bushes were masked. Most data sets currently available were measured near the peak of the local snow accumulation and represent the maximum of the seasonal snow amounts.
The accuracy of ALS-data depends mostly on the measurement platform. Most data sets mentioned above were gathered from airplanes, which are very effective in terms of the area that can be surveyed. The larger the flight height of the plane, which is typically several hundreds of metres above the surface, and the lower the impact angles of the laser beam in steep terrain, the lower is the accuracy. Nevertheless, the vertical error is usually below 30 cm for snow depth in alpine areas (Geist and Stötter, 2008;Moreno Banos et al., 2009;DeBeer and Pomeroy, 2010) but can be much larger in steep and rough areas. Alternatively, ALS campaigns can be performed with helicopter-based systems (Vallet and Skaloud, 2005;Skaloud et al., 2006;Vallet, 2011). The reduced terrain following flying height and the potential of Table 1. Data sets. "Date" is the date of the winter survey, "Area" the area of the data set, "Mean acc." the mean accuracy in vertical direction as quoted in the reference column and "Platform" the measurement plaform. the system to be tilted in the direction of the target (improved incident angel) results in a higher accuracy. Comparing helicopter-based snow depth maps with terrestrial laser scanning and tachymeter surveys, Grünewald et al. (2010) found that the error of the ALS data was below 10 cm. The typical area covered by ALS data sets obtained for this study is between 1 and 30 km 2 and they all have a spatial resolution close to one metre. Accuracy estimations and references to the data analyzed in this study are given in Table 1.

Site description
We assembled a large data set consisting of seven investigation areas located in six different regions, including the Swiss and the Austrian Alps, the Spanish Pyrenees, and the Canadian Rocky Mountains. Two catchments are dominated by glaciers whereas the remaining areas are free of ice. The borders of the investigation areas analysed in this study have been defined manually. This study focuses on the snow depth distribution of alpine areas. Therefore, effects of vegetation, such as interception of snow by trees, were not considered and all parts of the investigation areas which are covered by significant vegetation were masked from the data. Furthermore, it has been shown that ALS snow depth measurements are more accurate over alpine areas than adjacent forest covered areas (Hopkinson et al., 2012). In the following, we give a brief overview of the study sites. For more details, the reader may refer to Table 1, Figs. 1 to 3 and to the references given in the text.

Wannengrat, CH
The Wannengrat (WAN) area (Fig. 1a) is a small alpine catchment that is located in the eastern part of the Swiss Alps. The site has been an intensive investigation area for snow studies in recent years Grünewald and Lehning, 2011;Mott et al., , 2011Mott et al., , 2013Wirz et al., 2011;Egli et al., 2012). Most of the area is above the local tree line and no tall vegetation is present. Elevations range from 1940 to 2650 m a.s.l. and the lower part is typically characterised by gentle slopes whereas the summit region is formed by talus and steep rock faces. Storms which dominate the snow accumulation in the area usually come from the north-west . Two helicopter-based (Vallet and Skaloud, 2005;Skaloud et al., 2006;Vallet, 2011) winter ALS data sets were obtained in the area on 26 April 2008 and 9 April 2009, which represent the maximum snow accumulation in the area for the two years. The area of the second flight covers 5 km 2 , whereas the first flight only covers a small part of the domain (1.5 km 2 , see Grünewald et al., 2010;Grünewald and Lehning, 2011).

Piz Lagrev, CH
The Piz Lagrev (LAG) study area ( Fig. 1b) includes the south face of the Piz Lagrev, which is part of the mountain range defining the northern rim of the Engadin valley in the southeast of the Swiss Alps. The site is above the local tree line, and elevations range from 1800 to 3080 m. The Piz Lagrev is characterised by steep talus slopes which are surrounded by steep rock faces. Two pronounced, sheltered bowls dominate the upper part of the area . ALS measurements were obtained on 7 April 2009 using the same technology as for the Wannengrat, and cover an area of about 5 km 2 . Analysis of a nearby weather station indicates that the prevailing wind direction is south-west.

Haut Glacier d'Arolla, CH
The study domain Haut Glacier d'Arolla (ARO) is located on the main Alpine divide in the south-west of Switzerland at that point of time, elevations below 2700 m a.s.l. had already been affected by surface melt. Note that changes of the surface of the DEM due to glacier dynamics and the settling of firn in the accumulation area were not accounted for when calculating the snow depths. This resulted in a potential underestimation of the true snow depth in the accumulation area of the glacier, and in an overestimation in the ablation zone, whereas the overestimation in the ablation zone may be countered by the previously mentioned surface melt. An analysis of weather stations in the surrounding area showed that wind directions are variable, but that the synoptic main flow is from the southwest. This is also confirmed by temporary measurements in the catchment (Dadic et al., 2010a,b).

Hintereisferner, AT
The investigation area Hintereisferner (HEF) is situated in the upper Rofen valley, which is part of theÖtztal Alps at the Austrian-Italian border (Fig. 2b). The catchment, as defined for this study, covers an area of 25 km 2 and is characterised by glaciers, steep slopes and rock faces. Elevations range from 2300 to 3740 m a.s.l. and the largest glaciers are the valley glacier Hintereisferner (2002: ∼ 6.5 km 2 ) and the Kesselwandferner (2002: ∼ 3.8 km 2 ). On the local scale, wind directions are very variable, but the main direction is northwest. In this study, we analyze data obtained in three measurement campaigns in the beginning of May 2002May , 2003May and 2009, and one data set acquired in June 2002 (Geist and Stötter, 2008, Table 1). The first three flights were near the end of the seasonal snow accumulation in the area, whereas snowmelt clearly had an effect on the lower parts of the catchment in June 2002. To calculate snow depth, each of the winter DSMs was subtracted from a summer DEM obtained in the previous autumn of the respective year. As for ARO, glacier dynamics were neglected in the calculations. More details on the investigation area and the ALS campaigns can be found in Geist and Stötter (2008) and Bollmann et al. (2011).

Vall de Núria, ES
Vall de Núria (NUR) (Fig. 3a) is located at the main divide in the Eastern part of the Spanish Pyrenees. An ALS campaign covering an area of 38 km 2 was performed on 9 March 2009, which is near the time of the peak of the seasonal accumulation for the area (Moreno Banos et al., 2009). After masking significant vegetation and human infrastructure 28 km 2 of the area remained. Elevations range from 2000 to 2900 m a.s.l. and the mean slope is 28 • . The area is characterised by a mixture of gentle slopes and some steeper, rocky areas in the summit region. The most frequent synoptic wind direction in the area is from the northwest, even though a large variability of the wind direction is present on the local scale.

Marmot Creek, CA
Marmot Creek Research Basin in the Kananaskis Valley, Alberta is located within the Front Range of the Canadian Rocky Mountains. For our analysis we selected two small domains (Fig. 3b). The investigation area named Marmot Creek (MAR) in this study is the alpine zone surrounding Mount Allan and includes the Marmot Creek basin in the East. The

Aggregation of sub-areas
The aim of this study is to explain the spatial variability of snow depth on the catchment scale. With high-resolution snow depth measurements at hand, a statistical assessment considering snow depth and topography at the highest possible resolution is not meaningful. The difficulty becomes obvious if one thinks about cornices or other drift features. Maximum snow depth in a cornice is not the result of topographic features (e.g. slope) directly at this point but is shaped by the upwind topography interacting with the wind. Therefore, a systematic correlation between snow depth and terrain parameters will not be found at very small scales. Regression models of snow depth of the different investigation areas (not shown) confirmed that and showed that at the highest resolution (metre scale without aggregation), only 2 to 37 % of the variability could be explained by the terrain. Because of the large spatial variability at scales of metres (Shook and Gray, 1996;Watson et al., 2006), we define subareas in order to smooth this small-scale variability. We aggregated the data to spatially continuous subareas with length scales of some hundreds of metres. Two different methods for the aggregation are applied. The first is the same as performed by Lehning et al. (2011), who manually defined sub-areas from a subjective impression of the terrain (Fig. 4). The length scale of these subareas was approximately 500 m. Such an aggregation follows the concept of hydrological response units (HRU), which has successfully been applied for semi-distributed hydrological models (Kite and Kouwen, 1992;Rinaldo et al., 2006;Pomeroy et al., 2007) and is consistent with the concept of stratified snow sampling (Steppuhn and Dyck, 1974) where landscape units were defined to minimise within-unit variance of SWE and maximise difference in SWE between units. The facets applied by Daly et al. (1994) for the calculation of precipitation gradients or to the concept of snow accumulation units outlined by Hopkinson et al. (2012) are also similar approaches. The second method is a simple automatic approach which divides the domain into quadratic subareas. This approach is very similar to a simple resample of the data to larger cell sizes. We have tested different grid sizes ranging from 100 to 800 m. An example which illustrates the aggregation is shown in Fig. 4. The criteria which were applied to select the cell size appropriate for the study are: (i) the small scale variability should be smoothed out to an extent that the remaining variability can be reasonably explained by the topographic parameters, and (ii) the number of sub-areas must be large enough to be able to fit robust regression models (N 30). In order to avoid small clusters which do not represent a significant footprint of the catchment, very small sub-areas (< 10 000 m 2 ) were automatically excluded from the aggregated data. The approach is similar to the concept of representative elementary area (Wood et al., 1988;Blöschl et al., 1995), which aims to identify ideal scales for hydrological modelling applications.
As the second method is a straightforward and fully automated approach, it is possible to compose multiple statistical models by randomly shifting the origin of the grid. This approach helps to identify the range of model performance and if a selected grid results in a representative model. The impact of random effects, which might be caused by a single grid or by subjective classification of the sub-areas, can therefore be assessed.

Statistical models
We use multiple linear regression to model the spatial variability of snow depth. The dependent variable is the relative snow depth dHS (deviation from the catchment mean). Simple parameters that describe the surface topography and that can be derived from a summer digital elevation model serve as explanatory variables. Elevation reflects the effect of decreasing air temperatures with altitude and can therefore be used as a proxy for parameters such as freezing level or the amount of energy available for melt (Clark et al., 2011), and for orographic effects of precipitation (Frei and Schär, 1998;Grünewald and Lehning, 2011) though it is recognized that there are substantial differences in orographic effects between windward and leeward slopes (Pomeroy and Brun, 2001). In order to be able to combine data sets from different areas, elevation was changed to relative elevation (dE), which is the difference between the absolute and the minimum elevation of the area. This mapping respects the fact that all data sets are from the alpine zone at or above tree line but that this corresponds to different absolute elevation in the different areas. The slope (SL) represents gravitational processes such as sloughing and avalanching, which can have a significant effect on the snow distribution (Blöschl and Kirnbauer, 1992;Gruber, 2007;Sovilla et al., 2010;Bernhardt and Schulz, 2010). In combination with northing (NO), the slope also describes the amount of solar energy which is available for the ablation and settling of snow. NO is also important for the deposition and redistribution of snow by wind (Seyfried and Wilcox, 1995;Lehning et al., 2008), as more snow is usually accumulated in the leeward sides of slopes and mountains. An additional parameter, which represents the effect of the wind, is the mean sheltering index SX , which has been calculated for the direction of the main flow (see site descriptions) and for a maximum distance of 100 m. Several studies showed that SX is a good measure for sheltering and exposure of the local terrain, which gives a simple but reasonable representation of the local flow field and therefore of the redistribution of snow by wind (e.g. Winstral and Marks, 2002;Anderton et al., 2004;Erickson et al., 2005;Molotch et al., 2005;Litaor et al., 2008;. Finally, different measures for the surface roughness are applied. The standard deviation of the elevation (σ (E)) and slope (σ (SL)) represent classical morphometric roughness measures (Evans, 1972;Speight, 1974;Shepard et al., 2001). The surface roughness strongly affects the redistribution of snow by wind and gravitational processes (Jost et al., 2007), and can be seen as the capability of the surface to trap snow. As suggested by Lehning et al. (2011), we also tested the fractal dimension D and the ordinal intercept γ of the semi-variogram, which have been identified as good measures for the surface roughness (Goodchild and Mark, 1987;Power and Tullis, 1991;Klinkenberg, 1992;Klinkenberg and Goodchild, 1992;Xu et al., 1993;Sun et al., 2006;Taud and Parrot, 2006).
For each of the aggregated data sets, we separately calculated all possible two-and three-parameter regression models and selected the most significant models (p value < 0.001). Further selection criteria are the explanatory power (R 2 ) of the model and -if the R 2 was similar -the number of parameters included in the model. For data sets with a twoparameter model, which had a similar performance as the best three-parameter-model, the two-parameter-model was selected. To account for possible interactions of the parameters included in the models, factor combinations (multiplication of parameters) have been tested and included where appropriate. The preconditions of linear regression (normal distribution and constant variance of residuals) were examined by analyzing the Quantil-Quantil plots and the Tukey-Anscombe plots of the model residuals (not shown). Additionally, the physical meaningfulness of the models (e.g. sign of the coefficients) has been examined, and models have not been selected if the sign of the coefficient was counter-intuitive. As mentioned before, the automatic aggregation to quadratic sub-areas allows the calculation of multiple random grids. This in turn can be used to analyse whether the best models are consistent (same explanatory variables) and show a small standard deviation of the coefficients when calculated for different grid realizations. Note that the fractal parameters D and γ could not be included in the multiple model runs, as their computation is computationally very demanding. D and γ were therefore only applied for the models with manually defined sub-areas and for few selected quadratic models.
To assess the transferability of the results, we also built a "global" model by combining all data sets. In addition, subsets of the data were combined based on their morphological characteristics. The two morphological models discussed here are "Glacier", which includes the two sites which are currently glaciated, and "No glacier", which combines all remaining, non-glaciated data sets from alpine terrain.
The multiple-year data sets for HEF and WAN allow a validation in order to investigate the inter-annual consistency of the models. For HEF, models calculated from data sets containing three of the snow depth surveys in the area are validated with the remaining years. We repeated this procedure for each combination. Scatter plots and the coefficients of determination (R 2 det ) are used to assess the performance of the validation. As the extent of the 2008 data set for WAN is very small it could not be used to compose an own regression model. Nonetheless it could be used to validate the model obtained from the 2009 data set.

Aggregation of sub-areas
In order to find the appropriate level of aggregation for the automatic approach, we calculated models for different grid sizes and compared the results in terms of model performance. Figure 5 illustrates such a result for the Wannengrat area. The explanatory parameters of the model are dE, SL and NO. The figure illustrates a strong decrease in the snow depth variability with increasing grid size (Fig. 5a), which is negatively correlated to an increased model performance (Fig. 5b). This relationship is not surprising since an increasing portion of the variability is smoothed out by the aggregation. Nevertheless, at cell sizes of about 400 m the effect of additional smoothing appears very limited. Based on the analysis shown in Fig. 5, this level therefore appears to be a good choice. It can explain much of the remaining larger scale spatial variability, while the number of sub-areas still remains large enough to calculate robust statistical models. For the other investigation areas, similar results (not shown) were obtained. Cell sizes between 200 and 500 m showed to perform best. This is consistent with the length scale of > 100 m found by Shook and Gray (1996) for minimal autocorrelation in snow depth data. We therefore decided to apply a grid size of 400 m to all areas. Choosing a 200 m grid instead of the 400 m would result in very similar models but with lower performance: while the model parameters remain the same, the modelled R 2 would be reduced by 0.01 to 0.14 for the different data sets. For example for WAN R 2 would decrease from 73 to 59 % (Fig. 5b). Moreover, a 400 m grid is a similar level of aggregation as for the manually defined sub-areas. A comparison of models, calculated with sub-areas identified by the manual and the automatic (400 m) approach, showed that the results are very similar in terms of the best parameters selected for the final models, and in terms of model performance. As it is an objective and fully automatic method, we only show results from the automatic procedure in the following.

Statistical models
The model assumptions for linear regressions (see above) were fulfilled satisfactorily for all data. Nevertheless, small deviations from the normal distribution of residuals for very large positive and negative residuals have to be reported for ARO, HEF and SKO. Logarithmic transformation of the variables has been tested. As this did not yield improvements, the original, non-transformed variables were used. Table 2 summarises the best models for each of the single investigation areas. The table indicates that similar parameter combinations dominate most areas. Such, dE, NO and SL are included in almost all models. SX is included for NUR and for MAR. A roughness index is only present in the model for NUR (σ (SL)) and SKO (γ ). Nevertheless substituting γ by, for example, (σ (SL)) would result in a model with a similar performance. For some areas, different parameter combinations than the ones presented have only a slightly lower performance (not shown). Factor combination significantly improved the model for NUR, MAR and ARO.

Single catchment models
Adding an additional parameter would further improve the models for some of the areas (NUR, MAR, SKO). In order to keep the models as simple as possible and to avoid an overparameterization of the small areas, we do not include such additional variables.
As an example, we now focus on the Wannengrat area (first row in Table 2). The model represents a positive elevation gradient of the snow depth which is 15 cm per 100 m, a decrease of dHS with the slope angle (2 cm ( • SL) −1 ) and a decrease of dHS toward southerly aspects (4 cm (10 • NO) −1 ). Figure 6 compares a dHS map resulting from the model with measured dHS, aggregated to the model resolution. The upper image shows the snow depth distribution as aggregated from the ALS measurements, while the lower image presents the snow cover as calculated with the model. The figure illustrates that the statistical approach seems suitable to characterise the general spatial patterns of the snow depth distribution. Deviations occur especially in regions of extreme dHS, for example, in single cells in the northwest, the southeast or the southwest. This indicates that the model does not properly capture the extremes of the distribution. Nevertheless, the deviations are only small.
Returning to Table 2 and focussing on the different areas, it appears that both the parameters and the model coefficients are varying between the data sets. It is also obvious that the performance of the models (R 2 ) is quite different. While the model for LAG can explain more than 90 % of the snow depth variability, only 30 % could be explained for MAR and 41 % for SKO. For most areas, the R 2 is between 50 and 70 % which is a very good score in comparison to published results obtained at similar scales. The large range in the model performance might be attributed to the topographic diversity within the single catchments. Smaller, less heterogeneous catchments, such as LAG or WAN perform much better than the very large (HEF, ARO, NUR) or very complex (MAR, SKO) areas. Reasons for the relatively low R 2 for MAR and SKO can be found in the characteristics of the data sets, where the topography in the summit region of the two areas is dominated by extremely steep rock faces (Fig. 3b), which are exposed to wind and avalanches. In the period before the snow depth data had been acquired, several heavy storms occurred in the area. The heavy winds eroded and sublimated much of the snow cover in the summit region and, as a result, the major portion of the summit region was nearly free of snow. This is a common characteristic of alpine snow in the relatively windy Canadian Rockies (MacDonald et al., 2010). A further possible reason for the lower performance of some of the models could be the accuracy of some parts of the ALS data. As mentioned before, it must be expected that biases are larger in steep and rough terrain, especially for the data obtained from airplanes. Such terrain-based errors will also influence the models on resolutions as presented here and must be considered when interpreting results from such areas, especially when slope serves as an explanatory variable.
In general, the results confirm that topography dominates the snow depth distribution at that scale. The deviations between the models indicate that the influences on the snow depth distribution are variable in the different areas. For example, the magnitude of the elevation gradient is between 6 and 25 cm/100 m, and the coefficient for slope varies by a factor of four (ARO not included because of factor combinations). For NUR and MAR, the elevation gradient dE seems not to be relevant at all. Instead, the sheltering effect of the terrain (SX) is more important.
The representativeness of the results obtained from the selected grid (Table 2, Fig. 4) was assessed by calculating multiple model runs, for which the origins of the grid were shifted randomly. Results are summarised in Table 3. Fifty runs were performed for the best model, as identified in Table 2 (but without factor combinations), for each investigation area. The table shows the sample space (25 to 75 % quantiles) for all model parameters and for the model performance. The last column indicates the R 2 of the model run that is based on the same grid as the models presented in Table 2.
Generally the table indicates relatively small interquantile ranges for most models and parameters. For example for WAN (first line) the difference between the upper and lower quantile and the median of α i is only 6 to 10 % of the median. Similar ranges are characteristic for most parameters in the other data sets. Only α 1 and α 3 for SKO and α 3 for ARO show larger deviation of the quantiles and the median. For R 2 , interquantile ranges between 0.01 and 0.12 (Table 3) were calculated. Nevertheless, it is still necessary to note that for a few data sets outliers with much larger or much smaller R 2 values were observed. It is therefore an important finding from these ensemble calculations that the scatter between Table 3. Comparison of the model performance of the multiple random runs and the single selected model as presented in Table 2   the single runs is only moderate. Taken as a whole, Table 3 shows that most of the models represent the total average well. Moreover, we can conclude that the absolute position of the subareas only has a minor effect on the models. The spatial averaging procedure appears to result in relatively stable statistical relations, at least at the scale of the 400 m grid.

Combined models
For practical applications on a large scale, it would be very beneficial if it were possible to find a "global" model, which would be able to predict a large portion of the spatial variability for all data sets. We therefore combined all data to one large data set and built a new model, following the same procedure described above. However, the resulting model (Table 4), which selected dE and SL as explanatory variables, was only able to explain 23 % of the total snow depth variability of the aggregated data. Additional model parameters or factor combinations could not significantly improve the model. A model that only includes data from the alpine areas, which are presently not glaciated, is capable of explaining 30 % of the variability. This is clearly worse than the models for most of the single data sets (Table 2). It appears that as suggested by Pomeroy and Gray (1995), the interaction between topography and snow depth is not the same at different sites, and that a statistical relationship, developed for one area, cannot directly be transferred to a different site. In contrast, the "glacier" model features a similar performance to the models for the single glaciers (ARO and HEF). This could indicate that similar characteristics are important for the snow cover on glaciated catchments or that the sample size is too small to show substantial variability in glaciated sites. However, since the model consists of the same data as the HEF model presented in Table 2, only supplemented by the comparably small ARO data, the ARO data will only have a minor effect on the model and its R 2 value. Finally, when combining data sets from different sources, one needs to be aware of possible differences in the platforms and sensors applied for the data acquisition. Sensors of different manufacturers can be very diverse in their design, operational envelope and laser pulse specifications which might result in different levels of performance and error propagation. Nevertheless, the influence on statistical relationships should be marginal.

Inter-annual consistency
Results of the analysis of the inter-annual consistency, as described in Sect. 3.2.1, are shown in Table 5. Figure 7 illustrates the scatter plots for the validation, as shown in lines two (a) and five (b) of  Fig. 7b shows that for the WAN data the model is appropriate for both years for most sub-areas.
From these findings, we conclude that the models can be transferred between various years, at least for the points of time at which maximum snow accumulation occurred. This confirms a finding of , who detected a large consistency in the snow depth distribution at the Wannengrat between two years. A high inter-annual correlation was also reported by Deems et al. (2008) for two mountain sites in Colorado. However, at Wolf Creek Research Basin in Yukon, Canada, Pomeroy et al. (2004) and MacDonald et al. (2009) found substantially different snow accumulation distributions from year to year as primary wind directions of the major storms shifted between southerly and westerly. From our data, we cannot verify if the models can also be applied for earlier periods in the season. As the snow cover present at the end of the winter is the result of many individual snow fall events which are all affected by the topography, we speculate that the models, at least at a scale as presented in this paper, might also have explanatory power if applied to the accumulation season. The study of , who concluded that few major storms shape the snow depth distribution at the end of the winter, supports that hypothesis.

Conclusions
In this paper, we have applied linear multiple-regression modelling of snow depth distributions with topographic parameters for several small and medium size catchments in different mountain regions of the world. We showed that statistical modelling is capable of explaining large parts of the spatial distribution of snow depth on the catchment scale. This is only achieved if the data is aggregated to length scales of some hundreds of metres, in order to smooth out the very large heterogeneity caused by drifting snow at small scales. After arbitrary clustering of the data to quadratic cells with a grid size of 400 m, 30 to 91 % of the variability could be explained by including only three explanatory parameters. For the first time, a systematic investigation of the explanatory power as a function of scale has been made. The parameters which are most frequent in the models are the elevation gradient (dE), the slope (SL), a substitute for the aspect (NO), and a wind-sheltering index (SX). These results are typically better than those obtained by other studies because of the systematic aggregation. It is important to notice that owing to the high resolution of the original data applied in our study, the full small-scale variability of the snow cover is captured. This is not the case for most earlier studies, which had to rely on a limited number of point observations, such as snow stakes or SNOTEL sites. This means that not only is it likely that the full variability is not represented in these data sets, but also that there is a potential bias in the data, e.g. if only flatter portions of a catchment are sampled .
Studies which compare data from different regions are also rare. Our analysis showed that there are clear differences between the models for the specific regions, but that a global model still explains a significant fraction of the variability (23 or 30 % without glaciers). This indicates that the relationship between snow and topography is less universal than hypothesized by Lehning et al. (2011) and the application of a "global" model is limited. On the contrary, it could be shown that statistical models, developed for one year, can be well applied to other years, at least for the points of time of the maximum seasonal accumulation. This is an important finding for applications in hydrology or glaciology. A method, as presented here, can serve as a straightforward approach to characterise the spatial distribution of the snow cover. Applications from year to year should be applied with caution in some areas which have been reported to show high interannual changes in wind regimes that transport blowing snow.
We expect that more area-wide snow depth data sets will become available in the coming years. It might soon be possible to cover a larger spectrum of mountain regions for all over the world.
It will be interesting to see if the main topographic parameters that provided some explanation of snow depth distributions here for a limited section of climatic and physiographic environments are also important in other environments. Combining statistical with parametric or physically based methods might also yield more robust, yet still simple, models of snow accumulation. Another question which needs to be addressed is if such statistical relations are also valid for different points of time during the snow season, and how the statistical relationships change with time. This would require area-wide snow depth data for a larger area from different points of time during a season.
It should also be possible to extend similar models to vegetated areas. This would lead to the introduction of additional explanatory parameters which represent the influence of trees and vegetation, for example, a measure of winter plant-area index as used by Pomeroy et al. (2002).