Testing and Development of Transfer Functions for Weighing Precipitation Gauges in WMO-SPICE

. Adjustments for the undercatch of solid precipitation caused by wind were developed for different weighing gauge/wind shield combinations tested in WMO-SPICE. These include several different manufacturer-provided unshielded 20 and single-Alter shielded weighing gauges, a MRW500 precipitation gauge within a small, manufacturer-provided shield, and host-provided precipitation gauges within double-Alter, Belfort double-Alter, and small Double-Fence Intercomparison Reference (SDFIR) shields. Previously-derived adjustments were also tested on measurements from each weighing gauge/wind shield combination. The transfer functions developed specifically for each of the different types of unshielded and single-Alter shielded weighing gauges did not perform significantly better than the more generic and universal transfer 25 functions developed previously using measurements from eight different WMO-SPICE sites. This indicates that

winds, and was almost indistinguishable from the full-sized DFAR used as a reference. In general, the more effective wind shields, that were associated with smaller unadjusted errors, also produced more accurate measurements after adjustment.

Introduction
Precipitation measurements are frequently underestimated due to the interactions among wind, the precipitation gauge, and hydrometeors in the air around the gauge. The magnitude of this underestimation is affected by the wind speed and the 5 phase, size, density and crystal habit of precipitation. Many past observational (eg. Rasmussen et al., 2012;Wolff et al., 2013;Ma et al., 2015;Wolff et al., 2015;Chen et al., 2015) and theoretical (Theriault et al., 2012;Colli et al., 2015;Colli et al., 2016;Nespor and Sevruk, 1999;Sevruk et al., 1991) studies support this finding, including the first World Meteorological Organization (WMO) Solid Precipitation Measurement Intercomparison performed in the 1990s Yang et al., 1998;Yang et al., 1995). A new WMO Solid Precipitation Intercomparison Experiment (WMO-SPICE) was initiated 10 in 2010 to study and correct the effects of wind-induced errors on automated solid precipitation measurements, and also to evaluate new and existing precipitation and snow depth sensors in different configurations and climate regimes.
Measurements from the Spanish WMO-SPICE site were used to develop adjustments for tipping bucket gauge measurements (Buisán et al., 2017). Measurements from two WMO-SPICE test sites that pre-date the recent intercomparison have been used to describe and correct wind-induced undercatch for different types of wind shields (Kochendorfer et al., 15 2017b;Wolff et al., 2015). In addition, measurements from eight different WMO-SPICE sites during the intercomparison were used to derive multi-site adjustments for single-Alter shielded and unshielded gauges (Kochendorfer et al., 2017a).
These results indicate that despite some climate-or site-specific biases, multi-site adjustments (or transfer functions) can be used to effectively minimize the wind-induced undercatch of solid precipitation. In addition to the host-provided measurements used to derive the Kochendorfer et al. (2017a) multi-site single-Alter shielded and unshielded precipitation 20 transfer functions, WMO-SPICE included several manufacturer-provided weighing precipitation gauges for evaluation, and also weighing gauges tested within other types of shields for more specific national and scientific interests. These measurements were processed using standardized methods developed and implemented by WMO-SPICE (Kochendorfer et al., 2017a), allowing for both the creation of new transfer functions and the evaluation of existing transfer functions derived from independent measurements. 25 In the present study, transfer functions were developed and tested using WMO-SPICE measurements from several types of weighing gauges and shields, including gauge types that have never been intercompared before, and for which no other adjustments are currently available. Previously-derived adjustments were also applied to these measurements to test their applicability and efficacy for each of the gauge/shield types under evaluation, and also to test the hypothesis that for a given 30 shield type, the same corrections can be used to minimize wind-induced errors for different types of heated weighing precipitation gauges of a similar size and shape. Assuming that the type of shield (or the lack of a shield) is the primary Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2017-228 Manuscript under review for journal Hydrol. Earth Syst. Sci. Discussion started: 12 June 2017 c Author(s) 2017. CC BY 3.0 License. determinant of undercatch, we used these measurements to compare transfer functions derived specifically for these gauges with more generic transfer functions derived previously using other gauges and measurements. Recommendations are made based on these evaluations.

Precipitation Measurements 5
Weighing gauges and shield configurations tested at six of the WMO-SPICE testbeds ( Fig. 1) are included in this study.
They were all evaluated during 2013-2015, with measurements during the two winter seasons (Oct 1 -April 30) from this period considered in the present analysis. The Double Fence Automated Reference (DFAR), which was defined as the working automated reference for WMO-SPICE, consisted of a either a Pluvio 2 or a Geonor T-200B3 within a DFIR shield, and was used as the reference for all of the measurements evaluated here . The DFIR shield has been described 10 in detail by Kochendorfer et al. (2017b) and Goodinson et al. (1998), and comprises two concentric, octagonal fences constructed out of 1.5 m long wooden lath, with the outer shield having a diameter of 12 m, and the inner shield having a diameter of 4 m. At the centre of the inner shield, the weighing gauge is installed in a single-Alter shield. Typically the top of the single-Alter shield and the inlet of the weighing gauge are at a height of 3 m, but at Weissfluhjoch and Haukeliseter they were installed higher than this (at 3.5 and 4.05 m, respectively) to prohibit drifting snow from burying the shield and 15 gauge.
The TRwS 405 (MPS Systém, TRwS 405) has a heated 400 cm 2 orifice, a 750 mm capacity, and uses a strain gauge to measure the mass of precipitation accumulated within its bucket. It was provided by the manufacturer without a wind shield and tested at both the Haukeliseter and Marshall testbeds. The MRW500 (Meteoservis, MRW500, Czech Republic), with a 20 heated 500 cm 2 orifice and a 1400 mm capacity, also employs strain gauges for weight measurement, and was tested at both the Marshall and Bratt's Lake testbeds. Both an unshielded (Fig. 2b) and a shielded gauge (Fig. 2c) were installed at each site, with the small, manufacturer-provided shield constructed out of fixed metal slats (similar to a Tretyakov shield) and attached to the same base as the gauge. An unshielded and a single-Alter shielded (Fig. 2d) Total Precipitation Gauge (TPG) provided by Sutron (Sutron, TPG, VA, USA) were tested at the Marshall testbed. The Sutron TPG uses a load cell to 25 quantify the amount of accumulated precipitation, has a 914 mm capacity, an 8" diameter inlet (20.32 cm diameter, 324.3 cm 2 area), and was provided with a heater. The T-200-MD3W (1500 mm) from Geonor (Geonor, Oslo, Norway) was tested at Marshall, Bratt's Lake, Weissfluhjoch, and Caribou Creek, and was provided with a heater.
the Pluvio 2 at CARE using the manufacturer-provided heater and the Geonor at Marshall using the US Climate Reference Network heater (described in NOAA Technical Note NCDC No. USCRN-04-01). The Belfort double-Alter shield, which has the same sized footprint as the standard double-Alter shield, but with longer slats (46 cm long for the inner shield, and 61 cm long for the outer shield) that do not taper like the double-Alter and a lower porosity (30% vs the standard double-Alter porosity of 50%), was also tested at both CARE and Marshall. Another important distinction is that the standard double-5 Alter slats rotate freely, while the Belfort double-Alter slats employ springs to limit their travel within 45° of the vertical.
Like the standard double-Alter, the Belfort double-Alter was tested with a Pluvio 2 at CARE and a 600 mm Geonor T-200B3 at Marshall. The small DFIR (SDFIR) shield, which is 2/3 the size of standard DFIR shield and was designed to be more easily constructed out of commonly-sized North American lumber, was tested only at the Marshall site. Like the standard DFIR shield, the SDFIR configuration comprises three concentric wind shields. The wooden laths on the two-outermost 10 concentric shields were 1.2 meters long, and the diameters of the two outer shields were 8.0 meters and 2.6 meters. The height of the inner wooden shield was 10 cm lower than the outer shield. A standard single-Alter shield, mounted at the same height as the gauge inlet and 10 cm lower than the inner wooden shield, was mounted around the gauge. Table 1 summarizes the different gauges and shields that were tested, and includes some statistics describing the available measurements.

Wind speed and direction 15
Wind speed measurements from each site were used to create 30-minute-average wind speeds. Because transfer functions developed from both the gauge height and the 10 m height wind speed were desired, and not every site included wind measurements at both heights, the available 30-min measurements and the logarithmic wind profile were used to determine either the gauge height or 10 m wind speed, when necessary. The methods used to do this are described in detail in Kochendorfer et al. (2017a). 20 Quality control of the wind speed measurements included removal of wind speeds equal to zero, removal of 'stuck' wind speeds at the Haukeliseter site, where the wind speed would occasionally remain unchanged for several hours at a time, and removal of 10 m height wind speeds at the CARE site that were less than the gauge height wind speed. At the Marshall site, a composite 10 m height wind speed was created using two separate 10 m height wind speed sensors, as the sensors were 25 observed to shadow each other from either due north or due south. All of these steps are described in detail in Kochendorfer et al. (2017a). For the TRwS 405 precipitation gauge at Haukeliseter, an additional screening for wind direction was performed based on the gauge's position relative to the DFAR; precipitation measurements with wind directions between 115° and 140° were excluded from the analysis due to wind shielding by the DFAR.

Quality Control and Selection of 30-min Periods of Precipitation 30
The methods applied to the available 6-s and 1-min data include: a range filter, to remove values that were above or below realistic limits, or outside of the operational limits for a given gauge; a 'jump' filter, to remove sudden changes in accumulation exceeding a specified threshold; and a Gaussian filter, to remove high-frequency noise. For use in developing transfer functions, the resultant 1-min (all 6-s data were aggregated to 1-min), quality-controlled measurements were then used to create 30-min datasets that included only periods of precipitation. To ensure that precipitation was occurring, a minimum threshold of 0.25 mm of precipitation as measured by the DFAR was used. In addition, based on independent optical precipitation detector measurements, precipitation had to occur for at least 60% of every 30-min period (18 min). 5 Methods created within the WMO-SPICE project to smooth and quality control precipitation measurements are also detailed elsewhere (Kochendorfer et al., 2017a;Reverdin, 2016).
Following Kochendorfer et al. (2017a), a minimum precipitation threshold was identified for every gauge or shield under evaluation to help create an unbiased pool of measurements available for analysis. All precipitation measurements below the 10 minimum threshold determined for each specific gauge/wind shield configuration were excluded from the analysis. The minimum precipitation thresholds were calculated by multiplying the minimum DFAR precipitation of 0.25 mm by the median catch efficiency of the gauge under test, using only solid precipitation measurements (mean T air < -2 °C) with high winds (5 m s -1 < U 10m < 9 m s -1 ). Also following Kochendorfer et al. (2017a), a maximum catch efficiency threshold was calculated as 1.0 plus three times the standard deviation of the catch efficiency of the gauge under test, and all measurements 15 exceeding the relevant maximum catch efficiency threshold were excluded from the analysis.

Transfer Function Models
The Kochendorfer et al. (2017a) Equations 3 (hereafter KOC-Eq. 3) and 4 (hereafter KOC-Eq. 4) were fit to the resultant 30minute weighing gauge measurements. KOC-Eq. 3 was fit as a function of wind speed (U) and air temperature (T air ), while KOC-Eq. 4 was fit separately to solid and mixed precipitation measurements as a function of wind speed only. In the latter 20 case, precipitation type was determined using T air , with solid precipitation defined as T air < -2 °C, and mixed defined as 2 °C ≥ T air ≥ -2 °C. For some of the gauges examined here, KOC-Eq. 4 unrealistically over-predicted catch efficiency at low wind speeds when insufficiently constrained by the available measurements, and in these cases a more constrained function was used to describe realistic corrections for sparser or noisier results, especially for gauges with fewer low wind speed measurements: 25 where U is wind speed, and a, and b are coefficients fit to the data. Following Kochendorfer et al. (2017a), both gauge height wind speed corrections and 10 m height wind speed corrections were developed for all of the transfer functions tested.

Maximum Wind Speed Threshold
For the application of transfer functions, a maximum wind speed threshold (U thresh ) above which the transfer function should 30 not be applied was determined based on a visual assessment of the KOC-Eq. 3 transfer function fit to the available measurements. This was done by viewing the catch efficiency function of wind speed and air temperature superimposed over the actual measurements, and identifying the wind speed above which all temperature ranges below 2 ˚C were not generally well represented by the available measurements. The same threshold was applied to the Eq. 1 and the KOC-Eq. 4 transfer functions as well. In practice, when the wind speed is above the maximum wind speed threshold, the wind speed should be forced down to the maximum wind speed threshold to adjust the precipitation. A diagram describing the effects of the maximum wind speed threshold on an example transfer function is shown in Figure 3, using the unshielded KOC-Eq. 4 5 transfer function developed in Kochendorfer et al. (2017a).

Testing of Transfer Functions
When measurements from more than one site were available for a specific gauge or shield combination, all of the available measurements were used to create a common transfer function, and the transfer function was then tested on data from each site independently to determine the magnitude of site biases and the appropriateness of the transfer function for each 10 individual site. For gauges that were only tested at one site, a 10-fold cross validation was relied upon to maintain some independence between the measurements used to produce and test the transfer functions. The 10-fold cross validation was performed in 10 separate iterations, using 90% of the measurements to determine the transfer function and the remaining 10% to test the transfer function. The resulting error statistics were based on the average of all ten iterations.

15
Errors in the adjusted measurements were estimated by applying the appropriate transfer function and comparing the results to the corresponding DFAR measurements. The errors were then used to calculate the root mean square error (RMSE), the mean bias, the correlation coefficient (r), and the percentage of 30 min events with errors less than 0.1 mm (PE 0.1 mm ). These statistics were estimated for the KOC-Eq. 3 transfer functions using all of the available precipitation measurements. For the Eq. 1 and KOC-Eq. 4 transfer functions, the error statistics were estimated by separating the datasets using the mean air 20 temperature into liquid (T air > 2 ˚C), mixed (2 °C ≥ T air ≥ -2 °C.), and solid (T air < -2 ˚C) precipitation, correcting the mixed and solid precipitation using the appropriate transfer functions, combining these results with the uncorrected liquid precipitation measurements, and comparing the results to the corresponding DFAR measurements. The liquid precipitation measurements were not corrected because warm-season precipitation measurements were not included in the WMO-SPICE dataset, and the resultant liquid precipitation wind speed catch efficiencies were not significantly different than one. 25 However some liquid precipitation measurements were necessary for the successful creation of KOC-Eq. 3 type transfer functions, so the available liquid precipitation measurements were included both in the derivation and the evaluation of the KOC-Eq. 3 type transfer functions.

Evaluation of Independent Transfer Functions
In addition to developing new adjustments for all of the weighing gauges tested within WMO-SPICE, the WMO-SPICE 30 weighing gauge measurements were also used to evaluate other independently-derived transfer functions that are available. using eight different testbeds, and transfer functions determined from pre-SPICE measurements recorded at the Marshall testbed (Kochendorfer et al., 2017b).
Single-Alter and unshielded transfer functions developed previously by Kochendorfer et al. (2017a) from WMO-SPICE host-provided weighing gauges (either Geonor T-200B3 or OTT Pluvio 2 ) were tested on measurements from all of the 5 manufacturer-provided unshielded and single-Alter shielded gauges evaluated within WMO-SPICE. The hypothesis behind this testing is that the response of a gauge to wind speed and air temperature is more sensitive to wind-shielding or a lack thereof than to the gauge type. Although the transfer functions from Kochendorfer et al. (2017a) were not developed for these specific gauges, they include measurements from eight different sites, and therefore may be more robust and universally applicable than transfer functions developed from measurements at the limited number of sites where a specific 10 manufacturer-provided gauge was tested. A robust transfer function should arguably be developed from measurements representing a wide variety of precipitation types and wind speeds, as any transfer function is only valid for the range of conditions represented during its development. In addition, as shown in Kochendorfer et al. (2017a), significant site biases exist, indicating that the use of data from several sites for the creation of a transfer function is preferable to the use of data from just one or two sites. Because of this, it is possible that a generic transfer function developed using data from several 15 sites may be more universally applicable to a manufacturer-provided weighing gauge than the gauge-specific transfer function. For the double-Alter, Belfort double-Alter, and the SDFIR shielded gauges, which were not tested as broadly within WMO-SPICE as the single-Alter shielded and unshielded gauges, the WMO-SPICE measurements recorded at Marshall and CARE were used to evaluate transfer functions created by Kochendorfer et al. (2017b)  adjustments were calculated from differences between the transfer function catch efficiencies and the measured catch efficiencies. In addition, RMSE values of the adjusted catch efficiencies were estimated by applying the appropriate transfer function to the measurements and calculating the resultant error in the catch efficiency; ideally the adjusted catch efficiency should be equal to one, so the difference between the adjusted catch efficiency and one was used to quantify errors in the adjusted catch efficiency. This evaluation was limited to solid precipitation (T air < -2 ˚C) measurements for ease of 5 presentation, and the KOC-Eq. 3 type transfer functions were used for all of the different gauge configurations examined.

Results and Discussion
Using the methods described above, transfer functions were created for the weighing gauges and windshields included in WMO-SPICE. For all of the gauges and shields tested, the KOC-Eq. 3 was fit to the catch efficiency measurements using the measured wind speed and air temperature. In addition, either the KOC-Eq. 4 or the Eq. 1 type adjustment was fit to the 10 mixed (2 °C ≥ T air ≥ -2 °C.) and solid (T air < -2 ˚C) precipitation as a function of wind speed. These transfer functions were created for both the gauge height wind speed and the 10 m height wind speed. Statistics describing the transfer function errors were calculated based from the differences between the adjusted precipitation measurements and the DFAR precipitation measurements. In addition, transfer functions available from other studies were tested on the precipitation measurements, and errors in the uncorrected measurements were also described. 15

Sutron TPG
Two Sutron TPG gauges were tested in WMO-SPICE, and both were installed at the Marshall testbed. One was unshielded and the other was provided with a single-Alter shield. Separate transfer functions were developed using measurements from each configuration. In addition, the more universal Kochendorfer et al. (2017a) transfer functions for weighing gauges in unshielded and single-Alter configurations were tested using measurement data from the Sutron gauge in the corresponding 20 configuration.

Unshielded Sutron TPG
Transfer functions were developed from the 30-min measurements as a function of wind speed for both mixed and solid precipitation (KOC-Eq. 4), and also as function of both wind speed and air temperature (KOC-Eq. 3). Because this gauge type was only available at one site, the RMSE (Fig. 4a), bias (Fig. 4b), correlation coefficient (Fig. 4c), and PE 0.1 mm (Fig. 4d) 25 were estimated using 10-fold cross-validation (Section 2.6). One of the 'universal' unshielded transfer functions developed by Kochendorfer et al. (2017a) was also tested on these independent measurements, and performed as well as the corrections developed specifically for this gauge-shield combination (Fig. 4). As a result of this, our recommendation is to use the more 'universal' corrections from Kochendorfer et al. (2017a) for the unshielded Sutron TPG; the resultant transfer function coefficients developed for this specific gauge are therefore not included here. 30 Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2017-228 Manuscript under review for journal Hydrol. Earth Syst. Sci. Discussion started: 12 June 2017 c Author(s) 2017. CC BY 3.0 License.

Single Alter Sutron TPG
The different transfer functions developed and tested for the single-Alter (SA) shielded Sutron TPG performed similarly (Fig. 5). Like the unshielded Sutron TPG, the custom transfer function developed for this gauge did not perform significantly better than the 'universal' transfer function from Kochendorfer et al. (2017a). Because the 'universal' functions from Kochendorfer et al. (2017a) were developed from single-Alter gauges at eight separate sites and included more 5 measurements and a wider range of wind speeds and precipitation types, they are more widely applicable than the transfer functions developed specifically for the Sutron with the single-Alter gauge using measurements from a single site.

Meteoservis MRW500
Meteoservis MRW500 weighing gauges were tested at the Bratt's Lake and Marshall testbeds, with unshielded and shielded gauges tested at each site. Error statistics were calculated for different forms of the transfer functions. The 'universal' 10 unshielded transfer function was applied to the unshielded gauge measurements. Although the shielded MRW500 weighing gauge was provided with a custom Tretyakov-type shield, which was smaller than a standard single Alter shield and constructed out of metal slats mounted at a fixed angle (Fig. 1c), the 'universal' single-Alter shielded adjustment was tested on this configuration as previously-derived adjustments for an automated gauge within a comparable shield were available.

Unshielded MRW500 15
The resultant error statistics were similar for all the transfer functions developed from these data, with no significant differences between the KOC-Eq. 3 and KOC-Eq. 4 type adjustments for the gauge height wind speed and the 10 m wind speed (Fig. 6). All of the error statistics also showed significant improvement in the corrected measurements compared to the uncorrected measurements (Fig. 6). The universal unshielded function derived in Kochendorfer et al (2017a) performed as well as the unshielded functions developed specifically for this gauge. We therefore recommend using the universal transfer 20 function derived in Kochendorfer et al. (2017a), rather than the transfer function fit to these specific gauge measurements, as there appears to be no significant advantages to using the transfer function coefficients derived specifically for this type of unshielded weighing gauge.

Shielded MRW500
All of the different transfer functions developed and tested using the Marshall and Bratt's Lake measurements effectively 25 reduced the RMSE and bias values, and increased the correlation coefficients and PE 0.1 mm (Fig. 7). In addition, the 'universal' SA transfer function was tested with this gauge. The resultant RMSE values were similar to the RMSEs for the transfer function developed by Kochendorfer et al. (2017a) (Fig. 7a), and the PE 0.1 values were actually improved by the use of the SA transfer function (Fig 7d). However, the negative bias resultant from the application of the universal single-Alter correction indicates that this gauge was generally under-corrected by the single Alter transfer function, particularly at Bratt's 30 Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2017-228 Manuscript under review for journal Hydrol. Earth Syst. Sci. Discussion started: 12 June 2017 c Author(s) 2017. CC BY 3.0 License. Lake (Fig. 7c). For this reason, the new transfer function coefficients provided in Table 2 should be used to correct this gauge-shield combination, rather than the 'universal' single Alter correction from Kochendorfer et al. (2017a).
For the exponential wind speed transfer functions developed separately for solid and mixed precipitation, Eq. 1 was developed and used for the shielded MRW500 because KOC-Eq. 4 resulted in unreasonably high transfer function results as 5 the wind speed approached 0 m s -1 . This result are probably more closely linked with the scarcity of low wind speed events from these two sites and random errors in the low wind speed catch efficiencies rather than the gauge configuration. In addition, although an exponential fit was used for these data because it is more realistic at high wind speeds (where a linear fit would predict negative catch efficiencies), for the data available, the shielded MRW500 catch efficiency responded quite linearly to wind speed. Unfortunately there were insufficient high-wind data available from the Bratt's Lake and Marshall 10 sites to evaluate this shield at higher winds, where the catch efficiency would presumably asymptote at minimum value that was greater than zero.

Unshielded MPS TRwS 405
The MPS TRwS 405 gauge relies on a strain gauge to measure the mass of water accumulated within it. It was provided without a shield, and was tested at the Haukeliseter and Marshall testbeds. Transfer functions were developed specifically for 15 this gauge by combining the Marshall and Haukeliseter measurements together, and the efficacy of the transfer functions was tested by applying the corrections to the measurements and calculating statistics using the resultant errors at each site. The RMSE, bias, correlation coefficients, and PE 0.1 mm values (Fig. 8)  unshielded correction performed as equally well as the unshielded correction fit specifically to this gauge, and because of this the gauge-specific transfer function coefficients developed here are neither recommended or shown. 20

Single-Alter shielded Geonor T-200-MD3W (1500 mm)
The Geonor T-200-MD3W is based on the same design as the 600 mm and 1000 mm Geonor T-200B3 3-wire, vibratingwire gauges, but it has a taller cover, taller bucket with increased capacity (1500 mm), and different vibrating wire transducers. This gauge was tested at the Marshall, Bratt's Lake, Weissfluhjoch, and Caribou Creek sites. For reasons that are not well understood, the catch efficiency of the single-Alter shielded Geonor 1500 mm gauges at Weissfluhjoch and 25 Caribou Creek did not decrease significantly with wind speed. The 1500 mm Geonor at Caribou Creek was installed near low trees, which may have sheltered the gauge from the wind from some directions; however, a lack of wind direction measurements available in the WMO-SPICE event dataset from this site prohibits confirmation of this hypothesis. The Weissfluhjoch site has previously been demonstrated to be less sensitive to wind than other sites (Kochendorfer et al., 2017a), for reasons that are also difficult to understand or confirm. The relative insensitivity of the catch efficiency to wind 30 speed at these two sites is apparent from the fact that even using the custom 1500 mm transfer functions to correct the measurements, there was no improvement in the RMSE at either of these sites (Fig. 9a), and the bias (Fig. 9b)  the 1500 mm Geonor transfer functions (derived from the measurement data from this specific gauge configuration) overcorrected the measurements at Caribou Creek and Weissfluhjoch. Using the universal transfer function, the 1500 mm Geonor measurements from Weissfluhjoch and Caribou Creek were further over-corrected (Fig. 9b). The significant differences between the 'universal' transfer function and the transfer functions fit to the 1500 mm Geonor 5 measurements are attirbuted mainly to differences in the available measurements used to create the universal transfer function and the 1500 mm transfer functions; 35% of the available 1500 mm Geonor measurements were recorded at Weissfluhjoch, where the catch efficiency for all the gauges did not drop off with wind speed to the same degree as most of the other sites (Kochendorfer et al., 2017a). When the number of Weissfluhjoch measurements contributing to the 1500 mm transfer functions was artificially reduced, for example, the resultant 1500 mm transfer function was similar to the universal 10 single-Alter adjustment. Because there is no obvious physical explanation for why the Geonor 1500 mm gauge would have a higher catch efficiency than the 600 mm or 1000 mm Geonor gauges (the collecting area is the same for each configuration), and the reasons for the relatively poor performance of the universal function may be due more to the specific population of 1500 mm Geonor measurements available within this intercomparison, the universal transfer function is still recommend over the custom transfer functions derived from the 1500 mm Geonor measurements. At the Marshall site, where the catch 15 efficiency decreased with wind speed as expected, the universal transfer function performed better than the gauge-specific transfer functions, and at Caribou Creek, the differences between the custom 1500 mm adjustment and the universal single-Alter adjustment were small.
In general, the different error statistics generated from the 1500 mm Geonor measurements indicate that this gauge was 20 subject to more noise than the host-provided gauges used to develop the universal single-Alter transfer functions in Kochendorfer et al. (2017a). For example, at Marshall, the 1500 mm single-Alter Geonor RMSE values were about 0.25 mm and the PE 0.1 mm values were about 60%, while for the 600 mm Geonor at this same site, the RMSE values were about 0.15 mm, and the PE 0.1 mm values were about 70%.

Double-Alter 25
The double-Alter shield was tested at both CARE and Marshall; the CARE site had an OTT Pluvio 2 in a double-Alter shield, and the Marshall site had a 600 mm Geonor T-200B3 in a double-Alter shield. The pre-SPICE double-Alter transfer function (Kochendorfer et al., 2017b) performed about as well as the WMO-SPICE transfer function, and the different types of transfer functions developed from the WMO-SPICE measurements all performed similarly (Fig. 10). 1392 measurements were available from the pre-SPICE Marshall measurements (Kochendorfer et al., 2017b), and only 723 measurements were 30 available from the WMO-SPICE measurements (Table 1). However this new WMO-SPICE correction is arguably more defensibly applicable to all double-Alter measurements than the pre-SPICE transfer function, because it was developed using measurements from two sites. Table 3  season, liquid precipitation measurements included in the SPICE datasets, use of the KOC-Eq. 3 transfer functions presented here is not recommended when T air is > 5 ˚C, as they produce unrealistically high warm-temperature precipitation catch efficiencies. If a KOC-Eq. 3 type function is needed that is also applicable to warm-season measurements, we recommend using the pre-SPICE function, which was demonstrated in the testing performed here to be quite similar to the WMO-SPICE functions (Fig. 10). 5

Belfort double-Alter
The Belfort double-Alter shield, which has the same size footprint as the standard double-Alter shield, but with longer slats and a decreased porosity of about 30% relative to that of the standard double-Alter (~ 50%), was more effective at reducing undercatch than the standard double-Alter. This is demonstrated by the generally small improvements of the corrected measurements over the uncorrected measurements (Fig. 11a), and also by the near-zero uncorrected biases for the gauges at 10 both Marshall and CARE (Fig. 11b). These measurements, recorded at two separate sites, confirm the efficacy of the Belfort double-Alter shield documented by Kochendorfer et al. (2017b) using measurements from a single site. In terms of data availability, 919 30-min measurements were included in the present WMO-SPICE measurements (Table 1), and 1204 30min measurements were available for the Pre-SPICE Marshall transfer function development. Although the two datasets resulted in similar transfer functions (as evidenced by the equivalent performance of the pre-SPICE transfer function when 15 tested on these new measurements in Figure 11), we recommend the transfer functions determined from the WMO-SPICE measurements in this work, because they include measurements from two sites, and are therefore more broadly applicable.
Like the double-Alter WMO-SPICE transfer function, the KOC-Eq. 3 Belfort double-Alter transfer functions did not include many liquid precipitation events, but in this case, the resultant transfer functions were more realistic at warm temperatures, and can therefore be recommended for use in all conditions. The associated transfer function coefficients are provided in 20 Table 4.

Small DFIR
Tested only at the Marshall testbed, the SDFIR was the largest wind shield tested, and the uncorrected and corrected SDFIR measurements were associated with the lowest RMSE and bias values ( Fig. 12a and 12b) and the highest correlation coefficient and PE 0.1 mm values ( Fig. 12c and 12d)  function coefficients determined from the WMO-SPICE measurements are not included here, as the pre-SPICE transfer function was developed using measurements from the same gauge/shield, at the same site, over a much longer period.
However, because Kochendorfer el al. (2017b) did not include KOC-Eq. 4 coefficients, the KOC-Eq. 4 transfer functions determined from the WMO-SPICE measurements, which were unaffected by the lack of warm-temperature precipitation in the WMO-SPICE measurements, are included in Table 5. 5 In general, the necessity of transfer function adjustments for SDFIR-shielded measurements is disputable, as the corrected measurements were not significantly better than the uncorrected measurements (Fig. 12), with only a small improvement in the mean bias (Fig. 12b). In addition, the different correction types all performed similarly. These corrected and uncorrected SDFIR-shielded measurements are still interesting, however, because they provide a good indication of the magnitude of 10 errors when comparing well-shielded gauges. These measurements are a good representation of the current limits in accuracy for precipitation measurements recorded using two different gauges at the same site, both of which are well-shielded. The inferences that can be drawn from such well-shielded measurements are further emphasized in the following section comparing the different shields and adjustments.

Synthesis 15
Examples of all of the recommended KOC-Eq. 3 type adjustments for solid precipitation are included in Fig. 13a, with the 3dimensionsal transfer functions plotted against the gauge height wind speed at T air = -5 ˚C. The T air value of -5 ˚C was selected because it was fairly representative of the solid precipitation events included in this analysis, which had a median T air of -5.2 ˚C. The unshielded and single Alter 'universal' multi-site KOC-Eq.3 transfer functions from Kochendorfer et al. (2017) are also included, as these were generally recommended over the gauge-specific unshielded or single-Alter transfer 20 functions developed here. These results demonstrate relative magnitudes of the adjustments for different wind shields, with the more effective shields (SDFIR, Belfort double-Alter) resulting in much higher adjusted catch efficiencies than less effective shields (single-Alter, MRW500 shield) or unshielded gauges.
The uncertainty in each transfer function was estimated for different wind speeds. Errors in the resultant transfer functions 25 were calculated from differences between the measured catch efficiency and the adjustment (or transfer function) fit to the appropriate catch efficiencies. The resultant RMSE errors were relatively insensitive to wind speed (Fig. 13b). This is significant, because catch efficiency was presumably less affected by the interaction of snow crystals and wind at low wind speeds, so it suggests that the uncertainty in these adjustments may have other more important causes of uncertainty, such as random variability in the precipitation gauge measurements and the natural spatial variability in precipitation. 30 In addition, errors in the adjusted catch efficiencies were calculated by applying the appropriate adjustments to the measurements and calculating RMSE values from the resultant catch efficiencies (Fig. 13c). The relationship between the 13a and 13c. Measurements that required larger adjustments experienced larger errors in the adjusted catch efficiencies. This is due, at least in part, to basic arithmetic; for example, a precipitation measurement associated with a predicted catch efficiency of 50% would be doubled by adjustment, and any errors in the measured catch efficiency would likewise be doubled by the adjustment. At a given wind speed, the errors in the adjusted catch efficiencies (Fig 13c) are approximately 5 equal to the errors in the catch efficiency (Fig. 13b) divided by the adjustment (Fig 13a). Errors in the measured catch efficiency (shown in Fig. 13b) were enhanced by the adjustments (shown in Fig. 13a).

Conclusions
New transfer functions were developed using precipitation measurements from both host-and manufacturer-provided WMO-SPICE weighing gauges, and tested alongside existing transfer functions. The resultant errors in corrected 10 precipitation measurements were presented, and recommendations for the correction of different types of weighing gauges were made. These transfer functions were demonstrated to reduce the mean bias of weighing gauge measurements relative to the DFAR, and the remaining uncertainty in the corrected measurements was described using different statistics.
For the unshielded and single-Alter weighing gauges provided by different manufactures for testing in WMO-SPICE, the 15 multi-site transfer function developed by Kochendorfer et al. (2017a) typically worked as well as the gauge-specific transfer functions developed in this study. The more universal unshielded and single-Alter multi-site transfer functions from Kochendorfer et al. (2017a) are recommended for adjusting measurements from all the unshielded and single-Alter-shielded weighing gauges tested.

20
The low-porosity double-Alter shield developed by Belfort performed well relative to the DFAR, with an uncorrected bias of only -0.04 mm, or -5.4%. Comparison of results with weighing gauges in traditional single-and double-Alter shields indicated better performance for the Belfort double-Alter, suggesting that it is a viable, high-efficacy option for networks or sites that do not have the resources to build, site, and maintain a large wooden shield like the SDFIR or DFIR. The performance of the Belfort double-Alter shield also indicates that future improvements in shield design may yet be possible, 25 especially considering the significant resources required to site, install, and maintain large wooden shields like the SDFIR or DFIR.
Precipitation measurements from weighing gauges in higher-efficacy shields, such as the SDFIR and the Belfort double-Alter, showed not only much smaller uncorrected biases relative to the corresponding reference configurations, but also 30 smaller corrected RMSE and higher corrected PE 0.1 mm . Further, measurements from these gauge/shield configurations required less correction than the unshielded gauges tested, and the resultant errors estimated by comparing the corrected Hydrol. Earth Syst. Sci. Discuss., https://doi.org/10.5194/hess-2017-228 Manuscript under review for journal Hydrol. Earth Syst. Sci. Discussion started: 12 June 2017 c Author(s) 2017. CC BY 3.0 License. measurements to the DFAR measurements were also much smaller. The errors that remained after correcting the unshielded and single-Alter shielded measurements were much larger than the errors experienced by the more effectively shielded gauges. Upon closer inspection and bin-averaging by wind speed, it appears that the uncorrectable errors that less-effectively shielded measurements are subject to are enhanced by the adjustments required to remove the undercatch. At higher wind speeds, where such measurements require doubling or even tripling, the uncertainty in the measurements was also doubled or 5 tripled, accordingly. This suggests that there is a limit to the amount of uncertainty that can be removed by such adjustments, and the transfer functions presented here may already be approaching this limit. These results also suggests that although adjusted unshielded and single-Alter shielded gauge measurements can be used to effectively measure the total amount of precipitation without a large bias, the only way to significantly reduce the uncertainty of the measurement is to shield it more effectively using a shield such as the DFIR, SDFIR, or Belfort double-Alter. 10

Acknowledgements
The authors thank Hagop Mouradian from Environment and Climate Change Canada for contributing the mapped site locations (Fig. 1). We thank the manufacturers that provided many of the sensors used to produce these results. We also thank the World Meteorological Organization for supporting this intercomparison.

Disclaimers 15
Many of the results presented in this work were obtained as part of the Solid Precipitation Intercomparison Experiment (SPICE) conducted on behalf of the World Meteorological Organization (WMO) Commission for Instruments and Methods of Observation (CIMO). The analysis and views described herein are those of the authors at this time, and do not necessarily represent the official outcome of WMO-SPICE. Mention of commercial companies or products is solely for the purposes of information and assessment within the scope of the present work, and does not constitute a commercial endorsement of any 20 instrument or instrument manufacturer by the authors or the WMO.