Spatio-temporal assessment of WRF, TRMM and in situ precipitation data in a tropical mountain environment (Cordillera Blanca, Peru)

The estimation of precipitation over the broad range of scales of interest for climatologists, meteorologists and hydrologists is challenging at high altitudes of tropical regions, where the spatial variability of precipitation is important while in situ measurements remain scarce largely due to operational constraints. Three different types of rainfall products – ground based (kriging interpolation), satellite derived (TRMM3B42), and atmospheric model outputs (WRF – Weather Research and Forecasting) – are compared for 1 hydrological year in order to retrieve rainfall patterns at timescales ranging from sub-daily to annual over a watershed of approximately 10 000 km in Peru. An ensemble of three different spatial resolutions is considered for the comparison (27, 9 and 3 km), as long as well as a range of timescales (annual totals, daily rainfall patterns, diurnal cycle). WRF simulations largely overestimate the annual totals, especially at low spatial resolution, while reproducing correctly the diurnal cycle and locating the spots of heavy rainfall more realistically than either the ground-based KED or the Tropical Rainfall Measuring Mission (TRMM) products. The main weakness of kriged products is the production of annual rainfall maxima over the summit rather than on the slopes, mainly due to a lack of in situ data above 3800 ma.s.l. This study also confirms that one limitation of TRMM is its poor performance over ice-covered areas because ice on the ground behaves in a similar way as rain or ice drops in the atmosphere in terms of scattering the microwave energy. While all three products are able to correctly represent the spatial rainfall patterns at the annual scale, it not surprisingly turns out that none of them meets the challenge of representing both accumulated quantities of precipitation and frequency of occurrence at the short timescales (sub-daily and daily) required for glacio-hydrological studies in this region. It is concluded that new methods should be used to merge various rainfall products so as to make the most of their respective strengths.

The tropics are thermally characterized by an annual variation less important than the diurnal cycle (e.g., Kaser, 1999;Baraer et al., 2012). This applies to the Cordillera Blanca, where homogeneous thermal conditions are observed throughout the year (Juen et al., 2007). For instance, at Querococha, located in the southern part of the Cordillera Blanca, mean monthly temperature variation is less than 1 • C (Kaser et al., 2003).
By contrast, there is a strong seasonality of precipitation, controlled by the upper air circulation, with easterly wind transporting moisture from the Amazon plain (Aceituno, 1987) and westerly flow causing dry conditions due to the Humboldt Current (Garreaud et al., 2003). This results in two distinct seasons: the wet season from October to April with an average of 80 % of the annual precipitation (Vuille et al., 2008a), and the dry season from May to August. The wet season corresponds to the South American Monsoon System (SAMS) (e.g., Vera et al., 2006;Garreaud, 2009;Marengo et al., 2012), bringing humidity far to the west. The dry season is associated with the North American Monsoon System, the Intertropical Convergence Zone (ITCZ) being located as its northernmost position. The inter-annual variability of rainfall is important in relation to the fluctuations of the sea surface temperature (SST) of the North Atlantic and the El Niño-Southern Oscillation (ENSO) (e.g., Espinoza Villar et al., 2009;Lavado Casimiro et al., 2012;Lavado Casimiro and Espinoza, 2014). According to Lavado Casimiro and Espinoza (2014), the Rio Santa catchment belongs to an area where positive precipitation anomalies are observed during strong Niño as well as during strong Niña events.
The rainfall climatology is also characterized by strong spatial gradients at all temporal scales. First of all, the main annual rainfall pattern between 5 and 30 • S is the contrast between the dry and cold conditions on the Pacific coast, stretching to the western slopes of the Andes, and the warm, humid and rainy conditions prevailing on the eastern slopes (Garreaud, 2009). This results in high precipitation amounts on the windward slopes of the Andes in easterly flows situations (up to 6000 mm yr −1 ) and much smaller precipitation amounts on the leeward side, even at high altitudes (under 530 mm yr −1 ) between 5 • N and 20 • S (Espinoza Villar et al., 2009). Superimposed onto this large-scale spatial pattern, the influence of the topography becomes more and more important when considering smaller temporal scales at which convective and orographic processes have a deep influence. Rainfall hotspots, heavy rainfall gradients over a few kilometers and flash floods (Young and Leon, 2009;Espinoza et al., 2015) are the most prominent hydro-meteorological patterns induced by the rough topography of the region.
Another issue arises from the high altitude, meaning that a significant amount of precipitation falls as snow over 4800 m a.s.l. This requires one to measure reliably both the solid and liquid precipitation all year around, something that is far from granted and that remains a major difficulty in mountain hydrology.
The estimation of precipitation over the broad range of scales of interest for climatologists, meteorologists and hydrologists is thus especially challenging in this region characterized by very uncommon geographical features. And yet socio-economic stakes are high as far as potentially drastic changes of the water cycle related to precipitation variability and long-term changes are concerned, affecting access to drinkable water in urban areas, the yields of agricultural projects and the operation of numerous hydroelectric power plants.
The driving question of this study is to identify and compare the precipitation data sets that can be used for properly characterizing the water balance over catchments of the region from sub-daily to yearly temporal scales. Both the accumulated quantities of precipitation and the frequency of occurrence have to be properly estimated if one is to compute coherent water budgets over this large range of temporal scales, an accomplishment that no single precipitation data set can pretend to achieve on its own.
Each precipitation data set has its own strength and weakness. Starting with ground data, their main shortcomingbeyond their key advantage of being the only direct measurement of rainfall -is a poor sampling of the spatial variability that is especially important in mountainous regions (Scheel et al., 2011). This is compounded by the difficulty of installing and maintaining ground stations in a harsh environment, making whole areas very difficult to access (Salzmann et al., 2013;Schwarb et al., 2011). Rain gauges are thus most often available in the vicinity of villages, meaning that non-habited areas are virtually not sampled, especially at high altitudes, where distinguishing between liquid and solid precipitation is a major issue.
On their side, satellite rainfall products provide the global coverage that is lacking for ground data sets. However, the early satellite rainfall products elaborated in the mid-1980s were solely based on infrared data, affecting their accuracy in the case of convective rainfall and, more generally, in the presence of a strong rainfall gradient. The most recent products now make use of various sources of information, blending infrared and microwave satellite data and often incorporating ground data, which make them more performant in spotting the patches of intense rainfall. It remains that there are still significant differences between the most commonly used satellite rainfall products, especially in the tropics and for orographically forced rainfall (Ward et al., 2011). This means that the ability of these satellite products to fulfill user's expectations must be scrutinized on a case-by-case basis. Note also that satellite products are rather weak in distinguishing between liquid and solid precipitation.
In the perspective of quantifying the spatial and temporal variability of water budgets over catchments, another possibility for providing the required rainfall component is to use the precipitation produced by climate models. This presents two main advantages: (i) the physical coherency of the various elements of the water budgets computed by these models and (ii) the possibility of studying the evolution of the water budgets in the future in a context of global warming. Note, however, that global climate models usually fail to simulate properly the regional processes and their spatial variability, especially for precipitation in mountainous areas, a default particularly critical in the Andes due to their complex topography (Giovannettone and Barros, 2009). To remedy these limitations, downscaling approaches based on the nesting of regional climate models (RCM) into global models is frequently used. The performance of nested regional models depends on the study area, the spatial resolution and the parameterization used (Box and Bromwich, 2004), which means that their added value, as compared to the other sources of rainfall information, should also be considered on a case-bycase basis.

Study area
Draining an area of 11 930 km 2 located between 8 and 10 • S and 79 and 77 • W, the Rio Santa runs northward, between the Cordillera Negra to the west and the Cordillera Blanca to the east (Mark and Seltzer, 2003), before making its way to the Pacific; 41 % of the catchment area is above 4000 m a.s.l., including the highest point of the cordillera, Huascaran, peaking at 6768 m a.s.l. (Fig. 1). The upper Rio Santa catchment, with an outlet at Condorcerro, drains an area of about 10 000 km 2 , and will be our main study area.
Some modeling projections based on the mean of meteorological variables from four GCM grid points predict the disappearance of ice cover for 2080 in some sub-watersheds of the Rio Santa (Juen et al., 2007), which would have a significant impact on the flow regime of the river, since glaciers meltwater regulates its annual flow. For a sub-watershed of the upper Rio Santa watershed (4700 km 2 , 8 % glaciated), glacier meltwater currently provides 10-20 % of the annual rate, and up to 40 % in the dry season (Kaser et al., 2003;Mark and Seltzer, 2003;Baraer et al., 2012;Condom et al., 2012).
The larger studied area is a rectangle of 84 000 km 2 (Fig. 1). It can be divided into four hydrological sub-regions from the north-east to the south-west. The Rio Marañon catchment is located on the Amazon side, where the highest yearly precipitated amount was measured in situ during the hydrological year 2012-2013 (> 1100 mm yr −1 ). The second sub-region is the western side of the Cordillera Blanca, draining into the Pacific. Stations in this area are located inside the Rio Santa catchment. In situ measured precipitation amounts in the Cordillera Blanca area range from 478 to 1000 mm yr −1 (Table 1 and Fig. 1). The third region is the Cordillera Negra, which is much drier (from 44 to 434 mm yr −1 ) and lower in altitude (Table 1 and Fig. 1). This zone includes all stations located west of the Rio Santa Finally, the dry area near the Pacific Ocean, named Costa, is defined as the land area whose altitude ranges from 0 to 1000 m a.s.l. The topography data shown in Fig. 1 are from STRM (90 m resolution). While we will be looking at the entire 84 000 km 2 region, our analysis is focused on the precipitation falling over the upper Santa watershed, because this is our region of interest from a hydrological standpoint and because it is where we have the best ground network coverage.

In situ data
It was not an easy task to gather data from a sufficiently large number of stations in order to properly document our study area. First of all, there was the need to obtain some background climatological information; 10 stations operated by the Servicio Nacional de Meteorología e Hidrología de Perú (SENAMHI) since 1965 ( Table 1) allow computation of monthly and yearly long-term averages. However, their specific location and loose spatial sampling prevent one from estimating correctly the long-term average rainfall either over the upper Rio Santa catchment or over the whole study area. Data from an additional set of eight SENAMHI stations cover the period August 2012 to July 2013 at a daily resolution. We also had access to three stations from the Unidad de Glaciología y Recursos Hídricos (UGRH) from the Autoridad Nacional de Agua (ANA). These stations are of a tipping bucket type; they have the double advantage of being located at higher altitudes and of providing data at sub-daily time steps. As compared to previous studies in this region, the key new information used comes from a database of 16 meteorological stations with hourly data located in the An-cash region of Peru. They were installed in the framework of a project (Centro de Información e Investigación Ambiental de Desarrollo Regional Sostenible -CIIADERS), operated by the Universidad Nacional Santiago Antúñez de Mayolo (UNASAM) of Huaraz. These stations provide essential information for understanding the spatial (increased sampling density) and temporal (hourly resolution) distribution of precipitation within our study area. The SENAMHI data are routinely quality controlled, using standard procedures in use in the Met services worldwide. For the UGRH and UNASAM data, we had to carry out our own quality check, for instance by comparing precipitation amounts reported by stations located in the same area, leading to the removal of errant values.
Unfortunately the CIIADERS network has been in operation since 2012 only, limiting this study to 1 hydrological year (August 2012 to July 2013). The average pluviometric index of this 1-year study period, which corresponds to a reduced centered anomaly, is close to 0 (0.0774), meaning that the annual precipitation is close to the mean precipitation of the 1965-2014 period as calculated from 10 longterm stations among our total of 37. Note also that stations with more than 25 % of missing data during that year have been removed, leaving only 32 stations available to compute our ground-based rainfall grids (Table 1 and Fig. 1).
A weakness of this 32-station network is the lack of data for the dry Cordillera Negra and the high-altitude areas of the Cordillera Blanca (only three stations located above 3800 m a.s.l.). This shortcoming was partly overcome by using accumulation data provided by the UGRH for the Artesonraju and Yanamarey glaciers at near 5000 m a.s.l., which are net accumulations during 1 year, including solid precipitation and melting over the period. Concerning snow, it is important to keep in mind that the rainy season occurs during austral summer, when temperature is slightly higher and consequently few solid precipitations are observed under 4600 m a.s.l. (Condom et al., 2011).

Gridded precipitation from in situ data
A major problem when comparing precipitation data sets from different sources relates to their different spatial sampling. Satellite and atmospheric model data are provided as gridded products, while rain gauges provide point data. A spatial interpolation procedure is thus required to get each product on the same grid. There is a considerable amount of literature on selecting an appropriate interpolation method for computing rain grids from point data. This is an especially tricky problem in regions of rough topography.
Several studies showed that kriging with external drift (KED), using altitude as an external variable, provides good results over complex terrains (e.g., Masson and Frei, 2014;Tobin et al., 2011;Ochoa et al., 2014). Block kriging with altitude as an external drift was thus chosen here as our reference interpolation method -note however that other types of kriging interpolators were tested, but a cross-validation evaluation showed KED to be the most efficient of all in our case. While accounting for the strong influence of topography on the structure of rain fields is crucial in mountainous regions, another issue arises from the type of variogram to be used and whether it is allowed to vary from day to day. Related to this topic, different concepts of spatio-temporal kriging have been tested in previous studies (Amani and Lebel, 1997;Vischel et al., 2011;Gräler at al., 2012). Daily evolving variograms assume the hypothesis of a relationship between precipitation amounts of days D and D − 1, and information from the previous days is considered with a weight chosen by the user (10 % is used in this study). This is the method that was finally chosen to compute daily gridded precipitation at 27 km, 9 km and 3 km spatial resolutions, thus matching the resolution of the satellite and Weather Research and Forecasting (WRF) model products that will be presented below in Sects. 2.4 and 2.5.

TRMM product
Tropical Rainfall Measuring Mission (TRMM) Multi-Satellite Precipitation Analysis (TMPA) products have been available since 1998. This study makes use of the TRMM3B42 version 7 product, which provides precipitation data at a 3 h time step from a combination of remote sensing observations (microwave imager, precipitation radar, visible and infrared scanner) and monthly in situ observations (Huffman et al., 2007;Huffman and Bolvin, 2012). This product will simply be referred to as TRMM in the rest of the study. The TRMM data set covers a region between 50 • S and 50 • N, with a spatial resolution of 0.25 • (approximately 27 km) (Table 3). This product can be used for hydrological application in regions with scarce in situ data. Even though the TRMM mission was focused on the monitoring of tropical rainfall, it suffers from a number of drawbacks, the main one being its poor time sampling reduced to one or two passages per day depending on the area considered. This causes a significant loss of information for short-duration storms (Roca et al., 2010;Condom et al., 2011;Ward et al., 2011). The effect of these time sampling errors are reduced when aggregating in time (Scheel et al., 2011;Mantas et al., 2014), but TRMM products still show significant biases in monthly values in the tropical Andes (Condom et al., 2011) as well as in solid precipitation (Maussion et al., 2014).

WRF simulation
In this study we use the high-resolution simulations from the WRF model version 3.4.1 (Skamarock et al., 2008) that had only a few applications in the tropical Andes (Murthi et al., 2011;Ochoa et al., 2014;Sanabria et al., 2014). The WRF is a nonhydrostatic model and uses a terrain-following vertical coordinate (sigma). The limited domain simulations are forced by a boundary condition every 6 h by the National Center for Environmental Prediction (NCEP) Final Analyses (FNL) Global Forecast System (GFS) with 1 • of latitude and longitude horizontal resolution. The elevation data set is from the USGS GTOPO30. A large tropical Andes domain was first delimited for simulations at a 27 km resolution (WRF27). Two sub-domains were then used for carrying out simulations at a 9 km (WRF9) and 3 km (WRF3) resolution, respectively, both being centered in the Santa River basin (Tables 2, 3 by the WRF27 (WRF9) simulations using a one-way nesting technique. The simulations begin on April 2012, the first 4 months being used as a spin-up period for producing 1 year of data to be compared to the KED and TRMM products. Figure 2 shows the boxes corresponding to each simulation domain, and Table 2 lists the resolutions and coordinates of each configuration. Table 3 lists the parameterizations used in the simulations. We use the Thompson microphysical scheme (Thompson et al., 2008) and the Grell-Devenyi ensemble scheme for the cumulus parameterization. We also use a topographic correction for surface winds, previously tested in a complex orographic terrain of the Iberian Peninsula (Jimenez and Dudhia, 2012). The Noah-MP (Multi-Physics) land surface model is used for the representation of land-atmosphere interaction processes Yang et al., 2011). Noah-MP is an extended version of the Noah land surface model with enhanced Multi-Physics option to address critical shortcomings in Noah for long-term soil state spin-up and snow modeling. In particular, this version of the Noah model has shown improvements in the representation of surface energy fluxes, snow cover and snow albedo treatment. The partitioning of precipitation into rainfall and snowfall was set to option 2 (opt_snf = 2) using the Biosphere-Atmosphere Transfer Scheme, which assumes all precipitation as snowfall when the air temperature is lower than the freezing point plus 2.2 K, and rainfall otherwise.
The overestimation of precipitation is a frequent bias in numerical models (e.g., Mearns et al., 1995), particularly in complex orographic regions. Preliminary tests of sensitivity with various WRF parameterizations (including different cumulus schemes, cloud microphysics, planetary boundary layer and land surface options) have been done in the tropical Andes at a 27 km horizontal resolution; a clear overestimation of precipitation was observed with all these configurations and over all the domain, including the high mountain areas. The biases found with other configurations were almost similar to those of the configuration selected here in terms of the precipitation spatial distribution, and with quantitative differences more pronounced in the eastern slopes of the Andes and in the Amazon region rather than in high mountain zones like the Cordillera Blanca. The configuration finally retained for this study (Table 2) has been selected because (i) it minimizes the positive precipitation bias in the tropical Andes above 3500 m a.s.l., and (ii) it simulates correctly the spatial distribution of the precipitation in the region, including the zones of maximum precipitation situated in the Amazon basin and on the eastern slopes of the Andes (Fig. 2), when compared with the TRMM2B31 data. At 3 km resolution, the Noah-MP option was found to decrease the precipitation overestimation in the Cordillera Blanca and show a more realistic snow distribution when compared with previous observations.

Methods and criteria used for comparing the rainfall products
A total of seven gridded rainfall products are compared here, as described in Table 4. These products differ from one another on two accounts: (i) the type of information used (ground data, satellite data, atmospheric model), (ii) the spatial resolution, ranging from 27 km corresponding to the size of the TRMM satellite product grid mesh, down to 3 km, the finest resolution at which the WRF model was run. These seven gridded products are available at the daily scale, which is the corner scale for the comparison carried out in this paper. While TRMM products and WRF simulations are inherently gridded, in situ data need to be interpolated in order to build grids at the three spatial resolutions: 27, 9 and 3 km.

Computation of daily precipitation grids from in situ data
The performance of the KED outputs is determined based on a "leave-one-out" cross-validation procedure (Li and Heap, 2008). It consists in leaving aside one measurement point at a time and estimating the value at that point from the remaining 31 stations. The procedure is applied successively to each of the 32 measurement stations, allowing one to compute the bias (Eq. 1), the root mean square error (RMSE) score (Eq. 2)  Paulson (1970) and the correlation coefficient, as follows: whereP i,d is the daily precipitation estimated at point i for day d, using all the other gauges, P i,d is the corresponding measured daily rainfall, n is the number of stations (32 when there are no missing data on day d) and m is the number of days studied. In the following, the gridded daily precipitation product at 27, 9, and 3 km spatial resolutions will, respectively, be referred to as KED27, KED9 and KED3 (see Table 4 and Fig. 3). The daily RMSE value is large (3.41 mm d −1 ) compared to the mean daily precipitation over all stations (1.85 mm d −1 ), and errors are reduced with aggregation on a yearly basis (RMSE of 271 mm yr −1 for an average in situ amount of 572 mm y −1 for the 32 stations and a correlation coefficient of 0.78). In yearly values, kriging products will then be the basis of our comparison to TRMM data and WRF outputs. Despite some bias in the estimation of annual and daily rainfall, it is assumed that the most important spatial pattern is captured by KED.

Comparing the daily and annual precipitation products
Daily precipitation is defined as the accumulation of rainfall between 00:00:00 LT (local time) and 23:59:00 LT. An important point is that all gridded products suffer from weakness and, thus, the aim of the comparison is to analyze differences between products. The daily products are compared from three different standpoints: the statistical distribution of non-zero rainfall, the grid annual values and the seasonal cycle. The frequency of daily precipitation at one location (one station or the corresponding grid mesh) was studied through the cumulative distribution function of the non-zero precipitation (Sambou, 2004): where F (x) is the cumulative frequency of the daily precipitation amount above 1 mm d −1 and x is the daily precipitation (mm d −1 ). To assess the statistical performance of the 3 km resolution products against punctual in situ data at a daily timescale, the contingency table for rainfall/no rainfall was built (Table 5). The bias score (BIAS -ratio of the number of rainy days simulated (A + B) over the number of rainy days observed (A + C)), false alarm rate (FAR -ratio of the number of rainy days incorrectly simulated (B) over the total number of rainy days simulated (A + B)), probability of false detection (POFD -ratio of the number of rainy days incorrectly   Table 5. Contingency table used to assess the statistical performances of the 3 km resolution products against punctual in situ data at a daily timescale. The B value corresponds for example to a day with no precipitation in the in situ data and precipitation > threshold mm d −1 in the 3 km grid product.

In situ P j
Yes No 3 km grid Yes A B Product No C D simulated (B) over the number of days without rain in the observations (B + D)) and the frequently used Heidke skill score (HSS) (Eqs. 4-6) were calculated.
where N is the size of the statistical population, and A, B, C and D values are explained in Table 5. A perfect product would have a BIAS of 1, a FAR of 0, a POFD of 0 and a HSS of 1. Annual grids were computed by temporal aggregation of the daily grids. In the aim to study the water balance for the purpose of hydrological applications, each product was evaluated in terms of volume of water precipitated over the area of the upper Rio Santa watershed, corresponding to the watershed limited by the outlet at Condorcerro (Fig. 1).
Finally, to evaluate the seasonal cycle of precipitation in one site, we used the temporal standard score S t (Eq. 7): where P j 10 is the running means of daily precipitation amounts over 10 days in one location, and P j and σ j are the temporal average and standard deviation of the daily precipitation, respectively. It is important to mention that when comparing the performances at one location of the KED daily products with those of the TRMM and WRF, use is made of the cross-validation products, so that the local information is not taken into account, which would artificially benefit the ground product with respect to the satellite and model products.

Assessing the quality of the WRF3 hourly precipitation grids
To facilitate the comparison among all stations, the hourly precipitation amounts were normalized by dividing them by the mean of hourly values during the year. Few studies deal with hourly rainfall amounts from WRF modeling. In this study, we compared the timing of the precipitation peak from hourly rain gauge data and from WRF3 simulation outputs. Studying hourly data allowed us to see whether short time processes governing precipitation in the Rio Santa Valley are well represented in WRF3, considering in situ hourly measurement as the reference.

Frequency and intensities of daily precipitation amounts
In this section, we first analyze the statistics of daily precipitation and the temporal scale for which all eight products are available (Table 4), and present them for the Corongo location (no. 2 in Table 1 and Fig. 1). This station, located in the northern part of the Rio Santa watershed, was selected because it is representative of the 16 stations located inside the upper Rio Santa catchment in terms of the precipitation areal averaging effect, except when comparing the differences between the three different spatial-resolution products of WRF. In a second part, we studied daily precipitation occurrences based on the contingency table indices (see Sect. 3.2, Table 5) for all stations located in the Sierra area. Figure 4 shows the cumulative frequency of daily precipitation above 1 mm d −1 for the Corongo location comparing (i) the three spatial resolutions of WRF (Fig. 4a), (ii) comparing the three spatial resolutions of KED (Fig. 4b), (iii) comparing TRMM, WRF and KED products at 27 km (Fig. 4c), and (iv) comparing WRF and KED products at 3 km spatial resolution vs. in situ punctual data (Fig. 4d). The number in the box of each graph represents the number of days with precipitation over 1 mm d −1 (n p>1 ) for each product. Regarding KED data, the three spatial resolutions have a few differences that can also be seen in the number of n p>1 (Fig. 4b). Concerning the 27 km spatial resolution, KED27 and TRMM are more similar to each other compared to WRF27 (Fig. 4c), despite an underestimation of n p>1 for TRMM (108 days) compared to KED27 (183 days). WRF3, as WRF27 ( Fig. 4c  and d), does not correctly report daily precipitation amounts, with stronger values compared to the other data sets. In this comparison, KED3 seems to underestimate daily precipitation amounts and overestimate n p>1 in light of in situ data, but this can be related to a resolution effect between the 3 km resolution grid and punctual measurement.
Noting that WRF products are unrealistic in terms of daily precipitated quantities, we will now evaluate their perfor-mances in terms of occurrence, a notion that is essential in glacio-hydrological studies. This can be seen in the results of the contingency table and is studied by comparing KED3 and WRF3 with in situ data for different daily precipitation thresholds in Fig. 5. The results are shown for the Sierra region, but are similar for the Cordillera Negra and Marañon areas.
WRF3 largely overestimates the amount of strong daily precipitation, which can be linked to the overestimation of the product (Fig. 4d). The FAR, POFD and HSS show that there is an important improvement considering only precipitation above 1 mm d −1 in KED3 and that the amount of daily precipitation between 0 and 1 mm d −1 is largely overestimated by this product (Fig. 5b-d). POFD can be seen as an inter-comparison indicator as it does not depend on the number of predicted events. Above 1 mm d −1 , KED3 is then a better estimator of precipitation occurrence compared to WRF3. However, we faced the same spatial-resolution problem as above when comparing the 3 km mesh grid and in situ data for low-precipitation amounts. HSS indicates that daily precipitation in KED3 is in better accordance with in situ data than WRF3, with a few rainy days well predicted in WRF3. Although we noted a spatial-resolution effect for daily precipitation quantities under 1 mm d −1 , KED3 appears to be a good estimate of precipitation in terms of daily average quantities and occurrences, and will be considered later as a basis for comparison between different gridded products.

Annual cumulated precipitation amounts during the hydrological year 2012-2013
The estimations of the annual precipitation over the upper Rio Santa catchment (about 10 000 km 2 ) for the 27 km resolution products, range from 570 mm yr −1 for TRMM to 2910 mm yr −1 for WRF27 (and 830 mm yr −1 for KED27) (Table 4). Thus, even at this large integrative scale, the 27 km products display large discrepancies. KED annual rainfall is 15 % larger at the 3 km resolution (950 mm yr −1 ) compared to the 27 km resolution, while it is a diminution of 30 % for WRF (1970 vs. 2910 mm yr −1 ). Figure 6 shows those annual precipitation amounts for all different products used in this study. Even though the KED3 estimate is certainly not devoid of bias, it is clear that WRF overestimates rainfall. WRF products, compared to KED, show more spatial variability in precipitation amounts at both 3 and 9 km resolutions, with stronger altitudinal gradient. TRMM and KED27 are closer along the Rio Santa Valley, as they both incorporate rain gauge data. However, on the Marañon watershed side, TRMM integrates the tropospheric flows from the Amazonian lowlands compared to KED27, whose ground observations are undersampled over this area, not catching the rainfall effect of the moisture influx from the Amazon basin. Although coarse-resolution products (TRMM and WRF27) and Heidke skill score (d). Calculated for KED3 (black) and WRF3 (gray) against rain gauge precipitation data located in the Sierra area. Scores have been evaluated for several daily precipitation thresholds: 0.1, 0.5, 1, 3, 5, 10 and 15 mm.
do not provide acceptable rainfall grids for hydrological applications in complex topography area because of their lack of representation of the finer spatial pattern, they are not totally useless at this annual scale. They correctly represent the longitudinal precipitation gradient between the humid and rainy condition of the Amazon plain, the orographic influence of the Cordillera Blanca and the dry and cold Pacific coast conditions (Fig. 6f and h). Those products may thus be used as indicators of spatial precipitation pattern for the study of long-term trends in precipitation (that are costly to generate with WRF3, and not available with KED, because half of the gauge network was installed only in 2012).

Orographic influence on annual amount at 9 and 3 km spatial resolution
Field data are too remote, with no measurement at high altitude to provide information on the altitudinal gradient of precipitation. On a longitudinal transect near the Huascaran peak, we observed important differences in annual precipitation amount and spatial pattern between KED products and WRF outputs (Fig. 7b and c). At very high altitude, we compared precipitation to accumulation data measured at 5100 m a.s.l. on the Artesonraju glacier (station no. 5 from Table 1 and Fig. 1). We can observe in Fig. 7c that KED3 and KED9 products suffer from one major impediment: in regions of low gauge density, the spatial pattern will be solely driven by the altitude, not taking into account the effect of local slopes and orientation. As a consequence, daily rainfall maxima produced by KED are located over the summits, whereas it is well known that these maxima are rather located on the slopes, as correctly simulated by WRF3 (Fig. 7b). The only area with less precipitation in WRF3 compared to WRF9 is the upper zone of the Cordillera Blanca mountain range, near the highest peaks (Fig. 7b). In WRF3, the altitudinal variation is greater than in WRF9, with the summit reaching 5000 m a.s.l.; the spatial resolution is finer, and in this configuration, the orographic processes on the eastern slopes of the Andes are more pronounced and correctly represented at the 3 km spatial resolution.

Seasonal changes along the Rio Santa Valley
The annual cycle is presented in detail for cells corresponding to three stations located along the Rio Santa Valley, Corongo (station no. 2), Shilla (station no. 12) and Shancayan (station no. 16) (Fig. 1, Table 1), as these three stations are representative of others located in the Sierra area. Day 1 in Fig. 8 corresponds to the beginning of the hydrological year, 1 August 2012. The upper panels (Fig. 8a-c) correspond to the three products available at the 27 km spatial resolution (TRMM, KED27, WRF27). During the dry period, between days 1 and 50, and 300 to 350, TRMM largely overestimates the precipitation amount for Shilla (Fig. 8b). The percentage of ice-covered area in the mesh corresponding to Shilla station is up to 10 %, while it is less than 0.5 % for the meshes of Corongo and Shancayan. Error in dry season for Shilla can be seen as a poor consideration of ice-covered surface in the TRMM algorithm, as ice on the ground scatters energy in a similar way as precipitation drops in the atmosphere (Yin et al., 2004). Temporal trends of KED27 and WRF27 are similar, with occasional shifts of a few days in heavy rainfall events (for example, between days 200 and 230 for Corongo station; Fig. 8a).
Concerning the finer spatial resolution (Fig. 8d-f), KED3 and in situ data have strong similarities for the three stations, and that confirms the use of the 3 km spatial resolution to compare gridded data with in situ punctual data. Regarding WRF3, intensities of precipitation peaks are false in the heart of the rainy season, but the temporal distribution remains close to that of rain gauge precipitation.

Diurnal cycle of precipitation along the Rio Santa Valley
Half of the rain gauges available over the region of study are daily reading stations; the network of recording rain gauges is consequently too sparse and too unevenly distributed to permit the computation of relevant rainfall grids at a subdaily scale. WRF3 thus remains the only product able to account for the diurnal cycle of precipitation by providing hourly rainfall grids (even though TRMM3B42 is available at a 3-hourly time step, the fact that the satellite overpasses the studied area only once or twice daily makes it difficult to trust its accuracy for sub-daily timescales). This is important since the diurnal cycle in a glaciological context controls the precipitation phase and consequently the surface albedo (one strength of WRF is that it produces liquid as well as solid precipitation).
In situ data at Corongo (station no. 2), Shilla (station no. 12) and Shancayan (station no. 16) display a clear precipitation peak in the late afternoon, between 16:00:00 and 19:00:00 LT (Fig. 9). This diurnal cycle is visible in the WRF3 simulations, even though somewhat less pronounced (more rainfall around noon), and with a slight lag at Shilla and Shancayan. Looking at the diurnal cycle of precipitation at a regional scale (Fig. 10), it is noteworthy that the peak hour of precipitation occurs later in the bottom of the Rio Santa Valley (dark green for altitudes below 4000 m a.s.l., and around 19:00:00 LT) than in the surrounding mountains (light green color, around 17:00:00 LT). A lack of hourly information at high altitudes prevents one from validating these hourly scale characteristics with observations, but they correspond to well-documented orographic processes (valley and mountain breezes) (Biasutti et al., 2012;Barros, 2013). In the afternoon, moisture is transported to the peaks by anabatic winds. At the beginning of the night, moisture downs into the valley with katabatic winds. In a physical climate model like WRF, the representation of thermal and orographic circulations theoretically benefits from a finer resolution (Jimenez et al., 2013;Weckwerth et al., 2014), and mountain-valley breezes seem to be accurately estimated for the 3 km resolution runs.

Summary and conclusions
Over the past 40 years, the warming climate of the tropical Andes has led to a significant melting of the glaciers, impacting the hydrological cycle to an extent that remains to be assessed, both for present and for future times. One obstacle to doing so is our limited ability to evaluate properly the precipitation falling over high-altitude catchments, if only because of the difficulties for installing and maintaining sufficiently dense in situ networks. In addition, the rough topography generates strong spatial gradients that are very challenging to sample. In such a context, remote sensing and modeling look to be attractive means for complementing the information provided by in situ measurements. With this in mind, this paper has presented a comparison of rainfall products based on three different sources of information: rain gauge measurements, satellite imagery and atmospheric model outputs. While TRMM3B42 is a widely used standard, making it a natural candidate to represent the family of satellite rainfall products, there is a larger range of possibilities for selecting a ground-based product and an atmospheric model product. Preliminary tests, the results of which are not detailed in this paper, were used to finally select kriging with external drift interpolation (KED) as a typical ground-based product, the external drift being the altitude. As for atmospheric models, the retained product is made of WRF simulations, WRF being run in a configuration minimizing the differences between the observations and the model outputs over the Cordillera Blanca.
The TRMM3B42 product has a resolution of 27 km; the same resolution was thus used for the computation of coarse rainfall grids from gauge measurements (KED27) and for WRF simulations (WRF27). Then gauge rainfall grids and WRF simulations were also produced at the finer resolutions of 9 km (KED9 and WRF9) and 3 km (KED3 and WRF3). This makes a total of seven gridded precipitation products that were computed and inter-compared over the region of the Rio Santa in Peru, a glaciated catchment and the second largest river flowing from the tropical Andes to the Pacific.
Each process leading to the computation of gridded rainfall products has its own weaknesses: interpolation errors for the rain gauge products, indirect measurement of rainfall for the satellite products, sub-mesh parameterization for the WRF model outputs. Therefore none of them can be taken as an indisputable reference, whether it be in terms of quantities or in terms of occurrence. This is why the performances of each product were assessed from a double perspective. A comparison with measured on-site data was carried out when relevant (diurnal and seasonal cycle, statistics of rainfall occurrence), while the ability of each product to reproduce some well-known spatial features of precipitation fields at various timescales (from annual down to daily) was analyzed when no obvious quantitative reference could be used.
In line with the results of other studies, WRF27 simulations are found to be totally unrealistic in terms of annual quantities. WRF9 and WRF3 simulations are better in this respect but still largely overestimate the annual total, with WRF9 being additionally unable to capture properly the details of the spatial pattern that are well restituted by WRF3. This shortcoming of WRF9 can be explained by its resolution that is still too coarse to reproduce correctly the orographic influence, because a number of key features are smoothed out (for instance, grid meshes reaching altitudes above 5000 m a.s.l. are found in the WRF3 topography, which is not the case in the WRF9 topography). TRMM, with its coarse spatial resolution of 27 km, performs poorly over ice-covered surfaces, because ice on the ground behaves in a similar way as rain or ice drops in the atmosphere in terms of scattering the microwave energy (Yin et al., 2004). Using TRMM in glaciated mountain ranges should thus be avoided, especially at small timescales where spatial error compensation does not occur, as it might do when averaging annual totals over large areas. On the other hand, TRMM might provide some useful information over the lowlands on the Amazonian side of the Andes, as already mentioned by Lavado Casimiro et al. (2009). Coarse-resolution products (TRMM and WRF27), however, correctly represent the large spatial gradient between the humid Amazonian lowlands and the dry Pacific coast, and their long-term precipitation series can thus be used to study the interannual variability of the spatial patterns at a large regional scale and possible long-term trends linked to climate change.
Comparing the diurnal cycle of the hourly WRF3 simulations with observations in meshes containing one recording rain gauge leads to the conclusion that this diurnal cycle is fairly realistic. Of course the default of the large overestimation of precipitation by WRF3 prevents one from using directly the WRF3 grids as inputs to hydrological models. The challenge is thus to combine the hourly temporal distribution of precipitation in WRF3 with more accurate precipitated amounts. In this respect, one path to explore is to use the WRF3 diurnal cycle for disaggregating the KED daily grids.
A more general conclusion is that the topography and the associated rainfall gradients are too steep in this region for rainfall products at the spatial resolution of either 9 or 27 km to provide good rainfall estimates and good rainfall spatial patterns for glacio-hydrological purposes. Moreover, due to a poor sampling at high altitudes, kriging with external drift does not take into account local slope and orientation effects as the spatial pattern is solely driven by altitude. In summary, combining the daily rain gauge measurements with the spatial patterns generated by WRF3 appears to be promising way for building daily rain fields. There are several techniques to do so, one being to use the WRF3 rain field, instead of the topography, as the external drift when interpolating the in situ measurements with a KED technique.