Evaluation of TRMM 3B42 precipitation estimates and WRF retrospective precipitation simulation over the Pacific–Andean region of Ecuador and Peru

The Pacific–Andean region in western South America suffers from rainfall data scarcity, as is the case for many regions in the South. An important research question is whether the latest satellite-based and numerical weather prediction (NWP) model outputs capture well the temporal and spatial patterns of rainfall over the region, and hence have the potential to compensate for the data scarcity. Based on an interpolated gauge-based rainfall data set, the performance of the Tropical Rainfall Measuring Mission (TRMM) 3B42 V7 and its predecessor V6, and the North Western South America Retrospective Simulation (OA-NOSA30) are evaluated over 21 sub-catchments in the Pacific–Andean region of Ecuador and Peru (PAEP). In general, precipitation estimates from TRMM and OANOSA30 capture the seasonal features of precipitation in the study area. Quantitatively, only the southern sub-catchments of Ecuador and northern Peru (3.6–6 ◦ S) are relatively well estimated by both products. The accuracy is considerably less in the northern and central basins of Ecuador (0–3.6 ◦ S). It is shown that the probability of detection (POD) is better for light precipitation (POD decreases from 0.6 for rates less than 5 mm day −1 to 0.2 for rates higher than 20 mm day −1). Compared to its predecessor, 3B42 V7 shows modest regionwide improvements in reducing biases. The improvement is specific to the coastal and open ocean sub-catchments. In view of hydrological applications, the correlation of TRMM and OA-NOSA30 estimates with observations increases with time aggregation. The correlation is higher for the monthly time aggregation in comparison with the daily, weekly, and 15-day time scales. Furthermore, it is found that TRMM performs better than OA-NOSA30 in generating the spatial distribution of mean annual precipitation.


Introduction
Precipitation is the primary driver of the hydrologic cycle and the main input of most hydrologic studies. Accurate estimation of precipitation is therefore essential. The availability of rainfall data, in particular in developing countries, is hampered by the scarcity of accurate high-resolution precipitation. Since its inception, rainfall measurement principles remained unchanged; non-recording and recording rain gauges are still the standard equipment for ground-based measuring precipitation notwithstanding that they only provide point measurements. Rainfall amounts measured at different locations are traditionally extrapolated to obtain areal-averaged rainfall estimates. These estimates from point gauge measurements will only improve if over time the rain gauge network density increases. The latter is not always the case in developing countries. In fact, in many regions gauge densities are decreasing (Becker et al., 2013). One potential way to overcome the limitations of rain-gauge-based networks and weather radar systems in estimating areal rainfall is by using satellite-based global climate information and numerical weather prediction (NWP) products. Compared with rain gauge observations, satellite rainfall data provide observations in otherwise data sparse areas but their disadvantage is that they are indirect estimates of rainfall. On the other hand, increased computational power and improvement of NWP models have resulted in a considerable advancement in the ability to estimate rainfall. However, the main limitation for NWP models is that they cannot resolve weather features that occur within a single model grid box. To improve the accuracy of satellite rainfall estimation and NWP models, and facilitate their intake over data sparse areas, the evaluation of both products needs to be region-specific and user-oriented.
A wide range of satellite-derived precipitation products have emerged in the last decade and their performance over different regions of the world has been evaluated. Several studies have been conducted to assess the accuracy of three of the most widely used satellite-based methods producing global precipitation estimates, such as the Climate Prediction Center morphing method (CMORPH), Precipitation Estimation from Remotely Sensed Information using Neural Networks (PERSIANN) and the Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA) 3B42 (Romilly and Gebremichael, 2011). TMPA 3B42 V6 version performance has been evaluated over the tropical Andes of South America at high-altitude regions (> 3000 m a.s.l.) by Scheel et al. (2011) with focus on the Cuzco and La Paz regions in the central Andes. Ward et al. (2011) conducted a similar investigation in the Paute region (> 1684 m a.s.l.) situated in the southern Ecuadorian Andes, and Arias-Hidalgo et al. (2013) explored its applicability as input for hydrologic studies on a catchment in the Pacific-Andean region in central Ecuador. They all concluded that by disregarding the limitations at the small temporal scale (daily), the performance of this product increases with time aggregation, and they highlighted the potential to use TMPA 3B42 V6 on large-scale basins. Dinku et al. (2010) conducted a wider evaluation covering different climatological regions and altitudinal ranges of the Colombian territory. Results showed good performance when the temporal scale increases (10 days); however, they are region distinct, yielding the best performance over the eastern Colombian plain. The availability of the improved version, the TMPA 3B42 V7, opens a new question concerning its usefulness for South American regions. Recently, Zulkafli et al. (2014) assessed the improvement of the V7 over the V6 and reported a lower bias and an improved representation of the rainfall distribution over the northern Peruvian Andes and the Amazon watershed. The diversity of South American environments demands new comparisons over regions with different precipitation regimens and mechanisms.
On the other hand, NWP models' capabilities keep evolving and providing precipitation fields at high spatiotemporal resolutions. In general, NWP models are not only valuable tools for weather forecasting but also for climate reconstruction. Numerical weather prediction can be initialized and bounded by assimilated observational data describing the large-scale atmospheric conditions throughout the reconstructed period. Periods of years to decades can be retrieved using NWP models, commonly known as "regional atmospheric reanalysis". Although this technique is still in its early stages, in tropical South America, some NWP model applications were conducted by . Their study follows a three-level hierarchical approach. Globalscale analysis and/or general circulation model (GCM) outputs are generated and then used as boundary conditions for the mesoscale meteorological models, which in turn provide information for tailored applications. In a regional atmospheric reanalysis setting, the weather research and forecasting model (WRF, Skamarock et al., 2005) was forced by applying boundary conditions of the National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) Reanalysis project (NNRP, Kistler et al., 2001) to retrieve, for the first time, meteorological data for northwestern South America in the so-called OA-NOSA30 product. The aim of the retrospective simulation was to provide input data for hydrologic and healthepidemiological models with the hypothesis that the WRF retrospective simulation may add skill to GCMs in countries where the Andes provides complex disturbances that global models cannot solve.
The westernmost N-S axis of South America, which embraces the Pacific-Andean region of Ecuador and northern Peru (PAEP), is a region with below-average density and unevenly distribution of meteorological stations. Because of its location, contrasting landscapes, and complex topography, it includes humid regions of the western Andean foothills and arid areas offshore the coastal line. The PAEP region provides a unique case to evaluate the potentials and drawbacks of satellite and numerical model rainfall estimates. Consequently, the objective of this study is to provide an evaluation of the performance of the TMPA V7 and its predecessor, the TMPAV6 version and the OA-NOSA30 products versus regionalized ground data over the PAEP region. Specifically, emphasis is given to determine whether there are regions and time aggregation scales on which precipitation estimates may be considered as an alternative and/or complementary information source for poorly gauged basins.

Study area
The western coast of South America is a region with contrasting landscapes and a rather complex orography. Near the equator the coastal area of Ecuador is characterized by a high precipitation regimen and supports dense vegetation down to the shore. However, at the southern margin and along the northern Peruvian littoral, the coast is almost devoid of vegetation. The PAEP region (ca. 100 800 km 2 ) is located along the N-S axis between 0-6 • S and drains the westernmost slope of the Andes (Fig. 1a). The various steep Andean ridges down to the coast together with the Ecuador-Peru "Cordillera costanera" shapes thirteen Pacific-Andean valleys from north to south: Chone (1), Portoviejo (2), Guayas (3), Taura (4), Cañar (5), Naranjal-Pagua (6), Jubones (7), Santa Rosa (8), Arenillas (9), Zarumilla (10), Puyango-Tumbes (11), Catamayo-Chira (12), and Piura (13) (Fig. 1b) each one with particular geomorphological and climatological features. The proximity of the Andes to the coastal line is the main influence on the basin's relief and climatology. Short and steep basins, i.e., Puyango (10), descend from nearly 4000 m of altitude in less than 240 km of river length. On the other hand, large basins host the largest plains and lowland valleys in the Ecuadorian littoral with roughly 70 % of its area below an elevation of 200 m. The Guayas (3), which is one the most important fluvial systems on the western coast of South America, is such a basin.

Climate
The coastal region of Ecuador has a seasonal rainfall distribution characterized by a single rainy period, with 75-90 % of the rainfall occurring between December and May. Overall, in the PAEP region the rainy season starts around late November and ends in June, with a peak between February and March. Over the humid Andean foothills in the coastal plains a 2-3 month dry period separates the rainy seasons. On top of this seasonal rainfall pattern, the distribution of precipitation is affected by the seasonal latitudinal migration of the Inter-Tropical Convergence Zone (ITCZ) and eastern tropical Pacific sea surface temperature (SST) variations. The northern-southern seasonal ITCZ displacement and SST variations bring to the area air masses of different humidity and temperature. When the ITCZ and the equatorial front are in their southernmost position near the equator, Ecuador's coastal regions are under the influence of warm moist air masses, originating from the northwest, bringing significant rainfall and rising air temperatures. The latter mainly defines the rainy season. Inversely, the northernmost ITCZ displacement and the equatorial front result in the presence of cooler and dryer air masses descending from upwelling regions in the southwest, influencing the dry season (Rossel and Cadier, 2009).
The most important feature of the rainfall variability in the PAEP region is the occurrence of interannual anomalies as related to the large-scale circulation phenomena such as El Niño-Southern Oscillation (ENSO). The PAEP region is bounded by the limit of the strong ENSO influence defined by Rossel et al. (1999) as the region where the increase in mean annual precipitation is greater than 40 %. Therefore, in ENSO years abrupt changes in the mean annual rainfall conditions are considerable, with a coefficient of variation reaching 0.40 (Rossel and Cadier, 2009). Such increase is not uniform region wide; there are important regional differences in heavy rainfall formation during El Niño (EN) events  and the EN influence on rainfall variability may change substantially in short distances in the same Pacific-Andean hydrological unit (Pineda et al., 2013). Furthermore, since 2000 an atypical meteorological response to EN and La Niña (LN) conditions has been reported over the coastal plains and the western Andean highlands (Bendix et al., 2011). All this results in a very complex spatiotemporal distribution of rainfall patterns during ENSO and non-ENSO years. These considerations are of paramount interest when dealing with data quality control of unevenly distributed rain gauges in the PAEP region.

Rain gauge data
A ground precipitation network of 131 rain gauges with daily data (∼ 1964-2010) in the PAEP region was provided by the Ecuadorian and Peruvian meteorological and hydrological services, INAMHI and SENAMHI, respectively (Fig. 1b). Records with gaps higher than 20 % were excluded, resulting in 107 locations with long-term daily rainfall time series.
In a first step, a regionalization analysis was conducted on the long-term records to group spatially homogeneous stations. A station was considered as spatially homogenous if it showed proportionality in the cumulative monthly volumes with regards to a control station in the same sub-catchment. The most reliable records were identified by selecting records with no changes in location and instrument type and then set as control stations for a double mass analysis (Wilson, 1983). In the double mass analysis, the hierarchical criteria to check proportionality between the control and the candidate station involves: (i) neighboring, (ii) similarity in altitude, and (iii) exposure to the same meso/synoptic climatological feature (e.g., ENSO).
Next, the temporal homogeneity of each record was checked against error measurements. A record was considered as temporally homogenous if the record showed no step changes (shifts in the means) or if the detected step changes were attributed only to climatic processes. The Rbased RHtests_dlyPrcp software package, developed by the Climate Research Division of the Meteorological Service of Canada and which is available from the Expert Team on Climate Change Detection, Monitoring and Indices (ETC-CDMI) website (Wang and Feng, 2012), was used to identify multiple step changes at documented or undocumented change points. It is based on the integration of a Box-Cox power transformation into a common trend two-phase regression model suitable for non-Gaussian series such as non-zero daily precipitation (Wang et al., 2010). Documented changes (EN driven) are referred as those defined by Rossel and Cadier (2009) and are the sequence of at least 3 consecutive months where the monthly SST anomalies are above 23 • C and exhibit a positive anomaly equal to or greater than 1 • C. Such events occurred in the years 1965, 1972-1973, 1976, 1982-1983, 1987, 1992, and 1997-1998. For LN-driven changes, the year 2008 was also considered. Nonhomogeneous periods were considered as modifications in the field during data collection and set as Not Available (NA) and then retested to verify whether they are homogeneous with the disregarded period(s).

Gridded rainfall data set
In this study we compare basin station-gridded precipitation fields against basin-averaged precipitation products. Rather than rescaling the products to an arbitrary resolution, the products were evaluated at the sub-catchment scale identified during the regionalization analysis. Namely, instead of a punctual comparison, spatial averages were calculated for the precipitation products using the proportional coverage of each grid cell. The analysis was performed for the 1998-2008 11 year period. This period was chosen as common between the TMPA products and the WRF retrospective simulation. All data-quality-checked records were interpolated to obtain spatial averages in each sub-catchment, except the few whose data is available through the Global Telecommunication System (GTS). Data from these stations may have been used for adjusting TRMM estimates. Three GTS stations were identified in our data set and excluded. The locations of the GTS stations (five) are shown in Fig. 1b.
Using the kriging approach for spatial interpolation of daily rainfall over complex terrains, the incorporation of correlation with topography/altitude has been suggested to improve performance; see Buytaert et al. (2006) for highlands ∼ 3500 m a.s.l. and Cedeño and Cornejo (2008) for the coastal region below 1350 m a.s.l. in Ecuador. Also, in a climatological study for Ecuador and north Peru, Bendix and Bendix (1998) showed that the inclusion of the altitude increases the performance of kriging significantly.
In parallel, several interpolation techniques of increasing complexity have been developed and evaluated using the gstat R package (Edzer Pebesma, 2011). Inverse distance weighting (IDW) and ordinary kriging (OK) are fairly similar; both take into account the distance between stations, but OK has a more complex formulation and therefore is expected to be more accurate. Linear regression (LR) is supposed to perform similar to kriging with external drift (KED) since they both implement regression with altitude. KED is, however, more accurate at accounting for kriging of residuals, which means that distance between stations influences interpolation as well. KED was applied on daily basis, the variogram analysis was performed at each time step to determine the spatial variability function of precipitation, and then parameters were estimated from regression residuals for each time step (zero values were included in the semi-variogram fitting). To discern among different interpolation techniques, Li and Heap (2008) recommend assessing the performance by cross-validation methods.
A key issue in this study is whether the change of spatial support provides a sound reference for comparison with TMPA and WRF products. In general, errors and uncertainty in a gridded data set arise from many sources, including errors in the different steps of the data supply chain (measurements, collection, homogeneity) and in the interpolation technique. It would be ideal to split and quantify all of them. This is, however, not possible without the possibility of tracking them back. Kriging provides a measure of the expected mean value and its variance at an interpolated point. Several climate studies have used the kriging variance as a proxy of uncertainty. However, it is acknowledged that kriging variance is not a true estimate of uncertainty (Yamamoto, 2000 andHaylock et al., 2008). A solution would be to perform an ensemble of stochastic simulations from which uncertainty can be estimated at the expense of highly computational resources. Such detailed analysis is out of the scope of this work.
We therefore adopted the alternative method by Yamamoto (2000) for assessing kriging uncertainty using just the data provided by a single realization. We quantify the total residual variance and split it up in its main contributing residual variance sources (input (data) and kriging interpolation (geostatistical model)) based on a variance decomposition technique (Willems, 2008(Willems, , 2012 in order to estimate the fraction of each contributing source. The total residual variance is assessed based on statistical analysis of the residuals between each precipitation product (Y PP ) and KED estimates (Y KED ). The underlying assumption of the variance decomposition is that the (causes of the) errors on the Y PP and Y KED precipitation estimates are highly different, hence that they can be assumed independent. The residuals are converted into homoscedastic residuals by means of a Box-Cox (BC) transformation (Box and Cox, 1964). After this conversion, the total Y PP residual variance (S 2 BC(Y PP,Residual) ) is decomposed into the precipitation product error variance, hereafter called model error variance (S 2 BC(Y PP,Model) ), and the KED error variance (S 2 BC(KED) ) (Eq. 1). The KED uncertainty is evaluated using just the random field provided by a single realization with prescribed parameters (i.e., mean structure, residual variogram) (Yamamoto, 2000). We estimate the total (Y PP ) residual variance at every tile ( PP-KED ). By subtracting the KED error variance from the total residual variance of Y PP based on Eq. (1), we obtain indirect estimates of the model error variance and map its spatial distribution.

TMPA TRMM 3B42 products
The TMPA 3B42 V7 and its predecessor version V6 are used in this study. The TMPA 3B42 V6 consists of hourly rainfall rates (mm h −1 ) at surface level with a global coverage between 50 • N and S since 1998. This method combined precipitation estimates of four passive microwave (PMW) sensors, namely TRMM Microwave Imager (TMI); Special Sensor Microwave/Imager (SSM/I) F13, F14, and F15; Advanced Microwave Scanning Radiometer-EOS (AMSR-E); and Advanced Microwave Sounding Unit-B (AMSU-B). The TMPA V6 algorithm is described in Huffman et al. (2007). The improved version, the 3B42 V7, includes consistently reprocessed versions for the data sources used in 3B42 V6 and introduces additional data sets, including the Special Sensor Microwave Imager/Sounder (SSMIS) F16-17 and Microwave Humidity Sounder (MHS) (N18 and N19), the Meteorological Operational satellite programme (MetOp) and the 0.07 • Grisat-B1 infrared data. The changes in the V7 algorithm at various processing levels are described in Huffman et al. (2010) and Huffman and Bolvin (2014). It is useful to review some of the efforts in validating TMPA V6 and/or comparing V6 and V7 at low and high altitudes in the tropical Pacific because it has a bearing on the choice of the satellite products used in our study. While evaluating several precipitation products, Dinku et al. (2010) reported that V6 outperforms other satellite products (i.e., CMORPH) at 10-day accumulation over the dry northern Colombian littoral. The converse was found over the wet western Pacific coast where CMORPH was slightly better especially at daily scale. In an evaluation of V7 daily rainfall estimates to analyze tropical cyclone rainfall, Cheng et al. (2013) found improved skill scores over coastal and island sites in the tropical Pacific. Also, Zulkafli et al. (2014) reported that the improvement of V7 against V6 is a reduction of the bias, especially in the Peruvian Pacific lowlands. To assess whether such improvements are seen in the PAEP region, we use both TMPA versions. TMPA 3B42 V6 and 3B42 V7 precipitation estimates having 3-hour, 0.25×0.25 degrees resolution were aggregated to daily data for the 11 year period.

WRF retrospective simulation
The Scientific Modeling Center of Venezuela (CMC) and the National Institute of Hydrology and Meteorology from Ecuador (INAMHI) developed a North Western South America Retrospective simulation. The data set, called OA-NOSA30, is available online at the International Research Institute for Climate and Society (IRI) web page . The simulation provides numerous climate variables with a 30 km spatial and 6 h temporal resolution for the period January 1996 to December 2008 and a global coverage between 11 • S to 17 • N and 98 • W to 50 • E. The accumulated precipitation was extracted on a daily basis for the 11 year common period.
OA-NOSA30 is the simulation result from the weather research and forecasting (WRF) model, a regional climate model (RCM) herein used to downscale the meteorological data from the NCEP/NCAR Reanalysis Project (NNRP or R1, details at Kistler et al., 2001). This project is defined as the combination of global climate model outputs and observations. The WRF configuration for the microphysics parameterization, governing the outputs, was applied.  explained that the microphysics were modeled by the Kessler scheme (RRTM), the Dudhia schemes were used for the modeling of the long-wave and shortwave radiation, respectively; the Monin-Obukhov (Janjic) scheme for modeling of the surface-layer; and the thermal diffusion with five soil levels for modeling the land-surface physics. Finally the Mellor-Yamada-Janjic turbulent kinetic energy scheme was applied for describing the boundary-layer option, in which the SST update option was selected.

Rainfall products evaluation
Bias, root mean square error (RMSE), and Pearson's correlation (γ xy ) were applied to analyze the accuracy of the TMPA and OA-NOSA30 estimates comparing them with rain-gauge-interpolated estimates at sub-catchment scale (Eqs. 1 to 3). RMSE includes both systematic (bias) and nonsystematic (random) errors.
where P pp is the precipitation products value, P gauge the interpolation estimate from rain gauge values, and n the number of observations. Additionally, skill scores were calculated to quantify the products accuracy in detecting daily accumulation at different precipitation thresholds and they were calculated based on average sub-catchment precipitation. The probability of detection (POD) gives the fraction of rain occurrences that were correctly detected; it ranges from 0 to a perfect score of 1. The equitable threat score (ETS) measures the fraction of observed and/or detected rain that was correctly detected, and it adjusted for the number of hits that could be expected due purely to random chance. A perfect score for the ETS is 1. The frequency bias index (FBI) is the ratio of the number of estimated to observed rain events; it can indicate whether there is a tendency to underestimate or overestimate rainy events. It ranges from 0 to infinity with a perfect score of 1. The false alarm rate (FAR) measures the fraction of rain detections that were actually false alarms. It ranges from 0 to 1 with a perfect score of 0 (Su et al., 2008).
The ETS is commonly used as an overall skill measurement by the numerical weather prediction community, whereas the FBI, FAR, and POD provide complementary information about bias, false alarms, and misses. To evaluate the performance of the products for light and heavy precipitation events, they were calculated for each sub-catchment and for several thresholds: 0.1, 0.5, 1, 2, 5, 10, and 20 mm day −1 (Schaefer, 1990;Su et al., 2008).
Seasonality accuracy at the sub-catchment level was evaluated comparing precipitation estimates against interpolated average monthly rainfall depths. Furthermore, in order to evaluate precipitation products on increasing time scales, daily, weekly, 15-day, and monthly estimates were accumulated deriving Pearson's correlation (Eq. 3) and relative bias. The relative bias was calculated for daily/weekly/15day/monthly time aggregations by normalizing the bias (Eq. 1) in order to compare different time resolutions. Finally, annual mean precipitation was calculated for interpolated rain gauges and precipitation products and depicted spatially.

Data quality verification, interpolation and uncertainty
The double mass analysis discriminated 21 sub-catchments within which rainfall is spatially correlated. The proportionality is strong in the coastal areas where the altitude range is narrow but is less marked at higher altitudes. Four stations do not have significant correlation with any other station, and the sub-catchments in which they are situated were ranked as independent.
The temporal homogeneity check for each station reported several change points, with a statistical significance of 5 %. However, most of them were attributed to EN regional variations and therefore rejected as artificial change points. Besides the documented changes, several change points appeared repeatedly in nearby locations. They were interpreted as a common modification in the local climate and therefore disregarded as change points. Despite these considerations, non-homogeneous periods significant at 5 % were found in 30 stations. Those periods were discarded and the stations tested again for homogeneity. Nine stations did not pass the test. Therefore they were no longer taken into account, resulting in a quality checked set of 98 time series. From this data set, the 11-year period, January 1998 to December 2008, was taken for the comparison between OA-NOSA30 and the TMPA estimates, and rain gauge precipitation data. The 98 homogeneous stations together with the 21 homogenous subcatchments are shown in Fig. 1b. The area and the density of the rain gauge stations per sub-catchment are listed in Table 1. The highest density is found in Quiroz, Upper Guayas, Alamor, and Chipillico and the lowest in Naranjal-Pagua, Lower Guayas, Piura, and Tumbes. Table 2 reports the mean cross-validation results of the four investigated techniques for gridding daily precipitation in the period 1998-2008. Correlation for KED (0.49) is twice the value than for IDW, LR, and OK techniques (0.26, 0.28, and 0.21, respectively). Not only is its mean higher but correlation on almost every day was better than for any other technique. The mean square error (MSE) for KED is less than for    Figure 2a shows that the OA-NOSA30 estimates are subject to the largest model residual variance, which strongly correlates with the high topographic precipitation gradients as seen over the inner-mountain foothills (i.e., Upper Guayas (5), Cañar (7) and Jubones (9)), and to a lesser extent over the moderate slopes of the Cordillera Costanera (i.e., Chone (1)). The KED uncertainty has the highest contribution to the total residual variance in these regions whereas in the remaining stations the contribution of the KED uncertainty is more or less proportional to the total residual variance. In the comparison of TMPAV6-V7 (Fig. 2b and c) with KED estimates the spatial trends are less evident. Correlation with elevation still takes place in the V6 analysis but the largest total residual variance does not show clear distinction between middle (∼ 500 m a.s.l.) and high altitudes (∼ 3000 m a.s.l.). For the V7 analysis, the uncertainty mapping shows a more scattered distribution with almost no spatial trends. In both the V6 and V7 cases, the KED contribution to the total uncertainty remains slightly larger than the precipitation product error variance. All results together suggest that when comparing precipitation products against KED estimates, the TMPAV7based product, in the first place, followed by the V6 product, offers the best precipitation estimates since the precip- itation uncertainty is less affected by the topographic setting that provides the basis for our proposed gridded data set. The largest errors are encountered in the comparison between OA-NOSA30 and KED estimates at high altitudes. This has implications for our catchment-averaged analysis. These limitations are relevant for the results presented in the following sections. Figure 3a, b, c show the bias, RMSE, and Pearson's correlation between precipitation products and daily KED estimates accumulated over each sub-catchment unit and ranked from N-S within the period 1998-2008. These statistics reveal a strong spatial variation; for 3B42 V6 and OA-NOSA30, bias and RMSE decrease from north to south while correlation increases, whereas for TMPA V7 significant bias reduction and increase in correlation seems sub-catchment and precipitation regimen dependent.

Daily verification
TMPA V7 and V6 overestimate precipitation in all sub-catchments, with an average range between 0 to ∼ 2 mm day −1 . Conversely, OA-NOSA30 underestimates precipitation, except in Quiroz (17) and Chipillico (19), the range of over-/underestimation is within ∼ 0.5 to −1.5 mm day −1 (Fig. 3a). The RMSE ranges from 4 to 9 mm day −1 for both TMPA estimates. The RMSE gives more weight to the extremes because residuals are squared and they are typically higher for precipitation extremes. Given that, particularly for TMPA V6, the bias is very high in wet seasons, RMSE values are higher for TMPA V6 estimates than for OA-NOSA30 (Fig. 3b).   Figure 3c shows that the Pearson correlation is very similar between TMPA V6 and OA-NOSA30, oscillating between 0.3 and 0.6 except in Arenillas (11) where OA-NOSA30's detection fails. In the northern region the highest correlation (0.5) is found at Lower/Middle Guayas (3)/(4) and the rest of the northern sub-catchments record correlations ∼ 0.3. In the central region, average correlation is about 0.35. In the southern region, correlation consistently rises to 0.5 in a large area (Catamayo-Chira and Piura catchments). TMPA V7 shows a very modest region-wide improvement over TMPA V6 only with a notable correlation increase in Chone (1), Upper Guayas (5), Taura (6), Jubones (9), and Zarumilla (12).
OA-NOSA30 presents almost no region-wide bias for precipitation rates less than 1 mm day −1 . For the southern sub-catchment: Alamor (15), Macará (16), Quiroz (17), Chira (18) and Piura (21) this is up to 10 mm day −1 ; over such a threshold precipitation is systematically underestimated. TMPA V7 and V6 overestimate precipitation amounts smaller than 10 mm day −1 in sub-catchments in the central and southern regions. For lowland areas in the north, this threshold changes to 20 mm day −1 . As well as for OA-NOSA30, precipitations over 20 mm day −1 are systematically underestimated. Figure 4a-c show categorical scores POD, ETS, FBI and FAR for representative sub-catchments distributed in the northern, central and southern region corresponding to the TMPA V7, V6, and OA-NOSA30 estimates. The four subcatchments shown in Fig. 3 were chosen as representative according to their location and dominant precipitation regime. In the humid northern part, Chone (1), a coastal and ocean-exposed sub-catchment, and Middle Guayas, (4) in the inner core and greatly influenced by the continental climate divide, were selected. In the central region, Jubones (9) with a pronounced leeward effect and Chira (18), in the southern arid coast, were considered. Their indexes lead to conclusions which can also describe the situation of the surrounding sub-catchments in each region. The difference between scores of TMPA V7 (Fig. 4a) and V6 (Fig. 4b) is almost undistinguished, both estimates shows a POD value of 0.6, on average, for precipitation rates less than 5 mm day −1 . It gradually decreases to ∼ 0.2 when the threshold is higher than 20 mm day −1 . A close inspection reveals a marginal improvement of V7 over V6 only evident in Middle Guayas (4) at higher thresholds. Equitable threat scores, for precipitation estimates equal or lower than 5 mm day −1 , are on average 0.25. Equitable threat scores, a summary score that penalizes for hits that could occur due to randomness, can be used to compare performance across regimes. A slight improvement of V7 across all threshold is restricted to Chone (1). The false alarm rate and FBI increase with higher thresholds. This means that overestimation exists over 1 or 2 mm day −1 and false alarms are then also present. In general, TMPA products detect amounts of precipitation higher than 5 mm day −1 but it overestimates them, while amounts of precipitation less than 2 mm day −1 are detected with a low fraction of FAR, although bias is present. TMPA scores are better in the southern region, Chira (1). Figure 4c shows the same categorical scores for OA-NOSA30. In all sub-catchments, POD decreases when the threshold increases, indicating that the NWP estimates small precipitation events better. POD decreases abruptly to 0 when considering thresholds of 5 and 10 mm day −1 thresholds. The behavior of ETS scores is the same as for POD but the average scores are half the amount of POD. For small amounts of precipitation, i.e., less than 3 mm day −1 , OA-NOSA30's POD scores are around 0.6 while ETS scores are 0.3. The FBI plot shows underestimation. False alarms increase with higher thresholds with FAR values typically in the range 0.2 to 0.5. There are no FAR values given for thresholds over 5-10 mm day −1 since the POD of OA-NOSA30 is zero for those precipitation depths. Spatially, POD and ETS show a better probability of detection in the southern region and FBI shows lower bias in that region compared to the northern and central regions; however FAR is lower in the northern region Middle Guayas (4).

Monthly verification
Although Fig. 5a-c show the mean monthly precipitation within the period 1998-2008 for KED estimates against TMPA V7, V6, and OA-NOSA30 for the four selected subcatchments, the analysis below corresponds to all 21 subcatchments. In general, Figure 5c reveals that the three approaches yield comparable results for the southern region, which includes the sub-catchments Alamor (15), Macará (16), Quiroz (17), Chira (18), and Chipillico (19). In most of the sub-catchments, all data sets depict well seasonality showing wet conditions within the period January-May. In the northern and central regions, during the wet season, TMPA V7 and V6 overestimate precipitation while OA-NOSA30 underestimates (Fig. 5a, b). The pattern of overand underestimation is not that clear in all data sets during the dry season. Maussion et al. (2011) showed that the WRF and TRMM well estimated the precipitation distribution, but depths and positions of maxima do not match. Additionally, they showed that WRF usually predicts more rainfall over larger areas; notwithstanding, WRF may better reflect reality than TRMM.
The density of rain gauges in the Catamayo-Chira catchment is higher and also the quality of data is better (fewer missing gaps and change points). This might indicate that KED estimates are better for this area. However, in most of the southern region TMPA and OA-NOSA30 estimates are similar to KED estimates, even in the high altitude subcatchment, i.e., Quiroz (17), which is not the case for the rest of the sub-catchments. Also, there are other sub-catchments such as Catamayo (14) and Upper Guayas (5) where the precipitation estimates are neither similar between them nor to KED estimates, despite the high quality of data. Thus, KED estimates prove to be good references and the dependence of the interpolation technique on the rain gauge density (Table 1) as well as the error seen at high altitudes when comparing OA-NOS30 and KED is not substantially affecting the analysis. This is a very important issue given that the density of rain gauges is relatively low and building up a gridded rainfall data set that is the least influenced by this fact is crucial. Notice that the success of KED technique may differ for areas with lower gauge densities, which was not tested in this study. TMPA's overestimation occurs for any precipitation amount when aggregated per month (Fig. 5), unlike daily aggregation where over-underestimation occurs according to the amount of precipitation (see FBI scores in the Fig. 4a and b).

Verification on multi-temporal resolutions
The Pearson correlation (Fig. 6a) and bias (Fig. 6b) were calculated on daily, weekly, 15-day, and monthly time scales for TMPAV7, V6, and OA-NOSA30. In general, correlation increases with time scale, and is higher for monthly than 15day and weekly time aggregated periods. Bias seems to accumulate when time aggregation increases, as found for WRF in other regions (Cheng and Steenburgh, 2005;Ruiz et al., 2010). The purpose of finding the relative bias in the estimates is to quantify respectively the over-/underestimation of the precipitation depth. The relative bias is consistent with the correlation coefficient, decreasing as the time aggregation increases. Although the daily bias is high in Jubones (9) (∼ 1000 % for V7 and ∼ 1200 % for V6) and in Middle Guayas (4) (higher for V7 than V6), on a weekly-to-monthly scale the bias percentage decreases. The worst performance of both TMPA estimates was found in Jubones, where correlation is lowest and bias percentage is highest. For OA-NOSA30, that is the case for Chone (1) and Jubones (9). The results found for TMPA, i.e., that correlation increases and bias reduces as time aggregation increases, are in agreement with previous studies (Scheel et al., 2011;Habib et al., 2009;among others). Aggregation of the mean annual rainfall was performed to compare the spatial performance of the three approaches (OA-NOSA30, TMPAV6, and V7) against KED estimates in the study area (Fig. 7). Comparison shows that the TMPA estimates are closer to the spatial pattern of the mean annual rainfall, though mean annual rainfall in the north and southeast are overestimated. OA-NOSA30 presents a huge underestimation and does not reflect spatial variability, except over the southern region. Over the latter region, OA-NOSA30 bias is small enough to represent a spatial pattern approaching the one based on TMPA estimates.

Discussion
Our analysis shows that both TMPA products overestimate precipitation in the 21 sub-catchments of the heterogeneous PAEP region. Key challenges in the estimation of precipitation from satellite estimates arise from the processing scheme for microwave (MW) and IR data. The problem with IR data processing is that global algorithms do not consider the altitude of the hydrometeor. Dinku et al. (2011) suggest that overestimation over dry areas may be attributed to sub-cloud evaporation. While this mechanism may have implications on the overestimation of TMPA onshore the coastal plain, especially in the arid Peruvian littoral where a dry low atmosphere is common year-round; the attribution of TMPA overestimation to sub-cloud evaporation on the middle/high altitude sub-catchments is inconclusive.  showed that, over the Ecuadorian territory and surroundings, average cloud-top height increases from W-E, showing more stratiform cloud dynamics in the Pacific area and the coastal plains, and that the western Andes is a true division for Pacific influence. These authors describe the seasonal spatial pattern of cloud-top height distribution within December-May (wet season), possessing a well-defined blocking height (∼ 4.5 < 5.0 km) between 0-3 • S, but less marked southward. Given that IR data processing scheme infers precipitation from the IR brightness temperature at the cloud top (implicitly cloud height) it would be expected that overestimation follows the same spatial pattern. However, our analysis showed that even though TMPA overestimation matches the increasing W-E cloud-top gradient it does not explain the large overestimation in the northern bottom valleys (i.e., Lower Guayas and Chone catchment). The regional differences in cloud properties between the northern and southern catchments help to explain the differences in TMPA overestimation. Over the northern region ∼ 0 • (Quito-transect) , cloud frequency is substantially higher than the reduced cloudiness at ∼ 4 • S (Loja-transect). To illustrate these differences Fig. 8ac show cloud density patterns using anomalies of interpolated outgoing long-wave radiation (OLR) (Liebmann and Smith, 1996) as proxy for cloudiness (negative anomalies imply  increased cloudiness) during the rainy season within 1998-2008. During December-January (Fig. 8a) symmetrical patterns of cloudiness are observed over northern and southern sub-catchments, followed by increased cloudiness which concentrates over the northwestern edge during January-February (Fig. 8b). Then, cloudiness exhibits a marked north-southeast gradient in April-May (Fig. 8c). This suggests that in addition to the error introduced by the estimation of the cloud-top, the TMPA overestimation for the northern catchments may also be influenced by frequent occurrence of low stratiform clouds (typical on the coastal area) which under stable conditions are detached from precipitation patterns . This high density of non-rain producing clouds would affect the IR data retrieval resulting in overestimation.
The largest deficiencies of TMPA estimates are encountered in separating the windward/leeward effect of the Andean ridges on orographic rainfall which is particularly witnessed in Jubones where the leeward effect is dominant. West of the climate divide there is no typical precipitation gradient. Through blocking at the ridges and through reevaporation, rainfall of any origin more frequently affects higher elevations than valley floors (Emck, 2007).
TMPA V7 and V6 estimates show different region-wide skills on a daily basis but they yield comparable results particularly in the southern region (3.6-6 • S) in weekly to monthly time aggregations. TMPA V7 shows localized skill that is higher than V6 on short-steep coastal and oceanexposed sub-catchments but similar or lower skills on large inland basins. The improvement is seen in the detection capacity of light orographic precipitation on coastal oceanexposed sub-catchments, where the spatial sampling seems to capture small precipitation gradients. Over coastal areas, the orographic enhancement is a small spatial-scale event (Minder et al., 2008, Cheng et al., 2013. In the innermost sub-catchments where gradients of annual precipitation may reach 700 mm/100 m at 3400 m a.s.l. (Emck, 2007) the temporal sampling of V7 cannot capture the rapid evolution of orographic rainfall and the overestimation is similar to that of the V6 version. Notice that inland the total residual variance and the KED uncertainty increase with elevation (especially for V6). This could influence the apparent decrease of the V6 performance seen for the innermost sub-catchments. This is, however, restricted to very few sub-catchments where the spatial average is dominated by the weight of high altitude stations.
OA-NOSA30 product only shows reasonable skills in the southern region (3.6-6 • S) where amount and occurrence are relatively well represented. The greatest NWP limitations are encountered in representing the fast enhancement of rain rates due to the effect of the coastal mountains as a first barrier for moisture transport in short-steep coastal subcatchments (3-3.6 • S). The nearly null NWP detection capability is likely related to the unique rainfall rates that occur on the ocean facing foothills of the Cordillera Costanera. Unlike in most tropical mountains where convective rainfall dominates, in southeast Ecuador vigorous advection shapes a monotonic increasing precipitation gradient with altitude. In the core of the southern region, Emck (2007) reported that rainfall originates from an equal balance of advective-topographic (light) and convective (heavier) genesis. Such a characteristic, over the southern region, suggests that the NWP parameterization for OA-NOSA30 is particularly suited to solve for this type of precipitation. For the northern regions, which are more affected by the annual movement of the ITCZ, the influence of the continental climate divide and the occurrence of more stratiform cloud, deep convection (likely the dominant mechanism) is not emulated by the NWP model. A complete description of the errors in the NWP implementation is out of the scope of this study; we therefore only highlight some of the major sources. The lateral boundary conditions (reanalysis data set) have presumably a major role on the degradation of WRF product quality. The poor representation of the Andes in the reanalysis model has been shown to contribute to a modest simulation of meteorological fields such as wind (Schafer et al., 2003). Maussion et al. (2011) found that some undesired numerical effects and, eventually, inadequate input data can affect the operational output of the WRF model, in particular for extreme events; that is probably by overstressing certain physical processes. Jankov et al. (2005) found that the greatest variability in rainfall estimates from the WRF model originates from changes in the choice of the convective scheme, although notable impacts were observed from changes in the microphysics and planetary boundary layer (PBL) schemes. However, Ruiz et al. (2010) found that rainfall estimates only vary slightly among different configurations, but biases increase with time aggregation. Those findings agree with previous studies (Blázquez and Nuñez, 2009;Pessacg, 2008) and suggest that there is a common deficiency in the convective schemes used for this model.

Conclusions
In general, TRMM V7, V6, and OA-NOSA30 estimates capture the most prominent seasonal features of precipitation in the study area. Quantitatively, only the southern subcatchments of Ecuador and northern Peru are well estimated by both satellite and NWP estimates. There is low accuracy of both approaches in the northern and central regions where TMPA V7 and V6 overestimate precipitation while OA-NOSA30 systematically underestimates. The improvement of V7 over V6 is not evident region wide. It appears that V7 detects better light precipitation rates on coastal and ocean-exposed basins. Inland the differences of the two versions of TRMM 3B42 are almost unnoticeable. The separation of the windward/leeward Andean effect on orographic precipitation appears the main challenge for TMPA algorithms. It was found that the detection probability is better for small rainfall depths (less than 5 mm day −1 ) than for high amounts of precipitation. OA-NOSA30 showed the best skills in detecting a balanced advective/convective regime of precipitation in the southern region.
Analysis of daily, weekly, 15-day, and monthly time series revealed that the correlation with station observations increases and bias decreases with the time aggregation. Differences are considerably larger for daily than weekly aggregation. The correlation and bias values are similar in the northern and southern region but in the central region correlation is smaller and bias is higher for all time aggregations. TMPA V7, V6, and OA-NOSA30 are able to capture relatively well the spatial pattern in the southern region of the study area, but the performance of both approaches reduces in the northern and central region. In general the two TMPA versions perform better than OA-NOSA30.
In view of hydrological and water resources management applications, it has been demonstrated that the potential intake of both satellite and NWP estimates in the PAEP region differs among catchments and precipitation regimes. Our analysis has shown that both approaches capture the mean spatial and temporal features of precipitation at weekly to monthly accumulations over a particular region of southern Ecuador and northern Peru. These findings are relevant for these poorly gauged regions where there is growing pool of modeling work that relies on the use of satellite-based rainfall estimates as forcing data. Also dynamical weather prediction becomes more frequently applied, but this prediction is still in an experimental stage. However, for operational applications such as flood warnings, which demand high temporal resolution rainfall data, accurate depth and storm location estimates are mandatory. The usefulness of both estimates is less promising.