Remapping annual precipitation in mountainous area based 1 on vegetation pattern : a case study in the Nu River basin 2

Accurate high-resolution estimates of precipitation are vital to improve the understanding on basin9 scale hydrology in mountainous areas. The traditional interpolation methods or satellite-based remote sensing 10 products are known to have limitations in capturing spatial variability of precipitation in mountainous areas. In 11 this study, we develop a fusion framework to improve the annual precipitation estimation in mountainous areas 12 by jointly utilizing the satellite-based precipitation, gauge measured precipitation and vegetation index. The 13 development consists of vegetation data merging, vegetation response establishment, and precipitation remapping. 14 The framework is then applied to the mountainous area of Nu River basin for precipitation estimation. The results 15 demonstrate the reliability of the framework in reproducing the high-resolution precipitation regime and capturing 16 its high spatial variability in the Nu River basin. In addition, the framework can significantly reduce the errors in 17 precipitation estimates as compared with the inverse distance weighted (IDW) method and TRMM (Tropical 18 Rainfall Measuring Mission) precipitation product. 19


Introduction
Precipitation plays an important role in hydrological processes, land-atmospheric processes, and ecological dynamics.Accurate high-resolution precipitation is crucial for streamflow prediction, flood control, and water resources management in data-sparse regions such as mountainous areas (Song et al., 2016).However, it is a great challenge to obtain accurate precipitation in mountainous areas due to the sparse gauge network and the remarkable spatiotempo-ral variability of precipitation.Conventional gauge networks can provide accurate rainfall measurements at point scales, which can be interpolated within the region of interest to give estimates of precipitation in ungauged areas.However, such interpolated estimates might not be reliable in mountainous areas considering the very limited gauges there (Phillips et al., 1992;Mair and Fares, 2011;Jacquin and Soto-Sandoval, 2013;Wang et al., 2014;Borges et al., 2016).
Precipitation estimates can be influenced by a variety of ambient factors (e.g., topography, vegetation).In order to correct effects of topography on precipitation estimates, a digital elevation model (DEM) has been widely used in spatial interpolation of precipitation over mountainous ar-eas (Marquínez et al., 2003;Lloyd, 2005).However, the relationship between elevation and precipitation is not clear.Meanwhile, strong correlations between the normalized difference vegetation index (NDVI) and precipitation have been found by several studies (Li et al., 2002;Kariyeva and Van Leeuwen, 2011;Li and Guo, 2012;Sun et al., 2013;Campo-Bescós et al., 2013).As such, establishing statistical models between the NDVI and precipitation so as to improve the spatial resolution of TRMM products in mountainous areas is becoming popular (Immerzeel et al., 2009;Jia et al., 2011;Duan and Bastiaanssen, 2013;Chen et al., 2014;Xu et al., 2015;Mahmud et al., 2015;Jing et al., 2016).For instance, Immerzeel et al. (2009) downscaled TRMM-3B43 to 1 km based on an exponential relationship between NDVI and TRMM precipitation on the Iberian Peninsula of Europe.Jia et al. (2011) established four multivariable linear regression models between TRMM-3B43 precipitation and two other factors (i.e., DEM and NDVI) of different resolutions (0.25, 0.5, 0.75, and 0.1 • ) to get 1 km estimates of precipitation in the Qaidam basin of China.Duan and Bastiaanssen (2013) used a nonlinear relationship between TRMM-3B43 and NDVI to downscale precipitation to 1 km in a humid area and a semi-arid area.Chen et al. (2014) established a spatially varying relationship between TRMM, NDVI, and DEM by using a local regression analysis approach known as geographically weighted regression (GWR) in South Korea.Xu et al. (2015) also used the GWR method to explore the spatial heterogeneity of the RSBP-NDVI and RSBP-DEM relationships over two mountainous areas in western China.
However, the present RSBP-NDVI-based schemes have several limitations: (1) significant errors can be introduced during the downscaling given the nonlinear relationship between RSBP and NDVI; (2) large uncertainties exist in the RSBP for mountainous areas; and (3) inter-comparison of existing NDVI datasets is missing in deriving the RSBP-NDVI relationships.In this study, we develop a fusion framework to obtain more accurate high-resolution estimates of precipitation in mountainous areas based on the relationship between precipitation and vegetation response.More specifically, in addition to RSBP, gauge measurements and different vegetation datasets will be used in this study to overcome the aforementioned limitations in current RSBP-NDVI-based schemes.The paper is organized as follows: Sect. 2 describes the development of the fusion framework; Sect. 3 documents the study area and related datasets; Sect. 4 presents the results of the fusion framework and discusses impacts of different determinants on the performance of the fusion framework; and Sect. 5 summarizes this work.

Vegetation data merging
Vegetation closely interacts with soil moisture and is recognized as a good proxy of precipitation.The remote sensing technique provides us with various high-resolution vegetation products such as NDVI, EVI (enhanced vegetation index), and LAI (leaf area index).Among the vegetation indices, NDVI, an indicator of plant density and growth, is chosen as the proxy of precipitation in this study due to its wide availability.Considering the crucial role of NDVI in deriving precipitation estimates under our framework, we conduct an inter-comparison in data accuracy between two NDVI datasets (termed datasets A and B hereinafter) to reduce the error.First, the systematic errors of both datasets are eliminated by multiplying the reduction factor or using the simple regression model.After the correction, the final dataset is then obtained by selecting a better element between A and B if the quality criteria are satisfied, otherwise filling an anomaly value.
It should be noted that since the vegetation growth is suppressed or promoted on some land covers (e.g., rivers, lakes, snow and ice, and urban areas), the vegetation data of these land covers are excluded by filling anomaly values.Besides, due to the strong influence of farming activities (e.g., irrigation, fertilization, and harvest) on the crop growth, vegetation data of farmland are excluded as well.We note that although Moran's index (Li et al., 2007) is widely employed to detect anomalies in vegetation data (Jia et al., 2011;Duan and Bastiaanssen, 2013), it is not used in this study for its inapplicability in large areas with continuous anomaly pixels (e.g., farmland).As such, we identify anomaly pixels simply by land-use type: pixels categorized as water, wetland, urban, cropland, snow/ice, and barren will be identified as anomalies.The detected anomaly pixels are excluded from the original NDVI dataset and then filled with interpolated values using the IDW method so as to generate an optimized NDVI dataset.
Based on the optimized NDVI dataset, the NDVI data at the gauge locations are retrieved with the neighbor-average method (i.e., the value of a certain grid is determined as the average of all its eight neighboring grids) and will be used for the precipitation-vegetation regression.

Precipitation-vegetation regression
As far as we know, there is no widely accepted form of the precipitation-vegetation relationship.Therefore, the final regression form will be determined from several candidate relationships, including polynomial, exponential, logarithmic, and linear forms, according to the five metrics: correlation coefficient (R), coefficient of determination (R 2 ), root-meansquare error (E RMS ), mean relative error (E MR ), and mean absolute relative error (E MAR ), which are given as follows: (1) where O is the mean annual precipitation of all gauges, O i the mean annual precipitation of gauge i, P i the estimated precipitation at gauge i, and n the total number of gauges.Also, considering the annual variability of precipitation, the regression model is further determined for two temporal scales: (1) the entire period covering all the study years and (2) the individual year of the entire study period.The regression models for the entire study period and for individual years are thus termed RME and RMI, respectively.RME can utilize the full knowledge of precipitation characteristics of the entire study period, whereas RMI implies the interannual variability.Besides, RME can reasonably reconstruct the precipitation series of the years when data gaps exist.
The calibration-validation procedure for each candidate model is conducted under three scenarios with different numbers of gauges and/or years: Scenario a Fully random: a random number of gauges and a random number of years are independently used for calibration and validation; Scenario b All gauges, partial period: all the gauges will be involved in both procedures, but only 2/3 of years will be randomly chosen for calibration, and the other years for validation; Scenario c Partial gauges, entire period: all years will be used, but only 1/3 of gauges will be randomly chosen for calibration, and other gauges for validation.
For each scenario, the calibration-validation procedure will be performed for 100 samples determined based on the above criteria and the five evaluation metrics (i.e., R, R 2 , E RMS , E MA , and E MAR ) will be calculated for each sample accordingly.The best model is then determined based on the metrics.

RSBP product remapping
With the optimized vegetation dataset and the precipitationvegetation regression model, the RSBP product is then remapped over the study region.Thanks to the finer resolution of the NDVI dataset than the RSBP product and the accurate estimate of precipitation by gauges, the remapped RSBP product is expected to provide more detailed spatial characteristics of precipitation over mountainous areas.
3 Study area and datasets for framework application

Study area
The Nu-Salween basin (Fig. 2a), where 6 million people live, is one of the largest river basins in South Asia and spreads across three countries with an area of 324 000 km 2 .This study focuses on the Chinese part of the Nu-Salween basin (termed the Nu River basin hereafter), where the elevation ranges from 446 to 6134 m and the narrowest part is only 24 km.The annual precipitation of the Nu River basin ranges from 400 to 2000 mm with an average of 900 mm, and the mean annual runoff is 69 km 3 .The precipitation of the Nu River basin generally decreases from southwest to northeast and demonstrates high variability due to mountain weather systems (e.g., the difference in annual precipitation between the mountaintop and valley of Gongshan is larger than 1000 mm).Annual rainfall varies significantly across this region.Figure 2b shows the annual rainfall distributions of seven stations located in the upstream, middle, and downstream of the Nu River basin.The upstream and downstream have similar rainfall distributions, with larger rainfall occurring in summer compared to winter, while the middle part observes relatively large rainfall in winter and spring.Thanks to the adequate rainfall and minimal human perturbation, the Nu River basin has an extensive vegetation coverage, with the dominant types grassland in the Qinghai-Tibetan Plateau (upper basin) and mixed forest in Yunnan Province (lower basin).However, the dense vegetation cover increases the difficulty in conducting precipitation observations and only 13 gauges are very unevenly distributed over the whole basin of 142 479 km 2 , which makes it highly challenging to obtain the accurate spatial precipitation characteristics with traditional interpolation approaches.Although the RSBP products are available for this area, they are too coarse (usually with a spatial resolution of ∼ 50 km) to capture the high spatial variability of precipitation.
Considering the limited number of gauges (i.e., 13) in the Nu River basin, an enlarged area covering 23-33 • N and 91-101 • E is chosen for the application of the fusion framework, where 59 gauges are available and the climatic and topographic conditions are similar: both regions are characterized as mountainous areas under the subtropical climate influenced by the southeast and southwest monsoons.Besides, given no rain gauges are available outside of China in this study region, the non-Chinese region is excluded from the study area.

Vegetation data
In this study, we use two MODIS (MODerate resolution Imaging Spectoradiometer) vegetation products, MOD13A3 (termed MOD hereafter) and MYD13A3 (termed MYD hereafter), in the application of the fusion framework.Both the MOD and MYD datasets contain 10 sub-datasets consisting of NDVI, EVI, and pixel reliability.The temporal and spatial resolutions of the MOD13A3 and MYD13A3 products are 1 month and 1 km, respectively.The pixel reliability is an accuracy metric of the data quality pixel and has four valid values: 0 for good accuracy, 1 for marginal accuracy, 2 for snow/ice, and 3 for cloud.Based on the pixel reliability information, the NDVI values are either selected for corresponding pixel reliability levels of 0 and 1, or discarded as anomalies otherwise.
The MOD dataset is used as a benchmark while MYD is taken as the alternative for occasions when MOD data are missing or have large uncertainties.Since both the MOD and MYD datasets are extracted from different satellites at differ-ent transit times, systematic errors may exist in the difference between the two datasets.As such, we construct two regressions to remove their systematic errors: one is based on a subset with both MOD and MYD of good reliability (= 0), and the other on a subset with MOD of marginal reliability (= 1) and MOD of good reliability (= 0).After the removal of systematic errors, a merged dataset of MOD and MYD (termed MMD hereafter) is generated under the criteria given as follows: The annual MMD dataset is then calculated by averaging the 12 monthly images.

Land-use data
The MCD12Q1 Version 51 (MODIS/Terra+Aqua Land Cover Type Yearly L3 Global 500 m SIN Grid V051) landuse dataset in the period of 2001-2013 is used to identify the outliers of MMD, while the IGBP (International Geosphere Biosphere Programme) classification is adopted for its wide applications.Due to mismatch in spatial resolutions between the MMD and MCD12Q1 datasets, the MCD12Q1 dataset is upscaled to 1 km as MMD for outlier identification.It should be noted that for any of the four 500 m pixels in MCD12Q1 classified as water, urban, snow or ice and cropland, the upscaled 1 km pixel will be assigned with a missing value (i.e., −9999) and the corresponding NDVI pixel will be identified as an outlier.

Weather data
Datasets consisting of daily precipitation and air temperature collected at the 59 gauges in the study area are obtained via the China Meteorological Data Sharing Service system (http://data.cma.cn/data/detail/dataCode/SURF_CLI_CHN_MUL_DAY_V3.0/keywords/v3.0.html).The air temperature measurements will be used for dependence analysis later in Sect.4.5.The streamflow data provided by Yunnan University will be used for calculating sub-basinscale precipitation based on water balance.The five hydrological stations are Gongshan, Liuku, Jiucheng, Gulaohe, and Dawanjiang, with drainage areas of 101146, 106681, 6308, 4185, and 7986 km 2 , respectively.MODIS evapotranspiration (ET) product MOD16 (http://www.ntsg.umt.edu/project/mod16) with the spatiotemporal resolution of 1 km / 1 weekly will also be used in calculating precipitation based on water balance.

Model calibration and validation
Based on the results of six evaluation metrics for different regression form candidates (Fig. 3a), the second-order polynomial is chosen as the regression model form in this study: where p denotes the precipitation amount in millimeters, and a, b, and c are regression coefficients.The results of regression coefficients and evaluation metrics are given in Table 1, and the NDVI-precipitation relationships for the study period are demonstrated in Fig. 3b.The best performance of the regression model is found within 0.2 < NDVI < 0.7 and 400 mm yr −1 < p < 1500 mm yr −1 .Larger errors are found at pixels with NDVI larger than 0.7 or annual rainfall higher than 1500 mm, implying the water supply is no longer a determinant of vegetation growth as annual rainfall exceeds a certain threshold.
In general, the RMIs demonstrate better performance than RME, which can be attributable to the lower variability of precipitation in a single year than the whole study period.It is also noted that the R 2 values of RMIs for drier years (2003, 2009, and 2011) are less than wetter years, indicating  the weaker coupling effect between vegetation growth and precipitation.
The performance of regression models is assessed under three scenarios as described in Sect.2.2.A total of 300 tests are conducted and performance metrics (i.e., R, R 2 , E RMS , and E MAR ) are calculated accordingly (Fig. 4 and Table 2).The high R values (> 0.85) indicate a strong correlation between NDVI and precipitation independent of sampling method.Also, the regression models demonstrate good performance, with R 2 larger than 0.75 and E MAR less than 20 %.
Hydrol.Earth Syst.Sci., 21, 999-1015, 2017 www.hydrol-earth-syst-sci.net/21/999/2017/In addition, the metrics of regression models fluctuate around that of the RME, with narrow inter-quartile ranges, indicating the regression models have remarkable consistency with the RME model.Scenario a is designed to examine inter-annual stability in the performance of regression models, where the good performance indicates the acceptable ability of the RME model in estimating precipitation during periods when precipitation measurements are not available.Scenarios b and c investigate the impacts of spatial and temporal coverages of measurements, respectively.It is noteworthy that under Scenario b better performance in regression models is observed as compared with Scenario c, implying the greater importance of spatial coverage of measurements in conducting the regressions.In addition, the results of calibration are better than validation, as revealed by all metrics criteria, as expected.However, the differences between calibration and validation are not significant, implying the consistent performance of regression models under various scenarios.
The performance of RME is further assessed by comparing the estimates against observations (Fig. 5), and good agreement between estimates and observations is observed.It should be noted that the RME shows difficulty in estimating precipitation higher than 2000 mm (cf. the dashed line in Fig. 5), implying the limitation of the fusion framework inherited from the oversaturation effect of the vegetation index.
Elevation effect on the relationship between precipitation and NDVI is a concern to appreciate.An overall negative relationship is found between precipitation and elevation for the whole elevation range (i.e., 0-5000 m) with the R 2 value of 0.62 (Fig. 6a), whereas there is only an unapparent/weak relationship at different elevation bands (Fig. 6b-f).Given the spatial heterogeneity of orographic effects on precipitation (Brunsdon et al., 2001;Daly et al., 2008) and the insufficient data of this study, a more thorough investigation of the relationship between precipitation and elevation needs to be conducted with more information that might be available in the future.Positive precipitation-NDVI relationships are found at different elevation bands (Fig. 7), with the best and worst fitness observed at elevation band 2000-3500 m with an R 2 value of 0.94 and at elevation band 0-2000 m with an R 2 value of 0.62, respectively.By comparing the three regressions at different bands with the global regression, we notice that more significant overestimates of precipitation are observed with the range of lower NDVI values (< 0.4) at band 0-2000 m than the other three regressions, whereas regression at band > 3500 m has a significant overestimation of precipitation than the other three regressions for higher NDVI values (> 0.5).

Spatial characteristics of precipitation
The spatial characteristics of the precipitation of the study area are investigated with RME for the whole study period (Fig. 8).Annual precipitation in the Nu River is ob-served to decrease from south to north and from west to east with prominent spatial variability.Two "hot-spot" regions, whose annual precipitation exceeds 1500 mm, can be identified in the study areas: one near the southern border and the other close to the southwestern mountain border.The eastern part of the Nu River basin featuring a dry and warm climate receives an average annual precipitation of 800 mm with large inter-annual variability.A precipitation product (DEMP) based on a precipitation-elevation relationship is used to compare with RME.There is no obvious distribution pattern of precipitation (Fig. 9a) and a smaller spatial variability compared to RME in the DEMP product, indicating the advantage of RME in representing the spatial variability of annual precipitation.And the overall underestimation of precipitation is observed in the DEMP product across the whole study area (Fig. 9b).In addition, the pixels in Fig. 8 with a value out of the valid range (i.e., 400 mm yr −1 < P < 1500 mm yr −1 ) may have a relatively large error as discussed in Sect.4.1.As there is no justifiable method for such a correction and given the limited fraction of invalid Hydrol.Earth Syst.Sci., 21, 999-1015, 2017 www.hydrol-earth-syst-sci.net/21/999/2017/ pixels (10 % in the whole study area and 7 % in the Nu River basin), the figure can be used to demonstrate a full picture of the spatial precipitation pattern in the study area, but we note those pixels are of large uncertainties and should be interpreted with caution.

Model performance comparison
The performance between the IDW approach, the TRMM product and the fusion framework is compared in this section.IDW is one of the most popular methods for spatial interpolation of rainfall due to its easy implementation and flexibility in incorporating other auxiliary information (e.g., elevation).In general, the IDW approach is unable to demonstrate the high spatial variability, though it can capture the general spatial distribution of the whole basin (Fig. 10a), as TRMM (Fig. 10b).Due to the coarse spatial resolution, TRMM cannot capture the high variability in the river valley, where the elevation varies significantly.Although large rainfall (> 1800 mm) is observed in both our and TRMM products in the southwest of the study area region, our product gives lower rainfall compared to TRMM.As discussed above, the regression model tends to underestimate rainfall as the annual rainfall exceeds a certain threshold because the water supply is no longer a determinant of vegetation growth.
To demonstrate the advantage of the fusion framework, a cross-validation is conducted against the randomly sampled gauge observations by varying the number of samples (1-40).The cross-validation shows a higher E RMS for the IDW approach, followed by TMMM and RME (Fig. 11a).A higher mean E MR of 15 % is observed for TRMM than for IDW (8 %) and RME (5 %), while the differences in E MAR are minimal between TRMM and IDW.The results indicate an overestimated precipitation by TRMM as compared to gauge observations.Table 3 summarizes the maximum, minimum, and mean values of each method and shows the relative difference between RME and the other two methods.
On average, the E RMS of RME is smaller than that of IDW and TRMM by 20.4 and 17.4 %, respectively.In general, the fusion framework demonstrates better performance than the other approaches.
To further evaluate the performance of RME, the annual averages of precipitation of five hydrological stations (Fig. 12a) and the whole basin estimated by the three approaches (IDW, RME, and TRMM) are compared.At the whole basin scale, the estimate by RME is 5.2 % higher than that of IDW but 7.9 % lower than TRMM.Although the difference between the three approaches is minimal at the basin scale, the difference at the sub-basin scale is remarkable.In the upstream region (i.e., the Gongshan sub-basin) located on the Tibetan Plateau, TRMM overestimates precipitation by 13.2 %, while IDW underestimates it by 7.6 % as compared with RME.In the other four downstream sub-basins, estimates by RME are larger than those by IDW and TRMM.In general, in the midstream and downstream regions with large variability in terrain height, RME gives larger estimates of precipitation than IDW and TRMM.
To validate the accuracy of different precipitation estimates, we utilize the monthly MODIS (MOD16) global ET (evapotranspiration) product with 1 km spatial resolution (Mu et al., 2011)    vation band), DEMP, TRMM, and IDW (Fig. 12b).Although all five products underestimate the sub-basin-scale precipitation, RME and BandP give the closest estimates to the waterbudget-based precipitation, indicating the effectiveness of the precipitation-NDVI relationship in precipitation remapping.
We also compared our products with the Multi-Source Weighted-Ensemble Precipitation (MSWEP) product.The dataset takes advantage of a wide range of data sources, including gauges, satellites, and atmospheric reanalysis models, to obtain the best possible precipitation estimates at the global scale with a high 3-hourly temporal and 0.25 • spatial resolution (Beck et al., 2016).Comparison in the annual mean precipitation between the gauge measurements and predictions by the MSWEP and TRMM products (Fig. 13) shows acceptable performance of both MSWEP and TRMM in predicting the precipitation with an overall overestimation.The RMSE values for MSWEP, TRMM, and RME are 241, 196, and 174 mm, respectively, indicating that RME gives the best prediction among the three products.The possible reason why MSWEP shows no superiority over TRMM in predicting annual precipitation is that very few gauges are available in this region that might limit the applicability of the MSWEP methodology.However, the MSWEP methodology does provide insights into the production of high temporal resolution (3-hourly) rainfall, which we believe will be helpful to our future work.

Influence of different vegetation indices
Considering the possible degradation in model performance caused by oversaturation of NDVI in high biomass areas, another vegetation indicator, the enhanced vegetation index (EVI), is suggested as an alternative for estimating vegetation growth (Matsushita et al., 2007;Liao et al., 2015).As such, we also test the fusion framework with EVI in addition to NDVI and the results are assessed against the gauge observations.Based on the chosen metrics, EVI is found to outperform NDVI with better regression quality (Table 4): the EVI-based regression model gives higher R 2 , and smaller E RMS and E MAR compared to the NDVI-based model.Also, a remarkable difference is observed in the precipitation estimates based on the two vegetation indices (Fig. 14).It is noted that the curvature of the EVI-based model is larger than the NDVI-based model, suggesting higher sensitivity of the EVI-based model in a humid environment.Although the EVI-based model demonstrates better performance than the NDVI-based one, it should be noted that NDVI is the most popular vegetation index used in operational applications among the available vegetation index products.Besides, NDVI has a relative longer temporal coverage compared to other vegetation index products.For instance, the AVHRR (Advanced Very High Resolution Radiometer) NDVI data have been available since 1982 with a global coverage.As such, under scenarios when EVI is unavailable, NDVI is a satisfactory index that can be used in the fusion framework.

Influence of other ambient determinants
One major assumption of the proposed framework is that precipitation is the only determinant of vegetation growth, and thus NDVI is regarded as a proxy for precipitation.However, other ambient factors, such as soil properties, solar radiation, air temperature, and elevation, may significantly influence the vegetation growth as well as NDVI values.Considering the data availability of various ambient factors, air temperature and elevation, in addition to NDVI, are adopted as extra determinants to establish the regression models, which are thus termed RME + T and RME + H for air temperature and elevation, respectively.We note that, for simplicity, the extra determinants are assumed to have a linear relationship with precipitation.
The differences in R 2 , E RMS , and E MAR between the three models are minimal, and the regression coefficients of the three models are very close to each other (Table 5).The negative regression coefficient of temperature in RME + T indicates inconsistent trends between precipitation and temperature.Since the temperature decreases with the increase in elevation, RME + T and RME + H essentially provide consistent estimates of precipitation which are also clearly shown in Fig. 15.It is also noted that the information added by extra determinants (i.e., air temperature and elevation) is in fact minimal.Overall there is little difference between RME and the other two products.As such, we consider the RME-only-  based vegetation index to be a simple and efficient model for precipitation estimation.

Conclusion
In this study, a satellite-gauge-vegetation fusion framework has been developed for estimating the precipitation in mountainous areas by establishing a regression relationship between gauge-based precipitation observations and a satellitebased vegetation dataset.The fusion framework was then ap- The fusion framework for the Nu River basin adopted a second-order polynomial form and demonstrated promising ability in capturing the high spatial variability of precipitation in the river valley.Five evaluation metrics, including R, R 2 , E RMS , E MR , and E MAR , indicated good performance of the fusion framework in precipitation estimation.The performance of the fusion framework was also compared with the IDW approach and TRMM product and the comparison results indicated that the fusion framework generally outperformed other approaches in estimating precipitation in mountainous areas.On average, the E RMS of the fusion framework is 20.4 %, 17.4 % smaller than that of IDW and TRMM, respectively.The E MR of the fusion framework is 1.2 %, 71.5 % smaller than that of IDW and TRMM.The E MAR of the fusion framework is 18.9 %, 28.3 % smaller than that of IDW and TRMM.
The success of application of the fusion framework in the Nu River sheds light on the precipitation estimation in mountainous areas by using multi-source datasets.However, this framework does have certain limitations that are important to appreciate.First, the framework is applied only in the Nu River basin.More mountainous areas under different climates need to be examined to further test the robustness of this framework.In addition, although the RME model can utilize the full knowledge of precipitation in the entire study period compared with RMI models, the difference in the coefficients suggests apparent inter-annual variability of precipitation that should be considered when applying these models.Given the duration of study period and purpose, we suggest the RME model be used for long-term climatology identification while RMI models for inter-annual variability examination.Also, to fully verify the theoretical basis of this framework that vegetation actively interacts with precipitation in mountainous areas, future work is required to refine the spatiotemporal resolution of this study to enable better scrutiny into vegetation-precipitation interactions at submonthly scales across more detailed vegetation species.

Figure 1 .
Figure 1.Flow chart of the satellite-gauge-vegetation fusion framework development.

Figure 2 .
Figure 2. (a) Terrain map of the study area (the Nu-Salween basin and its adjacent areas).(b) The distribution of rainfall during the year across the Nu River.

Figure 3 .
Figure 3. (a) Different regression form between annual precipitation and NDVI; (b) the NDVI-precipitation relationships for RME and RMI.

Figure 4 .
Figure 4. Box plots of R, R 2 , and E RMS of the RME model under three scenarios: (a) fully random; (b) all gauges, partial period; and (c) partial gauges, entire period.Details of the three scenarios refer to Sect.2.2.The triangle marker corresponds to the value (R, R 2 , RMSE) of the RME model.Plus signs represent the outlier of the sample used to draw the box diagram whose value is out of the range from (Q1−1.5IQR) to (Q3 + 1.5IQR).Q1 and Q3 represent the lower and upper quartiles, IQR = Q3-Q1.

Figure 5 .
Figure 5.Comparison in annual precipitation between the gauged measurements and predictions by the regression model for scenarios (a) fully random; (b) all gauges, partial period; and (c) partial gauges, entire period.Details of the three scenarios refer to Sect.2.2.
Figure 9. (a) The map of precipitation estimates of DEMP; (b) difference in precipitation estimates between RME and DEMP.

Figure 10 .
Figure 10.Spatial distribution of mean annual precipitation of 2003-2012 estimated by (a) IDW and (b) TRMM.

Figure 13 .
Figure 13.Comparison in mean annual precipitation between the gauged measurements and predictions by the MSWEP, RMM and RME.

Table 1 .
Regression model performance and regression coefficients.

Table 2 .
Statistics of regression models for validation and calibration under three scenarios.

Table 4 .
Regression model performance and coefficients of regression.

Table 5 .
Results of two regression models established with extra independent variables: RME + T for temperature, RME + H for elevation.Figure 11.Performance of E RMS , E MR , and E MAR for three methods in different removed numbers.