Calibration of a parsimonious distributed ecohydrological daily model in a data-scarce basin by exclusively using the spatio-temporal variation of NDVI

. Ecohydrological modeling studies in developing countries, such as sub-Saharan Africa, often face the problem of extensive parametrical requirements and limited available data. Satellite remote sensing data may be able to ﬁll this gap, but require novel methodologies to exploit their spatiotemporal information that could potentially be incorporated into model calibration and validation frameworks. The present study tackles this problem by suggesting an automatic calibration procedure, based on the empirical orthogonal function, for distributed ecohydrological daily models. The procedure is tested with the support of remote sensing data in a data-scarce environment – the upper Ewaso Ngiro river basin in Kenya. In the present application, the TETIS-VEG model is calibrated using only NDVI (Normalized Difference Vegetation Index) data derived from MODIS. The results demonstrate that (1) satellite data of vegetation dynamics can be used to calibrate and validate ecohydrological models in water-controlled and data-scarce regions, (2) the model calibrated using only satellite data is able to reproduce both the spatio-temporal vegetation dynamics and the observed discharge at the outlet and (3) the proposed automatic calibration methodology works satisfactorily and it allows for a straightforward incorpora-tion of spatio-temporal data into the calibration and validation framework of a model.


Introduction
Drylands cover 30 % of the Earth's land surface and 50 % of Africa (Franz et al., 2010).Projections of the IPCC (Intergovernmental Panel on Climate Change, 2007) indicate that the extent of these regions will likely increase in the coming decades.Dryland expansion would have a considerable additional impact on water resources, which should be taken into account by water management plans (Franz et al., 2010).
In water-controlled ecosystems, the vegetation assumes a critical role influencing all components of the hydrological cycle (Rodriguez-Iturbe et al., 2001;Manfreda and Caylor, 2013).For instance, actual evapotranspiration (aET) may account for more than 90 % of the annual precipitation in watercontrolled areas (Zhang et al., 2016;Jasechko et al., 2013).Montaldo et al. (2005) affirmed that the use of constant LAI (leaf area index) values, commonly used in hydrological applications, produces large errors in land surface flux predictions.Given the strong control exerted on aET by the vegetation, reliable estimates of spatio-temporal variations of vegetation patterns are vital to obtain trustworthy predictions of available water resources (Andersen, 2008).In this sense, ecohydrological modeling becomes essential in order to include the vegetation dynamics as an additional state variable (Rodriguez-Iturbe et al., 2001).
G. Ruiz-Pérez et al.: Calibration of a parsimonious distributed ecohydrological daily model Evidence of aET being a prevalent driver of hydrological records of streamflow and water table depth, i.e., available water resources, has been observed in many studies (Bond et al., 2002;Nyholm et al., 2003;Gribovszki et al., 2008;Conradt et al., 2013).Recently, Tsang et al. (2014) showed that adding a better evapotranspiration scheme in a widely used runoff model improves streamflow predictions.Conradt et al. (2013), who compared three different strategies for deriving sub-basin aET, affirmed that incorporating spatial variation of aET in a semi-distributed model increases its robustness.Conversely, Stisen et al. (2011) and others stressed that those improvements are not necessarily seen in the outlet hydrograph.However, it could also be interpreted in the inverse sense; good performances in terms of the outlet hydrograph do not necessarily mean more reliable estimates of aET.
The streamflow record is traditionally the only observation used for the calibration of hydrological models, but several studies demonstrated the limited capabilities of such an approach when models are validated at interior points of a river basin.Discharge represents an integrated catchment response, and hence provides only limited insight on the lumped behavior of a catchment (Stisen et al., 2011;Koch et al., 2016a;Michaud and Sorooshian, 1994;Reed et al., 2004;Smith et al., 2013).In that sense, Conradt et al. (2013) provided several examples for large simulation errors within the model domain and they mentioned, among others, the outcomes given by Feyen et al. (2008), Merz et al. (2009) and Smith et al. (2012).Moreover, Wi et al. (2015) also pointed out that caution is needed when using an outlet calibration approach for streamflow predictions under future climate conditions.This leads to the idea of using spatial state variables, and the new era of distributed (temporal and spatial) models emerges in order to balance the conceptual distributed nature of this kind of model (Stisen et al., 2011).
Traditional observation, which generally consists of point data with little spatial support, is effectively strengthened by remote sensing data, which offer the capacity to provide detailed spatial coverage and pattern information (Franssen et al., 2008;McCabe et al., 2008;Stisen et al., 2011).Additionally, satellite data have the great advantage of also being available in data-scarce areas.In this sense, the application of remotely sensed data represents an excellent source that provides information with a fairly good spatial and temporal resolution (Yang et al., 2012).In modeling, remote sensing data have been utilized in three different ways: (1) as forcing data (Xiao et al., 2004;Yuan et al., 2010;Samaniego et al., 2011;Stisen et al., 2011), (2) as a priori information of particular parameters (Winsemnius et al., 2008;Stisen et al., 2011) and (3) for model calibration and validation (see next section for an in-depth discussion of this point).
Satellite imagery provides not only temporal information but also valuable information on spatial patterns, which can facilitate a spatial-pattern-orientated model evaluation.As highlighted by Koch et al. (2015), spatial model evaluation is an active field of research not only in hydrology but also in other disciplines such as atmospheric sciences (Brown et al., 2011;Gilleland et al., 2010).However, up to now, there exists no formal guideline on how to assess the goodness of fit of the simulated spatial patterns and little information can be found about how to effectively utilize spatio-temporal data.Some authors, such as Conradt et al. (2013), Graf et al. (2014) and Koch et al. (2015Koch et al. ( , 2016b)), focused on developing and testing metrics to be employed when spatio-temporal data are involved.For example, Koch et al. (2015) compared kappa statistics, fuzzy theory and empirical orthogonal function (EOF) analysis in an attempt towards a true spatial model evaluation of distributed models.But, besides these efforts, there are only a limited number of spatial validation studies that fully embrace the availability of satellite remote sensing data by means of true spatial performance metrics (Koch et al., 2016b).EOF analysis is a versatile methodology to investigate the spatio-temporal patterns of fluxes and states in the soil-vegetation-atmosphere continuum (Fang et al., 2015).As mentioned previously, Koch et al. (2015) carried out a validation of a distributed model using satellitebased land surface temperature data by means of an EOF analysis.With other statistical purposes, the EOF analysis was used by Graf et al. (2014), Kim and Barros (2002) and Liu (2003).A fine-scale study was carried out by Drewry and Albertson (2006) who used EOF analysis to associate spatial pattern in the errors of a canopy-atmosphere model with errors in the parameters.However, to our knowledge, EOF analysis has not been applied in model calibration yet.
In summary, the main objectives of this research are the following: (1) to incorporate spatio-temporal data into the calibration process by applying the EOF methodology as an objective function and (2) to exploit satellite data as a proxy of reliable estimates of vegetation dynamics for both the calibration and validation of an ecohydrological model.To address these key challenges, a distributed parsimonious ecohydrological model was applied in a water-controlled basin located in Kenya.

Satellite data and model calibration and validation
The applicability of remote sensing to calibrate and/or validate a model by exploiting information on spatial patterns still remains a challenging task.In order to better understand this issue in more detail, a bibliographic survey of the ISI Web of Knowledge science citation index was undertaken using the following word combinations in the topic search: (1) satellite calibration, (2) satellite implementation, (3) satellite ecohydrological modeling and (4) remote sensing ecohydrology.From the total number of publications obtained by this search, only those that incorporated satellite data to specifically model calibration were selected.
On the plot scale, Quevedo and Francés (2008) and Pasquato et al. (2015) calibrated and validated a parsimonious ecohydrological model using satellite information.More recently, Ruiz-Pérez et al. ( 2016) discussed the applicability of satellite data during the calibration process comparing the results obtained by a parsimonious model calibrated using only satellite data against the results obtained by a complex model calibrated using field measurements on the pixel scale.Similarly, Quevedo and Francés (2008) 2013) calibrated a semi-distributed hydrological model using streamflow data and satellite-based aET.Regarding other satellite products, GRACE data, which can be used to detect variations in terrestrial moisture storage (e.g., Lettenmaier and Famiglietti, 2006), have been used to calibrate both global and regional-scale surface hydrology models, in combination with stream discharge data (e.g., Lo et al., 2010).Zhang et al. (2011) calibrated the AWRA-L model with streamflow, NOAA-AVHRR LAI and TRMM-MI (Tropical Rainfall Measuring Mission's, TRMM, Microwave Imager, TMI) soil moisture using a multi-objective calibration framework.Only few studies carried out the calibration exclusively against remote sensing data.For instance, Gutmann et al. (2010) calibrated landscape hydraulic properties in the Noah land surface model using only MODIS surface temperatures from 14 different sites and using observed flux data for model verification.Also, Velpuri et al. (2012) modeled Lake Turkana water levels using only satellite information.All the above-mentioned studies shared the same conclusion: including remote sensing data into the model calibration and/or validation improves the overall performance.
In general, from the total of reviewed publications, calibration using only satellite data was performed in the 47 % of cases, while a combination of satellite data and field measurements (especially streamflow at the outlet) was used in the remaining contributions.Similar results were obtained regarding the validation: 35.3 % of publications only adopted field measurements (especially historical streamflow), employing satellite data exclusively for the model calibration; 47 % used a combination of field measurements and satellite data; 11.8 % only used satellite data; and one publication without any specification.However, more interesting is how the different calibrations were carried out.In most of the cited examples, a sort of multi-objective calibration was used adopting only some points or pixels to calibrate the entire catchment.In other cases, lumped or semi-distributed models were implemented instead of fully distributed ones, considering aggregated values of the satellite data.In other words, the spatial heterogeneity of the basin is neglected and the full potential of satellite imagery, namely the information on spatial patterns, is not fully exploited.Therefore, a method able to make use of the potential of the spatio-temporal information contained in remote sensed data is highly desirable and a calibration scheme which relies solely on remote sensing data will be greatly beneficial in modeling of data-scarce catchments (Kunnath-Poovakka et al., 2016).

Study area and data
The upper Ewaso Ngiro basin is located in the Laikipia region of Kenya (Fig. 1).The basin is part of the Laikipia plateau, which lies between Mount Kenya (south east) and the Aberdare mountains (south west).The basin has a G. Ruiz-Pérez et al.: Calibration of a parsimonious distributed ecohydrological daily model drainage area of 15 200 km 2 , with the largest river being the Ewaso Ngiro.This region is characterized by distinct rainy and dry seasons.The first rainy season occurs from March to May, while the second rainy season occurs from October to December.Both air temperature and precipitation patterns are heavily influenced by elevation.A full description of the precipitation patterns in the region can be found in Franz (2007).
Soil texture ranges from sandy clay to clay soils (according to the 1980 UNESCO soil map).Although the most characteristic landscape is savanna, higher elevations are dominated by forests and a large piece of land has been converted to cropland (Franz, 2007).The remaining surfaces of the study region are classified as grassland, shrubland and wooded grassland (savanna ecosystems).
As meteorological data, we used the weather stations of the natural resource monitoring, modeling and management project (NRM3) of Nanyuki, Kenya (illustrated in Fig. 1).Daily precipitation and temperature from 1959 to 2003 were validated by Franz et al. (2010).Considering the available hydrological information, we selected a subbasin with an area of about 4600 km 2 for the present study (Fig. 1).The selected catchment is equipped with a streamflow gauge at the outlet (operational from 1980 to 2002).The reference evapotranspiration (ET 0 ) was calculated using the Penman-Monteith equation with the simplifications proposed by Allen et al. (2006).This approach is extremely useful to describe the spatial distribution of solar radiation and to derive the ET 0 maps during any day of the year (Manfreda et al., 2013).
Regarding the satellite data, we adopted the Normalized Difference Vegetation Index (NDVI) included in the MOD13Q1 and MYD13Q1 products provided by NASA (NASA Land Processes Distributed Active Archive Center, LP DAAC).This satellite product is available from 2000 to present.The MOD13Q1 and MYD13Q1 data are provided every 16 days at 250 m of spatial resolution.The used NDVI products (MOD13Q1 and MYD13Q1) are in level 3, which means that they are not raw satellite data.NDVI indices are retrieved from daily, atmosphere-corrected, bidirectional surface reflectance.
Finally, based on previous experiences (Ruiz-Pérez et al., 2016;Pasquato et al., 2015) and in similar climatic conditions, we declined to use other products such as LAI or ET derived from MODIS because these kinds of products are produced by models.And, for example, Ruiz-Pérez et al. (2016) found large discrepancies between the LAI provided by satellite and the LAI measured in field.At this point, we had no information to determine the accuracy of these particular models and the spatial information used to implement them.In contrast, NDVI values are calculated by direct differences of spectrum bands, i.e., no models are involved and we therefore decided to use this latter product instead of satellite LAI and/or ET.

Model description: TETIS-VEG
The proposed model, called TETIS-VEG, is based on a distributed hydrological model called TETIS (Francés et al., 2007) coupled with a dynamic vegetation model.Both models have simplicity of model structure in common.The used equations are as simple as possible in order to reduce the number of parameters (Table 1).The sub-models are interconnected.The transpiration calculated in the hydrological sub-model depends on the LAI simulated by the dynamic vegetation model.At the same time, the simulated LAI depends on the water stress, which is calculated using the hydrological sub-model.The hydrological sub-model can be used on different timescales (from a few minutes up to daily time steps), while the vegetation dynamic sub-model has to be applied on a daily scale.Hence, the TETIS-VEG model must be used on a daily scale.Both sub-models can be used on a broad range of spatial scales.In this research, the resolution of the implemented model was 90 m × 90 m.

4.1
The hydrological sub-model: TETIS TETIS's conceptual scheme consists of a series of connected reservoirs, each one representing different water storages in the soil column: (i) vegetation interception, (ii) first static soil layer (retained water by upper soil capillary forces, i.e., below field capacity plus water detention in surface puddles; evaporation and transpiration can occur), (iii) second static soil layer (retained water in deeper soil by capillary forces; only transpiration can occur), (iv) surface (for overland runoff), (v) gravitational soil layer (upper soil water content above field capacity for interflow) and (vi) aquifer (for river baseflow).Vertical connections between reservoirs describe the precipitation, evapotranspiration, infiltration and percolation processes.The horizontal flows describe the three different hydrological responses that give the discharge at the catchment outlet: overland runoff, interflow and baseflow.A more detailed description of the TETIS model can be found in Francés et al. (2007) and GIMHA (2014).
The TETIS model uses a split structure for the effective parameter value at each cell (Francés and Benito, 1995;Francés et al., 2007).The effective parameter is calculated using a correction factor multiplied by the estimated value of the parameter in each cell using all the available information (land cover map, soil type map, DEM, depth of roots and soil layer, etc.) and expert's knowledge.Hence, the effective parameter in each grid cell is computed as the product of two terms: (1) a common correction factor for each type of parameter that takes into account the model, information and input errors and the temporal and spatial scale effects; and (2) the a priori estimated value at each cell.For a given parameter, the a priori and effective values are different from cell to cell, while the correction factor is common for all cells (and different from map to map).
Table 1.Summary of the initial values, the search range and the final value of the parameters or correction factors of both sub-models (hydrological and dynamic vegetation sub-models) as well as the units and the reviewed references.

Model Correction factor or parameter a Units Initial value Search range Final value References
Hydrological sub-model With the split-parameter structure, only nine correction factors are calibrated.Each one related to one of these estimated parameter maps: maximum static storage, reference evapotranspiration, infiltration capacity, hillslope velocity, percolation capacity, horizontal saturated conductivity for interflow, horizontal saturated conductivity for aquifer and river channel velocity.

The dynamic vegetation sub-model: LUE model
The proposed dynamic vegetation sub-model is based on the concept of light use efficiency (LUE; Medlyn, 1998) and calculates the leaf biomass (Bl) according to the Eq. ( 1).The LUE is the proportionality between plant biomass production by terrestrial vegetation and absorbed photosynthetically active radiation (APAR) in optimal conditions.However, the LUE can be strongly affected by stress conditions.The key factors contributing to the variation of this efficiency are: soil moisture content, air temperature (Landsberg and Waring, 1997;Sims et al., 2006) and nutrient levels (Gamon et al., 1997;Ollinger et al., 2008).Since this model is designed to be used in water-controlled areas, the nutrient levels are not considered.
where B l is the leaf biomass, LUE is the above-mentioned light use efficiency, ε is the water stress factor, PAR is the photosynthetically active radiation, fPAR is the fraction of photosynthetically active radiation, Re is the respiration, ϕ l (B l ) is the fractional leaf allocation and k l is the leaf natural decay factor to reproduce the senescence.The water stress factor depends on the amount of water contained in the two static reservoirs and it is calculated according to Porporato et al. (2001).Basically, the stress factor is equal to 1 (maxi-mum stress) if the water storage is less than the water storage at wilting point; it is equal to 0 (minimum stress) if the water storage is higher than the water storage at critical point (plants start the stomatal closure); and, it varies from 0 to 1 using a potential function which depends on the wilting point, the critical point and an exponent set to equal 2.Then, this stress multiplies the LUE index, reducing the efficiency when its value is lower than 1 (non-optimal conditions).
The LAI is simulated through the product between the leaf biomass, the specific leaf area (SLA) and the vegetation fractional cover.Later, the LAI is used to calculate the transpiration in the hydrological sub-model according to Eq. ( 2).
where T i is the transpiration from the i soil layer, ET 0 is the reference evapotranspiration, EI is the evaporation of the intercepted water, LAI is simulated by the model, Z i is the percentage of roots in the i soil layer and fc is the coverage factor.
Therefore, the LUE model has eight parameters to be calibrated: (1) Specific leaf storage (the maximum interception storage is calculated as the product between the specific leaf storage and the LAI simulated by the model), (2) the LUE index (explained above), (3) the coverage factor, (4) the distribution of roots between the first and the second static storage layers, (5) the maximum LAI sustainable by the system (the simulated LAI is limited by a maximum), ( 6) the light extinction coefficient k, (7) the SLA and (8) the optimal temperature (the stress factor also depends on the temperature).

Methodology
One of the main objectives of this research was to explore the potential of satellite remotely sensed data for model calibration.The TETIS-VEG model was calibrated purely against MODIS NDVI according to the following three steps: (1) a manual calibration in order to obtain a first approximation of model parameters, (2) an automatic calibration based on the combined use of EOFs and a genetic algorithm in order to refine model parameterization and (3) a model validation carried out with both remote sensed data and traditional data (such as streamflow measurements).Since the meteorological data (precipitation and temperature) were available from 1960 to 2003 and the MODIS NDVI was available from 2000 to present, we decided to use the year 2003 as the calibration period and the period from 2000 to 2002 for validation.In order to avoid the effect of the initial conditions, we used one year as warming-up period (the year 2002 and 1999 for model calibration and validation, respectively).
For these purposes, we adopted the NDVI as a descriptor of the state of the vegetation assuming that LAI and NDVI are intimately related.Studies on various vegetation types, e.g., agroecosystems (Cohen et al., 2003), grasslands (Friedl et al., 1994), shrublands (Law and Waring, 1994), conifer forests (Chen and Cihlar, 1996) and broadleaf forests (Frassnacht et al., 1997) have led to the general conclusion that the spectral vegetation indices such as NDVI have considerable sensitivities to LAI.The relationship between LAI and NDVI can be considered linear for low values, while it becomes nonlinear for the higher values of NDVI due to the greenness saturation (e.g., Turner et al., 1999).In this case study, the maximum LAI values are around 2.0-2.5, according to Franz (2007), and are lower than the greenness saturation threshold.Therefore, the relationship between the observed NDVI and the simulated LAI is expected to be linear.

Empirical orthogonal function method (EOF)
The EOF method is generally used to analyze the spatiotemporal variability of a single variable, but a comparison between different variables can also be performed using coupled EOF techniques (Björnssson and Venegas, 1997).The method decomposes a dataset in a time series and spatial patterns.Furthermore, the method allows for estimating a measure of the "importance" of each spatial pattern.We refer to the spatial patterns as the EOFs (in literature, they are also referred to as principal components), and to the time variation as loadings (in literature, there are several terms: expansion coefficient time series, expansion coefficients, EOF time series, principal component time series, etc.).
The EOF method is essentially a linear algebra methodology based on matrix transformation.The first step is thus the conversion of the spatio-temporal data to be analyzed into a matrix.Basically, we construct a matrix (F) in which each column is the temporal variation of the data in a particular cell while each row represents the cells values during a particular time step.Usually, the second step is to compute the anomalies of the analyzed data, which was not needed in this study because we used normalized data (for reasons that will be explained below).
The next step of the applied EOF method consists of the calculation of the spatial F covariance matrix (R) according to Eq. (3).Then, the eigenvalue problem is solved by Eq. ( 4).
is a diagonal matrix containing the eigenvalues λ i of R. The c i column vectors of C are the eigenvectors of R corresponding to the i-respective eigenvalues.Each of these eigenvectors can be regarded as a map which denote the EOFs (or principal spatial patterns).In what follows, we always assume that the eigenvectors are ordered according to the value of the eigenvalues.Thus, EOF1 is the eigenvector associated with the biggest eigenvalue.The fraction of the total variance in R explained by EOF i is found by dividing the λ i by the sum of all the other eigenvalues.The time evolution of an EOF i (a i ) is calculated according to Eq. ( 5).The components of these time vectors are referred to as loadings in this paper.
Using the spatial covariance calculated according to Eq. ( 3), the EOF technique provides three different results: the main patterns or EOFs, their time evolution whose components are called loadings and the portion of spatial variance explained by each EOF, which is calculated by dividing each λ by the trace of .

Manual calibration
The manual calibration was done with a dual purpose.First, we wanted to test the applicability of the proposed TETIS-VEG model in the study basin.Second, we wanted to obtain a first approximation for the parameters and, at the same time, constrain the automatic calibration.This manual calibration consisted of the usual ad hoc method (manual adjustment of parameter values) considering the Pearson correlation coefficient between the simulated LAI and the observed NDVI in a total of 32 different points inside the basin.These points were selected within homogeneous areas defined according to the main spatial patterns of the observed NDVI (EOFs) and the available maps of land cover, soil texture, DEM, slope and soil depth.
In this case, the EOF analysis was used to identify the main spatial patterns of the observed NDVI.Once the main spatial patterns were identified, we combined our own human perception with the confusion matrices between the main spatial patterns and the spatial maps of model parameterization.Confusion matrices are widely applied for map comparison in distributed modeling comparing actual to predicted values for each specific category defined previously (García-Arias et al., 2016;Bennett et al., 2013;Van Vliet et al., 2013 among many others).Generally, the rows in the matrix represent the values predicted by the model, whereas the columns represent the actual values.By its nature, the confusion matrix is an overall measure for similarity between two categorized maps.However, the comparison of numerical maps is feasible if they are categorized previously.While most of the spatial maps of the basin characteristics (land use, soil type, etc.) were categorical, the main patterns obtained by the EOF analysis were numerical.To build the confusion matrices, the main patterns were therefore discretized according to the number of river basin features (such as land cover map, soil type map, etc.) and based on the similitude between the corresponding histograms.Once the discretization was done, by a cell-by-cell comparison of the discretized NDVI main pattern maps obtained after the EOF analysis and the available spatial maps, the confusion matrices were built.
These confusion matrices allowed the calculation of the weighted kappa (κ) coefficient (Cohen, 1968).This coefficient, whose maximum value of 1 represents a perfect agreement, was employed to identify which spatial maps (land cover map, soil type map, DEM, etc.) were linked with the main patterns of the observed NDVI.Then, they were used in order to select the most appropriate points for the manual calibration.

Automatic calibration
The most innovative aspect of the proposed procedure was the direct use of the EOF analysis in the automatic calibration.As proposed by Koch et al. (2015), we decided to build one integral matrix by concatenating both the observed and predicted data: the matrix contained the normalized values of the NDVI provided by MODIS and the normalized values of the LAI simulated by the model.In this way, the upper part of this matrix contained the temporal variation of the normalized observed NDVI in all cells as columns, while the lower part contained the temporal variation of the normalized simulated LAI in all cells as columns.We decided to use the normalized values of the NDVI and LAI because, although they are correlated, they differ in range.
However, normalization implies that some spatial information is lost.In order to avoid these losses, we added two rows in the matrix F: the first containing the difference between the temporal mean of the observed NDVI at a particular cell and the general mean using the complete NDVI dataset, and the second with the same content but referred to the simulated LAI.In this way, we included the spatial gradient of the observed NDVI and the spatial gradient of the simulated LAI.These two rows represent two additional maps included in the evaluation of the model performance.If they were similar, it would mean that the spatial gradient remains and is properly reproduced.
The number of pixels was 1 034 706.For the calibration period (year 2003), there were 44 NDVI maps (one every 8 days more or less).Hence, the built integral matrix's size was 90 rows (44 + 44 + 2 additional rows) × 1 034 706 columns.After the construction of this matrix, the EOF analysis was applied obtaining the following: the EOF maps for the matrix containing both NDVI and LAI, the portion of variance explained by each EOF map and the loadings of each EOF map.The combined EOF analysis yielded orthogonal EOF maps that explained the combined intervariability and intravariability of both datasets.For each time step, the loadings express how much the respective LAI and NDVI maps contribute to the direction of the corresponding EOF.Hence, if the observed NDVI and the simulated LAI were completely correlated, the temporal evolution of the EOF maps for both, NDVI and LAI, would be essentially equal.
The automatic model calibration was carried out by trying to minimize the differences between the loadings of simulated and observed data.The used objective function was based on that idea and it also took into account the portion of variance explained by each EOF in order to consider that the variance contribution decreases consecutively for the EOFs.The adopted error measure is described in the follow-G.Ruiz-Pérez et al.: Calibration of a parsimonious distributed ecohydrological daily model ing equation: where Error is the objective function to minimize, w i is the portion of variance explained by the EOF i , load_sim i,j is the loading of the EOF i at time step j for the simulated data (in this particular case, the normalized LAI) and load_obs i,j is the loading of the EOF i at time step j for the observed data (in this particular case, the observed NDVI).
The calibration was performed using a genetic algorithm called Pyevolve (see http://pyevolve.sourceforge.net/).This algorithm needs a seed (initial values of the parameters) and a searching boundary of the parameters to be calibrated.We used the results obtained after the manual calibration explained above as seed and made sure that the searching boundaries were wide enough (Table 1).
In order to explore the outcomes of the proposed calibration framework, we additionally calculated both the temporal Pearson correlation coefficient between the NDVI provided by MODIS and the LAI simulated by the TETIS-VEG model in each cell and the spatial Pearson correlation at each time step.For the spatial and temporal correlation coefficients, we used the original values of both datasets (NDVI and LAI), not the normalized values as used by the EOF analysis.It is important to mention that the Pearson correlation coefficient between two datasets X and Y is positive if X and Y tend to be simultaneously greater than, or simultaneously less than, their respective means.Hence, the mean should be representative.For this reason, in the case of the spatial correlation coefficient, we distinguished between the main land covers whose means can be significantly different: tree, shrubs and grass.

Validation
The period selected for the model validation was the three years from 2000 to 2002.As during the calibration period (year 2003), there were data of precipitation, temperature and, also, NDVI provided by MODIS.To validate the model, we used the same performance indexes applied during the automatic calibration process.Keeping the parameter values obtained by the automatic calibration, we built the matrix concatenating the normalized value of the observed NDVI and the normalized value of the simulated LAI with two additional rows used to incorporate the spatial gradient of both datasets as explained above.We also plotted these two maps and compared them as we did during the model calibration.Using EOF techniques, we obtained the coupled EOF maps and their associated loadings and portion of variance explained by them.As during the calibration, we compared the deviation of the loadings for each EOF map and we calculated the Error function defined in Eq. ( 6).We also calculated the temporal and spatial Pearson correlation coefficients.
In addition to this, we explored the reliability of the calibrated model in reproducing streamflow.In fact, during the validation period, the observed discharge at the outlet point was available unlike during the calibration period.The reliability of the hydrological sub-model in reproducing the streamflow was an extremely challenging task considering that the entire modeling structure had been calibrated using only vegetation data from remote sensing along with physical information about the basin.
Furthermore, we included the Nash and Sutcliffe efficiency index (NS, Nash and Sutcliffe, 1970) and the bias (or volume) error (E) value between the observed and simulated discharges at the basin outlet in the model validation.We also decided to strengthen our discharge analysis by using the concept of flow duration curves (FDCs).FDCs are simple and powerful tools, commonly used in hydrology, to describe the runoff regime in a river basin that can be representative of the model's ability to reproduce the different components of the streamflow (e.g., Manfreda et al., 2005).In fact, FDCs represent the relationship between magnitude and frequency of streamflows, thus providing an important synthesis of the relevant hydrological processes occurring on the basin scale (Pumo et al., 2013).Actually, the shape of a flow-duration curve in its upper and lower regions is particularly significant in evaluating the stream and basin characteristics (Coopersmith et al., 2012).The shape of the curve in the high-flow region indicates the type of flood regime the basin is likely to have, whereas the shape of the low-flow region characterizes the ability of the basin to sustain low flows during dry seasons (Cheng et al., 2012).Hence, the FDC represents the full spectrum of variability in terms of their magnitudes (Wagener et al., 2013).

Manual calibration
The main objective of this a priori manual calibration was the identification of the most appropriate points where the model could be tested.To accomplish that, we identified the main spatial patterns of the observed NDVI and then we compared the EOFs with the spatial features of the river basin (such as land cover map, DEM, soil type map, etc.).
Using our own perception, we identified a certain relationship between EOF 1 (which explained the 61.5 % of the observed NDVI spatial variance) and the land-use map.This potential relationship was supported by the κ coefficient (described in Sect.5) that assumed a value of 0.34.This is not a high value but it showed the existence of a relationship between the two maps.Regarding EOF 2 (which explained the 10.5 % of the observed NDVI spatial variance), no connections with the basin physical characteristics were found.It might contain a mix of several drivers and, therefore, it can't be directly linked to a single one.Conversely, EOF 3 showed a good agreement with the soil texture map (the κ coefficient was 0.32).Therefore, we can state that the observed patterns of NDVI are strongly influenced by the spatial distribution of land cover and soil texture.In the following, we combined these two maps, extracted all possible combinations and randomly selected two points of each of these combinations obtaining 32 points covering all of the catchment area.
After conducting the manual calibration, the Pearson correlation coefficient between the observed NDVI and the simulated LAI was positive in 25 points of the 32 considered points.All points with negative correlations had in common the fact that they were located near to the Mount Kenya or Aberdare mountains (Fig. 2).
Finally, Table 1 shows the obtained set of parameters.This set was used as seed during the automatic calibration.It must be underlined that all parameters had values consistent with the reviewed literature (references embedded in Table 1).

Automatic calibration
The proposed automatic calibration is based on the assumption that the closer the loadings of the simulated values are to the loadings of the observed values, the higher the similarity between the spatial patterns is.Calibration produced a good agreement between the observed and simulated loadings of EOF 1 (Fig. 3; upper left panel) but small deviation between the observed and simulated loadings related to EOF 2 and EOF 3 .The loadings of the remaining EOFs were completely scattered mainly due to their corresponding low contribution (low weight) in the objective function of the automatic calibration process (Eq.6).It is needed to remark that EOF 1 explained more than 60 % of the dataset spatial variance, while EOF 2 and EOF 3 explained around 10 % each.The remaining EOFs explained less than 3 % each, but in any case they were considered during the calibration process (weighted by the portion of variance explained by each one).
We also used three additional metrics to evaluate the model performance: (1) the temporal Pearson correlation coefficient evaluated in each cell, (2) the spatial Pearson correlation distinguishing between trees, shrubs and grasses computed at any time and (3) comparison of the spatial gradient maps.First, the temporal Pearson correlation coefficient between the observed NDVI and the simulated LAI was higher than 0.4 (Fig. 4; left panel) in most of the catchment.The weakest correlations were obtained in the two highest areas of the basin near to the Mount Kenya and Aberdare mountains with zero to negative values.
The spatial Pearson correlation coefficients were calculated excluding the regions with negative temporal Pearson correlation coefficient.Although slightly worse than the results in terms of temporal correlation, the mean spatial correlations were higher than 0.45 for all main land covers: trees (mean = 0.58), shrubs (mean = 0.49) and grasses (mean = 0.55) (Fig. 5; upper panel).The best scores were obtained in cells classified as trees.In fact, the median was almost 0.60 and the variance was not high (standard deviation = 0.16).Conversely, the cells classified as grasses obtained the worst results with the lowest median and the highest variance (standard deviation = 0.18).
Figure 6 (upper panels) shows the comparison between the maps which represent, in each cell, the difference between the temporal mean and the general mean of the observed NDVI and the simulated LAI, respectively.No great differences were found by comparing both maps indicating the good spatial performance of the ecohydrological model, at least from the vegetation point of view.

Validation
Similarly to the calibration process, EOF 1 explained more than 60 % of the spatial variance, while EOF 2 and EOF 3 explained around 10 % and 5 %, respectively, for the validation period.The remaining EOF maps are not presented because  none of them explained more than 3 %.The simulated and observed loadings of EOF 1 were almost equal, while the obtained results in relation to EOF 2 and EOF 3 were slightly worse (Fig. 3; lower row of panels).However, it is important to stress that both showed the same clear temporal dynamics.Indeed, the resulting "Error" for the validation period was 4.03, just slightly worse than the Error for the calibration period.It must be considered that the Error value was calculated considering all EOFs (Eq.6).
The temporal Pearson correlation map between simulated LAI and NDVI showed the same pattern observed in the cal-ibration period: the two areas located near Mount Kenya and the Aberdare mountains had a temporal correlation coefficient equal to zero or negative.However, in more than 80 % of the catchment, this coefficient was between 0.3 and 0.9 (Fig. 4; right panel).
Regarding the spatial Pearson correlation coefficient between simulated LAI and NDVI in the three main land covers, the results were not as good as the results obtained in terms of temporal correlation.Nevertheless, there were no negative spatial correlation coefficients at any time step.In the case of shrubs and grasses, both the mean and median Hydrol.Earth Syst.Sci., 21, 6235-6251, 2017 www.hydrol-earth-syst-sci.net/21/6235/2017/ were almost 0.4, while the corresponding value for the trees was around 0.35 (Fig. 5; lower panel).The variance obtained during the validation period was narrower than that obtained during the calibration period for the three land covers.The spatial pattern of LAI was, as for the calibration period, well captured by the model (Fig. 6; see the lower panels).The cells with high differences between their own temporal mean and the general mean were consistent in both maps.Finally, since there was observed discharge at the basin outlet during the years 2000, 2001 and 2002, it was possible to compare the discharge simulated by the model with the observations.The volume error (E) was equal to −0.40, while the NS index was equal to 0.32.E is strongly affected by the results obtained at the beginning of the validation period, probably due to the absence of information regarding the initial conditions.Although we used a year as warmingup period, the simulations improved only after 2001.In fact, having calculated the performance indexes in each year, E decreased in magnitude from −0.88 in 2000 to only −0.17 during the year 2002 (Fig. 7).Regarding the NS index, the worst result was also obtained in the first year and it improved from a negative value in 2000 to 0.35 during the year 2002, as one should expect considering the visual comparison in Fig. 7.This trend is emphasized in the plot of the FDCs (Fig. 8), where the underestimation in the first 2 years is clearly highlighted.The upper panel (Fig. 8a) compares the FDC of observations and simulations within the whole period, while the lower panels (Fig. 8 b-d) compare the corresponding FDCs within the 2000, 2001 and 2002.In these plots, the simulation seems to closely interpret hydrological response in the year 2002.

Discussion
From the a priori manual calibration step to the model validation step, it was possible to identify a behavioral pattern which would also be observed during the following automatic calibration and validation steps: EOF 1 explains more than 60 % of the spatial variance, EOF 2 around 10 % and EOF 3 around 5 %, while the remaining EOFs could be considered negligible.The fact that EOF 1 and EOF 3 of the observed NDVI were related to the land cover and soil type maps, respectively, was consistent with our expectations that NDVI is a suitable proxy of vegetation dynamics.
After the automatic calibration, the model matched the observed loadings of EOF 1 and its accuracy was slightly poorer for EOF 2 and EOF 3 .Thus EOF 1 captured the predominant pattern that was found in both the observed NDVI and the simulated LAI data.On one hand, the temporal variation of EOF 1 loadings seemed to be related to the two typical growing seasons in the catchment: the first one during March-May and the second one during October-December (Franz et al., 2010;Fig. 3).On the other hand, the loadings of EOF 2 and EOF 3 were not strongly connected with any feature.The loadings of the remaining EOFs were scattered, which implies that mainly measurement and model noise were covered by these EOFs.
The weakness of the proposed calibration methodology is that, although the associated weights to the loading deviation in Eq. ( 6) are needed, they are also misleading some spatial information.New ways to weigh the loading deviations must emerge in future research as proposed by Koch et al. (2015).In fact, due to the portion of variance explained by EOF 1 , this first main pattern controlled the calibration process.In future applications, the proposed error index may be improved by focusing or excluding specific EOFs.A popular method for deciding which EOF to keep and which to discard is to use "selection rules".Basically, there are three classes of selection rules depending on whether they focus on the amount of variance explained by each EOF, the loadings or the EOF maps (Preisendorfer, 1988).Another option could be to rotate the EOFs as proposed by Bonaccorso et al. (2003).As each rotated EOF will not explain the same variance of the unrotated one, this approach would be an option to use different combinations of EOFs, which explain different amounts of variance in order to reduce the influence of EOF 1 .The automatic calibration process worked satisfactorily as shown by the additional metrics: temporal Pearson correlation coefficient, spatial Pearson correlation coefficient in the main land covers and the comparison between the gradient maps.In terms of spatial Pearson correlation coefficient, the weakest values were obtained in the higher portion of the basin near Mount Kenya and the Aberdare mountains, while the remaining cells within the study area showed a good agreement between observed NDVI and simulated LAI.This same behavior was also observed when calibrating manually.Two reasons could explain such results.First, the observed NDVI in some cells of those areas had a really bad quality as testified by the unrealistic oscillations of the NDVI from 0.8 to 0.1 (even zero) in just one week.These unrealistic oscillations could be produced by the presence of clouds over the area near to the mountains.The second reason is related to the conceptual limitation of the proposed model.The TETIS-VEG was designed to be used only in watercontrolled areas.Franz (2007) analyzed the correlation between the fractional woody cover and the mean annual precipitation within the catchment and they were strongly correlated.However, two different slopes were observed.The transition point, which indicates when water availability had a smaller influence on the fractional woody cover, occurred approximately around 800 mm year −1 .Physically, the transition point is believed to be a good approximation of the transition from a water-controlled ecosystem to a nutrientcontrolled ecosystem.This approach allowed us to define the higher areas within the study catchment (where Mount Kenya and the Aberdare mountains are included) as nitrogen limited ecosystems instead of water-controlled.
With the exemption of these two areas, a strong correlation between NDVI and LAI existed, i.e., the model captured the temporal dynamic of LAI.However, this did not necessarily mean that the magnitude of LAI was reasonable.This last point was proven by calculating the good performance in terms of spatial Pearson correlation and the comparison between the gradient maps.No differences and good agreements were observed along the main land covers: trees, shrubs and grasses.
Finally, there were four parameters in the automated calibration which changed substantially (in relative terms) in comparison to the values obtained during the manual calibration: the correction factor of the maximum static storage, the correction factor of the reference evapotranspiration, the factor related to the distribution of roots between the first and second static storage layers and the maximum LAI sustainable by the system (Table 1).These parameters directly affect the transpiration process and the amount of available water to be consumed by the plants.In any case, all obtained values were consistent with the reviewed literature (embedded in Table 1).All of them are completely included in the searching boundary used during the automatic calibration.
The validation process confirmed (1) the model was able to capture completely EOF 1 while the model performance worsened in EOF 2 and EOF 3 , (2) the simulated LAI and the observed NDVI were temporally correlated in most of the catchment and (3) the spatial distribution of LAI was consistent as shown by the comparison between the gradient maps and the value of the spatial Pearson correlation coefficient at any time.
An additional interesting outcome provided by the validation was the comparison between simulated and observed hydrograph data at the outlet.The simulated stream flow was promising, but not completely convincing.This limitation is obvious since the model parameters were calibrated on NDVI data, i.e., the model was calibrated on vegetation dynamics.Therefore, a direct comparison between hydrographs should not be too exigent when no information about the parameters involved in the river flow routing or aquifer discharge was included in the calibration.
We strengthened our discharge analysis by using the concept of FDCs.By graphical comparison (Fig. 8), it could be observed that the model is able to reproduce the shape of the observed FDC, while some discrepancies were found in terms of magnitude.However, its performance improved considerably year by year.Since the FDC shape is an important indicator of the relevant hydrological processes occurring on the basin scale, this result pointed out the capability of the proposed model calibration methodology to reproduce the main hydrological behavior of the study basin.
The main two objectives of this research were (1) to explore if it is possible to calibrate and validate an ecohydrological model using only satellite information and (2) to incorporate spatio-temporal data about a model state variable into the automatic calibration process.In order to tackle these challenges, a parsimonious distributed ecohydrological model was calibrated by exclusively using NDVI data provided by MODIS.A methodology based on EOF analysis was proposed to carry out manual and automatic calibration of the model.Finally, the results were validated using satellite data referring to different periods and the observed discharge at the basin outlet, which was not used for calibration.
In general, the proposed model was able to properly reproduce the vegetation dynamics and the observed stream flow.The results highlight the usefulness of satellite data.It was possible to implement the hydrological and the vegetation components of the TETIS-VEG daily model using only NDVI data and also to validate the model with satisfactory results.Such outcomes are promising because they demonstrate that satellite data could be exploited in order to predict river discharge in ungauged basins.More specifically, we expect this result given the key role played by vegetation in water-controlled areas such as the upper Ewaso Ngiro river basin in Kenya, where having an appropriate description of vegetation and transpiration is critical for a correct description of the water balance on the local and basin scale.
The proposed automatic calibration was completely designed to incorporate spatio-temporal data in order to take maximum advantage of the available satellite data.After calibrating, the simulated vegetation patterns display good agreement with measured NDVI in most of the basin except for some portions at higher altitudes.This non-satisfactory result may be due to the bad quality of the NDVI data and/or the limitation of the vegetation sub-model (that was specifically designed for semiarid regions).
Model limitations along with poor data quality and resolution negatively affected the overall model performance, but the proposed procedure allowed us to exploit the amount of information available addressing the critical issue of identifying a procedure for the calibration of a distributed model.This allowed us to obtain a correct description of vegetation dynamics in space and time also providing, as a marginal benefit, a fairly good streamflow prediction.In this context, it was mandatory to adopt a daily time step in order to have a coherence with satellite NDVI data and also removing the need for a runoff propagation module in our model.
Finally, we should consider that the potential of the present study is due to the large availability of remote sensing information (not only satellite) concerning spatial state variables and more information will be available in the future.Many efforts are being made to improve the quality and quantity of remote sensing data (drones, better devices, etc.).Additionally, the scientific community must also be ready to ex-ploit the enormous amount of information contained in this data (temporal, spatial and spatio-temporal).Therefore, we should identify the best way to use all of this new available information, not only for data assimilation but also for model calibration and validation.
Data availability.The MODIS data were obtained through the online data pool at the NASA Land Processes Distributed Active Archive Center (LP DAAC), USGS Earth Resources Observation and Science (EROS) Center, Sioux Falls, South Dakota (https: //search.earthdata.nasa.gov/search).
Competing interests.The authors declare that they have no conflict of interest.

Figure 1 .
Figure 1.Map showing the general location of the upper Ewaso Ngiro river basin within the boundaries of the east sub-Saharan Africa region.The study sub-catchment (study basin) was selected because of the density of rainfall stations (points in dark red).
and Pasquato et al. (2015) used time series of NDVI to validate a parsimonious ecohydrological model named HORAS.On the catchment scale, Immerzeel and Droogers (2008) used satellite-based evapotranspiration in combination with observed streamflow to calibrate the semi-distributed SWAT model.Zhang et al. (2009) concluded that multi-objective calibration of the SIMHYD model against streamflow and satellite-based aET produced better daily and monthly runoff compared to calibration with streamflow alone.More recently, Rientjes et al. (

Figure 2 .
Figure 2. Location of the points where the manual calibration was carried out.The value of the Pearson correlation coefficient between the satellite NDVI and the simulated LAI appears together with the point used for the manual calibration of the model.

Figure 3 .
Figure 3.The three first EOFs during the calibration (a) and during the validation (b) are represented.The y axes reflect the unitless loadings of each EOF.The x axes reflect the time step.

Figure 4 .
Figure 4. Temporal Pearson correlation coefficient between the NDVI provided by MODIS and the LAI simulated by the model during the calibration and validation periods.The two areas with negative values correspond to the Mount Kenya and Aberdare mountains.

Figure 5 .
Figure 5. Spatial Pearson correlation coefficient during the calibration (a) and during the validation (b) distinguishing between the main land covers: tree, shrubs and grass.The whiskers were calculated according to the 98th percentile and the outliers were plotted as × symbol.The median is the line inside the boxplot and the mean is the square tile symbol.

Figure 6 .
Figure 6.Comparison between the maps where each pixel color represents the difference between the temporal mean calculated in this particular pixel and the general mean calculated using all datasets of the simulated LAI (a, c) and observed NDVI (b, d) in both periods: calibration (a, b) and validation (c, d).This difference is a measure of the spatial gradient of both variables (LAI and NDVI).

Figure 7 .
Figure7.Time series of rainfall and observed and simulated daily discharge (m 3 s −1 ) during the validation period(2000, 2001 and 2002).