Improving the complementary methods to estimate evapotranspiration under diverse climatic and physical conditions

Reliable estimation of evapotranspiration (ET) is important for the purpose of water resources planning and management. Complementary methods, including complementary relationship areal evapotranspiration (CRAE), advection aridity (AA) and Granger and Gray (GG), have been used to estimate ET because these methods are simple and practical in estimating regional ET using meteorological data only. However, prior studies have found limitations in these methods especially in contrasting climates. This study aims to develop a calibration-free universal method using the complementary relationships to compute regional ET in contrasting climatic and physical conditions with meteorological data only. The proposed methodology consists of a systematic sensitivity analysis using the existing complementary methods. This work used 34 global FLUXNET sites where eddy covariance (EC) fluxes of ET are available for validation. A total of 33 alternative model variations from the original complementary methods were proposed. Further analysis using statistical methods and simplified climatic class definitions produced one distinctly improved GG-modelbased alternative. The proposed model produced a singlestep ET formulation with results equal to or better than the recent studies using data-intensive, classical methods. Average root mean square error (RMSE), mean absolute bias (BIAS) andR2 (coefficient of determination) across 34 global sites were 20.57 mm month −1, 10.55 mm month −1 and 0.64, respectively. The proposed model showed a step forward toward predicting ET in large river basins with limited data and requiring no calibration.


Introduction
A reliable estimate of ET (evapotranspiration) in river basins is important for the purpose of water resources planning and management. ET represents a significant portion of rainfall in the water balance especially in semiarid regions where most rainfall is typically lost as ET (FAO, 1989). Therefore, the uncertainty in estimating ET can lead to the inaccurate prediction of water balance. A careful screening of available meteorological, land use/land class and related hydrologic data in typical rural river basins suggest that ET is more challenging to calculate given the limited data. Data limitations in most rural river basins highlighted the importance of using alternative methods as opposed to the classical methods using land use/land cover data. While remote sensing techniques are available to estimate ET, such methods are expensive and necessary data may not be readily available for verification (Jimenez et al., 2011). Complementary methods initially proposed by Bouchet (1963) and others are alternative methods that can be used to calculate ET using meteorological data such as relative humidity, temperature and sunshine hours.
There are several classical methods presently available to estimate potential ET whereas estimating actual ET requires detailed local data such as land cover/land use, crop pattern and growing cycle. Typically, these classical methods predict crop ET from crop covered areas during the growing season to manage agricultural water demands. Crop ET is nothing but the potential ET multiplied by an appropriate crop coefficient, which is sometimes called the two-step approach (Allen et al., 1998). However, the actual water loss In water resources planning, the important estimate is the total water loss from the land surface that may or may not include transpiration from crop areas.
For several decades, complementary methods, including CRAE (complementary relationship areal evapotranspiration; Morton, 1983), AA (advection aridity; Brutsaert and Stricker, 1979) and GG (Granger and Gray;Granger and Gray, 1989) methods, have been used to estimate ET or total water loss from the land surface independent of land cover. These methods are attractive due to simplicity and practicability in estimating ET, wet environment ET (ETW) and potential ET (ETP) at the regional scale using meteorological data only. Previous studies attempted to use the complementary methods with little success (Doyle, 1990;Hobbins et al., 2001;McMahon et al., 2013;Szilagyi and Kovacs, 2010;Xu and Singh, 2005) given the limited understanding of the methods and the conflicting definitions of different terms. Still the complementary methods offer a distinct advantage over the classical methods given the simplicity, ready availability of required data and the ability to estimate total water loss as opposed to crop ET only.
Any improvements to the complementary methods cannot be conducted without the use of actual ET measurements. As part of this study, it is important to use measured ET data for model validation. Currently, ET fluxes are directly measured using the eddy covariance (EC) method that uses surface energy fluxes for weather forecasting and hydrologic modeling. These fluxes include sensible heat (H ) and latent heat (LE) fluxes. Compared to other methods such as lysimeters, an EC system produces minimal physical disturbance to the surrounding environment and captures the areal fluxes within the footprint area (Luo et al., 2010). Most importantly, EC data are freely accessible worldwide, for example, FLUXNET (http://fluxnet.ornl.gov/), which is a global network of micrometeorological sites that use the EC methods to measure land-atmosphere exchange of carbon dioxide, water vapor and energy fluxes (Baldocchi et al., 2001). FLUXNET comprises of free-access regional networks such as AmeriFlux, AsiaFlux, EuroFlux and CarboAfrica. Given the task of finding a large set of global data with different climatic conditions and physical conditions, this study used the FLUXNET sites similar to many other studies (Castellvi and Snyder, 2010;Huntington et al., 2011).
The major limitation of the EC method is the lack of energy balance closure (i.e., H + LE = R n -G soil , where R n is net radiation and G soil is soil heat flux) that causes underestimation of ET (Wilson et al., 2002). Twine et al. (2000) and Wang et al. (2008) showed that underestimation of ET can be as high as 15 %, however, others, Castellvi et al. (2008), Huntington et al. (2011) and Wilson et al. (2002), found lower percentages within measurement uncertainty that can be < 5 %. These studies showed that the impact of energy im-  Morton, 1983). balance in the EC method may not be significant as thought earlier (Castellvi and Snyder, 2010). Hence, the EC method is still attractive and served as the standard method for direct measurement of ET fluxes (Castellvi et al., 2008;Luo et al., 2010). Hobbins et al. (2001) and Xu and Singh (2005) found limitations to the complementary methods in different physical and climatic conditions especially in arid settings. Some of these limitations lead to many unanswered questions such as how applicable are the complementary relationship to estimate ET? Are these methods only valid within humid climates? What are the limitations in the different complementary methods? Have complementary methods been compared to measured ET data under a variety of climatic and physical conditions? Given these unanswered questions, it is important to address the validity of the complementary methods in a scientifically justifiable manner.
It is found that there is no single study where the ET estimates from the complementary methods have been extensively predicted and evaluated using data from EC sites. To evaluate the applicability of the complementary methods and to propose suitable changes, the methods need to be evaluated under a variety of land cover/land use classes and climatic conditions. In addition, the three complementary methods, CRAE, AA and GG, have not been cross-compared and evaluated using measured ET data. Therefore the goals of this study are to investigate the applicability of the complementary methods in estimating ET in contrasting environments, perform necessary revisions to the existing methods to improve estimates if necessary and finally propose a universal model of estimating ET that is calibration-free, simple, robust and uses minimum data.  Bouchet (1963). The theory states that a complementary relationship exists between ET and ETP as shown in Fig. 1 (see Davenport and Hudson, 1967;Pettijohn and Salvucci, 2009). ETW, however, is ET that would occur if the soil-plant surface is wet enough so that ET could approach its potential value, ETP (Granger, 1989). The development of the complementary relationships and the definitions of various terms are discussed in detail by Brutsaert and Stricker (1979), Granger and Gray (1989), Lhomme and Guilioni (2006), McMahon et al. (2013), Morton (1983) and Pettijohn and Salvucci (2009). The three definitions of ET are related as where ET, ETW and ETP are in millimeters per month (mm month −1 ). Equation (1), which is the Bouchet original expression, indicates that an increase in ET is accompanied by an equivalent decrease of ETP; i.e., δET = −δETP. In other words, as the surface dries, actual ET decreases causing a reduction in humidity and an increase in temperature of the surrounding air, and as a result ETP will increase. Once ETP and ETW are estimated, ET is subsequently derived. In the literature, the complementarity relationship between ET and ETP shown in Eq. (1) is of controversy among scientists who claimed that many inherent assumptions of Bouchet's theory lack sufficient evidence (Granger, 1989;Lhomme and Guilioni, 2006). Recently, there have been several attempts to improve the complementary relationship and its predictive power of different ET definitions (see Brutsaert and Stricker, 1979;Granger and Gray, 1989;Morton, 1983). Han et al. (2012) developed a nonlinear approach to the complementary relationship but the results require further study and verification. Yet, Lhomme and Guilioni (2010) proposed a different model that can describe the complex relationship between ET and ETP based on the convective boundary layer.

CRAE method
ETP is estimated by solving the energy balance and vapor transfer equations iteratively (Morton, 1983). ETP is calculated by solving at equilibrium temperature (T P in • C) at which the energy balance and vapor transfer equations for a moist surface are equivalent. The procedure describing the iterative solution is given by Morton (1983, Appendix C). The energy balance equation to estimate ETP is given as where R T is net radiation for soil-plant surfaces (mm month −1 ) at air temperature T ( • C), λ is the heat transfer coefficient (mbar • C) and f T is the vapor transfer coefficient (mm month −1 mbar −1 ). To estimate ETW in Eq. (4), net radiation for soil-plant surfaces at T P (R TP ) is first computed using Eq. (3).
where γ is the psychrometric constant (mbar • C −1 ), b 1 is a constant representing advection energy, b 2 is a constant and P is the rate of change of saturation vapor pressure with T at T P (mbar • C −1 ). Constants b 1 and b 2 were calibrated using climatic data from arid regions in North America and Africa (Morton, 1983). ETP from Eq. (2) and ETW from Eq. (4) are used in Eq. (1) to calculate ET of the CRAE method.

AA method
In the AA method, Penman's (1948) equation (ET PEN ) is used to estimate ETP as shown in Eqs. (5) and (6).
where is rate of change of saturation vapor pressure with T (mbar • C −1 ), R n is net radiation (mm month −1 ), G soil is soil heat flux (mm month −1 ), E a is drying power of air (mm month −1 ), β is a constant and usually equals to 1.0, U is wind speed at 2 m above ground level (m s −1 ), e s is saturation vapor pressure at T (mm Hg) and e a is vapor pressure of air (mm Hg). In the wind formulation of Penman (1956), β was updated to 0.5. Although both wind function formulae (when β = 1 or 0.5) are widely used in hydrology, Penman preferred a β value of 1 (see Brutsaert and Stricker, 1979;McMahon et al., 2013). Brutsaert and Stricker (1979) mentioned that their method is insensitive to the wind function. The first term of Eq. (5) is called equilibrium ET and the second is aerodynamic ET that is generated by large-scale advection effects (see Hobbins et al., 2001). When advection is minimal, the interactions of atmosphere with the soil-plant system will be completely developed and an equilibrium condition is approached (Brutsaert and Stricker, 1979). ETW of the AA method is calculated using ET PT of Priestley and Taylor (1972) in which minimal advection is assumed and given by Eq. (7).
where α is a coefficient that typically equals to 1.26 or 1.28 (Priestley and Taylor, 1972). The AA method in this study used the values α of 1.28 and β of 1. ETP from Eq. (5) and ETW from Eq. (7) are used in Eq.
(1) to calculate ET of the AA method.

GG method
The complementary relationship given in Eq.
(1) is primarily used by the CRAE and AA methods. In the GG method, Equation (8) is reduced to Eq. (1) only when γ = . In this method, two new concepts were proposed and empirically correlated together; relative drying power (D) and relative evaporation (G) shown in Eqs. (9) and (10), respectively.
where D indicates surface dryness, i.e., D becomes larger as the surface becomes drier. G is ET that occurs under similar wind and humidity conditions from a saturated surface at the actual temperature (Granger and Gray, 1989).
In the original work, G was defined as G 1 through Eq. (11) where this equation was empirically derived using data from two stations in a semiarid region of Western Canada. Granger and Gray (1989) mentioned that G 1 is independent of land use.
where c 1 = 1.0, c 2 = 0.028 and c 3 = 8.045. In the GG method, the selection of the function to calculate relative evaporation (G) has great impact on the actual ET estimates and any modification to this empirical formula may be significant in improving the predictability of the GG method. In essence, there is more research required in this effort. Thus, Eq. (11) was later modified by Granger (1998) to account for different surface conditions as shown in Eq. (12).
where c 4 = 0.793, c 5 = 0.2, c 6 = 4.902 and c 7 = 0.006. Therefore G in Eq. (10) can be substituted by G 1 of Eq. (11) or G 2 of Eq. (12). ETW required to solve Eq. (8) is obtained from Eq. (5), used earlier in the AA model. Thereafter G 1 is used in Eq. (10) together with Eq. (9) to solve for ET in Eq. (8). The final equation describing ET in the GG method is therefore given as where ET, R n , G soil and E a are in millimeters per month.
Although the CRAE, AA and GG methods enable the direct prediction of ET without the need for surface parameters (temperature and vapor pressure), but the GG method is the only method that does not require a prior estimate of ETP (Granger, 1989).

Alternative method (ASCE)
In the popular ASCE (American Society of Civil Engineers) method (Allen et al., 2005), input data to calculate net radiation (R ASCE ) are similar to those of the CRAE method. More specifically, the ASCE method requires minimum and maximum temperature data, which sometimes are not available. In such a case the procedure described by Allen et al. (2005, Eq. E.5) is followed. One major difference between the CRAE and ASCE methods is the albedo calculation. In the former, albedo is calculated using a set of equations whereas albedo is fixed at 0.23 in the latter. The ASCE method also requires wind speed measurements to calculate ETP while estimating crop ET requires detailed information of land cover/land use, crops, cropping pattern and the growing cycle. The ASCE method is specifically utilized in this study to compare R ASCE with R T and R TP . The ASCE method is also used to calculate G soil using monthly averages of temperature data.

Sites of EC data
In this study 34 global sites were selected with measured meteorological and flux data and these sites are distributed as follows: 17 from AmeriFlux sites, 11 from EuroFlux sites, 5 from AsiaFlux sites and 1 CarboAfrica site (see Fig. 2). Unfortunately, efforts to obtain data from other sites in Car-boAfrica have not been successful. The selection of the 34 sites was based on data availability and climatic variability. The details of the sites and data collected are shown in Table 1 and Fig. 2. The reason to select 34 sites is that prior studies have typically used a smaller number of sites and in most cases under similar climatic conditions. By using a variety of global sites in contrasting physical and climatic conditions with  where P ann is average annual precipitation in millimeters and T ann is average annual T in degrees Celsius. Unlike other aridity indices, AI M indicates the availability of both water and energy from readily available data. In effect, the sites were sorted to the following climatic classes; very humid (AI M ≥ 35), humid (28 ≤ AI M < 35), subhumid (24 ≤ AI M < 28), Mediterranean (20 ≤ AI M < 24), semiarid (10 ≤ AI M < 20) and arid (AI M < 10). As shown in Table 1, the 34 sites have different geographic and climatic conditions. The data set consists of 1657 monthly measurements across the 34 sites. The P ann values range from 196 mm at site 25 to 2231 mm at site 4, and T ann varies between −1.7 • C at site 3 and 26.3 • C at site 4. It is noticed that many sites fall within the very humid climatic class. The surface conditions also differ considerably from grasslands to forests. Data are available from 12 to 120 months from 1992 to 2010. At site 1, for example, data from 24 months are available in 2006 and 2007, while at site 4 there are no ET data in April 2003. Therefore, the total number of months included in the calculations from 2002 to 2005 is 47 instead of 48.
The EC tower heights vary from 2 m at site 24 to 103 m at site 14 with a median value of 10 m at site 7 and an average value of 17.1 m. The EC tower height reflects the vertical flux footprint that usually indicates the upwind area captured by the instruments mounted on the tower. Starting from very humid, humid, subhumid, Mediterranean, semiarid to arid climatic classes, the average EC tower heights are 24.8, 28.2, 15.8, 10.2, 4.6 and 6.8 m, respectively. It is no surprise that the tower heights are highest in the very humid sites where the land cover is dominated by forests of high canopy altitudes. However, low tower heights are required for arid and semiarid sites naturally characterized by grassland or shrubland covers. The high range of EC tower heights explains the suitability of selecting these particular 34 EC sites that have flux footprints of the scale of the complementary methods. This observation may lead to the conclusion that a perfect correlation between the EC and complementary methods may exist.
Compared to the lowest average-ET EC flux (10.5 mm month −1 ) that occurs at site 25, site 4 has the maximum of 134.3 mm month −1 . It is observed that site 4 has the highest ET EC fluxes across the 34 sites because the site is located in tropical peat swamp forests where soil moisture is relatively high throughout the year (Hirano et al., 2005) and the site is also exposed to high energy demands. In general, the wide ranges of ET EC fluxes and AI M values reflect the diversity of hydrologic and climatic conditions present in this study.

Measured flux data from EC systems
In comparison to finer-resolution data, collecting data at a monthly scale is easier in rural and sparse areas, less problematic when data quality is poor and more appropri-ate for regional-scale studies. Thompson et al. (2011) examined model performance using different timescales from half hourly to interannual and found that a monthly time step is preferable. Data in this study were directly downloaded from its regional network website and sometimes obtained (or complemented) through personal communications. In cases where monthly data were not readily available, average monthly data were aggregated from finer-time-resolution data, e.g., daily or hourly. To keep minimal changes to the input data, only months of available data (50 % or more) were considered in the analysis.
Input data requirements are often the driver to select a specific method to estimate ET. Even in rural regions where data limitations are common, data to calculate R n with the CRAE method (Morton, 1983) only requires monthly averages of temperature, humidity (or dew-point temperature) and sunshine hours (or solar radiation). Again, the CRAE method calculates two types of R n , R T and R TP , at the same time. It is obvious that the CRAE method can also estimate ETP, ETW and ET using the same data. However, both AA and GG methods, similar to any classical method, need wind speed measurements to calculate ET (see Eq. 6).
The performance indicators used to assess the model predictions are root mean square error (RMSE), mean absolute bias (BIAS) and coefficient of determination (R 2 ). As the number of sites is large, the BIAS, which indicates the disparity of predicted and measured ET, is preferred over the mean bias value itself because negative values of mean bias cannot cancel positive values.

Model development and results
The approach used here is a systematic model sensitivity analysis across the three existing complementary methods to identify the major model components contributing to predicting ET compared to the EC observations. The findings from each step of the sensitivity analysis is later used to propose a universal model that is calibration-free and capable of predicting ET (or total water loss) independent of land cover/use. The proposed approach can be divided into four stages: (1) first, the three original complementary methods are applied across all 34 sites to identify the relative accuracy of each method, (2) using the results obtained from the first stage, a set of model variations representing the different model structures will be developed, (3) next the model variations with acceptable results will be selected for further analysis, and, finally, (4) a statistical analysis will be conducted to differentiate between the final model(s) to identify a universal model capable of predicting ET across all sites without calibration. To further test the proposed model, the results of this study will be compared with the results of recently published ET studies.

Comparison between original complementary methods
The ET estimates computed using the three original complementary methods were compared to the measurements from the EC sites (ET EC ) and the results are given in  Xu and Singh (2005) evaluated three sites of diverse climates and found that the predictive power of the methods increases with humidity. This conclusion contradicts the results in Table 2 as the CRAE and AA methods perform best in arid climates. In general, the three methods work relatively well under extreme climatic conditions, either arid or humid. Also the predictions of the GG method are slightly better in humid climates than arid as found by Xu and Singh (2005). Overall, the CRAE method is the best according to RMSE and R 2 while the GG method has the lowest BIAS. Still, the computed ET estimates are not close enough to the ET EC measurements indicating that there is a need for improvements to the existing methods.

Development of alternative model variations
The prior estimates of ET are highly dependent on R n . Net radiation computed by Morton (1983) is denoted as R T , which is net radiation at T while R TP is net radiation at T P . Net radiation from Allen et al. (2005) is denoted as R ASCE . When compared to the R n measurements from the EC sites, the three estimates of net radiation perform better as humid-ity increases. Although detailed results are not shown here, the average R 2 values of R T and R ASCE estimates range from 88 to 98 % and from 92 to 98 %, respectively. While R ASCE is the overall best estimator of R n , R T performs better in arid and semiarid regions. The results of this analysis clearly indicate that the net radiation prediction is dependent on the climatic class and therefore, any improvements should consider climate dependency. Selecting the correct equations to calculate ETP, ETW and even ET may significantly influence the accuracy of the net radiation estimates. This work used the original model equations of the CRAE, AA and GG methods in different ways. This study is not meant to explore all possible relationships between ETP and ETW; instead the focus here is developing a reliable predictive model of actual ET that is applicable under a variety of climatic and physical conditions. Therefore, the relationships and model equations of the original methods were used here in a manner to preserve the physical processes controlling ET. Similarly, there are two formulae to describe the complementary relationship, namely Eqs. (1) and (8). It is true that there may be other possible formulae to simulate the complementary relationship between ET, ETW and ETP. The drawback of these approaches is the need for calibration for which the revised model will be applicable for a given site or region. This condition is against the original purpose of this study that attempts to develop a model that is widely applicable for many different climatic and physical conditions.
In stage 2, different combinations of model formulations are considered to develop a set of alternative model variations that may be better than the original methods. For instance, these alternative model variations can decide if R T is a better estimator of net radiation compared to R ASCE or not. Similarly another question is if the complementary relationships are adequately presented by Eq. (1) or Eq. (8) or if a different formulation is needed. In selecting these different alternative model variations, the criteria for the sensitivity analysis used are the method to calculate R n , the representation of the complementary relationship, the value of   Table 3 for subsequent analysis. As discussed earlier, this is a systematic parameter sensitivity exercise to identify the best alternative model variation. Although more model variations are possible, the 17 listed alternative model variations are adequate at this stage. For example, the AA and GG methods have four criteria each (R n , complementary relationships, α and β) producing 16 model variations. An important consideration in the development of these model variations is the conclusions of others. For instance, Hobbins et al. (2001) found that changes to the AA method did not necessarily produce superior results, especially by perturbing β (see Brutsaert and Stricker, 1979). The ET estimates produced by these 17 alternative model variations across the 34 sites were compared to the EC measurements and the results are shown in Fig. 3. It should be noted that Fig. 3 shows the anomalies from the original method for each model variation. In effect, the results are considered to show improvements if the anomaly of RMSE is negative. The same trend is valid for BIAS but opposite for R 2 . It is observed that none of the CRAE-or AA-based alternative model variations improved RMSE and BIAS. Among the CRAE-based model variations, CRAE2 has the minimum deterioration of RMSE and BIAS while showing some improvement of R 2 . A similar behavior is noticed with AA4 of the AA-based model variations. However, the GG-based model variations have obvious improvements across all three metrics. GG1, GG3, GG5 and GG7 model variations showed improved RMSE and BIAS values when compared with the original GG method. The only common feature among these four GG model variations is Eq. (1) representing the complementary relationship and not Eq. (8), which was used by the original GG method. This observation indicates that Eq. (1) is superior in representing the complementary relationships between ET, ETW and ETP. The deterioration of results in the GG-based model variations is deemed minor when compared to other model variations. The conclusion from stage 2 is that these GG model variations perform better than the CRAE and AA model variations.
Although ETP is usually given under saturated conditions in the equation of Penman (1948) as shown in the original AA method, the definition of ETW still has some ambiguity (Lhomme and Guilioni, 2006). One important difference of the original GG method compared to the other two methods is the equation describing ETW. ETW of the original CRAE and AA methods is derived from the ET PT equation (Eq. 7) while the original GG method uses the ET PEN equation or Eq. (5) (Brutsaert and Stricker, 1979;Granger and Gray, 1989;Morton, 1983). Given this departure of the GG model  from others, we further studied the GG model variations based on the model describing ETW. Accordingly, another set of alternative model variations from the GG model is possible. These variations consist of 16 models (GG8 through GG23) and the details are given in Table 4. In these variations, β is no longer changed while α in the ET PT equation will be changed. ETW in all these variations will use the Priestley-Taylor equation (Table 4). In total, 24 GG model variations (GG1-GG23 from Tables 3 and 4) are now considered for the next stage.

Selection of best performing GG model variation
For the purpose of selecting the best GG model variation(s), each model from the latest 24 was run and the results were compared with EC observations (see Table 5). The performance metrics were used to identify the best GG model variation in each climatic and performance metric combination and the results are shown in Table 5. For example, GG3 was the best for RMSE, GG1 for BIAS and GG17 and GG23 for R 2 in the very humid class. In essence, 11 GG model variations became eligible from the 24 selected earlier from stage 2. It is also observed that GG20 is the best for six combinations of performance metric and climatic class combinations. In contrast, GG3 is the best only in RMSE for the very humid class. GG1, GG3, GG11 and GG13 are the best models each for one combination of metric and climatic class combination only. Therefore these GG model variations were rejected and the remaining seven (GG7, GG14, GG17, GG18, GG20, GG22 and GG23) were selected for further consideration.
There are other key observations made from the prior analysis. First, the original GG method uses the complementary relationship given by Eq. (8) (Granger, 1989), yet, five of the seven promising model variations selected earlier uses Eq. (1). In essence, this observation suggests that Eq. (1) is better in capturing the variability of ET compared to Eq. (8). Second, six of these seven promising GG model variations use the ET PT equation to calculate ETW. Third, a comparison between R T and R ASCE shows that six of these promising GG model variations use R ASCE to denote net radiation that supports the conclusion drawn earlier. Fourth, five of these GG model variations use Eq. (12) to calculate G. Lastly, changing the value of α in the ET PT equation and varying the equation describing G did not alter the results. The next step of the analysis will be to identify the best model variation of the seven selected earlier. Before proceeding to the next step, the six climatic classes are simplified to represent climatic variability using three simple classes; wet (from original very humid and humid), moderate (from original subhumid and Mediterranean) and dry (from original semiarid and arid). This revision shall not affect the results and will make the analyses and conclusions simple. Using these new definitions, the original 34 global sites are now reallocated as 18, 6 and 10 into wet, moderate and arid classes, respectively. Figure 4 shows the results of performance metrics to these seven models using the simplified climatic classes of wet, moderate and dry. For all climatic classes, GG17 has the highest RMSE and GG7 has the highest BIAS values. GG7 performs well only in the wet climatic class, while it performs poor in the moderate and dry classes. The GG17 and GG23 model variations have identical behaviors since these differ in the α value only. Both models fail in the moderate climatic class. It is also noticed that GG14 does not simulate ET well in the moderate climatic class.
Overall, GG22 has the lowest median and average values of RMSE that are 16.20 and 20.23 mm month −1 , respectively. These results indicate that GG22 has the potential to be the best model variation. Based on BIAS for all sites, the lowest average value is 10.55 mm month −1 for GG18, while the lowest median value is 7.45 mm month −1 for GG20. Comparing the three model variations, both GG18 and GG20 have same R 2 of 0.64 and GG22 produced 0.62. It is therefore reasonable to state that GG18, GG20 and GG22 are the best GG model variations for further consideration.
There is no evidence to suggest that a specific model variation from these three models is superior in a particular climatic class. The climatic class with poorest performance is the moderate class. The reason may be the low number of sites in this class and therefore extreme values such as those of site 24 can dramatically influence the results. In the moderate climatic class, GG22 has the lowest average RMSE and BIAS, however, GG18 and GG20 share the highest average R 2 . It is also noted that all three model variations have the following similarities; net radiation is calculated by R ASCE , the complementary relationship is represented by Eq. (1) and the ETW is computed by Eq. (7).
The performance metrics (RMSE, BIAS and R 2 ) for the three model variations can be compared with uncertainty associated with observed EC-based fluxes to assess the overall accuracy of the methods. For example, Mauder et al. (2007) showed that RMSE and bias of LE sensors normally range from 38 to 61 mm month −1 and from −29 to 30 mm month −1 , respectively. In another study, it was found that EC data are comparable to weighing lysimeter ET measurements (Castellvi and Snyder, 2010) when the RMSE was 26 mm month −1 and R 2 was 0.98. These results indicate the high efficiency of the three model variations, namely GG18, GG20 and GG22, in predicting the actual ET.

Statistical analysis
The applicability of the three GG model variations, GG18, GG20 and GG22, is further investigated using the analysis of variance (ANOVA) to assess if these three models are similar or not (Berthouex and Brown, 2002). The ANOVA test was used on the time series consisting of 1657 estimates of ET from each model variation and measured ET EC . The average values of ET across the 34 sites are 35.9, 33.8, 33.2 and 32.0 mm month −1 , for GG18, GG20, GG22 and measured data, respectively. There is a tendency to underestimate average ET by all three model variations. The reason may be the similarity in structure of the three GG model variations.
The ANOVA F test statistic (F V 1,V 2,1−CI ) was computed for the four time series (three GG model variations simulated and ET EC observations) at 95 % confidence level (V 1 is number of models minus 1, V 2 is number of measurements minus the number of models, and CI is the confidence interval) and compared to that of the F test of ANOVA. Simply, if the F test is smaller, methods are alike. In this case, F 3,1653,0.05 is found to be 2.60 (Berthouex and Brown, 2002, Table C in Appendix) while the F test is 4.58. Therefore, it is obvious at 95 % confidence, the averages of the four time series are not equal; however, the test cannot identify which model variation is different compared to the others. For this purpose, Dunnet's method (Berthouex and Brown, 2002) was used to compare the three GG model variations to the measured ET EC fluxes. Dunnet's method has the advantage to answer two questions: a confidence interval in which average values are alike and the direction of the difference. The results of Dunnet's method showed that at 95 % confidence interval, the average ET is between 32.3 and 39.4 mm month −1 . In other words, GG22 is statistically different while the difference in each of the other two model variations is likely to be insignificant. Figure 5 shows the average ET estimates across 33 sites according to the climatic class. At site 4, none of the models can simulate the elevated ET fluxes measured. In general, GG22 underestimates ET as humidity increases. However, the scatter of data around the 1:1 line for most climatic classes is more pronounced with GG18 and GG20. The similarity between GG18 and GG20 is visible because the only difference between the two models is α in the ET PT equation that does not influence the results. In fact, GG18 has two advantages over the other two model variations: it has the closest average ET value to that of the ET EC fluxes and closest to the 1:1 line (see Fig. 5). Hence, GG18 is deemed to be the best from the seven promising GG model variations.
In Fig. 6, the performance metrics of GG18 are shown for each site in the three climatic classes. The R 2 values have a minor increasing trend with humidity. The R 2 values at sites of the wet climatic class mostly lie above the average value and vice versa for the dry climatic class. There is no such trend with RMSE and BIAS. However, the RMSE and BIAS values at most sites of the dry climatic class are below the average value. Again, it is emphasized that site 4 has specific data issues that have to be further inspected. Generally, Fig. 6 demonstrates that GG18 is consistently predicting ET across these 34 sites that have diverse climatic and physical conditions. It also indicates that there is no evidence that the flux footprint (EC tower height) plays a major role or directly impacts the accuracy of the results.
The average R 2 values of GG18 over the wet, moderate and dry classes are 0.72, 0.61 and 0.52, respectively. Since the ET fluxes differ between the wet and dry climates, the absolute values of RMSE may not be simply compared to each other. Instead, the RMSE value at each site is divided by the average ET EC value shown in Table 1 such that the relative RMSE is computed and compared across all sites. The values of relative RMSE for GG18 range from 0.23 at site 11 to 1.59 at site 34 with an average of 0.69.

Comparison with recent studies
In this section, the results of the proposed modified complementary method, specifically GG18, are compared to the  results from recently published studies using the classical methods and original complementary methods. Suleiman and Crago (2004) estimated hourly ET using radiometric surface temperatures in two grassland sites in Oklahoma and Kansas and validated using EC data. The results showed the RMSE values ranging from 32 to 53 mm month −1 while R 2 varied between 0.78 and 0.94. Mu et al. (2007) used data from 19 AmeriFlux EC sites to validate the estimates of a remotely sensed ET using a revised Penman-Monteith equation. The average RMSE, bias and R 2 were 29 mm month −1 , −6 mm month −1 and 0.76, respectively. When used with 46 AmeriFlux sites (Mu et al., 2011), the results showed average RMSE, absolute bias and R 2 of 26 mm month −1 , 10 mm month −1 and 0.65, respectively. Kuske (2009) estimated ET using Penman-Monteith and Priestley-Taylor equations and compared estimates to EC data. Both models were significantly overestimating the high ET fluxes and slightly underestimating the low ET fluxes. Thompson et al. (2011) tested ET "null" model that couples the Penman-Monteith equation to a soil moisture model at 14 AmeriFlux sites from which 8 sites are used in the present study. RMSE varied between 56 and 208 mm month −1 and therefore, changes were made to further improve the model to produce RMSEs of 34-175 mm month −1 .
However, complementary methods to predict ET have not been extensively compared with EC-based ET measurements. With the exception of Ali and Mawdsley (1987), researchers have recently started paying attention to the complementary methods. A monthly ET map using a modified Morton method was produced using MODIS imagery for Hungary (Szilagyi and Kovacs, 2010) and verified using three EC sites. At two sites, R 2 values were 0.79 and 0.80 and bias ranged between −19 and 21 mm month −1 . At the third site, however, the authors found a difference of 44 % with the EC measurements due to physical conditions at that particular EC tower (see Szilagyi and Kovacs, 2010). Shifa (2011) examined the wind function of the AA model using data under wet and dry conditions. With the original AA method, RMSE was 17 and 29 mm month −1 for the wet and dry conditions, respectively. The author found that the AA method performs best using calibrated wind function coefficients under wet conditions in which RMSE and R 2 were 12 mm month −1 and 0.7, respectively. Huntington et al. (2011) tested the AA method using data from arid shrublands at five EC sites in eastern Nevada. It was found that RMSE, R 2 and percent bias were 13 mm month −1 , 0.77 and 18 %, respectively. RMSE, R 2 and percent bias of a modified AA method were 11 mm month −1 , 0.71 and 1 %, respectively. Han et al. (2011) proposed an enhanced GG model at four sites under different land covers and compared it to the original GG method and EC-based ET data. The enhanced model was better than the original GG method at three sites and RMSE of the enhanced GG model ranged from 4 to 16 mm month −1 . Table 6 shows the results from a set of the abovementioned studies compared with the results of the proposed GG18 model variation. The comparison shows that the results of the GG18 model variation are equal or better and more reliable considering the wide range of physical and climatic conditions of the 34 global EC sites used in this study. More importantly, the ET estimates of GG18 outperform the estimates of ET of other studies given the minimal cost and data needed to compute reliable regional ET using meteorological data only. Furthermore, GG18 is a single-step method that does not require local calibration and therefore suitable to use in rural river basins with minimal data and monitoring while providing the total water loss from the land surface that is appropriate in water resources planning.
The GG18 model is close to a "universal model" and shows better behavior among the 34 sites and the results are more consistent across the spectrum of climatic classes as shown in Fig. 6. The ET estimates of the GG18 model for the moderate-climate sites are comparable to both wet or dry climatic classes (Fig. 6), and those of the most recent ET studies (Table 6). None of the original (CRAE, AA and GG) methods, however, succeeded to estimate ET under subhumid and Mediterranean climatic classes (see Table 2). The discrepancy is clear when compared to the more extreme conditions, i.e., dry and humid categories (Table 2). For example, one may argue that the average values of performance metrics of the GG18 model are slightly better than those of the original CRAE method that does not need wind measurements. The comparison cannot be made only between the overall average values given by the CRAE method and the GG18 model. There are other statistics (e.g., standard deviation) that show the accuracy (or distribution) of the ET estimates among the 34 sites. As discussed earlier, one major problem of the CRAE method is that it fails to estimate ET under subhumid and Mediterranean climatic classes (see Table 2). Under the diverse physical and climatic conditions, the GG18 model variation is quantitatively and qualitatively outperforming all original complementary methods. The model structure of the proposed GG18 model variation is given in Fig. 7.
One last concern is about the most proper temporal resolution of the GG18 model. It is known that the original AA and GG methods are usually used at daily timescales, while the original CRAE method is typically used at a monthly timescale. The goal of this study is to propose a universal ET model that can be successfully used for data deficit conditions under which daily data are missing or unavailable. It is believed that the regional estimates of ET entail monthly time resolution. Thus, the question now is whether applying the GG18 model at a monthly timescale will change the parameters of the model used in daily time steps or not. In order to answer this question, the proposed GG18 model was applied to a countrywide study of Ghana where daily data were available and climate varies from semiarid in the north to tropical humid in the south (Anayah et al., 2013). The predictions using monthly data from 2000 to 2005 were very much comparable to the daily estimates of the GG18 model. These results suggested that the GG18 model can accommodate both daily and monthly time steps to produce consistent results. The reader may refer to Anayah (2012) and Anayah et al. (2013) for further details.

Summary and conclusions
Complementary methods have the potential to predict regional ET using minimal meteorological data. However, prior studies used small data sets representing limited climatic variability and physical conditions that were not successful in improving the methods. A few of the successful studies used locally calibrated parameters that may not have the universal applicability simply due to the two-step approach required to compute ET. In addition, water resources studies require the total water loss from the land surface irrespective of the land use/land class. In this regard, complementary methods provide the distinct advantage over the classical methods that only provide crop ET using detailed input data such as land use/land class, cropping patterns and crop calendar. The state of the complementary methods is such that there is no single methodology consistently used over a wide variety of climatic and physical conditions. This study is aimed at developing a calibration-free universal model using the complementary relationship that requires meteorological data only to predict regional ET.
In this work, 34 global sites with measured ET data via the EC method are used to develop the proposed model using systematic sensitivity analysis conducted with the three original complementary methods. The sites have different climatic and physical conditions to ensure the universal application of the proposed model. The three original complementary methods consisting of CRAE, AA and GG are first evaluated and the need for improvement to all methods is determined. Based on the models' structures, 20 alternative model variations are proposed. The GG method was found to be the most attractive compared to the other two methods and therefore the GG method is further analyzed. The ETW that uses the Priestley-Taylor equation produced 16 GG model variations. Climates of the FLUXNET sites were initially sorted to six climatic classes based on the aridity index proposed by De Martonne (1925). The initial results identified seven promising model variations. Given the complexity of using six different climatic classes, the analysis later reduced this number to three distinct climatic classes consisting of wet, moderate and dry climates. This simplification identified three promising model variations from the earlier seven variations. Statistical analyses conducted via ANOVA testing and the Dunnet method showed that two of the model variations are similar while one GG model variation, GG18, clearly provided a different distribution and results. Therefore the GG18 model variation was considered the best. Also the comparison of results from recent studies showed that the GG18 model variation is capable of producing equal or better results while capturing a wide variety of physical and climatic conditions.
In the proposed GG18 model, net radiation R n is computed using R ASCE calculated by Allen et al. (2005), which outperforms R T developed by Morton (1983). It is evident that the simple complementary relationships suggested by Eq. (1) can describe the behavior of ET fluxes better than the more generic complementary relationship of Eq. (8). Most importantly, the predictive power of the GG method (Granger and Gray, 1989) is improved when the ET PT equation is used to calculate ETW. There is a strong indication that the proposed GG18 model can significantly enhance the accuracy of ETW using the GG method and consequently to predict regional ET using meteorological data only and without calibration. Furthermore, this one-step estimation method can reliably estimate ET regardless of the prevailing climatic conditions. Such an estimate will unequivocally lead to reliable predictions of water resources, in particular recharge estimation and impacts due to climate change.