Three perceptions of the evapotranspiration landscape: comparing spatial patterns from a distributed hydrological model, remotely sensed surface temperatures, and sub-basin water balance

A problem encountered by many distributed hydrological modelling studies is high simulation errors at interior gauges when the model is only globally calibrated at the outlet. We simulated river runoff in the Elbe River basin in central Europe (148 268 km 2) with the semi-distributed eco-hydrological model SWIM (Soil and Water Integrated Model). While global parameter optimisation led to Nash– Sutcliffe efficiencies of 0.9 at the main outlet gauge, comparisons with measured runoff series at interior points revealed large deviations. Therefore, we compared three different strategies for deriving sub-basin evapotranspiration: (1) modelled by SWIM without any spatial calibration, (2) derived from remotely sensed surface temperatures, and (3) calculated from long-term precipitation and discharge data. The results show certain consistencies between the modelled and the remote sensing based evapotranspiration rates, but there seems to be no correlation between remote sensing and water balance based estimations. Subsequent analyses for single sub-basins identify amongst others input weather data and systematic error amplification in inter-gauge discharge calculations as sources of uncertainty. The results encourage careful utilisation of different data sources for enhancements in distributed hydrological modelling.


Improving spatial representativeness of distributed models
A distributed hydrological model which accurately simulates discharges at the basin outlet while producing poor results at interior points seems to be a paradox.But this feature has been shown by many studies on distributed modelling where inner point discharges were evaluated.Examples for larger simulation errors within the model domain are given by Andersen et al. (2001), Güntner and Bronstert (2004), Ajami et al. (2004), Ivanov et al. (2004) (suggesting a synthesis of modelling with remote sensing data to realise "the true value of the distributed approach"), Mo et al. (2006), Moussa et al. (2007), Feyen et al. (2008), and Merz et al. (2009).Bergström and Graham (1998) and Das et al. (2008) also report better model performances with increasing basin size for (semi-) lumped approaches.Pokhrel and Gupta (2010) and Pechlivanidis et al. (2010) tried parameter-sparse approaches for multi-site calibration but achieved generally poor model performances at interior points.Finally, respective results obtained from numerous models in the first phase of the Distributed Model Intercomparison Project (Reed et al., 2004) gave rise to adding more stream gauges at interior points for the second project phase (Smith et al., 2012a) which confirmed the observed trend of model fidelity increasing with basin size (Smith et al., 2012b).
Yet another example from the Elbe River basin in central Europe (148 268 km 2 ) gave reason to this study: for estimating water-related climate change impacts, the For compatibility of positive and negative deviations, the logarithm of the relation of simulated to measured mean discharge has been used as error measure.

T. Conradt et al.: Three perceptions of the evapotranspiration landscape
semi-distributed eco-hydrological model SWIM (Soil and Water Integrated Model) had been applied to project natural water discharges under scenario conditions (Conradt et al., 2012b(Conradt et al., , 2013a)).Single global calibration by measured discharges at the basin outlet appeared to be insufficient: comparing the simulated discharges from higher-order tributaries by respective gauge data often revealed grave deviations in water volume.Figure 1 shows the relative volume errors decreasing with increasing sub-basin area.Other comparisons showed poor model performance in simulating peak or low flow phases for some sub-areas of the basin.Nevertheless, a Nash-Sutcliffe efficiency of 0.9 had been achieved for long-term series of daily discharge at the main outlet gauge Neu Darchau.Spatial calibration might minimise sub-catchment uncertainties through increasing site-specific representativeness of the model.In conjunction with distributed hydrological modelling, spatial calibration usually means individual multi-site calibration (Santhi et al., 2008;Zhang et al., 2008).This study uses the term in the same line.Pokhrel and Gupta (2011) argue that enhancements of spatial model representativeness are not necessarily seen in the outlet hydrograph.But they agree with other researchers that incorporating additional site-specific information in a distributed hydrological model increases its robustness (Stisen et al., 2011).Especially remote sensing data are valued as a useful complement to station based time series (Finger et al., 2011;Liu et al., 2012).
In our case of semi-distributed eco-hydrological modelling of the Elbe River basin (Conradt et al., 2012a,b), sub-basin discharges were fitted to (management corrected) gauge observations by individual evapotranspiration corrections.Having calibrated the model globally beforehand, most subbasin evapotranspiration (ET) adjustment factors differed significantly from one.High and low values were spatially clustered, but no functional relationship to certain land use classes or soil types could be identified.An independent mapping of the spatial ET pattern by means of remote sensing could probably explain these observations and help to identify probable error sources.

Hydrological modelling and remote sensing
The idea of integrating remote sensing into hydrological modelling is relatively old (e.g.Klemeš, 1983Klemeš, , 1988;;Schultz, 1987Schultz, , 1988)), and despite many systematic and practical problems (cf.Kite and Pietroniro, 1996;Beven, 1996Beven, , 2001) ) a lot of modellers continued working with remotely sensed data in recent years.As satellite data availability has been much increased within the last decade, current research is finally measuring up with many expectations of the 1980s (Nagler, 2011).For example, an operational, multiple-source data assimilation system integrating remote sensing information is currently being put into service in Australia (van Dijk and Renzullo, 2011;Glenn et al., 2011).
We use remotely sensed land surface temperatures to map the ET pattern in the Elbe River basin.Recent studies that also make use of thermal and optical sensors range from "classical" rainfall-runoff modelling with remotely sensed pattern comparison (like our contribution) to integrated data assimilation systems.Examples of the former are from Boegh et al. (2004) for 10 km 2 of agricultural landscape in Denmark and Vinukollu et al. (2012) with a global ET pattern comparison; as well as a substantial contribution from Schuurmans et al. (2011) who first compare and then assimilate the modelled and remotely sensed actual ET patterns of an area of 70 km 2 in the middle of the Netherlands; however, observed differences between the two data sources remain partly unexplained.
Despite the fact that remote sensing does not directly provide measurements that a hydrological model could be calibrated to, the idea of using the additional spatial information for improving distributed models seems to be an elegant way between the extremes of validation only and direct data assimilation.Immerzeel and Droogers (2008), for example, applied the SWAT (Soil and Water Assessment Tool) model to the Upper Bhima catchment in southern India (45 678 km 2 ) and adjusted the monthly evapotranspiration for each sub-basin to the ET a -estimates of the SEBAL (Surface Energy Balance Algorithm for Land) algorithm (Bastiaanssen et al., 1998a,b) applied to thermal imagery from the Moderate Resolution Imaging Spectroradiometer (MODIS) satellite.Singh et al. (2010) and Jhorar et al. (2011) used remotely sensed ET rates for improving agro-hydrological models on irrigated plots.
And Githui et al. ( 2012) demonstrated a multi-objective and spatial calibration of a semi-distributed model using data from two runoff gauges and remotely sensed ET for 59 subbasins of the 600 km 2 Barr Creek catchment in northern Victoria, Australia.

Objectives of this study
Originally, our intention was to present an alternative spatial calibration of our Elbe River basin model by means of remote sensing.But we will make a more fundamental assessment by comparing annual evapotranspiration patterns in space: 1. simulated by the semi-distributed eco-hydrological model SWIM, 2. derived from remotely sensed land surface temperatures, and 3. estimated with the water balance method.
The objectives are to show the feasibility of our remote sensing approach, to evaluate the correspondences and differences between the results of all three methods, and to find reasonable explanations for systematic or individual subbasin deviations.

The Elbe River basin
Before we present the three methods in detail, the research domain shall be introduced.The Elbe River basin, located in central Europe covers 148 268 km 2 (FGG Elbe, 2005), thereof approximately one third within the Czech Republic and two thirds within Germany; less than 1 % belong to Austria and Poland.Figure 2 provides two maps of the basin.
The model domain was restricted to 134 890 km 2 , including the drainage area of the main outlet gauge Neu Darchau (131 950 km 2 ).The lower part of the stream is influenced by tide, which renders continuous discharge measurements impossible.
Approximately 50 % of the area are lowlands below 200 m a.m.s.l.This landscape dominates the north of the basin.Formed by the last glaciations, it is characterised by sandy plateaus with loam-covered riparian zones and wetlands in between.Due to the low slopes, sandy soils, and comparably low-intensity rainfall, the hydrological behaviour is governed by groundwater dynamics.Major land uses are grassland, forestry, and agriculture, often on poor soils.
The higher elevated regions can be divided up into hilly mountain forelands (32 %, 200-500 m a.m.s.l.) and mountainous areas (18 %, above 500 m a.m.s.l.).The hilly mountain forelands are covered by loamy-silty substrates and loess areas of highest field capacities.These productive soils are mainly used for agriculture.The mountainous areas have relatively poor soils, typically thin cambisols from weathered rock sediments.Their major land use are coniferous forests.Climatically, the Elbe River basin is located at the transition of the maritime temperate zone towards continental climate.Precipitation shows a rather uniform intra-annual distribution.The long-term mean is 702 mm a −1 , and the average discharge at the river mouth of 861 m 3 s −1 equals 183 mm a −1 , which means an average evapotranspiration of 519 mm a −1 (FGG Elbe, 2005, and own calculations).The spatial distribution of precipitation depends strongly on topography: near Magdeburg, in the lee of the Harz Mountains, less than 500 mm a −1 are measured, while more than 1200 mm a −1 can be observed within the mountainous regions.Evapotranspiration follows a distinct annual cycle.Negligible in winter, local ET rates reach up to 7 mm d −1 in summertime.There are huge lignite open cast mining areas in the subbasins of the rivers Spree, Schwarze Elster, and Weiße Elster.These are hydrologically important: a groundwater deficit of 13 × 10 9 m 3 had been created by draining (Grünewald, 2001), and ongoing recultivation activities shall produce over 200 km 2 of new water surfaces.Besides direct effects on river discharge, the landscape alterations affect local hydrometeorology (Conradt et al., 2007).

Climate station
The spatial pattern of climatic inputs and a multitude of different landforms, soil characteristics, and land uses within the Elbe River basin make it an interesting large-scale domain for distributed hydrological modelling.Examples are the contributions by Krysanova et al. (1999), who observed unsatisfactory model performance in the lowlands (in particular the Havel River) where the runoff regime is dominated by river-groundwater interactions and the related transpiration fluxes in the riparian areas, and Krause and Bronstert (2007), who focused their investigation on these processes.
In contrast to the similar studies of Immerzeel and Droogers (2008) and Githui et al. (2012), records from 133 gauging stations within the Elbe area could be utilised for comparison.As the water balance method requires longterm observations, mean discharges of 1961-1990 were used where available.Some gauge data were restricted to shorter periods that fell into this time span.Comparisons with model results were always made for matching periods, this applies accordingly for the remote sensing estimations.

General model structure
The semi-distributed eco-hydrological model SWIM (Krysanova et al., 1998(Krysanova et al., , 2000) ) is a variant of the wellknown SWAT (Arnold et al., 1993(Arnold et al., , 1998;;Srinivasan et al., 1998;Gassman et al., 2007).Semi-distributed means that the model domain is not represented in gridded manner (fully distributed) but by landscape patches with uniform hydrological behaviour, the so-called hydrotopes.For this study, the model domain had initially been divided up into 2278 sub-basins.In the following, they shall be addressed as "model sub-basins" to distinguish them from (gauged) sub-basins in general.A total of 133 calibration sub-basins are gauged aggregations of these model sub-basins.The hydrotopes are sub-units of these model sub-basins, defined by an intersection of soil and land use maps so that each hydrotope is a unique combination of sub-basin, soil type and land use.
For each hydrotope, vegetation growth and water and nutrient fluxes between various storages are modelled.This comprises, e.g.water seepage and capillary rise between soil layers, water and nutrient stress for plants, or evapotranspiration.Discharge components are accumulated and routed through the sub-basin structure by the Muskingum approach.The model works on a daily time step.
Daily climate input was provided by measurements of 853 climate stations: 352 fully instrumented plus 501 additional rain gauges; their spatial density was lower in the Czech part of the model domain compared to the German part, cf.Fig. 3.
Input variables were precipitation, global radiation, air humidity, and maximum, minimum, and mean air temperature.These data were interpolated to the model subbasins with inverse-distance weighting.Elevation dependencies were considered individually for each variable: when a linear regression on elevation yielded a coefficient of determination of 0.4 or more, only the residuals were interpolated and the trend component added afterwards.

The evapotranspiration calculus
Of the numerous approaches for estimating ET from meteorological data (cf.McMahon et al., 2013) a modified Turc-Ivanov approach (DVWK, 1996) which is applicable without wind speed data was chosen for calculating reference evapotranspiration, because it had been originally developed for East Germany (Richter, 1984;Wendling and Schellin, 1986), largely resembling the German part of the Elbe Basin.The original formula by Turc (1961) is replaced by another approach originally proposed by Ivanov (1954) • (100 − rF) for T < 5. (1) This combined equation yields daily potential or reference evapotranspiration ET p in mm from average temperature T in • C, net radiation R n in J cm −2 , and relative humidity rF in %.The dimensionless factor varies monthly between 0.7 for December and January and 1.25 for May.
According to ATV-DVWK, the reference ET p values from Eq. ( 1) were modified by land use specific factors ranging between 0.9 for cropland and 1.3 for water surfaces.
Daily actual evapotranspiration ET a is then calculated for each hydrotope as sum of soil evaporation ES and plant transpiration EP with an approach similar to that of Ritchie (1972).
Plant transpiration is calculated from the reference ET p depending on the leaf area index LAI: This preliminary value EP 0 is reduced to EP according to the plants' actual water use which is calculated for each of up to 10 soil layers separately according to the approach of Williams and Hann (1978): a potential water use WUP i for layer i is estimated with the equation where RDP refers to a rate depth parameter, RZD i means root zone depth parameter of layer i, and RD is the fraction of the root zone that contains roots.(Plant growth including root development is dynamically simulated and these parameters change accordingly.)The actual water use from that layer WU i depends on the ratio of available soil water SW i to the field capacity FC i , and the sum of all soil layer contributions equals the actual plant transpiration: Soil evaporation is treated in similar steps; starting with potential soil evaporation which depends on LAI, the value is reduced according to the extent of dry periods and available water in the top 30 cm of the soil.The amount of evapotranspirated water is subtracted from the soil layer storages and accordingly reduces percolation and subsurface and ground water runoff and, subsequently, the accumulated discharge.

The remote sensing approach
Evapotranspiration cannot be measured directly from space, but several methods exist to estimate ET values by means of remote sensing.One common approach is based on surface temperature, which can be inferred from thermal radiation and is partly governed by energy partitioning into sensible and latent heat.Most studies following this approach aimed at estimating evapotranspiration more or less solely from remotely sensed data; their comparisons with ground measurements show correlations, but typically high noise levels (Moran et al., 1994;Kite and Droogers, 2000;Garatuza-Payan et al., 2001;Jiang and Islam, 2001;Jacobs et al., 2004;Patel et al., 2006;Wloczyk, 2007;Hoedjes et al., 2008;Galleguillos et al., 2011).Bastiaanssen et al. (1998a,b) invented the SEBAL to account for many error sources by taking the coolest ("wet") and the warmest ("dry") pixel of a scan as calibration basis.This approach may well be the most popular in counts of applications, derived variants and further developments, e.g.Gómez et al. (2005) Many problems of ET estimation from thermal radiances -which also contribute to the challenges of this study -can be explained from a closer look at the relationships between energy and water fluxes.The general energy balance for any surface spot on Earth reads On the left hand side, the energy inputs net radiation R n , ground heat flux G and heat advection S are summed up.
They equal the outgoing fluxes on the right hand side: latent heat by evapotranspiration λET and sensible heat H . Net radiation is principally the driving force for evapotranspiration.The other input terms, G and S, may be neglected for 24 h and a fortiori for annual integrations, but both net radiation and Bowen ratio (of sensible to latent heat) have to be determined.

Determining net radiation
Net radiation is the sum of all radiation components at the ground: In detail, R n consists of that part of the incoming shortwave global radiation R sg which is not reflected at the surface (therefore α, the land cover dependent albedo), and the long-wave components: surface radiation R le towards the sky (therefore negative) and the absorbed part (governed by the surface emissivity ε e ) of the atmospheric back radiation R la .While αR sg and R le may be quite directly measured by a remote sensor (only corrected for atmospheric extinction),

T. Conradt et al.: Three perceptions of the evapotranspiration landscape
assumptions or ground measurements have to be made for determining R la and the total global radiation R sg , or R la and α, respectively.The relationship between thermal radiances and actual surface temperature provides additional room for errors, because the Stefan-Boltzmann law R = ε • σ • T 4 contains the emission coefficient ε which depends on the radiant material.T denotes the temperature in K and σ is the Stefan-Boltzmann constant of 5.67 × 10 −8 W m −2 K −4 .Both R la and R le can be expressed in terms of specific ε and T values: While ε e varies only within a small range around 0.95 for natural surfaces (Albertz, 1991), the assumption of a single temperature T a for the atmosphere is a common simplification.
We utilised spatially interpolated, ground-measured air temperature and radiation data: net radiation is routinely derived by SWIM from standard input data containing daily values of global radiation R sg , air temperature T a , and relative humidity rF.The formulae in the applied SWIM version generally follow the recommendations of DVWK (1996).Equation ( 7) is fed with albedo depending on vegetation density and eventual snow coverage: 15 ν for ≤ 5 mm water equivalent 0.6 for thicker snow cover, (10) with d v being the biomass density in kg ha −1 dynamically calculated by the crop and vegetation growth routines.Furthermore, Eqs. ( 8) and ( 9) are merged (assuming T a ≈ T e ) to a net emittance with the effective emission coefficient ε and a cloud cover factor ω: using the approximations of Brunt (1932) based on vapour pressure e ε = 0.34 − 0.044 which, despite its age, seems to perform better than more recently developed alternatives (cf.Bilbao and Miguel, 2007;Choi et al., 2008), and Wright and Jensen (1972) with coefficients by Doorenbos and Pruitt (1977) The vapour pressure e is calculated from T a and rF according to DVWK (1996), and R max is the theoretically possible clear-sky radiation on the given day at the mean latitude of the model domain (disregarding elevation).

Determining the Bowen ratio
Equation ( 6) shifted about neglecting G and S and divided by λ = w • r v , which is the energy needed to evaporate one volume unit of water (water density w times steam heat r v ), delivers ET, when both R n and H are known: The calculation of net radiation has been discussed above.
The question remains, how much of R n is transformed into sensible heat and what remains for evapotranspiration, i.e. the Bowen ratio (Bowen, 1926a,b;Lewis, 1995) has to be determined.
The sensible heat flux H is driven by the vertical temperature gradient ∂T ∂z .This gradient is usually represented by the temperature difference T = T s − T a between the soil or plant canopy surface temperature T s and the 2 m air temperature T a .We follow this approach, being aware that there might be some difference between surface temperature and aerodynamic temperature at the ground and that tall vegetation, especially forests, would require a more elevated measurement level.However, the utilised air temperature measurements from meteorological stations are always taken above the vegetation canopy (usually lawn), and their spatial interpolation allows for a consistent T mapping.The explicit consideration of different air temperatures in space is an advantage over the uniformly calibrated SEBAL algorithm that should balance negative side effects of some necessary assumptions.
The sensible heat flux can then be formulated either via an exchange coefficient C or an aerodynamic resistance for heat r ah : In this equation, c p means the specific heat content of the air and a its density.Aerodynamic resistance (viz.the exchange coefficient) depends on atmospheric stability, wind velocity u (at a reference height z) and geometric surface characteristics (cf.Brutsaert, 1982).
We do not have these data, but with reliable area averages of ET a from the SWIM modelling, denoted by ET SWIM , the aerodynamic resistances can be estimated.The spatial mean of ET a must equal the sum of the contributions of the n model hydrotopes with areas a i : In this equation, the aerodynamic resistances r ah,i still have individual values for each hydrotope.
The simplest solution would be assuming one common resistance for the entire basin.But the land use pattern will definitely reverberate in the atmospheric resistances through different surface structures.Therefore, the general approach taken here is to assume two different effective resistance values: r ah,f for the forested part of the basin and r ah,n for the rest of the domain, because forests differ most distinctively in their surface characteristics from the remaining landscape.Elevation effects, including the strongly correlated wind effects, are neglected, and wind speed is not considered by SWIM either.But eventual elevation dependencies can and will be analysed from the results.
The respective double usage of Eq. ( 17) for the m forested and the remaining n − m non-forested hydrotopes allows for direct calculation of the two respective resistance values r ah,f (forested) and r ah,n (non-forested): The modelled averages of the respective land cover evapotranspiration ET S,f + ET S,n = ET SWIM are very reliable, because they represent large sub-areas of the model domain (cf.Fig. 1 and Table 1), and statistical analyses of the results showed no relationship between deviations of sub-basin evapotranspiration estimates and sub-basin forest shares.

From snapshots to annual values
Hitherto, nothing has been said about the time frame in which Eqs. (17-19) should be applied.Principally, a single day or several years make no difference, provided that effective temperature gradients for the entire period can be provided.Effective means that the difference between satellite-derived surface temperature and ground-measured air temperature must always be extrapolated from the snapshot time(s) of the actual measurements to a period average.
Using averaged T values (denoted by an overline) as effective gradients, it is possible to calculate the individual evapotranspiration heights of hydrotopes for any longer period provided the total ET (of the entire model domain within that period) is known.Here, the index k refers to the selected hydrotope: It makes hardly any difference whether the measurements were taken at noon or in late afternoon, as long as R n was positive and dominant compared to G. Note that the resistances r ah in this equation are also effective time averages which have to be estimated for the same time period, and R n accordingly means the radiation energy accumulated within that time.

The water balance method
The classical water balance equation reads Evapotranspiration ET should theoretically equal precipitation P minus discharge Q for timescales of several years, because S, the change in water storage of the catchment, becomes negligible compared to the other variables within such a time span.Practically, this approach has to grapple with difficulties in measuring catchment precipitation and uncertainties about catchment boundaries; the latter includes unaccounted ground water exchanges with neighbouring areas.The measured discharge may even be influenced by anthropogenic management.But due to lack of better alternatives, the water balance approach is commonly accepted as the reference assessment of long-term mean evapotranspiration for river basins.
We calculated average annual balances from the three years of interpolated precipitation data that were used to drive SWIM and from the observations at 133 runoff gauges.For each gauge with a catchment area containing further gauged sub-catchments upstream only the part below these upstream areas was considered by subtracting their precipitation and runoff contributions.

Results
The eco-hydrological model SWIM, only globally calibrated on the daily runoff values of the 1990s at the outlet gauge Neu Darchau, was run for three years: 2001-2003.During this period, runoff at Neu Darchau was slightly overestimated by 12 %.The resulting balance error for evapotranspiration remains below 5 % though, because the runoff coefficient is below 0.3 (cf.Conradt et al., 2012b).The Nash-Sutcliffe efficiency of the daily values was at 0.87. Figure 4 shows the hydrographs.Using the simulated ET averages from forested and non-forested hydrotopes, 944 remotely sensed land surface temperature (LST) maps from this period were evaluated.The area-averaged general results of this calculation are summarised in Table 1.

Application of the remote sensing method
The LST maps derived from thermal imagery of the "Advanced Very High Resolution Radiometer" (AVHRR) instruments operated by the US National Oceanic and Atmospheric Administration (NOAA) were readily provided by the German DLR (German Aerospace Center) Applied Remote Sensing Cluster and could be downloaded via its EOWEB portal (http://www.eoweb.de).These maps cover all of Europe at a resolution of approximately 1.1 km in the map centre.This study utilises all 944 available daytime LST maps of the years 2001-2003.
Detailed information on these data is given by Tungalagsaikhan and Guenther (2007), including cloud screening procedures and the algorithms applied for computing the LST values from the thermal radiances.The latter had originally been established by Becker and Li (1990) and van de Griend and Owe (1993), and they were proven to be superior to other methods for this part of the world.
The European LST maps were reprojected onto the hydrotope map of the SWIM model, and mean surface temperatures could be calculated for each hydrotope when completely free from cloud cover.Hence, a first problem arises: how to deal with spatio-temporally varying cloud coverage?
Figure 5 demonstrates that the scanning times of the LST maps vary heavily due to satellite orbit characteristics and an intermediate change of the platform.Regarding the groundmeasured air temperatures, only three measurements per day were available from the climate stations: minimum, maximum and average temperature.The maximum values, interpolated to sub-basin resolution, had to serve as best estimate for T a at satellite overpass time.
Here, average temperature gradients had to be determined for the three calender years 2001-2003.One possible ap- proach could be averaging only the seven days within that period having LST maps with less than 1 % cloud cover.But 732 out of the 944 maps show surface temperatures for more than 1 % of the basin -and their information should not be discarded.The solution applied here is to produce a composite map of temperature gradients by averaging all available daily T values for each hydrotope and correcting them for cloud cover frequencies as described below.
Figure 6 shows both the blue-sky fractions of the satellite maps (share of the non-cloud covered pixels in the model domain) and the simulated evapotranspiration for the model domain of SWIM.Luckily, there is a correspondence: especially in wintertime, when the remote sensing information suffers from permanent cloud and snow coverage or the longer data gap occurs, there is only little evapotranspiration.
On the other hand, radiation and accordingly heat gradients and evapotranspiration rates are much lower under cloud cover compared to blue sky conditions, so cloudiness has to be considered.The assessment of longer time periods (full years or our three year period) with hundreds of LST maps minimises the error from specific cloud distributions at satellite overpass times; it allows for utilising average cloudiness maps as shown in Fig. 7.
The cloud screening procedure applied by the DLR prohibits LST calculations as soon as the respective pixel is cloud contaminated (Tungalagsaikhan and Guenther, 2007), i.e. is not totally cloud-free.Affected pixels are set to white.Such white pixels include all conditions from thin cirrus with hardly dimmed radiation to dense stratus.A "blue-sky gradient" T was calculated for each hydrotope observation without any white pixels (i.e. for the shares shown in Fig. 7a).The effective temperature gradient T could then be estimated with the average blue-sky fraction of the hydrotope β, shown in Fig. 7b, assuming a mean attenuation factor of η = 0.33 of the cloud layer in white pixels: Although the value of η plays an important role for the range of these gradients, the resulting ET heights are hardly sensitive to it; the relative pattern remains quite stable for different choices of η, and the total evapotranspiration sum is kept to the level obtained from the hydrological model by an appropriate adjustment of the aerodynamic resistances r ah .
The resulting map of average temperature gradients is shown in Fig. 8. Mountainous regions, wetlands, or regions with many lakes (near the catchment boundary in the north) are clearly distinguishable by values close to zero.The most extreme gradients were determined for lowland areas in the north of the Czech Republic.This is most probably an artefact due to the sparseness of climate station data in that region.
In 2001 the German part of the Elbe Basin experienced an average year regarding radiation and precipitation, 2002 was warm and relatively wet (an extreme flood occurred in August), and in 2003 the vegetation period was exceptionally dry, sunny, and hot (Müller-Westermeier et al., 2002;Müller-Westermeier andRieke, 2003, 2004).This sequence can be confirmed by the ET simulations and average temperature gradients; cf.Table 1.The variations in the resistance values can be explained by respective subsequent increases in the numerators of Eqs. ( 18) and ( 19) combined with an increase in the denominators (more R n , less ET) between 2002 and 2003.The resistance values are also sensitive to the adjustment of η: with η = 0.25 instead of 0.33, the time- averaged r ah,f decreases from 99.2 to 87.3 s m −1 and r ah,n from 103.6 to 85.4 s m −1 .But in any case, the aerodynamic resistances range in the order of magnitude for vegetated surfaces in temperate climate found by many other authors (e.g.Thom and Oliver, 1977;Lindroth, 1993;Ramakrishna and Running, 1989;Liu et al., 2007).

Comparison of the three methods' results
Figures 9 to 11 present the patterns of the three ET estimations for the 133 gauged sub-basins in three respective maps, and Fig. 12 shows the sub-basin estimations in three scatter plots for the possible pair combinations of the three methods.The first impression is that the variances of both water balance derived and remotely sensed ET clearly exceed those of the simulation results from the only globally calibrated hydrological model.
The second insight delivered from Fig. 12 is that there is a weak correlation between the model and the remote sensing approach, an even weaker agreement between model and water balance based validation, and, finally, practically no relationship between remote sensing and the water balance approach.
In order to shed light into the discrepancy between water balance and remote sensing estimations, we grouped those sub-basins which deviate most from being correlated in the It turns out that all "deviating" sub-basins are located in the Czech part of the Elbe Basin.The cluster of subbasins marked by red colour which combine low remotely sensed ET with medium to high ET found by the water balance method concentrate in the lowlands of the northwestern part of the Czech republic, while the opposite combination coloured in blue with high remotely sensed ET values was found at sub-basins distributed around the mountainous edge of the Czech area.
Subsetting the data to the 72 German sub-basins clearly increases all correlations as in Fig. 14.The upper left panel in Fig. 14 shows a relatively good agreement between ET estimates of the remotely sensed approach and the globally calibrated model simulation.Outliers are dominated by smaller sub-basins, which could be expected (cf. the Introduction) -and this clearly highlights the potential of spatial calibration by means of remote sensing in order to reduce the modelling errors especially for smaller sub-basins.
The third evapotranspiration estimation from sub-basin water balances shows little correlation with SWIM and, at least in the German subset, also with the remotely sensed ET (compare the lower panels in Figs. 12 and 14).While restricting the data basis to the German sub-basins decreased the variance of the remotely sensed ET heights, the water balance based estimations still cover a comparably wide range.Again, systematical errors can be identified by mapping the most prominent outliers in the lower right panel in Fig. 14, this is done in Fig. 15.
It appears that two pairs of subsequent gauge areas at the lower Havel River (Ketzin and Rathenow) and at the Elbe River downstream of the Havel (Wittenberge and Neu Darchau) have both been assigned combinations of very low and high ET estimates from the water balance method; the systematical flaw behind these deviations is discussed below.
Another reason for ET deviations in the water balance estimations are the massive anthropogenic ground water extractions from open-cast lignite mining areas that peaked in the 1980s when more than 30 m 3 s −1 excess flow were lead into the Spree River, Grünewald (2001).In Fig. 16, the sub-basins whose discharges were presumably elevated by pumped ground water are coloured according to their river catchment affiliation.
One would expect too-low ET estimations for open-cast mining affected sub-basins, which would (wrongly) explain their elevated discharge.Figure 16 shows that this holds true for some sub-basins contributing to the Spree River, drawn in red.For the Pleiße sub-basin (yellow/orange) the plot reveals no visible effect, and the blue-coloured sub-basins of the  Schwarze Elster River catchment seem to be drifted towards ET overestimation.It can be explained by the ground water pumping into the Schwarze Elster having seen its maximum rates already in the 1960s before most sub-basin gauges went into operation.Reduced discharges due to the already generated groundwater deficit are likely to have dominated the observation period here.

Remote sensing estimations
To explain the heavy noise in the remote sensing estimations for Czech sub-basins, we take a look at the geospatial pattern of the outlier sub-basins in Fig. 13.These outliers match the most extreme temperature gradients in Fig. 8. Taking into account that the spatial density of climate stations of which data were provided was much lower in the Czech part than in the rest of the basin (cf.Fig. 3), it seems highly probable that the 2 m air temperature and hence the resulting temperature gradient were systematically biased preventing the remote sensing approach from working properly in this region.Additionally, the southern exposition of the area at the foot of the Ore Mountains might have be contributed to locally increased air temperatures not captured by the station network.
The remaining noise of the remote sensing estimations compared to the SWIM results in the upper left panel in Fig. 14 is indicated by a correlation coefficient of 0.613.This is clearly within the range observed by most recent studies evaluating remotely sensed ET by estimations from other methods, be it reference ET calculated from lysimeter measurements (Wloczyk, 2007;Sánchez et al., 2008), eddy flux or other micrometeorological tower measurements (Ver-straeten et al., 2005;Patel et al., 2006;McCabe and Wood, 2006;Brunsell et al., 2008;de C. Teixeira et al., 2009), or hydrological model simulations (Boegh et al., 2004;Gao and Long, 2008;Galleguillos et al., 2011).

Water balance estimations
The reason for the outliers in the water balance estimation for subsequent gauges (cf.Fig. 15) are slightly biased discharge measurements along larger rivers causing sweeping oscillating errors.
For example, the average runoff of the years 2001-2003 at gauge Wittenberge was 764 m 3 s −1 , and at the outlet gauge Neu Darchau 789 m 3 s −1 .For the catchment area of 8418 km 2 between them, the difference of 25 m 3 s −1 would theoretically equal a discharge contribution of 94 mm a −1 .
Given a relative gauging uncertainty of ±5 % (cf.Sauer and Meyer, 1992;Maniak, 2005), the real difference value can easily be zero as well as 50 m 3 s −1 -an uncertainty range of 187 mm a −1 that directly applies to the respective ET estimates.
It seems that Wittenberge is indeed biased towards toohigh runoff measurements about that magnitude, resulting in the picture shown in Fig. 15 with largely underestimated ET for the area above Wittenberge and a comparable overestimation for the area between Wittenberge and Neu Darchau.
The case for Ketzin and Rathenow is very much the same.In general, measurement errors of subsequent gauges on the same river renders reasonable water balancing impossible when the total runoff is relatively large compared to the discharge from the intermediate area.
However, the deviations caused by different mining histories in some sub-basins are not at all an error, but a management effect that could be identified by comparing the water balance results with those from the other methods.With less noise in the estimations one could not only identify but also quantify such effects.

Eco-hydrological model simulations
The output of the SWIM simulations are of course also subject to errors.The model water balances of two groups of hydrotopes -forested and non-forested -were taken for adjusting the remote sensing based ET values which might have added to the overall noise of the results.It has to be pointed out that the internally computed LAI values were left unmodified, although some standard parameterisations for land cover units are questionable for parts of the model domain; e.g. the Ore Mountains.There had been a severe forest dieback in the crest region in the 1980s, but an ideal forest had been modelled.
The breakdown of the socialist economies in Eastern Europe around 1990 had global impacts on evapotranspiration via the phenomena of global dimming and brightening (Wild, 2012).This is relevant, because the eco-hydrological model was calibrated on data from before the change (Conradt et al., 2012b) when global radiation and ET were generally lower while the satellite scans were taken under brightening conditions.It remains unclear, to what extent different land uses were affected differently, but individually changing Bowen ratios might also have contributed to the observed uncertainties.
Finally, it has to be noted that the modelling of lateral water exchanges between sub-basins was limited to stream runoff.Groundwater exchanges affecting plant water availability and thus ET were not considered.

Conclusions
The comparison of three independent estimations for the spatial evapotranspiration pattern within the Elbe River basinthe semi-distributed model SWIM, the remote sensing approach, and the water balance method -and the identification of major reasons for their deviating results delivers a valuable insight: Systematic weaknesses of single methods can be detected much more easily by using several independent approaches and their differences.Natural phenomena can be separated from methodological artefacts with high confidence.
Without the relatively strong correlation between the modelled and the remotely sensed estimates, the water balance results -ground based and therefore commonly trusted most -would probably not have been questioned for methodological errors.
Concerning the recently published climate change impact study for the Elbe River basin (Wechsung et al., 2013) which relies on ground measurement-based spatial calibration (Conradt et al., 2012a(Conradt et al., ,b, 2013a,b),b), the consequence of our findings has to be extra caution when interpreting the results; cf. the assessments of water management options (Kaltofen et al., 2013a,b;Koch et al., 2013a,b) and the related economic consequences (Grossmann et al., 2013).

Sources of uncertainty
There are several reasons which have disturbed the validity of the evapotranspiration estimations, two of them have been shown explicitly.Another likely source of uncertainty is hidden, unaccounted groundwater fluxes.These are not at all implausible for the lowlands with their dominating sandy sediments.Although significant effects are more likely for small areas, Schaller and Fan (2009) postulated groundwater export or import altering the water balances even for large basins (up to ≈ 50 000 km 2 ) in the United States.
For lowland rivers in subcatchments of the Elbe River basin, Krause and Bronstert (2007) and Krause et al. (2007) investigated and modelled variable interactions between groundwater and surface water.Their findings question directly the credibility of both the SWIM model and the water balance approach for smaller sub-basins in this landscape.Additionally, many lowland areas of the Elbe River basin are covered with a network of ditches and canals, and their impact is sparsely known.

Recommendations
Despite these challenges, incorporating additional information by means of remote sensing must be recommended for any distributed modelling project, it should always serve as independent spatial basis of comparison.
Distinct perceptions of (hydrological) reality by modelling, remote sensing, and ground based observations are so widespread that great efforts have been made to merge these differing views into one consistent picture of reality, e.g. by data assimilation (Evensen, 2007;Liu and Gupta, 2007;Mathieu and O'Neill, 2008;Reichle, 2008).Practical examples for integrating evapotranspiration patterns retrieved by remote sensing into hydrological modelling are given by Pan et al. (2008); Qin et al. (2008); Long and Singh (2010); Schuurmans et al. (2011), andLiu et al. (2012).
But these techniques cannot avoid biased results and do not really help improving the models.McCabe et al. (2008) quite correspondingly conclude that there is currently no comprehensive and robust framework for integrating a multitude of observations; simply developing more efficient merging techniques would not be the key issue.
We therefore strongly recommend thoughtful comparison of remote sensing and other methods' results and careful investigation of the differences.Only by uncovering the individual reasons for observed differences, hydrological modelling may be improved accordingly.
Regarding our approach of combining remotely sensed with ground measured data for estimating T. Conradt et al.: Three perceptions of the evapotranspiration landscape evapotranspiration, we can only recommend it for areas with a high density of meteorological stations.Otherwise, poor performance prevents any meaningful assessment, and an alternative method like SEBAL (Bastiaanssen et al., 1998a,b) should be used instead.
Meteorological and stream gauge measurements will of course remain the basis for driving and calibrating hydrological models.But especially if there are only few runoff data from interior stream gauges, a distributed hydrological model can be well spatially calibrated on remotely sensed ET patterns, but to achieve realistic discharge simulations in space, additional local knowledge, e.g. on groundwater exchange and water management effects, is essential.In any case, by comparing model outputs with remote sensing results, local peculiarities may be identified.
Finally, this endorses the case made by Beven (2001): the future of hydrologic science lies less in the development of new theories and models but in gathering knowledge and understanding about specific areas; it should rather be a "learning about places" (see also Beven, 2003Beven, , 2007)).

Fig. 1 .
Fig. 1.Dependency of model discharge deviation on sub-basin size.For compatibility of positive and negative deviations, the logarithm of the relation of simulated to measured mean discharge has been used as error measure.

Fig. 2 .
Fig. 2. The Elbe Basin in central Europe: (a) elevations and major tributary streams.(b) Land use according to the CORINE 2000 classification; saturated tints indicate the model domain.

Fig. 3 .
Fig. 3.The model domain and the locations of climate stations.

Fig. 5 .
Fig. 5. Overpass times of the NOAA AVHRR platforms utilised for the daily LST maps at the centre of the Elbe River basin.Local solar time is about UTC + 50 min.The switch from NOAA-14 to NOAA-16 clearly shifted the bandwidth of scan times from late afternoon towards noon.Calculated from satellite Equator crossing data available via URL http://www.noaasis.noaa.gov/NOAASIS/ml/navigation.html(last access: May 2012).

Fig. 6 .Fig. 7 .
Fig. 6.Daily blue-sky fractions (light blue, left hand y axis) and average evapotranspiration rates (black dots, right hand y axis) of the modelled part of the Elbe River basin, average of 2001-2003.Cloud coverage was calculated from the available LST maps (grey colour indicates data gaps), and ET a values were obtained from the globally pre-calibrated SWIM model.

Fig. 8 .
Fig. 8. Average temperature gradients of 2001-2003: difference in K between surface and 2 m air temperatures.The originally observed differences between remotely sensed and ground measured temperature data have been corrected for cloud cover frequencies by Eq. (22).

Fig. 9 .
Fig. 9. Evapotranspiration patterns in the Elbe River basin according to SWIM.Average values for 133 sub-basins for the years 2001-2003.

Fig. 10 .
Fig. 10.Evapotranspiration patterns in the Elbe River basin according to remote sensing.Average values for 133 sub-basins for the years 2001-2003.

Fig. 11 .
Fig. 11.Evapotranspiration patterns in the Elbe River basin according to the water balance.Average values for 133 sub-basins for the years 2001-2003.

Fig. 15 .Fig. 16 .
Fig. 15.Extreme differences between water balance derived ET from neighbouring sub-basins.The sub-basin areas in the dotty plot (a) are named according to their outlet gauges drawn in the map cut-out (b) as black triangles.

Table 1 .
General results of the evapotranspiration calculation.The total area (134 890 km 2 ) is the modelled part of the Elbe River basin as shown in Figs.9-11.ET and R n are SWIM calculations, T and r ah are derived on that basis from the LST maps."All"refers to the results for the full data set of2001-2003. *