Kalman filter approach for estimating water level time series over inland water using multi-mission satellite altimetry

Introduction Conclusions References


Introduction
Over the last few decades, monitoring and modeling the water cycle of the system Earth have become a very important task.In particular, the knowledge of regional changes of water storage in rivers and lakes is fundamental for the risk assessment of natural disasters such as droughts and floods which have been increasing over the last decades (Guha-Sapir and Vos, 2011).Despite of the growing importance of respective measurements, the number of in-situ stations monitoring river discharge is globally declining.The number of river discharge time series provided by the Global Runoff Data Center (GRDC) decreased from about 7300 to 1000 stations between 1978and 2013(Global Runoff Data Center, 2013).However, in the last years many remote sensing Figures satellites were launched measuring parameters relevant for the investigation of the water cycle, e.g.precipitation, water level, and gravity.One of these remote sensing techniques is satellite altimetry.Besides its main design goal to measure water level heights of the ocean, satellite altimetry can also be used for deriving water level heights of inland water bodies, i.e. lakes, reservoirs, rivers, and wetlands (Birkett, 1995;Crétaux and Birkett, 2006;Crétaux et al., 2011).The advantage of satellite altimetry is the global availability which allows for estimating water level time series even in remote areas without local infrastructure.By now satellite altimetry can provide water level time series longer then two decades.However, due to its measurement geometry providing measurements along separate ground tracks with distances between about 80 km (Envisat) and 300 km (Topex/Jason) at the equator not all water bodies can be captured.In addition, due to a repeat orbit configuration the temporal resolution is limited to 35 (Envisat) or to 10 (Topex/Jason) days when using only single altimeter missions.Thus, the combination of different altimeter systems plays a key role for increasing the temporal and spatial resolution as well as the length of the time series.Satellite altimetry has to cope with different problems over inland water which are mainly caused by the large footprint of radar altimeters.For altimeter missions using Ku-band such as Envisat, the resulting footprint varies between 2 km over ocean and up to 16 km over land (Chelton et al., 2001).Even for SARAL/AltiKa, measuring in Ka-band, the footprint size still is about 8 km (Schwatke et al., 2015).The majority of problems in the field of inland satellite altimetry is due to land contamination.This effect is twofold: on the one hand the contamination of the radar echo leads to degraded range quality or even to unusable data sets, on the other hand so-called "hooking" or "off-nadir" effects occur.The second effect arises from off-nadir radar returns when the satellite is still/already over land but receives the main reflection from the off-nadir water areas.This leads to longer ranges visible in a parabolic shape of the resulting height sequence.This effect can be easily corrected by fitting curves on the resulting water level heights (da Silva et al., 2010;Maillard et al., 2015).The first effect is more challenging: the contamination of the radar measurements by land leads Introduction

Conclusions References
Tables Figures

Back Close
Full to a degeneration of ocean-like waveform shapes.The affected waveforms are more peaky and reliable heights cannot be derived using ocean waveform retrackers (MLE (Challenor and Srokosz, 1989), NASA β (Martin et al., 1983), etc).Therefore, an additional retracking has to be applied using robust retracking algorithms (OCOG (Wingham et al., 1986), Improved Threshold (Hwang et al., 2006), etc.) in order to achieve reliable heights.
Despite of the aforementioned challenges, satellite altimetry has been successfully used for the estimation of water levels of lakes and rivers by different groups during the last years.The potential of using satellite altimetry for the estimation of water level time series and for understanding the terrestrial water cycle was already shown e.g. in Birkett (1995), Crétaux and Birkett (2006), and Crétaux et al. (2011).In most studies, only single satellite tracks were used for the computation of water level time series.The most popular study areas were the Great Lakes (e.g. by Ponchaut and Cazenave, 1998 using Topex/Poseidon) and the Amazon Basin.For this basin investigations based on different missions exists, e.g. using Topex/Poseidon (de Oliveira Campos et al., 2001;Zakharova et al., 2006), Topex/Jason-1/Jason-2 (Seyler et al., 2013) and ERS-2/Envisat (da Silva et al., 2010).
In addition to these individual investigations four global databases have been developed that provide water level time series over inland water to the international community.The different processing strategies of these four databases are described as follows: The Hydroweb database 1 was developed by the Laboratoire d'Etudes en Géophysique et Océanographie Spatiales (LEGOS).For the estimation of water level time series over lakes and rivers, a multi-mission approach using satellite altimeter data of Topex/Poseidon, ERS-1, ERS-2, Envisat, Jason-1, and GFO is applied.The physical heights are estimated in a track-wise manner and are corrected by the slope of the geoid or mean lake level and by range biases with respect to Topex/Poseidon.The Introduction

Conclusions References
Tables Figures

Back Close
Full final time series are computed by merging the altimeter data on a monthly basis.The approaches used are published in Crétaux et al. (2011) andda Silva et al. (2010).
The River and Lakes database2 was developed by the European Space Agency and the De Montfort University (ESA-DMU).It provides track-wise time-series derived from Jason-2 and Envisat over a variety of inland waters.For each track crossing the water body of interest a single time-series is processed.The methodology for the estimation uses an expert system which is based on neural networks (Berry et al., 1997).
The Global Reservoir and Lake Monitor (GRLM)3 is maintained by the Foreign Agricultural Service of the United States Department of Agriculture (USDA).Time series of lakes and reservoirs are estimated by using a segment of one single altimeter track over the investigated target.The time series are composed of data from consecutive altimeter missions measuring along the same ground track.A combination of contemporaneous missions is not performed.The method for the estimation of water level time series is described in Birkett et al. (2011).
The Database for Hydrological Time Series over Inland Water (DAHITI)4 has been launched by the Deutsches Geodätisches Forschungsinstitut (DGFI, now DGFI-TUM) in 2013.Currently, DAHITI provides about 250 time series of rivers, lakes, reservoirs, and wetlands.The methodology for the estimation of water level time series in DAHITI is based on a Kalman filter approach described in detail in the article at hand.In contrast to the methods already published in literature, our approach is based on a rigorous multi-mission combination of a variety of altimeter missions.In addition, an extended outlier detection is applied and optional waveform retracking is implemented.Moreover, the processing contains a full error propagation and provides accuracies for each height measurement.This will be discussed in a follow-on paper in which uncertainties of the applied geophysical corrections and models will be taken into account.
Furthermore correlations between altimeter measurements will be considerd to achieve Introduction

Conclusions References
Tables Figures

Back Close
Full more reliable errors for each water level height.The current paper provides detailed information on the estimation of water level time series and performs a comprehensive validation by comparing the results with in-situ gauging data and time series from other databases (LEGOS, ESA-DMU, and GRLM).
The article is structured as follows: in Sect. 2 the altimeter data that serve as input for the Kalman filter approach as well as the preprocessing of the data are described.In Sect. 3 the methodology for the estimation of water level time series from satellite altimeter data using a Kalman filter approach is explained.Section 4 starts with the introduction of the validation areas and data before the resulting water level time series and validation results are presented.The paper finishes with a conclusion.

Altimeter data and preprocessing
For more than two decades, satellite altimetry has been provding data for various applications over ocean and inland waters.The approach presented in this paper combines as many as possible altimeter tracks from different missions over an investigated water body in order to increase the temporal resolution of the final water level time series, to maximize the probability to cover smaller inland waters, and to increase the accuracy.
In this paper, altimeter measurements from Topex, Jason-1, Jason-2, ERS-2, Envisat, and SARAL/AltiKa are used depending on the data coverage over the inland water body under investigation.In principle, data from Geosat, ERS-1, HY-2A, IceSAT, and Cryosat-2 can be used.However, these missions are neglected in the current investigations due to different reasons, i.e. lack of data over land, non/long-repeat cycle, bad data quality, or missing waveform information.
The applied missions can be separated into two groups according to their orbit characteristics.Topex/Poseidon was launched in 1992 into an orbit with a repeat cycle of 9.9156 days and a track separation at the equator of about 300 km.The mission was followed by its successors Jason-1 and Jason-2.These three altimeter satellites can be used for estimating continuous time series over more than two decades.The second Introduction

Conclusions References
Tables Figures

Back Close
Full group starts with ERS-2 (launched in 1995), followed by Envisat and SARAL/AltiKa.The orbit of these missions is defined by a repeat cycle of 35 days and a track separation of about 80 km.The data is available for almost two decades with a data gap between 2010 and 2013 due to the shift of Envisat to a drifting orbit that lasted until the launch of SARAL/AltiKa.ERS-1 is not yet ready for the use in DAHITI but shall be integrated in near future.This will enable to extend the time series back until 1991.
For the estimation of water level heights, Sensor Geophysical Data Records (SGDR) altimeter products are used which provide 1 Hz and high-frequent ranges as well as the altimeter waveforms.The latter allow for an individual retracking in order to achieve more reliable altimeter ranges, especially for smaller inland water bodies.Table 1 shows a list of the altimeter missions used and provides information about the product, cycle length, frequency, cross-track distance between altimeter measurements on ground, time period, and mean range bias with respect to Topex.
Depending on the investigated inland water body the original ocean ranges in the SGDR are very often corrupted.Especially, over small lakes and rivers the altimeter waveforms do not exhibit the typical ocean-like shape due to land contamination.Land-contaminated altimeter waveforms are usually more peaky and noisy.The quality of the ranges can be improved by retracking these waveforms.In this study, the "Improved Threshold Retracking" (Hwang et al., 2006) with a threshold of 10 % is applied if an additional retracking is necessary.This algorithm is very robust and delivers ranges for all surface types.They are more reliable than the original ranges over small inland waters.But over open water (i.e.larger lakes) the resulting ranges are less precise than ranges derived from retracking algorithms for ocean applications.But it is known that switching retracking algorithms along a single satellite track leads to height offsets (Crétaux et al., 2009).To avoid those offsets, all altimeter measurements of an investigated inland water body are retracked with the same retracking algorithm.
In order to convert the range measurements (original or retracked) to water level heights serving as input for our Kalman filter approach numerous preprocessing steps are necessary.Equation (1) summarizes the height computation from altimeter prod-Introduction

Conclusions References
Tables Figures

Back Close
Full ucts (orbit height h sat and altimeter range r alt ).These processing steps have to be performed for each individual altimeter measurement.The derived normal heights h normal serve as input for the Kalman filter approach described in Sect.3.
First, the range has to be corrected for geophysical effects.For this purpose, the models and conventions given in  , 2004).Finally, each single altimeter measurement for its radial error ∆h rad in order to account for inter-mission range biases is corrected.Radial errors are derived from a global multi-mission crossover analysis as described by Bosch et al. (2014).This range correction allows to use different altimeter missions as a single virtual altimeter system.The mean range bias given in Table 1 shows the averaged radial errors for each altimeter mission.All data used in this study (the altimeter data as well as all corrections) are extracted from OpenADB5 , the open altimeter data base of DGFI-TUM.More information on OpenADB are given in Sect.3.1.The quality of extracted geophysical corrections are checked and altimeter measurements are rejected if they do not comply with certain thresholds.
For the computation of water level time series within the Kalman filter approach normal heights h normal are used as input data whereas altimetry provides ellipsoidal heights.However, ellipsoidal heights are purely geometrical and do allow to predict where the water will flow.We compute normal heights by subtracting a (quasi-)geoid model (N) from the ellipsoidal heights.For this purpose, the EIGEN6c3stat (Förste Introduction

Conclusions References
Tables Figures

Back Close
Full The derived water levels are assumed to be constant over lakes since in general, the water is in balance with gravity and hydrodynamics of lakes is small compared to open ocean conditions.

Kalman filter approach
In order to use altimeter measurements from different tracks and missions a consistent and reliable combination strategy is important.The irregular spaced observations from different locations shall be merged to one time series per target and the optimal combination of measurements with different uncertainties must be ensured.All three requirements can be fulfilled by a Kalman filter that updates a model by measurement data of different accuracies and predicts the current state to the next time epoch (Kalman, 1960).In contrast to the common least-squares adjustment the Kalman filter works recursively and the amount of input observations per processing step is significantly reduced due to its sequential integration.This also enables real-time applicability in future.
The processing strategy for the estimation of water level time-series over inland waters using a Kalman filter approach is separated into three steps which are "Preprocessing", "Kalman Filtering" and "Postprocessing" (cf.Fig. 1).At the beginning, the preprocessing step includes all necessary tasks for the preparation of the input altimeter heights such as waveform retracking, applying range corrections, calculation of SDs of heights, and the rejection of outliers.The Kalman filtering step starts with the definition and creation of a hexagonal computation grid covering the inland water body.This is followed by the Kalman filtering itself, estimating water levels for each grid point.In the postprocessing step, all water level heights from the previous step are merged to a single water level time series referring to one reference location.Afterwards, an outlier detection is conducted.The final time series is stored in the "Database for Hydrological Time Series of Inland Water" (DAHITI).Introduction

Conclusions References
Tables Figures

Back Close
Full

Preprocessing
The Open Altimeter Database (OpenADB) holds satellite altimeter data and derived high-level products.In OpenADB satellite altimeter data are stored in the "Multi-Version-Altimetry" structure which is designed to allow fast parameter updates and data base extractions with user-defined formats and parameters.This data structure allows for an easy extraction of the required altimeter measurements for an inland water body of interest.Furthermore, the desired geophysical corrections can be selected individually.Users can choose between different geoid models, wet troposphere models, etc. according to their individual purpose.The data sets used for this study and the methodology to derive individual water level heights is described in Sect. 2.
In addition to the normal heights of the water levels the Kalman filter requires information on the quality of each measurement.This information is used for the weighting of the individual data sets as well as for the error estimation of water level products.Due to lacking absolute accuracies, the precision of the heights is computed by analyzing the along-track scatter of the measurements.For this purpose, a SD for each water level height using a floating box of 5 data points along the altimeter track is estimated.Lower SDs imply higher accuracies of the water level heights and vice versa.This approach assumes a constant water level along the satellite track and is only valid for lakes and small river crossings without significant slopes.In general, the SDs increase when the measurements locations approach the shore.
Each water level height has to pass an outlier test before it is included in the Kalman filter.Different user-defined criteria can be selected for track-wise outlier detection.Inaccurate water level heights are rejected before Kalman filtering, precise ones will be used for the estimation of the resulting water level heights.Different thresholds for water level height, SD, or latitudes can be selected.Moreover, an outlier detection by using Support Vector Regression (SVR) (Smola and Schölkopf, 2004) is implemented.This method applies a linear regression on each altimeter track to reject altimeter measurements that do not represent the flat water level of the inland water target.SVR is Introduction

Conclusions References
Tables Figures

Back Close
Full similar to the common regression but more flexible and robust.SVR is an advancement of the Support Vector Machine (SVM) (Boser et al., 1992) which is used as a classification algorithm for applications such as pattern recognition and machine learning.Depending on the mathematical problem, the kernel for the regression is variable.One can use linear, polynomial or radial base functions (Smola and Schölkopf, 2004).In our case, the SVR on single altimeter tracks over an inland water body using a linear kernel and zero-slope constraint is applied.Based on the constant representing the flat water level, an interval is defined which separates into valid and invalid data.Figure 2 shows an example of an altimeter track crossing a lake with an island in the middle.
Blue dots indicate valid measurements, red dots indicate rejected data that exceeds the SD threshold of 5 cm, and green dots mean outliers detected by SVR (with rejection interval of 10 cm).One can see that all heights influenced by land contamination are detected as outliers and the remaining heights represent a flat surface.
It is important to note that the criteria for the outlier detection are very flexible and the optimal configuration strongly depends on the investigated water body.As a consequence, the parameters for outlier rejection vary with the study areas.

Kalman filtering
The Kalman filtering is the most important step in the computation of water level time series and the heart of DAHITI.It describes the estimation of water level time series from the track-wise input heights.The combination of the time-dependent input data available at irregular intervals and -in case of larger lakes -at different locations is realized by a Kalman filter approach (Kalman, 1960).Different modified Kalman filter approaches were already used for geodetic applications (Yang and Gao, 2006;Eicker et al., 2014;Gruber et al., 2014) In principle, this algorithm realizes a sequential least squares adjustment taking into account the accuracies of the input data as well as the deterministic and stochastic behavior of the system and produces a statistically optimal estimate of the water level time series.Introduction

Conclusions References
Tables Figures

Back Close
Full Our approach includes the location of each altimeter observation by performing all computations on a regular grid over the water body.A hexagonal grid is selected in order to ensure equidistant grid nodes.The grid is created automatically by a recursive algorithm using one initial node over water as reference point for each water body.A landwater mask provides information about the extent of the grid.For smaller lakes and rivers, the grid is extended by a transient region in order to take into account uncertainties of the landwater mask and occurring flood events.Figure 3 shows an example of a grid used for the Lake Erie.The resolution of the grid and the number of nodes (n) can be defined individually depending on the extent of the inland water body.For rivers a small grid with a very high spatial resolution is selected in order to avoid errors due to river slopes.
The Kalman filter uses input observations to update the current state of the system and predict the model of the following time epoch.This is performed in a continuous loop consisting of two steps (an update and a prediction step) running consecutively for every period of time t k .At the beginning an initialization is necessary in order to set the starting conditions.The work flow is illustrated in Fig. 4.
The time increment of the Kalman filter can be defined arbitrarily.In our case an observation-based update interval instead of a contant one is used.That means that our system each time is updated if a new altimeter track is available.Thus, the update interval strongly depends on the size and the data coverage of the investigated water body.It can vary between 35 days (in case only an Envisat track crosses the target area) and one hour (in case of large lakes covered by different altimeter missions).Shorter time intervals are precluded by assigning the individual measurements to full days.The use of an adaptive update interval avoids smoothing effects in case of data gaps that may occur when a fixed time increment is selected.
In the following, the basic equations of the Kalman filter are introduced.The algorithm consists of an observation model and a dynamic model.Introduction

Conclusions References
Tables Figures

Back Close
Full The observations for each step k corresponding to epoch t k are given in vector l k and its co-variances in matrix Σ ll,k .
The vector length of l k depends on the number of water level heights m k available at each epoch t k .The unknown grid node heights are compiled in vector x k .The m k xn design matrix A k is the core of the observation model and connects the water level heights with the computation grid consisting of n grid points.A k has a dimension of m k × n and contains ones for those grid nodes where water level heights are available.Hereby, each water level height is assigned to the nearest grid node.The vector v k absorbs the residuals of the observation model.
The uncertainties of the water level heights are described in Σ ll,k .Since there is no information on correlations between individual water level heights the matrix is defined as diagonal matrix with variances σ 2 l (computed in the preprocessing step) on the mean diagonal.These are collected in vector s l,k The dynamic model of the Kalman filter approach describes the transition of the system state from epoch t k to t k+1 .
This includes the prediction step (cf.Fig. 4) for the parameter vector x + k as well as for its covariance matrix Σ + xx,k .The prediction of the grid node heights is done by the transition matrix Φ k .In addition, system noise q k is taken into account and mapped to the grid node heights by Λ k .The model uncertainties are predicted by Eq. ( 5) where Introduction

Conclusions References
Tables Figures

Back Close
Full the covariance matrix Q k contains the uncertainties of the system disturbance, i.e. the system noise.Since no information on the temporal evolution of the water level is known in advance, the prediction is purely based on stochastic information (transition matrices are identity matrices).Moreover, the (deterministic) system disturbances in q k are set to zero.The system noise σ 2 q in matrix Q k is assumed to yield 5 cm 2 for each grid node (without correlations).
In the following, the applied Kalman filter procedure is described in detail.

Initialization
The Kalman filter approach begins with an initialization step which is necessary before starting the recursive loop.The initial state vector x − k is filled by setting all elements to the observed water level with the smallest SD in the first epoch t k .The covariance matrix Σ − xx,k is initialized by an identity matrix of size n × n.

Update
In the update step, new altimeter water level heights are introduced in order to update the parameters of the actual state x − k to a new state x + k .The update is done by comparing the estimated observations (based on the current model, cf.Eq. 2) with the water level heights.The weighting of this so-called innovation is described by matrix K k .It can be computed based on the design matrix and the covariance matrices of observations and parameters using (6) Introduction

Conclusions References
Tables Figures

Back Close
Full The parameter update of vector x + k describes the updated water level heights for each grid node at the current epoch t k .
In parallel the corresponding covariance matrix Σ + xx,k of the height estimates is updated using Eq. ( 8).The uncertainties of new altimeter data are taken into account by applying the Kalman matrix as weighting matrix.It can easily be seen that the parameter accuracies will be reduced within the updating step.

Prediction
After updating the parameter vector and the covariance matrix of the current epoch t k , the prediction of x are to be computed.The predictions are used as start parameters for the next update step, and afterwards, the computation loop continues until all water level heights have been processed.In our case, no additional information about the temporal propagation of the parameter vector and the covariance matrix are introduced.Therefore, no deterministic model is applied and the transition matrices Φ k for data and Λ k for disturbances in Eqs. ( 4) and ( 5) can be identity matrices.Furthermore, only system noise is taken into account by setting the disturbance value q k equal to zero and its uncertainties Q k to variances of 5 cm 2 for each grid node without any correlations.These assumptions lead to simplified equations for the prediction.

Post-processing
The Kalman filter provides water heights x k and their formal errors Σ xx,k for each epoch t k and grid node.Introduction

Conclusions References
Tables Figures

Back Close
Full Since we assume the water level to be constant over the grid area for each time step, the surface information shall be concentrated to one references point.Thus, one "mean" one-dimensional time series shall be computed.Instead of simply averaging all grid node heights, only the best water levels per epoch are selected.Only water level heights are selected that which fulfill certain error criteria.In general, the limit for the SD is set to values between 5 and 10 cm.The remaining water level heights are averaged for each epoch by using the formal errors for the weighting factors.Finally, a time series of water level heights and their formal errors over the entire period of time are obtained.
In a last step an outlier rejection is performed.The water level time series can still contain outliers due to bad quality of data, ice coverage, orbit maneuvers, etc.For the detection of those outliers, SVR can be applied again -now on the full time series.
Complete tracks showing significant differences with respect to the other points of the water level time series can be rejected.This time, radial base functions instead of a linear kernel to perform the regression are used since it cannot be assumed a constant water level over time.The radial base function as kernel of the SVR allows to fit the time series including seasonal variations and trends.Figure 5 shows the results of an applied SVR on a 6 years subset of the time series of Lake Erie.The fitted model is plotted as cyan line together with its manually defined confidence interval.Water level heights which fulfill the limit of the SVR are kept (blue) whereas outliers are rejected (red).

Results and validation
In this chapter, some resulting water level time series from the Kalman approach are presented and validated.Since it is not possible to show results for all inland water bodies we focus on selected study areas introduced in Sect.4.1.Three inland water targets will be described in more detail.They represent different target types, i.e. large lakes, small lakes, and rivers.Moreover, results from 16 lakes and 20 river crossings Introduction

Conclusions References
Tables Figures

Back Close
Full are validated by comparison with in-situ data and altimeter time series provided by other groups.

Study areas
For altimetry-derived water level time series, in-situ measurements from gauging stations are the most important validation data sets.In order to perform reliable comparisons, only those inland water bodies are selected as study areas for which in-situ data are available.Since we have access to many gauging stations in North and South America we focus our study on these two continents.
Another criteria for the selection of inland water bodies is the availability of external altimetry-derived time series in order to demonstrate the performance of our Kalman filter method compared to other approaches.Hereby, each study case shall be observed by at least one other group (i.e.LEGOS, ESA-DMU or GRLM).Thus, those targets in America are selected which are best represented by other inland altimetry databases for a time period as long as possible.Moreover, different water types should be covered, such as large lakes, small lakes, and rivers with different width.We end up with 16 lakes and 20 rivers crossings illustrated in Fig. 6.For all investigated inland water bodies at least one in-situ gauging station and one external altimetry-derived time series is available.
In the following all investigated inland water bodies located in North America and South America and their corresponding in-situ data are introduced.
The first study areas are the Great Lakes of North America comprising of Lake Superior (82 000 km 2 ), Lake Huron (59 000 km 2 ), Lake Michigan (58 000 km 2 ), Lake Erie (25 000 km 2 ), and Lake Ontario (19 000 km 2 ).The large extents of the these lakes lead rents" platform of the National Oceanic and Atmospheric Administration (NOAA) 6 .For the validation of Lake Superior, in-situ stations of Duluth, Grand Marais, Marquette, Ontonagon and Point Iroqouis are used.Lake Huron has 5 stations for validation which are Essexville, Harbor Beach, Lakeport, Mackinaw City, de Tour Village.The stations Calumet Harbor, Holland, Kewaunee, Ludington, Milwaukee, and Port Inland are used for Lake Michigan.Lake Erie has 7 stations for validation which are Buffalo, Cleveland, Fairport, Fermi Power Plant, Marblehead, Sturgeon Point, and Toledo.For the validation of Lake Ontario, the in-situ stations of Cape Vincent, Olcott, Oswego, and Rochester are used.
In addition to the Great Lakes, the Great Slave Lake (27 200 km 2 ), Lake Winnipeg (24 000 km 2 ), Lake Athabasca (7800 km 2 ), Lake Winnipegosis (5100 km 2 ), Lake Manitoba (4600 km 2 ), Lake of the Woods (4300 km 2 ), Great Salt Lake (4000 km 2 ), Lake Claire (1400 km 2 ), and Cedar Lake (1300 km 2 ) which are located in Canada and the United States are investigated.These lakes differ significantly in surface extent by up to a factor 20. A difficult task for the estimation of water level time series are the climatic conditions in winter for the Canadian lakes.Several lakes are frozen for several months which makes the water level computation challenging.For the validation of the water level time series, in-situ data provided by the Government of Canada7 and the U.S.
In addition to the lakes in North America two lakes in the very South of South America are selected for validating our approach.Lake Argentino (1466 km 2 ) and Lake Buenos Aires (1850 km 2 ) are located in Argentina next to the Andes.The lakes are part-wise surrounded by mountains which can affect the altimeter measurements.Both lakes have similar shapes with largest extend in across-track direction of the satellites ground track.This leads to rather short track crossings varying between 10 and 15 km.Despite Introduction

Conclusions References
Tables Figures

Back Close
Full the location in a temperate climate zone near high mountains the lakes are not frozen in winter.The seasonal variations of both lakes vary between 2.5 and 3.5 m.For the validation of Lago Argentino and Lake Buenos Aires in-situ data from the Ministerio de Planificación Federal, República Argentina9 are used.
For the analysis of rivers the Amazon Basin is selected as study area.The Amazon Basin is the largest basin worldwide covering about 7 000 000 km 2 .The region is located in the tropics, and the climate is hot and wet during the whole year.Due to the strong precipitation, the resulting seasonal variations of the water level show amplitudes up to 15 m.The Amazon Basin consists of countless rivers which differ in lengths, widths, meanders, and seasonal variations.This diversification is very useful for the quality assessment of water level time series from altimetry.For example, the river widths vary between up to 10 km for the Amazon river and a few hundred meters for Rio Jiparaná.Moreover, the Amazon Basin is a well-observed area since the Agência Nacional de Águas (ANA)10 provides data of numerous in-situ gauging stations.
For the validation, water level time series of gauges at Rio Japurá, Rio Solimoes, Rio Negro, Rio Purus, Rio Jiparaná, Rio Paraguai, and Rio Cuiba are used.Another reason why we choose the Amazon Basin is that other groups such as LEGOS and ESA-DMU have also investigated this area.

Validation data sets
For a validation of the Kalman filter results in-situ data of gauging stations provided by different institutions named in Sect.4.1 are used.Water level time series from gauges have a high relative accuracy, but there are still some facts that have to be kept in mind using in-situ data.The absolute comparison of heights from gauges and satellite altimetry is often very difficult since location, reference height and vertical datum of gauges are not always precisely known or even unknown.This leads to height offsets Introduction

Conclusions References
Tables Figures

Back Close
Full between water level time series from gauge and altimetry which must be considered in the validation step.Especially, the comparison between water level heights from altimetry and in-situ data over rivers show in most cases remaining offsets.In general, almost no altimeter satellite track crosses the river at the location of a gauging station, which leads to additional offsets due to the slope of the river.To avoid handling the uncertainties of in-situ data only relative comparisons with water level time series from altimetry are preformed.
In order to rank our results with respect to other time series derived from altimeter data, we download water levels from three external inland altimeter data bases, namely LEGOS, ESA-DMU, and GRLM.These results are based on different altimeter missions and the groups perform different approaches to compute the water level time series.As a consequence, these external time series cover different time periods and feature different temporal resolution.This has to be kept in mind when comparing the different time series.

Selected results
We choose three of the aforementioned water bodies in order to present detailed results of our Kalman filter approach.Hereby, the targets are selected to represent three disparate inland water body types featuring different characteristics.Lake Superior (Fig. 7) is selected as representative of larger lakes with ocean-like condition.Lake Athabasca (Fig. 8) is a smaller lake which has to cope with ice coverage in winter which is the case for the most lakes in North America.Finally, Rio Madeira (Fig. 9) in the Amazon Basin is selected to show the potential of the Kalman filter approach for river monitoring.For those examples, the Kalman-filter based time series from DAHITI is compared with in-situ data and results from LEGOS, ESA-DMU and GRLM.
Figure 7 shows the water level time series of Lake Superior between 1992 and 2014: the DAHITI result is plotted in blue, in-situ data of station Ontonagon in red, and external altimetry-derived water levels in green (LEGOS), light blue (ESA-DMU), and orange (GRLM).For a detailed few, results from year 2004 are highlighted in the up-Introduction

Conclusions References
Tables Figures

Back Close
Full per right corner.In order to neglect constant offsets between the different solutions, all time series are shifted to the level of DAHITI, and only water level changes will be compared.The Kalman filter provides a continuous time series with an irregular near-daily resolution which shows neither outliers nor inter-mission inconsistencies.
For the computation, 1 Hz altimeter data of Topex, Jason-1, Jason-2, Envisat, ERS-2 and SARAL/AltiKa are used.Altogether, the time series is composed of 2779 single points each representing one day with at least one altimeter track crossing the lake.The DAHITI water levels coincide very well with the daily in-situ data of Ontonagon.The correlation correlation coefficient R 2 is 0.96 and the RMS differences show is 4.1 cm.In comparison to the DAHITI time series, the other altimetry-derived water levels show significantly reduced temporal resolutions.In addition, the lengths of the time series differ, depending on the missions used by the different groups.In order to rank the DAHITI result compared to other altimetry-derived water levels, we also compare the three external time series with the in-situ gauging data within the corresponding time intervals.This gives for all three databases smaller correlations and higher RMS (LEGOS: RMS = 6.1 cm, R 2 = 0.94, 278 points, ESA-DMU: RMS = 8.2 cm, R 2 = 0.82, 82 points, and GRLM: RMS = 12.1 cm, R 2 = 0.74, 760 points).The different performance are due to varying input data sets and the different approaches.LEGOS is using a multi-mission approach with a merged monthly resolution whereas ESA-DMU purely relies on Envisat with a temporal resolution of 35 day.GRLM applies a multimission approach reaching a temporal resolution of about 10 days.Figure 8 shows the same time series as Fig. 7 but for the smaller Lake Athabasca.Once again, water level from DAHITI (blue), in-situ data of Crackingstone Point (red), LEGOS (green), ESA-DMU (light blue), and GRLM (orange) are plotted.Now, the year 2010 is highlighted.In principle, Lake Athabasca with its 7800 km 2 surface extent should be large enough to provide reliable altimetry-derived water level time series.However, due to regular freezing in winter altimeter data processing is challenging for some months in the year.For the estimation of the water level time series in DAHITI retracked altimeter data are used, applying a 10 % Improved Threshold retracker (Hwang Introduction

Conclusions References
Tables Figures

Back Close
Full  , 2006).The DAHITI water level shows a very good agreement with in-situ data in summer, a few outliers due to ice coverage are visible in winter.The overall consistency with the gauge data yields a correlation coefficient of 0.88 and an RMS difference of 17.0 cm using 1337 points in the period between 1992 and 2014.The differences between in-situ data and LEGOS (RMS = 33.7 cm, R 2 = 0.79, 272 points), ESA-DMU (RMS = 80.5 cm, R 2 = 0.30, 79 points) and GRLM (RMS = 55.7 cm, R 2 = 0.27, 76 points) show higher RMS values and smaller correlations.One can clearly see, that the problems of all altimeter time series occur mostly in winter due to ice coverage.
As last example, we choose a river crossing in the Amazon Basin. Figure 9 shows the resulting water level derived from an Envisat/SARAL/AltiKa crossing over Rio Madeira.
At this location the Rio Madeira is about 2.5 km wide.The in-situ station Humaitá is located about 27.6 km upstream.All altimeter time series reach a temporal resolution of about 1 month since there is only one mission with 35 day temporal resolution.Altimeter data is available between 2002 and 2014 with a data gap in 2011 and 2012.Gauging information starts not before 2007.Thus, the comparison with in-situ data only comprises a time period of about 3.5 years.For DAHITI we have another year of SARAL/AltiKa data available.The Kalman filter result (blue) shows an RMS difference of 21.6 cm and a correlation coefficient of 1.00 by using 35 points.The RMS is comparable to the result of Lake Athabasca which is even more satisfying when taking the seasonal variations of about 15 m of Rio Madeira into account.The high amplitude is also the reason for the extremely high correlation which should not be overvalued.The RMS differences of LEGOS and ESA-DMU with respect to the gauge are about two times larger with 45.1 cm (LEGOS, 29 points) and 53.2 cm (ESA-DMU, 28 points) respectively.GRLM does not provide information for this virtual station.

Quality assessment
The results of Lake Superior, Lake Athabasca, and Rio Madeira presented in Sect.4.3 already show the qualification of the Kalman filter approach to provide reliable and high accurate time series of inland water level heights.Since three results -even if they Introduction

Conclusions References
Tables Figures

Back Close
Full represent different inland water types -are not enough to perform a reliable quality assessment of the method, we extend the validation to a larger sample and include all study targets (16 lakes and 20 river crossings) described in Sect.4.1 in the comparison.Table 3 summarizes the comparisons of lake level time series from DAHITI, LEGOS, ESA-DMU, and GRLM with in-situ gauge data.Additional information such as winter ice coverage (Ice) or data retracking (Retr., for DAHITI only) is provided.For each target, RMS difference, correlation coefficient and the number of used points (No) are provided.Depending on the availability of in-situ time series of the investigated water body, more than one comparison is performed for the larger lakes.The smallest RMS difference for each target is marked in green, the largest one in red.
DAHITI results show RMS differences with respect to the gauge data between 4 and 38 cm.It is obvious that the accuracy declines with lake extent and ice coverage.For some lakes, the differences between DAHITI and in-situ data vary by more than a factor of two using different lake gauges.Especially, for Lake Erie and Lake Winnipeg the difference between the RMS values can reach up to 10 cm.Since only one DAHITI time series is computed per lake, these variations demonstrate uncertainties of the in-situ data sets.For most lakes, the relations between the different RMS values are similar for the different altimeter products.
For most lakes the DAHITI water levels are more consistent with in situ-data than the results from external altimeter data bases.In addition, the temporal resolutions of the time series are significantly higher as indicated by the number of used points.Of course, the different time periods of the other altimeter data sets has to be taken into account, too.The most considerable improvements through the DAHITI Kalman approach with respect to the existing databases can be seen for smaller lakes.For example, for the Lake of the Woods, the DAHITI consistency with in-situ data is more than twice as good as in case of the other altimeter products, improving the RMS differences from about 40 cm to approximately 15 cm.
The validation results for different rivers in the Amazon Basin are summarized in Table 4.We study 8 different rivers with altogether 20 virtual stations.For the computation Introduction

Conclusions References
Tables Figures

Back Close
Full data from Jason-1, Jason-2, Envisat, and SARAL/AltiKa are applied.Most of the time series are based on only one altimeter track (sometimes from consecutive missions, e.g.Jason-1 and Jason-2).Few locations allow for using more than one track in case of a crossover point between different altimeter tracks.
The Table shows the comparison results of three altimeter products (DAHITI, LE-GOS, and ESA-DMU) with different in-situ stations.GRLM does not provide river level time series and is excluded from this investigation.In addition to RMS differences with respect to the gauging time series and correlation coefficients, the number of used data points, river width and distance between altimeter crossing and gauge is given.Positive distances indicates downstream gauges, negative differences indicate upstream gauges.The RMS differences between altimeter time series and in-situ data vary between 12 and 139 cm in the case of DAHITI.For most virtual stations, the consistency with the gauge is considerably lower than for lakes.It is not possible to prove a dependency to river width and distance to the gauge.Certainly this is not only due to the altimeter time series but also caused by the accuracies of the in-situ data.
Compared to time series from LEGOS and ESA-DMU, the new DAHITI approach can improve the consistency with the gauges for most of the targets.The improvement can reach several decimeters.
Many correlation coefficients in Table 4 are close to 1.This is not necessarily an indication for optimal consistency between altimeter water level and gauging observations but is significantly influenced by the large absolute water level variations (more than 10 m).

Conclusions
This paper presents a new method for estimating water level time series over inland water using multi-mission satellite altimetry data.It is based on a careful data preprocessing (including waveform retracking), a Kalman filter approach, as well as a rigorous Introduction

Conclusions References
Tables Figures

Back Close
Full outlier detection.The introduced method is the basis of the "Database for Hydrological Time Series over Inland Water" (DAHITI), an online database for inland water level time series from satellite altimetry observations operated by the Deutsches Geodätische Forschungsinstitut (DGFI-TUM).
The study demonstrates the performance of the new method for numerous lakes and rivers in North and South America.A comprehensive validation is performed by comparison with time series of water level variations from in-situ gauging stations.Moreover, a comparison with external altimetry-derived water level variations is presented based on data from Hydroweb (LEGOS), River and Lakes database (ESA-DMU), and Global Reservoir and Lake Monitor (GRLM).
The lake level data sets computed with the presented approach yield accuracies between 4 and 38 cm depending on the surface extent of the lake and climate conditions (i.e.ice coverage).For rivers, the performance is considerably lower with RMS differences varying between 12 and 139 cm.Here the accuracy mainly depends on the crossing angle of the altimeter track and the surrounding conditions.River width only plays a minor role.
For most study cases, the new approach yields significant accuracy improvements compared to water level variations provided by established inland altimeter databases, especially for smaller lakes and rivers.
In addition, the temporal resolution of the DAHITI lake time series is significantly improved compared with other data sets, allowing for the detection of sub-monthly temporal changes.
The reasons for the improvement performance of the presented approach are multiple: firstly, a larger observation data set is used as input as a multi-mission concept is realized.All available altimeter missions are cross-calibrated and incorporated in the computations.Secondly, the applied preprocessing consisting of a careful retracking and a robust outlier elimination.This ensures that only high accurate data will be used.Moreover, the Kalman filter approach permits the optimal combination of all data sets and also includes the accuracies of the input data for weighting.This also enables Introduction

Conclusions References
Tables Figures

Back Close
Full    Full Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | et al., 2012) model is used which extends the EGM2008 geoid model with additional GOCE gravity data.
Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | to ocean-like conditions which means that the altimeter measurements are not disturbed by land.The Great Lakes show seasonal variations of about 1 m.They are well-observed inland waters with many in-situ stations provided by the "Tides & Cur-Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | et al.
Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |

Figure 1 .
Figure 1.Processing strategy for the computation of water level time series for inland waters in DAHITI separated into three main steps: preprocessing, Kalman filtering, and postprocessing.

Figure 2 .
Figure 2.Example of an outlier detection using SDs threshold and SVR along a single satellite track over a lake containing an island (between approx.41 • 44 and 41 • 47 ).The result of the regression shows valid (blue) and rejected (red, green) altimeter heights.The SDs of the heights are plotted as gray bars.Thresholds for SDs and SVR are marked by dashed lines (black and cyan respectively).

Figure 4 .Figure 5 .
Figure 4. Procedure of the Kalman Filtering starting with an initialization step which is followed by a progressive loop containing one update and one prediction step.

Table 1 .
List of all altimeter missions used in this study together with their main characteristics.

Table 2 .
List of applied models and geophysical corrections.