Modeling 25 years of spatio-temporal surface water and inundation dynamics on large river basin scale using time series of Earth observation data

The usage of time series of Earth observation (EO) data for analyzing and modeling surface water extent (SWE) dynamics across broad geographic regions provides important information for sustainable management and restoration of terrestrial surface water resources, which suffered alarming declines and deterioration globally. The main objective of this research was to model SWE dynamics from a unique, statistically validated Landsat-based time series (1986–2011) continuously through cycles of flooding and drying across a large and heterogeneous river basin, the Murray–Darling Basin (MDB) in Australia. We used dynamic linear regression to model remotely sensed SWE as a function of river flow and spatially explicit time series of soil moisture (SM), evapotranspiration (ET), and rainfall (P ). To enable a consistent modeling approach across space, we modeled SWE dynamics separately for hydrologically distinct floodplain, floodplain-lake, and non-floodplain areas within ecohydrological zones and 10km× 10km grid cells. We applied this spatial modeling framework to three sub-regions of the MDB, for which we quantified independently validated lag times between river gauges and each individual grid cell and identified the local combinations of variables that drive SWE dynamics. Based on these automatically quantified flow lag times and variable combinations, SWE dynamics on 233 (64 %) out of 363 floodplain grid cells were modeled with a coefficient of determination (r2) greater than 0.6. The contribution of P , ET, and SM to the predictive performance of models differed among the three sub-regions, with the highest contributions in the least regulated and most arid sub-region. The spatial modeling framework presented here is suitable for modeling SWE dynamics on finer spatial entities compared to most existing studies and applicable to other large and heterogeneous river basins across the world.

Abstract.The usage of time series of Earth observation (EO) data for analyzing and modeling surface water extent (SWE) dynamics across broad geographic regions provides important information for sustainable management and restoration of terrestrial surface water resources, which suffered alarming declines and deterioration globally.The main objective of this research was to model SWE dynamics from a unique, statistically validated Landsat-based time series  continuously through cycles of flooding and drying across a large and heterogeneous river basin, the Murray-Darling Basin (MDB) in Australia.We used dynamic linear regression to model remotely sensed SWE as a function of river flow and spatially explicit time series of soil moisture (SM), evapotranspiration (ET), and rainfall (P ).To enable a consistent modeling approach across space, we modeled SWE dynamics separately for hydrologically distinct floodplain, floodplain-lake, and non-floodplain areas within ecohydrological zones and 10 km × 10 km grid cells.We applied this spatial modeling framework to three sub-regions of the MDB, for which we quantified independently validated lag times between river gauges and each individual grid cell and identified the local combinations of variables that drive SWE dynamics.Based on these automatically quantified flow lag times and variable combinations, SWE dynamics on 233 (64 %) out of 363 floodplain grid cells were modeled with a coefficient of determination (r 2 ) greater than 0.6.The contribution of P , ET, and SM to the predictive performance of models differed among the three sub-regions, with the highest contributions in the least regulated and most arid sub-region.The spatial modeling framework presented here is suitable for modeling SWE dynamics on finer spatial enti-ties compared to most existing studies and applicable to other large and heterogeneous river basins across the world.

Introduction
Periodically inundated areas such as floodplains play a major role in the healthy function of river systems and perform many ecosystem services of value to people such as the retention of flood water, nutrients, and sediment and the provision of food, clean water, and groundwater recharge (Hamilton, 2010;Lemly et al., 2000;Maltby and Acreman, 2011;Robertson et al., 1999;Tockner et al., 1999).Floodplains are particularly important within water stressed areas with high rainfall variability and semi-arid climate conditions as they help to sustain smaller discharges during the dry season, resulting in improved overall availability of water (Teferi et al., 2010).During the last century, increasing development of water resources, land use transformations, and agricultural intensification have led to an alarming disappearance and decline of terrestrial surface water resources (Finlayson and Spiers, 1999;Jones et al., 2009;Lemly et al., 2000).A recent study estimates that nearly two-thirds of all terrestrial freshwater wetlands were lost between 1997 and 2011, expressed as a reduction of the global area from 165 to 60 million ha (Costanza et al., 2014).Consequently, there is a pressing need for improved management and restoration of terrestrial surface water resources, which requires cost-effective methods for mapping and analyzing the distribution and dynamics of surface water across large spatial and temporal scales Published by Copernicus Publications on behalf of the European Geosciences Union.(Alsdorf et al., 2007;Bakker, 2012;Finlayson et al., 1999;Vörösmarty et al., 2015).
Recent advances in the availability and spatial and temporal resolution of geospatial data along with improved processing capabilities enabled the development of various continental-to global-scale hydrodynamic models with improved representation of channel and floodplain inundation dynamics (Cauduro et al., 2013;Getirana et al., 2012;Neal et al., 2015Neal et al., , 2012;;Sampson et al., 2015;Yamazaki et al., 2011).Although these models can provide information about the distribution of surface water across extended areas and periods of time, they require complex parameterization, are computationally intensive, and depend on the accuracy of digital elevation models (DEMs) with global coverage.As an alternative, Earth observation (EO) data, combined with statistical modeling techniques, represents a promising and costeffective approach for systematic observation and quantification of surface water (Alsdorf et al., 2007;Overton, 2005).New satellite, airborne, and ground-based remote sensing data with high spatial, temporal, and radiometric resolution are quickly becoming more accessible (Nativi et al., 2015).EO data enable analyses of changes in the availability and distribution of surface water at continental or sub-continental scales based on comparison of snapshots of the state of the system at two (Baker et al., 2007;Teferi et al., 2010) or multiple points in time (Huang et al., 2014b;Zhao et al., 2011).The opening of archives of continuous optical satellite data, such as Landsat and MODIS imagery, further increased the potential of performing time series analysis of remotely sensed surface water extent (SWE) and inundation dynamics over large areas and long periods of time (Klein et al., 2014;Kuenzer et al., 2015;Mccarthy et al., 2003;Sakamoto et al., 2007;Tulbure and Broich, 2013;Tulbure et al., 2016).Such EO-based analyses of SWE dynamics are to be distinguished from EO-based flood mapping, which focuses on flooding of areas that are not frequently inundated and large-scale damage assessment of floods (Kuenzer et al., 2015).In comparison to change analysis based on multiple observations, time series analysis refers to temporally dense monitoring of land surface dynamics over a defined period of time (Broich et al., 2011;Wagner et al., 2015).Synthetic aperture radar (SAR) has the advantage of not being affected by cloud cover for mapping surface water, but the availability of long-term SAR time series data for large areas is still limited (Yan et al., 2015).Accordingly, optical satellite data currently represent the main choice for time series analysis of SWE dynamics.
Empirical models of SWE on floodplains derived from optical satellite data as a function of discharge or water height in the adjacent river (Table 1) have been previously developed, with case studies including the Okavango Delta (∼ 15 000 km 2 ) (Gumbricht et al., 2004), the Waza-Logone floodplain in Cameron (∼ 3000 km 2 ) (Jung et al., 2011;Westra and De Wulf, 2009), the Tana River delta in Kenya (∼ 1300 km 2 , Leauthaud et al., 2013), and various flood-plains across the Murray-Darling Basin (MDB) in Australia (Table 1,study no. 4,5,6,7,8).Table 1 gives an overview of studies in which SWE derived from continuous optical satellite imagery was empirically modeled as a function of river flow and other driver variables for different floodplain sites.For example, the RIM-FIM (River Murray Floodplain Inundation Model) (Sims et al., 2014) used between four and seven manually selected Landsat images during the rising side of flood hydrographs in the period from 1984 to 2012 in combination with high-resolution DEMs to create empirical models of floodplain inundation as a function of river flow for 11 zones in the MDB.Huang et al. (2014a) used Moderate-resolution Imaging Spectroradiometer (MODIS) imagery during the biggest annual floods between 2001 and 2010 to develop a model that provides maximum SWEs for river-flow levels with a range of average return periods for 90 zones covering the entire MDB (∼ 1 million km 2 ).While such EO-based inundation models can provide cost-effective tools for sustainable management of water resources at subcontinental scales, there is currently still a gap between models of SWE at local scale and high spatial resolution (Table 1, study no.2, 4, 5, 6) and sub-continental scale and coarse resolution, which cover large areas but lack local detail (Table 1, study no.7).Using a unique Landsat-based time series  of validated surface water extent (Tulbure et al., 2016), the overall aim of this research was to model SWE dynamics at large river basin scale with locally relevant detail, focusing on the MDB of Australia as a case study.The Landsat-based SWE time series is unprecedented in its spatial (30 m × 30 m) and temporal (every 16 days) resolution and provides unique insights into 25 years of SWE dynamics across the entire MDB, including the Millennium Drought (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) (Leblanc et al., 2012), and major floods (e.g., 2010-2011 La Nina floods).
Despite the great value of large-scale inundation models for water resources management, there remains potential for improving the usage of time series of EO data for modeling SWE at sub-continental scale and multi-decadal time periods.One of the major limitations of existing approaches is that most of them are based on a small number of satellite images, which are typically acquired before, during, and after the occurrence of peak flow of manually selected floods (see event-based models in Table 1).The resulting models are event based, limiting them to forecasting a single maximum flood extent for a given peak flow.Considering the importance of flood propagation and duration for biodiversity as well as the increasing availability of time series of EO data, an event-based approach has drawbacks and a dynamic modeling approach, where each time step of the SWE time series is accounted for, is desirable (Chen et al., 2014;Hamilton, 2010;Shaikh et al., 2001).Therefore, this study aimed to model SWE dynamics continuously through cycles of flooding and drying, using all observations of the SWE time series along with a modeling approach suited for time series data.Even though the extent of floodplain inundation largely depends on the discharge and water level in the river, the hydrologic conditions of the floodplain as well as the local climate before, during, and after a flood play an important role in the flooding and drying behavior of floodplains.For many water bodies that are not connected to rivers, local rainfall (P ) is the main source of inundation (Kingsford et al., 2001).Increased soil moisture (SM) prior to flooding usually leads to reduced transmission losses and thus to a larger flood extent and longer flood duration for a given flow level compared to dry antecedent conditions of the floodplain (Overton, 2005).Additionally, evapotranspiration (ET) is a major component of the water balance of surface water bodies especially in semi-arid regions such as the MDB (Lamontagne and Herczeg, 2009;Sánchez-Carrillo et al., 2004).Besides river flow as the key driver for floodplain inundation, five of the studies that developed EO-based inundation models accounted for P and four of them also for ET during or before flooding, whereas only one study accounted for the antecedent SM condition of the floodplain (Table 1).Accounting for the local climate before and during flooding is most commonly done based on a modeling approach that requires definition of conceptual water balance and flow routing models (Table 1, study no.9, 10, 11).In order to understand the key factors that drive the dynamics of SWE over extended areas, we modeled SWE as a function of river flow and spatially explicit time series of P , SM, and ET and quantified the contribution of each of these variables in explaining the variability of SWE.
Even though some of the larger study sites are divided into smaller sub-regions for modeling (Table 1), only two studies accounted for lag times between discharge recorded at the gauge (Westra and De Wulf, 2009) or water surface elevation at key points (Jung et al., 2011) and the correlated SWE in different areas of the sub-region.The overall aim of this study was to develop a holistic and data-driven methodology for modeling SWE and its drivers through periods of flooding and drying across a large and heterogeneous river basin.Specific key objectives of this study were to 1. develop a transferable spatial modeling framework that allows for the application of a holistic modeling approach across the study area; 2. model lag times between remotely sensed SWE per modeling unit and recordings of discharge at available river gauges; 3. model SWE and quantify the role of drivers (i.e., river flow, ET, SM, P ) in explaining SWE variability across space and time. 2 Methods

Study area
The study area of this research was one of Australia's largest river system, the Murray-Darling Basin (more than 1 million km 2 ).The eastern and southern border of the MDB is marked by the Great Dividing Range (Fig. 1), which is also where most of the MDB's surface water runoff is generated (CSIRO, 2008).Outside of these partly humid highlands, the majority of the MDB is characterized by extensive and flat low-lying plains with arid and semi-arid climatic conditions and slowly meandering rivers with vast floodplains (MDBA, 2010).The long-term average annual rainfall of the MDB is 469 mm of which around 90 % evaporates or transpires back into the atmosphere.There is also a pronounced climate gradient across the MDB with average annual rainfall decreasing and climate variability increasing from southeast to northwest (Leblanc et al., 2012).
The MDB has almost 30 000 wetlands (MDBA, 2010) with 16 wetlands listed as "wetlands of international importance" (Ramsar Convention Secretariat, 2014) and around 200 specified in the Directory of Important Wetlands in Australia (Environment Australia, 2001).The majority of wetlands are floodplains, which cover around 6 % of the MDB's total area (Kingsford et al., 2004).In the second half of the 20th century, the MDB's water resources have been developed intensively with agriculture now taking up around 80 % of the MDB's area (CSIRO, 2008) and accounting for more than 80 % of the MDB's average annual surface water use of 11.3 km 3 year −1 (Leblanc et al., 2012).Reduction in both frequency and size of flooding events resulting from agricultural development within the MDB has led to deterioration in the health of many surface water ecosystems (Kingsford, 2000).On top of this human-induced stress, between 1997 and 2009 the MDB experienced the most severe drought since the records began, during which the floodplains and wetlands of the lower MDB received very little or no inflows with devastating environmental and ecological impacts (Leblanc et al., 2012).These key features illustrate that water requires extensive and well-planned management in the MDB for which a basin-wide empirical model of surface water and inundation dynamics can be a valuable tool.

Data
The dependent variable used in this study was based on a remotely sensed time series of validated, open-surface water and flooding extent derived from the seasonally continuous archive of over 25 000 Landsat TM and ETM+ images available for the entire MDB from 1986 to 2011 (Tulbure et al., 2016).This SWE time series was developed using random forest classification models for surface water and clouds (Tulbure et al., 2016).The overall classification accuracy was 99 %, with 87 % (±3 % standard error) and 96 % (±2 %) producer's and user's accuracy of surface water, respectively.SWE dynamics showed high inter-and intra-annual variability across the MDB, highlighting the vulnerability of SWE to hydroclimatic variability (Tulbure et al., 2016).The SWE time series used Landsat images with ≤ 50 % cloud cover (Tulbure et al., 2016), resulting in times between subsequent observations of SWE from 16 days (Landsat temporal resolution) to a multiple of 16 days.
In order to model SWE continuously through periods of flooding and drying, we used spatially explicit time series of rainfall, evapotranspiration, and near-surface soil moisture along with river flow as predictor variables (Table 2).The selection criteria for these variables were that the temporal extent of the data sets had to be spatially explicit and long enough to cover the entire period of the SWE time series.River-flow data were acquired for gauges that had complete records of daily discharge expanding over the entire time frame of the SWE time series and downloaded from respective state repositories (State Government Victoria, 2015;Queensland Government, 2015;Government of South Australia, 2015;New South Wales Government, 2015).

Spatial modeling framework
A schematic overview of the data processing, analysis and spatial modeling framework used in this analysis is provided in Fig. 2. Due to the large geographic extent of the study area and the related heterogeneity of SWE dynamics across the basin, the study area was split into suitable sub-units for modeling SWE.Overton et al. (2009) developed a segmentation of the MDB into zones with uniform ecological and hydrological characteristics (EH-zones), which is described as a trade-off between the finer resolution of river and floodplain behavior and available river gauges.This zonation was used for the development of the MDB-FIM (Chen et al., 2012) and subsequently adapted and improved by accounting for the hydrologic structure of the MDB (Huang et al., 2013) (Figs. 2a, 3).The zonation was specifically developed to enable hydrological and hydraulic modeling on a whole-ofbasin scale while preserving key ecologic and hydrologic entities, and served as the basic spatial segmentation in this analysis.For each EH-zone, the most suitable river gauge for modeling SWE was specified by this zonation and the size of the resulting 89 zones ranged from a maximum of 59 991 km 2 to a minimum of 541 km 2 with an average zone size of 11 935 km 2 .
The MDB contains numerous small and ephemeral rivers and other water bodies that are not connected to major river systems.As opposed to floodplains, we expected the SWE of these water bodies to be mainly driven by local rainfall and evapotranspiration.Furthermore, the SWE dynamics of floodplain lakes differ greatly from those of shallow floodplains, especially with respect to the retention of flood water after a flood has passed.Therefore, we used an existing static wetland layer (Kingsford et al., 2004) to categorize the entire study area into floodplain, floodplain-lake, and nonfloodplain areas, so that the heterogeneous dynamics of SWE on these different entities could be accounted for in the modeling process (Fig. 2b, c).The definition of the floodplain and floodplain-lake category was based on hydraulic connectivity of surface water bodies to river systems with available discharge data for modeling.We defined hydraulic connectivity based on the Geofabric Surface Network (Commonwealth of Australia (Bureau of Meteorology), 2012), a fully connected and directed spatial river network, which allowed for performing upstream and downstream routing operations through the river network based on the location of available river gauges (Fig. 2).As a result of this approach, several water bodies that were defined as floodplains by the static wetland layer were assigned to the non-floodplain instead of the floodplain category.Accordingly, the floodplain-lake category comprised all non-permanent lakes, for which the river network indicated hydraulic connectivity to a river gauge.Since SWE dynamics of reservoirs are mainly a function of respective management strategies, they were masked out for generating the surface water categories.
To enable similar modeling conditions across the study area and to identify local spatial patterns in the role of climate drivers (P , SM, ET) of SWE dynamics, we imposed a regular grid on top of the EH-zonation (Fig. 2c).Although the imposition of a regular grid at times led to a less than ideal spatial segmentation of the river and floodplain structure, it allowed us to quantify the relationship between SWE and hydrologic key parameters at a much finer scale compared to using only the regional EH-zonation.Another advantage of using a regular grid was that it was straight forward to implement at sub-continental or even global scales, and did not require a comprehensive analysis of the river structure.In re- gards to defining a suitable grid cell size, a finer grid leads to a decrease in the fraction of cells that contain any floodplain area and consequently also to an increase in cells that contain very small fractions of floodplain.We chose 10 km as the most suitable cell size for modeling as a trade-off between sufficient spatial detail for capturing the variability in SWE at a local scale and the suitability and spatial resolution of data for modeling the driver variables.

Statistical modeling of surface water extent dynamics 2.4.1 Data pre-processing
In order to prepare geospatial time series data sets for statistical analysis and modeling, we summarized all data sets based on the grid cells and surface water categories.For all spatially explicit driver variables (i.e., ET, SM, P ), we developed numeric daily time series per grid cell by computing spatial averages for each day.For the dependent variable, we derived a time series of surface water area for each surface water category in each grid cell from the SWE time series.Even though a cloud cover threshold of 50 % per Landsat scene was already applied for generating the SWE time series, there could still be excessive cloud cover of up to 100 % in individual 10 km × 10 km grid cells.To preserve a maximum number of valid surface water observations while maintaining acceptable levels of noise and uncertainty in the SWE time series, we therefore applied another cloud cover threshold of 40 % per individual grid cell.To account for scale effects resulting from different fractions of each surface water category in different grid cells, we used the fraction of SWE per cloud-free area of each category in a cell instead of actual areas (Table 3).For data management and modeling purposes, all data were stored and handled in time series format using the zoo infrastructure for regular and irregular time series (Zeileis and Grothendieck, 2005) in R (R Development Core Team, 2008).

Model development and specification
To better understand SWE dynamics and its drivers across the study area, we developed dynamic multiple linear regression models with surface water as the dependent variable and four predictor variables (P , SM, ET, and river-flow data; Table 3) for each surface water category per grid cell.One of the key objectives was to take advantage of each time step of the SWE time series and to model surface water extent continuously through periods of flooding and drying.We achieved this by including the previous SWE observation as an addi-  tional predictor variable into the model equation.Models in which a lagged-dependent variable is used as an additional predictor variable are referred to as dynamic linear regression models or lagged-dependent variable (LDV) models and are commonly used for the analysis and forecasting of time series data in economics (Keele and Kelly, 2006;Shumway and Stoffer, 2006).The lagged-dependent variable introduces a temporal component into the model, so that the SWE at a given time step is also a function of the SWE of the previous time step in the time series.The equation that includes all potential predictor variables used for modeling SWE on floodplains and floodplain lakes is shown in Eq. ( 1).
where SWE t is the surface water extent at time t, SWE (t−1) is the surface water extent of the previous available Landsat observation (t − 1), Lag(Q) is the discharge at the related gauge lagged by the time it takes for a flood to travel from the gauge to the respective modeling cell, P is local rainfall, ET is evapotranspiration, SM is soil moisture, and e is the error term.The equation used for modeling SWE dynamics on non-floodplain areas is the same as Eq. ( 1) but without discharge as a predictor variable, given that those areas are not connected to a river.We used the dynlm (Zeileis, 2014) R package for dynamic linear modeling and time series regression for implementation and analysis of dynamic models based on the numerical input time series.To account for the time that it takes for water to travel from the gauge where it is recorded to a modeling cell, discharge is incorporated into the equation by applying a lag time.This lag time for discharge (Q lag) is the modeled timing between daily flows measured at the gauge and the correlated satellite-observed inundation response of a downstream or upstream grid cell.After quantification of Q lags, we used a 10-day moving average of discharge instead of daily flows in order to match the inter-daily variability and dynamics of the discharge time series with the dynamics of the SWE time series (16-day time step).For local rainfall and actual ET we used the sum of 16 days before each Landsat observation (including the day the image was taken) because P and ET that occurred more than 16 days before were already accounted for by the previous SWE observation.Compared to ET and P , a previous 16-day moving average of SM before each Landsat observation was used, since SM characterizes the condition of the floodplain or land surface rather than input or output of water to the system such as ET and P .By transforming daily time series of the predictor variables into moving averages and sums, we smoothed out the high inter-daily variation of these variables, which is not reflected in the SWE time series and thus not suitable for explaining the variation of SWE in modeling.The final approach for including the driver variables into the models was based on our conceptual understanding of SWE dynamics, where the SWE on a given observation is a function of river flow during the observation, the SWE of the previous observation and changes in P , ET, and SM between the previous and current observation (Table 3).

Variable selection and model validation
We used the coefficient of determination (r 2 ) as a measure of how well the variability in SWE was explained by the models.For quantifying the relative importance of the predictor variables on SWE, we tested whether accounting for P , ET, and SM leads to an improvement in the predictive performance of models.We used the root mean squared error (RMSE) in 5-fold cross-validation (CV) (hereafter referred to as CV-RMSE) to quantify predictive performance of models.The RMSE is the root of the mean of the squared residuals of the prediction and is in the same unit as the dependent variable, which ranges between 0 and 1 (Table 3).For 5-fold CV, we split the data into five equally sized chrono-logical subsets.We then fitted the model to four subsets of training data and used it to predict the remaining subset as test data.We repeated this process for the other four constellations of training and test data and the CV-RMSE was calculated by averaging the RMSE of all five predictions.The variable selection process was implemented in R by using the cross-validation tools for regression models (cvTools) package (Alfons, 2012).
We used step-wise variable selection to find the variable combination that leads to the best predictive performance as measured by CV-RMSE.Since we considered Q and P as the key drivers of SWE dynamics in floodplains and nonfloodplains, respectively, the initial models included only Q as a predictor for floodplain and floodplain-lake models and only P for non-floodplain models.The base model is the initial model after adding the LDV as a predictor variable.Based on the order of predictor variables as given by Eq. ( 1), we then added one variable at a time to the base model and kept the variable in the model if it led to a reduction in CV-RMSE.Hereby, a very small improvement in CV-RMSE was sufficient for including a variable into the model.If adding a certain variable did not lead to an improvement in CV-RMSE, this variable was no longer considered in the variable selection process.The predictor variables that were selected based on this process are hereafter referred to as additional predictor variables (APV).The model that is obtained after adding the APV to the base model is hereafter referred to as the final model.

Quantification of flow lag times
Preliminary analysis of available gauges and their suitability for modeling showed that there are often multiple suitable gauges for grid cells that contain floodplain or floodplainlake areas and that large EH-zones with multiple major inflows and outflows require more than one gauge to model all floodplain cells.Therefore, we considered all available gauges with suitable discharge data and used the river network to select the most suitable gauges for modeling.We then quantified the most suitable lag time for discharge by iterating through a variety of positive and negative Q lags at 5-day intervals and selected the one that led to the highest correlation between discharge at the gauge and surface water extent on the respective cell.We selected 5-day intervals to account for the fact that there is no exact lag time for each cell because flow travel times are a function of discharge and overbank flow, so that elevated discharge values during times of flooding are likely to result in different flow travel times compared to low flows (Overton et al., 2006).For each possible lag time, we developed a simple linear regression model with surface water area as the dependent variable and lagged discharge as the predictor variable and used the adjusted r 2 as a measure for correlation.The lag that led to the highest r 2 was then assigned to each cell and used for modeling SWE on floodplains and floodplain lakes.For floodplain lakes, the estimation of discharge lag times was more difficult, since the SWE on these surface water bodies is only correlated to discharge on the rising side of the flood hydrograph, whereas the draining of these water bodies is primarily driven by infiltration and evapotranspiration.To account for this, we used only increasing SWE observations (observations of surface water that were higher than the previous observation) for quantifying Q lags for floodplain lakes.

Case study and experiment design
One of the main objectives of this study was to develop a spatial modeling framework that enables capturing SWE dynamics on a local scale, while being applicable to large (i.e., sub-continental) and heterogeneous areas.We selected three sub-regions (Fig. 3) across the study area for illustrating the modeling and analysis results.Based on the climate characteristics of the MDB (see Sect. 2.1), we expected SWE dynamics and the role of the predictor variables to differ substantially amongst the three sub-regions.
Each of the three sub-regions contains important floodplain wetland systems that are listed under the Directory of Important Wetlands in Australia (Environment Australia, 2001).The Paroo sub-region comprises large parts of the Paroo river system, which together with the neighboring Warregoo River is considered the last of 26 major rivers in the MDB without large dams and diversions and consequently little or no manipulation of the natural flow and inundation regimes (Kingsford et al., 2001).These river systems experience semi-arid to arid climate conditions and have a flow regime typical for dryland rivers, which is characterized by extreme variability with long dry spells, punctuated by large and unpredictable flood events and more frequent flow pulses that lead to in-channel flows without achieving floodplain inundation (Bunn et al., 2006).Another important feature of the Paroo river system is that it is predominantly a terminal river system that has only connected with the Darling River a few times in recorded history through the Paroo overflow (Fig. 5).
The Murray sub-region covers the lower Murray River and its adjacent floodplains along a ∼ 350 km stretch, starting from the location where the Darling River merges into the Murray River.This section of the Murray River is highly regulated by a sequence of locks, weirs and storage facilities.The Murrumbidgee sub-region covers a ∼ 300 km stretch of the Murrumbidgee River and its adjacent floodplains as well as parts of the mountainous runoff-generating catchment area.Both the Murray and the Murrumbidgee subregions are highly regulated and contain areas of irrigated agriculture, which is likely to have a pronounced impact on SWE dynamics.
In the southern part of the MDB including the Murray and the Murrumbidgee sub-regions, floods naturally occur in winter and spring as a result of reliable rainfalls and snowmelt and typically last for several months (Penton and Overton, 2007).In the northern part of the MDB including the Paroo, floods typically occur in the summer months as the result of increased rainfall activity in the corresponding catchment areas during this time of the year.Floodplain inundation in the Murray sub-region is largely driven by discharge generated by rainfall in distant upstream catchments.In comparison, the Paroo and the Murrumbidgee sub-regions are located close to runoff-generating catchment area; thus, local rainfall will likely have a more pronounced effect on SWE dynamics as compared to the Murray.The runoff and climate characteristics differed substantially among the three sub-regions (Table 4).

Flood propagation and flow lag times
There was more than one available river gauge in all three sub-regions (Table 5), which allowed us to validate Q lag estimates based on a comparison of flow data from two gauges in the same river reach.For each pair of validation gauges, we randomly selected four floods of different magnitudes that had a single pronounced flood peak and calculated the time difference between the day of occurrence of the flood peak at the upstream and downstream gauge (Table 5).
Q lags for the Murray sub-region were modeled for flow data of two different gauges (Fig. 4a, b).We found that despite the long length of the river reach, using one gauge for all floodplain modeling cells of the sub-region led to realistic estimates of Q lags.Both the most upstream (2-A) and most downstream located gauge (2-B) led to a gradual increase and decrease in Q lags, respectively, along the 350 km reach in accordance with a validated lag time of 12 days between gauge 2-A and 2-B (Table 5).Despite the overall realistic flow propagation pattern for both gauges, there were a few outlier cells, for which Q lags were different as compared to the average Q lag of neighboring cells.For floodplain lakes, we expected Q lags to be in the same range as the Q lags of the surrounding floodplains, with potential minor differences resulting from the larger volume and slower filling behavior of these water bodies compared to shallow floodplains.For both gauge 2-A and 2-B, Q lags were in good accordance with the Q lags of the surrounding floodplains for some floodplain lakes (e.g., lakes around Lake Limbra) but deviated substantially for others (e.g., lakes around Lake Woolpolool and Lake Wallawalla) (Fig. 4a, b).
Due to the complexity of the river and floodplain network in the Paroo, the usage of a single gauge was found to be insufficient for modeling all floodplains across this sub-region.The Paroo River receives lateral inflows from the Warregoo River, which first passes the large Yantabulla swamp before merging into the Paroo further downstream (Fig. 5).The Kerribree creek is a second lateral inflow into this sub-region from the Warregoo River but is likely not connected to the Paroo River (Timms, 2009).To account for these lateral inflows, Q lags of the Paroo sub-region were modeled using two gauges.The main branch of the Paroo River (west of line L-A in Fig. 5a) was modeled using gauge 1-C (Fig. 5a).Similar to the Murray, Q lags in the area of the gauge (1-C) were 0 days and showed a gradual increase and decrease in downstream and upstream directions, respectively, with the exception of a patch of connected floodplain area in the east of the southernmost floodplains.The validated flow lag time of 9 days between river gauge 1-B and 1-E (Table 5) was captured well by the model with Q lags consistently approaching −5 days in floodplain areas around gauge 1-B and 5 days around gauge 1-E.In the most downstream located floodplains of this sub-region, Q lags approached the upper limit of 60 days that was defined for modeling.The lateral inflows from the Warregoo River to the Yantabulla swamp and Kerribree creek (east of line L-A in Fig. 5a) were modeled using Gauge 1-B, for which records only date back to 1991.Despite the limited temporal coverage of this gauge, Q lags for the northern inflows represent a realistic pattern, with a noticeable abrupt increase of about 5-10 days along the passage through the Yantabulla swamp.The increase in Q lags from 0 to 5 days occurs in the area of validation gauge 1-D, which is in accordance with the validated flow lag time of 4 days between gauge 1-A and 1-D (Table 5).For floodplains and floodplain lakes of the Kerribree creek, automated quantification of Q lags did not lead to realistic results as indicated by negative Q lags.
In the Murrumbidgee sub-region, quantification of Q lags was based on two gauges along the reach (3-A and 3-B) and led to realistic flow propagation patterns up to a point where the floodplains divert into two branches (point A in Fig. 6).Before point A, Q-lags were in good accordance with the validated flow lag time of 5 days between gauges 3-A and 3-B and between 3-B and 3-C (Table 5).Downstream of point A, Q lags abruptly approached the pre-defined upper limit of possible lag times consistently for the remaining floodplain cells.

Model performance
In order to quantify the relative importance of the LDV and APV for predicting SWE, we compared the CV-RMSE of the initial, base and final models as defined in Sect.2.4.3 (Table 6).Since the majority of water bodies in all three sub-regions are floodplains, this section is mainly focused on this surface water category.For all three sub-regions, adding the LDV to the initial model of the floodplain category yielded large improvements in CV-RMSEs, with an average improvement of 81 % ( CV-RMSE = 0.19) for the Paroo, 81 % ( CV-RMSE = 0.16) for the Murray, and 87 % for the Murrumbidgee ( CV-RMSE = 0.10).In comparison to that, adding the APV (i.e., P , ET, SM) to the base model, only led to further CV-RMSE improvement of 5.2 % ( CV-RMSE = 0.0029) for Paroo, 0.3 % ( CV-RMSE =   0.0001) for the Murray and 0.8 % ( CV-RMSE = 0.0003) for the Murrumbidgee.Since the RMSE is in the same unit as the dependent variable, it partly depends on the average magnitude of the SWE ratio on local floodplain units and is consequently not suitable for comparing model performance among the three sub-regions but only within each sub-region.
In comparison to the RMSE, r 2 is independent of the magnitude of the dependent variable and thus suitable for comparing model performance between the sub-regions.The r 2 of initial floodplain models is a measure of how well SWE dynamics are explained by river flow as the only predictor variable after accounting for Q lags.The average r 2 of initial models was much higher in the Murray sub-region with a value of 0.58 as compared to the Paroo (0.33) and Murrumbidgee (0.25).Despite this large difference in r 2 of initial models, average r 2 of floodplain models in the Paroo increased to the same level as the Murray (0.67) after accounting for the LDV (0.62) and APV (0.67).For the Murrumbidgee, the performance of final floodplain models was much lower compared to the other two sub-regions with an average r 2 of 0.41.The average r 2 of all 363 floodplain models across the three sub-regions was 0.63.Similar to the CV-RMSE, accounting for the APV only led to small further improvements in model r 2 compared to the large improvements resulting from the LDV.Analysis of the spatial distribution of r 2 -based performance of final models showed that there were no distinct patterns for floodplain and floodplain-lake models of the Murray (Fig. 4c), Paroo (Fig. 5a), and Murrumbidgee sub-regions (results not shown).
The average r 2 of final models of the floodplain-lake category was 0.69 for the Paroo, 0.68 for the Murray, and 0.47 for the Murrumbidgee.For the non-floodplain category, average r 2 of final models was 0.42 for the Paroo (Fig. 5b), 0.27 for the Murray (results not shown) and 0.32 for the Murrumbidgee (results not shown).In comparison to floodplain and floodplain-lake models, r 2 -based model performance of non-floodplain models showed distinct spatial patterns in the Paroo with predominantly high r 2 (> 0.6) for all modeling cells that contained surface water bodies that are not hydraulically connected to the modeling gauges (see nonfloodplain waterbodies in Fig. 5b).
The large differences in performance of initial and final models of the floodplain category were mainly a result of the distinct relationships between remotely sensed SWE on local floodplain units and river flow in each of the three subregions (Fig. 7).In the Paroo, most floodplain areas showed SWE time series with similar temporal patterns to example cells Ex-A and Ex-B (locations shown in Fig. 3), which were characterized by long dry periods and short but intense flood pulses.Both example cells are located in the area of overlapping Landsat satellite paths resulting in approximately double the number of SWE observations and a shorter modeling time step compared to the Murray and Murrumbidgee example cells.Most flood pulses were captured well by the SWE time series in both cells but example cell Ex-B (Q lag = 25 days) illustrates that the river-flow data recorded at the gauge becomes less representative for SWE dynamics with increasing distance to the gauge.For cell Ex-B, the SWE Table 6.Average CV-RMSE and r 2 of initial, base, and final models of the floodplain category of the three sub-regions (definition of initial, base, and final models is given in Sect.2.4.3).

Sub-region
Paroo Murray Murrumbidgee time series showed prolonged inundation of floodplain areas for several months after river flow at the gauge has returned to dry conditions resulting in a much lower r 2 of the initial model of example cell Ex-B (0.45) as compared to example cell Ex-A (0.83).After accounting for the LDV, the r 2 of example cell Ex-B increased to 0.79, which illustrates the importance of the LDV in the final models.Due to the large size of the Paroo sub-region, SWE dynamics on many of the floodplain areas differed from the dynamics of river flow at the modeling gauge as in example cell Ex-B, which explains the low average r 2 of initial floodplain models in this sub-region.In the Murray, most of the floodplain areas showed similar SWE dynamics to the example cell for which the SWE time series closely resembles the dynamics of river flow whenever a certain minimum flow threshold was exceeded in the river.Inundated area reduces quickly when river flow recedes to pre-flood levels, which explains the comparatively high initial r 2 of the example floodplain model (0.74).Out of the three sub-regions, the Mur-rumbidgee was most affected by cloud cover during times of flooding due to its proximity to mountainous headwater catchments.Consequently, SWEs during times of flooding were not represented well in the SWE time series (i.e., missing time steps) for the majority of floods resulting in a low initial r 2 of 0.39.In comparison to the Paroo, accounting for the LDV did not lead to large improvements in model performance, with r 2 of the final model of 0.79 (7 %) in the Murray and 0.43 (10 %) in the Murrumbidgee example cells.

Predictor variable combinations and spatial patterns
Based on the results of the variable selection process, we calculated the percentage of models in each sub-region for which inclusion of APV led to improvement of CV-RMSE (Table 7).On average, P and SM were more important for explaining SWE dynamics on floodplains in the Paroo as compared to the Murray and Murrumbidgee sub-regions, for which P and ET were selected for about 50 % of all floodplain models.In general, local rainfall was the most influential APV and helped to explain SWE dynamics in 241 (66 %) out of 363 floodplain models and 41 (89 %) out of 46 floodplain-lake models.ET was selected for approximately half of all floodplain models in the Murrumbidgee and Murray sub-regions and 36 % of models in the Paroo.For floodplain-lake models, local rainfall and evapotranspiration were more important in the Paroo compared to the Murray sub-region.There were only four modeling cells that contained floodplain-lake area in the Murrumbidgee sub-region so that a comparison with the other two zones was of limited value for this category.For non-floodplain models, we found a similar pattern for actual ET and SM for all three sub-regions with both variables being selected for one-third to one-half of all models with an exception in the Murray, where SM was only selected for 11 % of the models.
In comparison to r 2 -based model performance, there were distinct spatial patterns for the contribution of the LDV and the APV to predictive performance as measured by CV-RMSE of floodplain and floodplain-lake models.For in- stance in the Paroo, the LDV led to the largest improvement in CV-RMSE in areas with poor predictive performance (high CV-RMSE) of the initial models such as the area of the Yantabulla swamp, the most downstream-located dead-end floodplain area and the floodplains of the Kerribree creek (Fig. 8a, b).The Kerribree creek along with the Yantabulla swamp also showed the highest contribution of the APV for explaining SWE dynamics (Fig. 8c).The floodplains of the Kerribree creek are a good example showing that previous floodplain conditions (i.e., LDV) and local climate drivers (i.e., APV) explain the variability in SWE (i.e., models with r 2 > 0.6, Fig. 5) for floodplains on which SWE dynamics are poorly correlated with river flow from available gauges as indicated by initially high CV-RMSE (Fig. 8a).In general, the magnitude of the improvements in CV-RMSE resulting from the APV was small compared to the improvements resulting from including the LDV.
The complex nature of the relationship between SWE and its drivers over time was identified by analyzing crosscorrelations and lag effects of the time series used for modeling.The correlation coefficient indicates how variation of one variable over time is explained by a second variable.We computed this coefficient for a range of positive and negative lag times between different pairs of input time series.For the floodplain area in the Paroo example cell Ex-A (location shown in Fig. 3), SM was partly driven by P (maximum correlation coefficient 0.43 at a lag of 3 days), indicated by short-term peaks in the otherwise strongly seasonal SM signal during larger rainfall events (Fig. 9a).We found the highest cross-correlation (maximum correlation coeffi-cient of 0.84 at a lag of 3 days) between river flow and SWE (Fig. 9b).
This floodplain example illustrates the importance of the hydrologic condition of the floodplain as well as the local climate conditions before, during and after flooding.The three consecutive floods starting in 2000 (referred to as the "2000 floods" in Fig. 9) led to prolonged inundation of parts of the floodplain for an entire year until another minor flooding occurred shortly before the end of 2000 (referred to as the "end of 2000 flood" in Fig. 9).In comparison to that, the flood that occurred after 3 dry years in 2004 (referred to as the "2004 flood" in Fig. 9) had a higher peak flow in the river and a larger maximum surface water extent but only caused short-term flooding of the floodplain, which rapidly returned to dry conditions after the flood had passed.Comparison of SM and ET and their respective trend components (Fig. 9c) showed that both parameters were significantly higher during the time period of the 2000 floods than in the time period of the 2004 flood (see trend components in Fig. 9c).ET had a distinct seasonal component with a peak during the summer months but it is also a function of water available for evapotranspiration, which is mainly governed by P and SM.The maximum correlation coefficient between ET and SM was 0.44 at a lag of 24 days and 0.53 at a lag of 26 days between ET and P .

Discussion
Globally, terrestrial surface water resources continue to suffer degradation and decline (Costanza et al., 2014;Jones et al., 2009) and there is currently a new boom in hydro power development, expected to reduce the number of remaining free-flowing large rivers by 21 % (Zarfl et al., 2015).As a result of this trend, there is a need for balancing the socioeconomic demands on water resources the requirements of the environment for maintaining ecological health of terrestrial freshwater ecosystems.In absence of dense in situ monitoring networks, time series of Earth observation data combined with adequate modeling techniques represent a promising tool for quantifying water resources across large and heterogeneous river basins, which is needed for balancing these competing demands.In this study, we used dynamic linear regression to model SWE through cycles of flooding and drying as a function of key hydrological drivers.We developed a spatial modeling framework that allowed us to account for the unique SWE dynamics of hydrologically distinct floodplain, floodplain-lake, and non-floodplain surface water bodies.Our empirical inundation models are seasonally continuous and suitable for predicting surface water extent and retention on each modeling cell based on locally quantified combinations of predictor variables.

Model selection
The appropriate form of a model depends on its specific objectives (Bennett et al., 2013).The main objective for developing SWE models was to identify distinctive patterns in the role of lagged surface water observations and predictor variables for SWE dynamics across space and time.
We chose a dynamic modeling approach because including the previous SWE observation as an explanatory variable allowed us to account for the complex and spatially varying flooding and drying behavior of different water bodies.Slow flooding and drying behavior of periodically inundated waterbodies leads to an increase in autocorrelation in the SWE time series, resulting in a higher probability of a large inundated area at time t, if there was a large inundated area on the previous time step (t − 1).By including the LDV into the models, we were able to overcome the limitations of existing event-based approaches for modeling SWE, in which observations during the falling limb of floods, which likely contain water from the previous observation, were not used for model development (see event-based models in Table 1).Additionally, Keele and Kelly (2006) suggested that in most cases, LDV models are more appropriate than static models if a process is known to be dynamic, meaning that the process at time t is a function of history as modified by new information, which applies to SWE dynamics.Furthermore, our findings indicate that the degree of improvement in CV-RMSE resulting from accounting for the LDV yields information about the flooding and especially drying behavior of waterbodies.Large improvements in predictive performance as measured by CV-RMSE resulting from accounting for the LDV were commonly linked with poor predictive performance of initial models and surface water bodies that tend to retain floodwater for prolonged periods of time.In both cases, including the LDV played an essential role in achieving good model performance across each sub-region.Based on these considerations, this study illustrates that a dynamic modeling framework has a variety of advantages compared to a static modeling approach for modeling SWE through flooding and drying cycles and over long periods of time.
One of the drawbacks of using a LDV model is that the resulting models are limited to forecasting SWE for one Landsat time step (commonly ≥ 16 days), since the previous SWE observation needs to be specified as part of the predictive model equation.Nevertheless, the resulting models are still of high practical relevance, considering that a 16-day forecast of SWE across the Murray-Darling Basin is valuable for many water management applications.Additionally, the models can be used for estimating missing time steps in the SWE time series resulting from discarding images with excessive cloud cover.

Quantification of flow lag times
In previous studies, one gauge was commonly used for synchronous inundation modeling over comparatively large floodplain units without applying lag times (Table 1).Our results illustrate that the continuous Landsat-based SWE time series yields sufficient information for estimating Qlag times and consequently the spatio-temporal dynamics of SWE on a finer than sub-basin scale.By classifying water bodies into three categories and using a regular grid of 10 km × 10 km cells, we were able to quantify the local combination of predictor variables on individual grid cell level and to apply a tailored modeling approach to surface water bodies with distinctive SWE dynamics.
Our results show that for floodplain and floodplain-lake areas, accurate quantification of Q lags is an essential step for developing SWE models with good (r 2 > 0.6) predictive performance.For all three sub-regions, automated Q-lag estimation led to realistic lag time patterns, as confirmed by validation at different gauging stations, with gradual increase or decrease of Q lags in downstream and upstream direction away from the gauge.There were, however, a number of scattered outlier cells, for which Q lags deviated from the average Q lag of neighboring cells.These outliers were likely a result of the large differences in SWE dynamics from cell to cell as a result of river regulation and variable floodplain geometries.Furthermore, the available data sets did not allow for an exact distinction between permanent and nonpermanent floodplain lakes.Permanent and semi-permanent lakes that are connected to a river have a less pronounced discharge to SWE relationship and slow draining behavior compared to other floodplain areas.The Q-lag relationship was therefore more difficult to model as reflected in more frequent occurrence of outliers for floodplain lake compared to floodplain Q lags.For future applications, these outliers could be eliminated by only allowing increasing lag times for modeling cells along the reach (in downstream direction) or by averaging the Q lag for each cell based on the Q lags of the surrounding cells.
In the most southerly located floodplain areas of the Paroo sub-region, Q lags increased abruptly and eventually approached the upper limit of 60 days that was defined for modeling.This is likely a result of the hydrologic characteristics of this terminal river system.Due to the large extent of this sub-region and the abundance of vast and shallow floodplains, the flow pattern recorded in the Darling River in the north of the sub-region changes significantly when flowing downstream and filling up vast floodplains.As a result of this, flow travel times from gauge 1-C (most southerly gauge with flow data covering the analysis period) to far downstream located floodplain cells are assumed to be strongly dependent on the magnitude and duration of floods and thus difficult to quantify from the data.The unrealistic Q lags of the Kerribree creek floodplains (Fig. 5a) as well as a patch of connected floodplain area in the east of the southernmost floodplains are most likely a result of the fact that due to the abovementioned drastic changes to the flow regime, the river-flow data of the modeling gauge were not representative of SWE dynamics on these areas.There were also few surface water observations of large magnitude in the east of the southernmost floodplains during the analysis period, which imposed difficulties for quantifying flow travel times from the gauge to these areas.
The abrupt increase in Q-lags downstream of point A in the Murrumbidgee sub-region (Fig. 6) was not reflected in the flow data (see validation in Table 6), indicating that our approach for automated estimation of Q lags did not work in this area.This was likely a result of the high level of river regulation in this area as illustrated by the large areas of irrigated agriculture in the close proximity of this reach and the resulting drastic changes to the flow regime in downstream direction from the gauge.

Advances in modeling surface water extent dynamics from space
One of the key objectives of this study was to develop a highly automated, data-driven top-down approach for modeling SWE dynamics.Most existing studies of satellite-based empirical inundation modeling (Table 1) required extensive site analysis, manual image and data selection and tailoring of the modeling approach to local site characteristics.While these approaches are likely to result in improved inundation models for the local floodplain system for which they were developed, they are not applicable across large river basins with highly variable SWE dynamics across space and time.Consequently, they would not be suitable for modeling SWE on sub-continental scales, where complex parameterization of site characteristics is not feasible.Existing studies on large-river basin or sub-continental scale provide empirical inundation models linked to higher levels of uncertainty because variables are often averaged for comparatively large, yet potentially hydrologically complex spatial entities.Such models may potentially neglect fine-scale variations in SWE, which could lead to high levels of uncertainty in the results, poor model performance or unexpected model behavior.Nevertheless, modeling SWE dynamics on sub-continental scale requires a simplification of the process to a certain degree and segmentation of the study site into adequate spatial units.By providing a novel approach for balancing the local complexity of SWE dynamics with the requirements and limitations of modeling surface water processes on large river basin scale, this study is filling the gap between existing approaches of large-scale and high level of uncertainty and finer-scale and complex site analysis.The key feature of this approach was the spatial modeling framework, which allowed us to model cell based on automatically quantified flow travel times and to capture SWE dynamics on a local scale across a large and heterogeneous river basin.Additionally, the spatial modeling framework allowed us to apply a tailored modeling approach to surface water bodies that are not hydraulically connected to river systems.This surface water category was modeled as a function of local rainfall as the key driver rather than river flow.Average r 2 of final models of this surface water category was much lower for all three sub-regions compared to the other two categories.Similar to the floodplain and floodplain-lake categories, the LDV played an important role in achieving uniform model performance in non-floodplain models.One of the reasons for the low average r 2 of non-floodplain models was our definition of this category, which comprised all remaining land surface areas that were not assigned to the floodplain and floodplain-lake categories.We used this definition to capture inundation of land surface areas that are not defined as water bodies by the static wetland layer, which could be caused by intense local rainfall events.As a result of this definition of the non-floodplain category, many of the non-floodplain modeling cells did not have sufficient surface water observations for fitting a model equation.These findings indicate that a definition, in which the non-floodplain category is limited to dynamic surface water areas rather than the entire remaining land surface, could be more suitable for modeling this category.The SWE time series used in this study quantifies the spatial and temporal distribution of surface water over 25 years at 30 m pixel size and can thus be used to characterize the non-floodplain category.
The large size of the study area introduces some limitations for modeling SWE with Landsat-based time series.
Overlapping satellite paths and discarding of images with excessive cloud cover led to varying sample densities across the study area and sometimes across single EH-zones.The MDB expands across eight Landsat paths from east to west, which resulted in seven overlapping paths with a width of approximately 40 km.In these areas of overlapping satellite paths, the number of observations was approximately twice as high as in the remaining areas resulting in variable temporal density of the SWE time series used for modeling across the study area.Discarding images or parts of images with excessive cloud cover introduced gaps into the otherwise regular time series of SWE (every 16 days).Since these missing time steps in the time series were not explicitly accounted for during the fitting of the models, these gaps led to inclusion of SWE observations into the models that were sometimes several 16-day time steps apart from the current observation.

Role of the local climate for surface water extent dynamics
Another key objective of this research was to analyze the role of APV (i.e., P , ET and SM) in SWE dynamics, based on integration of spatial time series of rainfall, near-surface soil moisture and actual evapotranspiration.We used dynamic linear regression and 5-fold cross-validation for performing a data-driven analysis of the role of these driver variables across space and time.Since we developed a highly automated and data-driven approach for modeling SWE dynamics, we did not consider modeling SWE processes based on simplified water balance models as done in other studies on a smaller-scale (Table 1, study no.9, 10, 11).Instead, we performed a variety of exploratory analysis (e.g., Figs. 7, 9) to get a better understanding of the relationships between variables and to find the most suitable representation of each variable in the models (Table 4).Even though analysis of cross-correlations indicated that for some variables the correlation with SWE is the highest after applying relatively large lag times, we did not apply lag times for any of the APV for modeling and variable selection.Instead, the final modeling approach was based on our conceptual understanding of SWE dynamics, for which the SWE at time t, should be a function of river flow at time t, the SWE at time (t − 1) and the changes in P , ET, and SM between (t − 1) and t.Using a 10-day moving average for river flow helped to account for the fact that especially for modeling cells located far away from a gauge, Q lags are an approximation and partly depend on the magnitude of floods so that actual daily values become less suitable for modeling SWE dynamics.
Cross-validation is commonly used for variable selection and the high value of predictive modeling for explanatory purposes and capturing patterns and relationships in rich data sets is widely recognized (Arlot and Celisse, 2010;Bennett et al., 2013;Shmueli, 2010).We found that local rainfall was the most important additional driver variable across all three www.hydrol-earth-syst-sci.net/20/2227/2016/ Hydrol.Earth Syst.Sci., 20, 2227-2250, 2016 sub-regions and that the APV were more important for modeling and predicting SWE in the Paroo sub-region compared to the other two sub-regions.The Paroo is the largest, most arid and least regulated sub-region, which, in combination with its extensive shallow floodplains, might be the reason for the increased importance of the APV in this area.Based on our understanding of SWE dynamics, we expected actual evapotranspiration to help explain the variability in SWE particularly on slow draining surface water areas such as the floodplains of the Paroo sub-region since these waterbodies mainly drain as a result of infiltration and ET.Despite these assumptions, ET was the least influential local climate variable in the Paroo.One of the reasons for this may be that the ET data are modeled output of a land surface and water balance model (Table 2).Even though this model is calibrated against streamflow and partly accounts for surface water body dynamics, including inflows from runoff and discharge (Viney et al., 2014), model estimates of actual evapotranspiration might be of limited accuracy over extended dynamic surface water areas such as large floodplains.
In comparison to ET, the P and SM data sets are of observational nature.The rainfall data set was generated based on spatial interpolation of point measurements and the SM time series was generated based on remotely sensed data sets.The active and passive microwave data used for generating the SM time series is expected to be largely influenced by extended open water bodies, including inundated floodplains as well as by dense vegetation cover (Liu et al., 2012;Ye et al., 2015).Consequently, this data set may be biased for floodplain areas during flooding, where large-scale inundation is likely to drive soil moisture estimates towards saturation so that the SM data are partly a direct observation of SWE.
Overall, the improvements in CV-RMSE after accounting for the APV were small compared to the improvements after accounting for the LDV in all sub-regions, which is likely a result of the complex relationship between SWE dynamics and the local climate drivers.This relationship is non-trivial to quantify due to the absence of spatially continuous information on inundation volumes and infiltration rates (missing variables) as well as resolution limits of the data sets used here.A conceptual water balance model (i.e., a dynamic hydrologic model), as an alternative modeling framework, capable of accounting for all fluxes of water into and out of each spatial modeling unit, would likely be limited by the same factors.Such a modeling approach might, however, be more suitable for capturing the variable effect of P and ET on SWE over time, resulting from the fact that a certain strong rainfall event is more likely to influence the SWE during times of inundation as compared to dry conditions.Alternatively, in an event-based modeling approach, the APV could be used to characterize the antecedent conditions of floodplains for each modeled flood, which are known to have a strong influence on the magnitude and duration of inundation on floodplains.Such an event-based modeling approach may be useful for certain applications but is not suitable for modeling SWE dynamics continuously through cycles of flooding and drying as done in this study.

Conclusions
In this study, we statistically modeled SWE over a period of 25 years across three large and hydrologically distinct sub-regions of the MDB.We developed a spatial modeling framework that allowed us to apply a tailored modeling approach to hydrologically distinct floodplain, floodplain-lake and non-floodplain areas and to quantify local driver combinations on the level of 10 km × 10 km grid cells.Based on this spatial modeling framework, we modeled SWE continuously through cycles of flooding and drying on 946 modeling cells across the three sub-regions.Automated quantification of flow lag times was a key step for modeling floodplains and floodplain lakes on the level of individual grid cells.Accounting for the LDV was important for achieving uniformly good model performance across the three sub-regions.Freely available spatial time series of P , ET, and SM allowed us to analyze the role of local hydrological and climatic conditions in explaining variability in SWE.Even though the contribution of local rainfall, evapotranspiration and soil moisture to the predictive performance of SWE models were small compared to the large improvements resulting from the LDV, these variables were important for achieving good model performance in a variety of hydrologically distinctive areas.Additionally, local rainfall was the main driver for modeling SWE on all surface water bodies that are not connected to river systems with available flow data.The models developed in this study provide unique insights into the inundation and retention behavior of surface water bodies and local driver combinations and can provide a valuable tool for improving water resource management in the MDB.Compared to the more complex framework of continental-or globalscale hydrodynamic models our modeling approach provides a data-driven and highly automated alternative for analyzing SWE dynamics.Future work will focus on applying the presented methodology to the entire MDB, providing a first basin wide and seasonally continuous inundation model.The spatial modeling framework is transferable to other large and heterogeneous river basins across the world provided that hydraulic connectivity of surface water bodies and river systems can be determined.

Data availability
The Landsat-based time series of surface water extent will be made freely available as part of the Australian Research Data Storage Infrastructure in accordance with the rules of the funding agency and embargo regulations of UNSW.The rainfall data product can be purchased from the Australian Bureau of Meteorology.The soil moisture time series Hydrol.Earth Syst. Sci., 20, 2227-2250, 2016 www.hydrol-earth-syst-sci.net/20/2227/2016/ -earth-syst-sci.net/20/2227/2016/Hydrol.Earth Syst.Sci., 20, 2227-2250, 2016

Figure 2 .
Figure 2. Overview of the analysis design and spatial modeling framework.Sources: river network (Commonwealth of Australia (Bureau of Meteorology), 2012), surface water categories derived from Kingsford et al. (2004), and eco-hydrological zonation(Huang et al., 2013).

Figure 3 .
Figure 3. (a) Eco-hydrological zonation, irrigation areas, and wetlands of the Murray-Darling Basin and surface water categories of the three sub-regions -(b) Paroo, (c) Murray and (d) Murrumbidgee -used for illustrating the abilities of the spatial modeling framework for modeling SWE dynamics on a local scale (per grid cell) across large river basins.Sources: surface water categories derived from Kingsford et al. (2004), eco-hydrological zonation (Huang et al., 2013), and irrigation areas (Bureau of Rural Sciences, 2008).

Figure 4 .
Figure 4. Model results of the Murray sub-region: (a) Q lags for river gauge 2-A, (b) Q lags for river gauge 2-B, and (c) r 2 of floodplain and floodplain lakes (final models).

Figure 5 .
Figure 5. Model results of the Paroo sub-region: (a) Q lags and (b) r 2 for the floodplain, and floodplain-lake surface water category (final models) based on river gauge 1-A (all cells east of line L-A) and 1-C (all cells west of line L-A), (c) r 2 of non-floodplain category (final models).

Figure 6 .
Figure 6.Model results of the Murrumbidgee sub-region: Q lags for river gauge 3-A (used for modeling the reach between gauge 3-A and 3-B) and 3-B (used for modeling all areas downstream of gauge 3-B).

Figure 7 .
Figure 7. Time series of surface water extent ratio on floodplains and discharge in the corresponding river gauge for four example grid cells for the period from 1992 to 1999.(Vertical green bars indicate time steps of the SWE time series; location of example grid cells is shown in Fig. 3.)

Figure 8 .
Figure 8.(a) CV-RMSE of the initial models of the floodplain and floodplain-lake surface water category in the Paroo sub-region and improvement in CV-RMSE after accounting for the (b) LDV (base models) and (c) APV (final models).

Figure 9 .
Figure 9. Relationship of variables used for modeling the floodplain area of the Paroo example cell Ex-A (location shown in Fig. 3) for the period from 1999 to 2006: (a) soil moisture and local rainfall, (b) discharge and surface water extent ratio, and (c) soil moisture and evapotranspiration with respective trend components (dashed lines).

Table 1 .
Overview of studies that modeled optical satellite-based observations of surface water as a function of river flow and other driver variables.

Table 2 .
Overview of spatial time series used for modeling SWE dynamics.

Table 3 .
Definition of model parameters per grid cell.
SWESurface water extent Fraction of SWE on cloud-free surface water category area km 2 km −2 Q Discharge 10-day moving average m 3 s −1 P Rainfall Previous 16-day sum mm 16 days −1 ET Evapotranspiration Previous 16-day sum (actual) mm 16 days −1 SM Soil moisture Previous 16-day average m 3 m −3

Table 4 .
Long-term average climate and runoff characteristics along with total, floodplain, and floodplain-lake area of the three sub-regions used for illustration of the spatial modeling framework.

Table 5 .
Official names, ID, and start dates of river gauges and data used for this analysis along with estimated average flow travel times between gauges used for validation of Q lags.

Table 7 .
The role of local climate variables for modeling SWE dynamics on floodplains, showing the percentage of grid cells in each sub-region where automated variable selection led to improvements in CV-RMSE and thus inclusion of each P , ET, and SM into the final models.