Introduction
Understanding the behaviour of extreme hydrological events and the ability of
hydrological modellers to improve the forecast skill are distinct challenges
of applied hydrology. Hydrological forecasts can be made more reliable and
less uncertain by recursively improving initial conditions. A common way of
improving the initial conditions is to make use of data assimilation (DA),
a feedback mechanism or update methodology which merges model estimates with
available real-world observations e.g..
Data assimilation methods can be classified from different perspectives.
Traditionally, we distinguish between sequential and variational methods. The
sequential methods are used to correct model state estimates by assimilating
observations, when they become available. Examples of sequential methods are
the popular Kalman and particle filters
e.g..
The variational methods on the other hand minimize a cost function over
a simulation period, which incorporates the mismatch between the model and
observations e.g..
A next distinction can be made between synchronous and asynchronous methods.
Synchronous methods, also called three-dimensional (3-D), assimilate
observations which correspond to the time of update. The ensemble Kalman
filter EnKF, e.g. is a popular synchronous approach,
which propagates an ensemble of model realizations over time and estimates
the background error covariance matrix from the ensemble statistics.
Asynchronous methods, also called four-dimensional (4-D), refer to an
updating methodology in which observations being assimilated into the model
originate from times different to the time of update
. The ensemble Kalman smoother (EnKS)
is a common example of an asynchronous method
e.g.. The EnKS
extends the EnKF by introducing additional information by propagating the
contribution of future measurements backward in time. The EnKS reduces the
error variance as compared to the EnKF for the past . EnKS
and EnKF are identical for forecasting (including nowcasting).
The essential difference between a smoother and a filter is that a smoother
assimilates “future observations”, while a filter assimilates “past
observations”. This implies that for operational forecasting purposes, we
need a filter rather than a smoother. A smoother can help improve the model
accuracy in the past (e.g. for re-analysis), but it does not help improve
forecast accuracy . Therefore,
introduced the asynchronous ensemble Kalman filter (AEnKF), which requires
forward integration of the model to obtain simulated results necessary for
the analysis and model updating at the analysis step using past observations
over a time window. The difference among the EnKF, EnKS and AEnKF is
schematized in Fig. .
showed that the formulation of the EnKS provides a method
for asynchronous filtering, i.e. assimilating past data at once, and that the
AEnKF is a generalization of the ensemble-based data assimilation technique.
Moreover, unlike the 4-D variational assimilation methods, the AEnKF does not
require any adjoint model . The AEnKF is particularly
attractive from an operational forecasting perspective as more observations
can be used with hardly any extra additional computational time.
Additionally, such an approach can potentially account for a better
representation of the time lag between the internal model states and the
catchment response in terms of the discharge.
Discharge represents a widely used observation for assimilation into
hydrological models, because it provides integrated catchment wetness
estimates and is often available at high temporal resolution
. Therefore, discharge is a popular
variable in data assimilation studies used for model state updating
e.g.
or dual state-parameter updating e.g.
The Kalman type of assimilation methods were developed for an idealized
modelling framework with perfect linear problems with Gaussian statistics;
however, they have been demonstrated to work well for a large number of
different non-linear dynamical models . It remains
interesting to evaluate whether elimination of the non-linear nature of the
model updating can be beneficial. For example, introduced
the idea of a partitioned update scheme to reduce the degrees of freedom of
the high-dimensional state-parameter estimation of a distributed hydrological
model. In their study, the partitioned update scheme enabled them to better
capture covariances between states and parameters, which prevented spurious
correlations of the non-linear relations in the catchment response.
Similarly, decreasing the number of model states being perturbed and updated
was suggested by to increase the efficiency of the
filtering algorithm while conserving the forecast quality. Such an approach
was proposed especially to states with small innovations, which in their case
was mainly the soil moisture storage.
Illustration of the model updating procedure for the ensemble Kalman
filter (EnKF), the ensemble Kalman smoother (EnKS), and the asynchronous
ensemble Kalman filter (AEnKF). The horizontal axis stands for time,
observations (d1, d2, d3, d4) are given at regular intervals. The
blue arrows represent forward model integration, the red arrows denote
introduction of observations and green arrows indicate model update. The
magenta arrows represent the model updates for the EnKS and therefore go
backward in time, as they are computed following the EnKF update every time
observations become available. The green dotted arrows denote past
observations being assimilated using the AEnKF. The schemes for the EnKF and
the EnKS are after .
In this study we present a follow-up of the work of
, in which discharge observations were assimilated
into a grid-based hydrological model for the Upper Ourthe catchment in the
Belgian Ardennes by using the EnKF. Here we scrutinize the applicability of
the AEnKF using the same updating frequency (i.e. the same computational
costs) as in the previous study. To our knowledge this is the first
application of the AEnKF in a flood forecasting context. Firstly, the effect
of assimilating past asynchronous observations on the forecast accuracy is
analysed. Secondly, the effect of a partitioned updating scheme is scrutinized.
Material and methods
Data and hydrological model
We carried out the analyses for the Upper Ourthe catchment upstream of
Tabreux (area ∼ 1600 km2, Fig. ), which is
located in the hilly region of the Belgian Ardennes, western Europe
. We employed a grid-based spatially distributed HBV-96
model Hydrologiska Byråns Vattenbalansavdelning;, with spatial resolution of
1 km × 1 km and hourly temporal resolution. The model
is forced using deterministic spatially distributed rainfall fields, which
were obtained by inverse distance interpolation from about 40 rain gauges
measuring at an hourly time step. Evaluation of the benefits of different
rainfall interpolation techniques was deemed beyond the scope of the study.
We used a method used in operational practice as this study is also oriented
towards operational benefits of asynchronous filtering. Additionally, there
are six discharge gauges (hourly time step) situated within the catchment,
some of which are used for discharge assimilation and some for independent validation.
For a more detailed description of the catchment and model structure and
definition of the hydrological states and fluxes we refer to
and to Fig. . Briefly, for each grid
cell the model considers the following model states: (1) snow (SN), (2) soil
moisture (SM), (3) upper zone storage (UZ) and (4) lower zone storage (LZ).
The dynamics of the model states are governed by the following model fluxes:
rainfall, snowfall, snowmelt, actual evaporation, seepage, capillary rise,
direct runoff, percolation, quick flow and base flow. The latter two fluxes
force the kinematic wave model . This routing
scheme calculates the overland flow using two additional model states, the
water level (H) and discharge (Q) accumulation over the drainage network.
Model parameterization is based on the work of and .
Topographic map of the Upper Ourthe (black line) including the river
network (blue lines), rain gauges (crosses), six river gauges (white circles
labelled with numbers: 1 – Tabreux, 2 – Durbuy, 3 – Hotton, 4 – Nisramont,
5 – Mabompré, and 6 – Ortho). Projection is in the Universal Transverse Mercator
(UTM) 31N coordinate system. After .
In contrast to , in the current study we employed
the HBV-96 model built within a recently developed open-source modelling
environment, , which is suitable for integrated
hydrological modelling based on the Python programming language with the
PCRaster spatial processing engine . The
advantage of using is that it enables direct
communication with , an open-source data assimilation toolbox.
provides a number of algorithms for model calibration and
assimilation and is suitable to be connected to any kind of environmental
model e.g..
The import and export of hydrological and meteorological data to the system
is done using Delft Flood Early Warning System
Delft-FEWS,, an open-shell system for managing
forecasting processes and/or handling time series data. Delft-FEWS is
a modular and highly configurable system, which is used by the Dutch
authorities for the flood forecasting for the River Meuse basin (called RWsOS
Rivers), in which the Upper Ourthe is located. The current configuration is
a stand-alone version of RWsOS Rivers; however, it can be easily switched
into a configuration with real-time data import.
Data assimilation for model initialization
As stated in the introduction, we investigate the potential added value of
the asynchronous EnKF (AEnKF) as compared to the
traditional (synchronous) EnKF for operational flood forecasting. The
derivation of the AEnKF (Sect. ) is based on the equations
using the same updating frequency (i.e. same computational costs, different
number of observations) as for the EnKF (Sect. ), as among
others presented by .
Ensemble Kalman filter (EnKF)
First, we define a dynamic state space system as
xk=fxk-1,θ,uk-1+ωk,
where xk is a state vector at time k, f is an operator
(hydrological model) expressing the model state transition from time step
k - 1 to k in response to the model input uk-1 and
time-invariant model parameters θ. The noise term
ωk is assumed to be Gaussian white noise
(i.e. independent of time). It incorporates the overall uncertainties in model
structure, parameters and model inputs.
Left: catchment discretization using a grid-based approach including
the channel delineation. Arrows indicate flow direction. Right: schematic
structure of the HBV-96 model for each grid cell. Model states are in bold
and model fluxes in italics after.
Second, we define an observation process as
yk=hxk+νk,
where yk is an observation vector derived from the model state
xk and the model parameters through the h operator (in our case
the kinematic wave routing model generating discharge). The noise term
νk is additive observational Gaussian white noise with
covariance Rk. For spatially independent measurement errors,
Rk is diagonal. Note that both the kinematic wave routing
model h(.) and the hydrological model f(.) exhibit non-linear behaviour.
After the model update at time k - 1, the model is used to forecast model
states at time k (Eq. ). The grid-based model states form
a matrix, which consists of N state vectors xk corresponding to N
ensemble members:
Xk=xk1,xk2,…,xkN,
where
xki=SN1:mi,SM1:mi,UZ1:mi,LZ1:mi,H1:mi,Q1:mikT.
SNi, SMi, UZi, LZi, Hi and Qi are the HBV-96 model states
of the ith ensemble member (Sect. ), m gives the number of grid
cells and T is the transpose operator. The ensemble mean
x‾k=1N∑i=1Nxki
is used to approximate the forecast error for each ensemble member:
Ek=xk1-x‾k,xk2-x‾k,…,xkN-x‾k.
The ensemble-estimated model covariance matrix Pk is defined as
Pk=1N-1EkEkT.
When observations become available, the model states of the ith ensemble
member are updated as follows:
xki,+=xki,-+Kkyk-hxki,-+νki,
where xki,+ is the analysis (posterior, or update) model state
matrix and xki,- is the forecast (prior) model state matrix.
Kk is the Kalman gain, a weighting factor of the errors in model
and observations:
Kk=PkHkTHkPkHkT+Rk-1,
where PkHkT is approximated by the forecasted
covariance between the model states and the forecasted discharge at the
observing locations, and HkPkHkT is
approximated by the covariance of forecasted discharge at the observing
locations :
PkHkT=1N-1∑i=1Nxki-xk‾hxki-hxk‾T,HkPkHkT=1N-1∑i=1Nhxki-hxk‾hxki-hxk‾T,
where
hxk‾=1N∑i=1Nhxki.
Asynchronous Ensemble Kalman Filter (AEnKF)
The AEnKF should not be considered as a new method but rather a simple
modification of the (synchronous) EnKF (Sect. ) using
a state augmentation approach. This means that the ith vector of model
states (xki) at time k (see Eq. ) is augmented with the
past forecasted observations h(xk-1i), …, h(xk-Wi)
(i.e. model outputs corresponding to the observation locations) from W
previous time steps, which yields
x̃ki=xkihxk-1ihxk-2i⋮hxk-Wi.
Remember that the size of xki and h(xk-1i), …, h(xk-Wi)
can significantly differ: xki contains the
complete set of model states, while h(xk-1i), …,h(xk-Wi)
contains only the forecasted observations. Additionally,
with the new state definition comes a new augmented observer operator
h̃k (in which I, with the corresponding subscript,
stands for identity elements on the diagonal, matching the dimensions in
Eq. ), a new augmented observation vector
ỹk and its corresponding observation covariance matrix
R̃k:
h̃k=hkIk-10Ik-20⋱Ik-W,ỹk=ykyk-1yk-2⋮yk-W,R̃k=RkRk-10Rk-20⋱Rk-W.
Having these augmented equations for x̃ki,
h̃k, ỹk and R̃k, it is
straightforward to carry out the assimilation in the same manner as presented
in Sect. . Note that although current and past observations
are used to construct the augmented state vector in
Eq. (), in practice Eq. () is solved only to
the current state x̃ki (i.e. the indices that correspond to
xki) and the rest is ignored. The presence of past observation
terms increases the dimension of P̃k and
K̃k (see Eqs. and ) in
both directions (rows and columns). Each column of K̃k
corresponds to an observation. The extra column of K̃k
corresponds to the past observations. Hence, it is possible to simply solve
the equations for the first rows, which correspond only to xki.
Note that the first rows of K̃k also contain the
contributions of the past observations to the current state. These
contributions arise from the off-diagonal terms of the augmented covariance
P̃k. Finally, if the time window equals the current single
time step, then W = 0 and the AEnKF problem reduces to the traditional EnKF.
From the operational point of view, it is preferable to have a longer
assimilation window, because less frequent assimilation eliminates
a disruption of the ensemble integration by an update and a restart. When
assimilation is done more frequently, it will cause considerably higher
calculation costs, which can often be a burden for real-time operational
settings . The AEnKF uses a longer assimilation window and
assimilates all observations in a single update. This makes the AEnKF
attractive for operational use. The added value of a longer assimilation window will
be a subject for investigation in this work. Especially, it can provide an
improved representation of the time lag between the internal model states and
the catchment response in terms of the discharge. Such an idea was
investigated for example by , who compared the effect of
time-lag representation using the EnKF and EnKS.
Overview of the periods used in this study.
Period
Number of
Maximum observed
events
discharge [m3 s-1]
23 Oct 1998–15 Nov 1998
1
210
15 Feb 1999–05 Mar 1999
2
195
15 Jan 2002–06 Mar 2002
4
340
21 Dec 2002–07 Jan 2003
1
380
Model uncertainty
In this study, we assume the source of model uncertainty to be the HBV soil
moisture, which provides boundary conditions for surface runoff and
represents interaction from interception, evapotranspiration, infiltration
and input uncertainty by rainfall. The uncertainty is represented as a noise
term ω as in Eq. (). Based on expert
knowledge, the noise is modelled as an autoregressive process of order 1 with
a de-correlation time length of 4 h. The noise process is further assumed
spatially isotropic with a spatial de-correlation length of 30 km. The noise
is assumed to have a spatially uniform standard deviation of 1 mm. The 2-D noise fields with
such statistics were generated by using the toolbox. This
parameterization of the noise model ensures that the ensemble spread in the
simulated discharge corresponds well with the control simulations as
presented by (not shown). Ideally, all sources of
uncertainty should be accounted for in a DA scheme. However, this is not yet
a common approach in operational hydrologic data assimilation. Moreover, as
the objective of the current manuscript is to compare the operational
benefits of application of the AEnKF, we kept the noise model relatively
simple. For more work on the effect of noise specification on DA using
complex spatially distributed hydrological models see .
Experimental setup
This section provides a configuration setup of the filtering methods
(Sect. and ) to assimilate discharge
observations into a spatially distributed hydrological model of the Upper
Ourthe catchment. The objective is to improve the hydrological forecast at
the catchment outlet (at Tabreux, gauge 1 in Fig. ) by
assimilating up to four discharge gauges, numbered as 1, 3, 5, 6 in
Fig. . Note that discharge data from multiple gauges are
assimilated simultaneously and no localization is employed in this study.
Additionally, validation at an independent location is also performed. The
discharge assimilation is performed every 24 h; however, the forecasts are issued
every 6 h, i.e. 4 times a day, with different independent starting points at 00:00,
06:00, 12:00, and 18:00 UTC, which is the same implementation as used by
. This study analyses the eight largest flood peaks
observed within the catchment since 1998. An overview is provided in Table .
The ensemble of uncertain model simulations is obtained by perturbing the
SM state with the spatio-temporally correlated error model
(Sect. ). With this approach we ensured that the error
model produced reasonable results in the open loop and did not lead to any
numerical instability. More complex ways of perturbing the model and their
effects on forecast accuracy were studied
before see and were deemed beyond
the scope of this manuscript. The ensemble size in this study was defined to
be 36 realizations (for computational reasons). Note that increased ensemble
sizes of 72 and 144 realizations did not influence the results (not shown).
Nevertheless, such a small ensemble size as presented in the manuscript would
not be possible if parameter estimation would be involved or if more complex
error models would be employed. The error in the discharge observations is
considered to be a normally distributed observation error with a variance of
(0.1 Qobs, k)2 after e.g..
Four partitioned state updating schemes (indicated in the first
column) for five model states (indicated in the first row) being updated and thus
included in the model analysis. Model states are described in Sect.
and Fig. and have the following acronyms: discharge (Q), water
level (H), soil moisture storage (SM), snow storage (SN), upper zone storage (UZ),
and lower zone storage (LZ).
Name
Q
H
SM
SN
UZ
LZ
No update
all
√
√
√
√
√
√
noSM
√
√
√
√
√
HQ
√
√
The experimental setup scrutinizes the problem of asynchronous filtering from
two perspectives. First, we investigate the effect of state augmentation
using the past observations and assimilation of distributed observations on
the state innovation (Sect. ). Recall that the number of
observations being assimilated into the model depends on the magnitude of W.
Furthermore, the choice of which model states are included in the analysis
step to be updated is analysed (Sects. ,
and ). This means that besides updating all of the model states,
we will test two other alternatives. The first alternative will leave out
from the model analysis the soil moisture state (noSM), which is known to
exhibit the most non-linear relation to Q. The second alternative will
eliminate all the model states except for the two routing ones (HQ). The
scenarios of the partitioned state updating schemes are shown in
Table , including the control run without state updating (no update).
Discharge ensemble forecasts (grey lines) and observations (points)
at four locations (gauges 1, 3, 5, 6; see Fig. ).
Observations being assimilated using the AEnKF are schematized according to
the state augmentation size for two scenarios: assimilation of data from the
current time step W = 0 (open circle, traditional EnKF approach) and
assimilation of data including the previous 11 time steps, W = 11 (black
dots). The observations are assimilated into the model states on 31 December 2002,
00:00 UTC.
The performance of the data assimilation procedure regarding discharge
forecasting is evaluated using the Ensemble Verification System (EVS):
a software tool for verifying ensemble forecasts of hydrometeorological and
hydrological variables at discrete locations , which
provides a number of probabilistic verification measures. In this study we
used three popular measures: the root-mean-square error (RMSE), the relative
operating characteristic (ROC) score and the Brier skill (BS) score. We refer
to e.g. , , , and
for exact definitions of these measures. In summary, the perfect forecast in
terms of the RMSE has a value of 0, while positive values indicate errors in
the same units as the variable. The perfect forecast in terms of the ROC and
BS scores has a value of 1 and values smaller than 1 indicate forecast deterioration.
Results
The effect of state augmentation and distributed observations on state innovation
To investigate and understand the effect of augmented operators
(Eqs. , , and ) on the
innovation of spatially distributed model states, we present the following
example. Figure shows discharge simulations and
corresponding discharge observations at four locations within the catchment on
31 December 2002, 00:00 UTC. Note that the magnitude of the discharge
observations is a function of the location within the catchment; for
downstream gauges the magnitude is larger than for the more upstream gauges.
The discharge observations are further distinguished according to the time-window length of the state augmentation, which is set to W = 0 and W = 11.
The first example represents the traditional EnKF algorithm, while the latter
assimilates observations from a 12 h time window (i.e. 1 current
observation and 11 past observations), which is arbitrarily defined as half
the 24 h assimilation time window. For some cases alternative
assimilation windows were tested, which did not lead to noticeable
differences however (not shown). Note that the amount of information being
assimilated into the model differs for different values of W.
The mean difference between the forecasted and updated model states for the
whole ensemble is illustrated in Fig. for four scenarios.
These examples improve our understanding about the behaviour of the updated
model states in relation to the information content of the observations from
two perspectives: (1) the effect of assimilating also past observations in
addition to observations at the current (analysis) time, and (2) the effect
of assimilating spatially distributed observations into a grid-based
hydrological model.
Mean difference between the forecasted (X-) and updated
(X+) model states on 31 December 2002 at 00:00 UTC for different
scenarios (shown in vertical panels). We show only four sensitive model states:
discharge (Q), water level (H), soil moisture (SM) and upper zone (UZ). We
excluded the insensitive lower zone (LZ). Notations W = 0 and W = 11
indicate the size of the state augmentation. Notation up.all indicates that
all of the model states are updated. Notation as “xx” indicates the gauges
which are assimilated; see Fig. for their locations. The
corresponding ensemble of model forecasts and observations being assimilated
are shown in Fig. .
Let us first consider the traditional EnKF (i.e. no state augmentation with
W = 0) to update all the grid-based model states by assimilating the
observation at the catchment outlet (gauge 1). We observe that the single
observation is measured approximately in the middle of the simulated ensemble
(see the open circle for gauge 1 in Fig. ). Therefore,
there is hardly any difference between the forecasted and updated model
states as we show in Fig. a. In the second scenario, we
still assimilate only one gauge at the outlet; however, we use the augmented
operators with W = 11. Because the mean of the ensemble simulations is
predominantly underestimated as compared to the assimilated observations (see
black dots in Fig. for gauge 1), after the update more
water is added spatially equally into the system, as shown in
Fig. b. In the third scenario, we include all four gauges
being assimilated into the model without any augmentation. Because the model
simulations at the interior gauges are mostly overestimating the
observations, water is removed from the catchment during the update.
Moreover, since the model overestimation is largest at gauges 3 and 6, we can
also observe in Fig. c how well the EnKF is capable of
identifying corresponding regions in a spatial manner. In the fourth scenario
(Fig. d) we still assimilate all four gauges; however, we
augment the state with W = 11. We can observe that the innovation of the
model states gets even more spatially differentiated; the updated SM and UZ
model states in the downstream part of the catchment increase the amount of
water in the system, while the updated SM and UZ model states in the upstream
part decrease the amount of water in the system.
The presented educational examples shows an update for several scenarios
starting from the same initial conditions. This enables a fair comparison
between scenarios; however, the sensitivity of state augmentation needs to be
further scrutinized in terms of its cumulative effect over time.
Ensemble of discharge forecasts for a typical event at the catchment
outlet (Tabreux, gauge 1) for three updating scenarios: all, noSM, and HQ (see
Table for definition). The combined effect of the model
states being updated (three scenarios shown in rows) and the length of the state
augmentation vector (W) of past observations being assimilated (two scenarios
in columns) is presented. Gauges 1, 3, 5, and 6 are assimilated. The control
run (with no update) is shown in the left panel. The observations are shown
in black.
(a) Root-mean-square error (RMSE), (b) relative
operating characteristic (ROC), and (c) Brier skill score (BSS) at
Tabreux (gauge 1) for different discharge observation vectors for which
different model states are updated and with different lengths of the state
augmentation vector (W) of past observations being assimilated. The results
incorporate a set of eight flood events shown in Table . Gauges 1,
3, 5, and 6 are assimilated. For BSS, the reference forecast is the sample
climatology and only values larger than the 25th percentile of the whole
sample are considered. (d) Same as (a) but the results are
presented for Durbuy (gauge 2), a validation location which is not assimilated.
Scaled difference between the ensemble mean for the three partitioned
update schemes and the control run without data assimilation at four gauging
locations (shown by different colours) within the Upper Ourthe catchment using
the AEnKF with (a) W = 0 and (b) W = 11. We excluded the
insensitive lower zone (LZ). Gauges 1, 3, 5, and 6 are assimilated. The
results correspond to the same period as presented in Fig. .
The effect of the four partitioned update schemes and asynchronous assimilation on forecast accuracy
We present a qualitative interpretation of the hydrological forecasts with
a lead time of 48 h in Fig. for different partitioned
state updating schemes as defined in Table , including both
a non-augmented state (W = 0) and an augmented state (W = 11). This analysis
focuses on a characteristic winter flood event (December 2002–January 2003)
being typical for a moderate temperate climate caused by a fast-moving
frontal stratiform system . We observe that the
ensemble of the control runs (top panel of Fig. )
simulates the major flood peak reasonably well, including the timing and the
magnitude; however, it has a larger spread with respect to the assimilation
scenarios. Additionally, when we consider the ensemble mean of the no-update
scenario with respect to the assimilation scenarios, the accuracy
deteriorates. When discharge assimilation is employed, an overall reduction
of the uncertainty in the forecasted ensemble is observed. Nevertheless, the
forecasted flood peak becomes underestimated and the forecasted recession
remains overestimated, which is acceptable because of the defined uncertainty
in the observed discharge. This happens in particular for the scenario in
which all states are updated; there are marginal differences between the
non-augmented and augmented model states. Furthermore, when we leave out SM
from the state update (noSM), we can observe that the major flood peak is
forecasted more accurately, including the rising limb around
31 December 2002. Moreover, for the augmented state with W = 11, the ensemble
spread becomes somewhat wider for lead times exceeding 12 h than for the
non-augmented state. Nevertheless, the observations correspond approximately
with the ensemble mean. Finally, we present the effect of the scenario in
which only the two routing states are updated. The results suggest that the
flood peak is captured most accurately of all scenarios, however with
somewhat wider uncertainty bands. Therefore, it seems more appropriate to
exclude the UZ storage (noSM scenario) in the model state updating, which
represents water storage available for quick catchment response in the
concept of the HBV model.
Besides a qualitative interpretation of the forecasted hydrographs presented
in Fig. for one particular event, we summarize these
results in a more quantitative manner for the whole set of eight flood events
(see Table ) using three statistical measures with respect to
the lead time. Figure shows the average behaviour (over many
forecasts) of an improved initial state on the forecast accuracy for the
different filter settings, although individual partial updates may vary in
time. In general, the improvements in forecast accuracy decay with lead time
in a systematic fashion as is to be expected.
Figure a shows the RMSE as a function of
lead time for different partitioned state updating schemes and for three
scenarios for the state augmentation at the catchment outlet (Tabreux). The
control model run with no update has a constant RMSE of about
32 m3 s-1 and an improved hydrological forecast has a RMSE
lower than the control run. The results suggest that all assimilation
scenarios improve the hydrological forecast, however, with marked differences
between the scenarios. Figure a also clearly shows that the
differences in the forecast improvement of these various setups are purely
due to using multiple data points in the past at the analysis step. We can
further observe that updating all model states except for SM (noSM scenario)
consistently leads to the most accurate forecasts across the whole range of
lead times. Additionally, state augmentation using W = 5 and W = 11 indicates
improvements compared to the case without augmentation (W = 0). However, for
lead times longer than the travel time from the most upstream gauges to the
outlet (i.e. exceeding 20 h), the difference between state augmentations
W = 5 and W = 11 diminishes. Moreover, when only the two routing states (HQ
scenario) are updated, the RMSE is lowered for short lead times, but the
improved effect does not last as long as for the noSM scenario. The smallest
improvement at shorter lead times is achieved when all model states are
updated (scenario all). This is due to the strongly non-linear relation
between the assimilated observations and the SM storage, which is further
articulated by the time lag between the state and the catchment response.
Nevertheless, for longer lead times it seems slightly better to update all
states rather than only the routing states. Discharge is related to the SM
and UZ storages through the Kalman gain. When the correlation is lower the
update will be smaller. AEnKF exploits the correlation between the present
discharge state and the discharge state not only at the previous time step
but also further in the past. It may be possible to use the correlation
between discharge at the present time and UZ/SM in the past for data
assimilation; however, this is deemed beyond the scope of this study.
Nevertheless, we speculate that this will only be useful in a smoothing
context (i.e. the present discharge may bring information on UZ/SM in the
past), not in a filtering context as in the present study.
Validation of the model setup in terms of the RMSE is presented in
Fig. d for an independent evaluation of the forecasting results
at Durbuy, an interior location which was not used for assimilation. These
results show that an improvement of discharge assimilation also occurs at the
validation location and that the pattern corresponds well to the results
presented in Fig. a. Such an analysis indicates that there is no
spurious update of the model states.
To present the results in a more robust way, we also analysed them (at
Tabreux) in terms of other probabilistic verification measures: the ROC score and the BS score (see
Fig. b and c). Recall that values of 1 represent a perfect forecast,
while values smaller than 1 indicate forecast deterioration. Similar to the
RMSE results, updating only the two routing states (HQ) is most efficient for
short lead times, but this skill disappears quickly for longer lead times. In
terms of the ROC and BS scores, for a given augmentation size, there are
marginal differences between the scenarios which update all states (all) and
which leave the soil moisture out (noSM). However, it is notable that the
state augmentation case (W = 11) improves the forecast performance as
compared to the no augmentation case (W = 0). Note that the state
augmentation of W = 5 was not carried out.
Temporal nature of model state innovations
To reveal the temporal nature of the model being updated using the AEnKF,
using W = 0 and W = 11, we present in Fig. a and b time
series of normalized differences between the ensemble means for the three
partitioned update schemes and the ensemble mean for the no-update scenario.
The normalization is achieved by dividing the aforementioned difference by
the no-update scenario mean. In such a way we obtain the relative change in
each of the model states. For the AEnKF using W = 0
(Fig. a), we can observe that for the scenario “all”,
which updates all the model states, the magnitude of the percentage change is
approximately the same for all four model states and ranges up to 25 %. When
all model states except for the SM are updated, no changes in the SM storage
occur and the overall magnitude of the changes in the other states is
slightly decreased and smoothed. Furthermore, when only the two routing
states are updated (HQ), the SM and UZ storages remain constant over time and
we observe a different temporal behaviour of the routing states in comparison
with the previous cases. For the HQ scenario, the updated time series have
a clear zigzag shape, which indicates that the effect of updating diminishes
faster because only the river channel is updated. In contrast, the routing
states for the other cases show a more stable behaviour over time,
illustrated by the stepwise shape. These more persistent results correspond
to the updates in the UZ storage, which is used for a quick catchment
response and has an impact for a longer time. The benefits of including the
UZ storage in the update and leaving the SM storage out was already presented
from a different point of view in Fig. a for longer lead times.
For the AEnKF using W = 11 (Fig. b), we can observe that the
overall pattern of the temporal changes in the model states is similar as for
W = 0, but the behaviour of using W = 11 shows somewhat larger variability.
By assimilating more observations (W = 11), we expect an even larger update,
assuming that more observations contain more information about the unknown
truth. Assuming the underlying forecast model has a significant error, by
assimilating more observations the Kalman filter will pull the model even
closer to the truth, yielding a larger abrupt update.
Conclusions
We applied the asynchronous ensemble Kalman filter (AEnKF)
and identified the effect of augmenting the state vector
with past simulations and observations. To our knowledge this is the first
application of the AEnKF in flood forecasting. We showed that the effect of
an augmented assimilation vector improves the flood forecasts, but the
contribution gets smaller for longer lead times. Overall, the AEnKF can be
considered as an effective method for model state updating taking into
account more (e.g. all) observations at hardly any additional computational
burden. This makes it very suitable for operational hydrological forecasting.
When compared to standard EnKF, the AEnKF allows for the choice of a certain
assimilation window length, which adds a degree of freedom to the data
assimilation scheme. The optimal window is very likely related to the
catchment size (i.e. concentration time). It was noted (not shown) that for
the smaller upstream catchments the optimal window was smaller than for the
complete Upper Ourthe catchment, although there was no negative effect of
a longer assimilation window (W = 5 vs. W = 11). For the high flows analysed
in this study, the AEnKF with a longer time window W is able to make
corrections that last longer on average than with the shorter time window W.
Characterization of the statistical properties of the temporal flow
dynamics (i.e. typical timescales of flood peaks as compared to low flows)
is however a relevant issue. The length of the time window W has to be seen
relative to the timescale of the river flow dynamics. We assume that for low
flow conditions, the improved skill of longer W with respect to shorter W
will become negligible, as low flows exhibit less temporal dynamics than high
flows. We refer to for an analysis about explicit handling
of lags in space and time, which uses a state augmentation approach for a
linear inverse streamflow routing model. Note that it was not the objective
of this study to determine the optimal assimilation window for the AEnKF
given various river flow dynamics. Another limitation of this study is the
relatively simple error model for perturbing only soil moisture states. More
complex ways of perturbing the model and their effects on forecast accuracy
deserve more attention in future studies.
We investigated the effect of a partitioned update scheme recently suggested
by . We showed that for the Upper Ourthe catchment, reducing
the number of model states of a grid-based HBV model using AEnKF can lead to
better forecasts of the discharge. In terms of the root-mean-square error,
the largest improvements in the forecast accuracy were observed for the
scenario where the soil moisture was left out from the analysis
similar to the PDM updating scheme presented by. This
indicates that elimination of the strongly non-linear relation between the
soil moisture storage (SM) and assimilated discharge observations can become
beneficial for an improved forecast when soil moisture observations are not
considered. On the other hand, it was recently demonstrated that a
rainfall–runoff model can be improved when constrained by remotely sensed
soil moisture e.g.
or in situ soil moisture e.g.. Moreover, we showed that
keeping the quick catchment response storage (upper zone; UZ) in the model
analysis is important, especially for longer lead times, when compared to the
scenario in which only two routing storages were updated. The UZ seems to
compensate the effect of SM on discharge. The fact that excluding SM extends
the improvements suggests that in our case the discharge forecasts with
a lead time of 2 days (and for major flood events) are less dependent on
SM. A possible alternative to excluding the SM storage from the analysis
would be to investigate the use of other algorithms, for example the maximum
likelihood ensemble filter (MLEF) ,
which is more suited for use with highly non-linear observation operators.