The assimilation of satellite-based water level observations (WLOs) into 2-D hydrodynamic models can keep flood forecasts on track or be used for reanalysis to obtain improved assessments of previous flood footprints. In either case, satellites provide spatially dense observation fields, but with spatially correlated errors. To date, assimilation methods in flood forecasting either incorrectly neglect the spatial correlation in the observation errors or, in the best of cases, deal with it by thinning methods. These thinning methods result in a sparse set of observations whose error correlations are assumed to be negligible. Here, with a case study, we show that the assimilation diagnostics that make use of statistical averages of observation-minus-background and observation-minus-analysis residuals are useful to estimate error correlations in WLOs. The average estimated correlation length scale of 7 km is longer than the expected value of 250 m. Furthermore, the correlations do not decrease monotonically; this unexpected behaviour is shown to be the result of assimilating some anomalous observations. Accurate estimates of the observation error statistics can be used to support quality control protocols and provide insight into which observations it is most beneficial to assimilate. Therefore, the understanding gained in this paper will contribute towards the correct assimilation of denser datasets.

In data assimilation (DA), observations are combined with numerical model
output, known as the background, to provide an accurate description of the
current state, known as the analysis. In DA the contributions from the
background and observations are weighted according to their relative
uncertainty. The observation error statistics are the sum of the instrument
error and representation error

The development of DA systems has largely been driven by its use in NWP, but
the methodologies are applicable to any system that can be modelled and
observed. There have been recent advances in real-time 2-D hydrodynamic
modelling and the acquisition and processing of relevant remote sensing
observations (earth observations, EOs)

A predominant EO technique to obtain water level observations (WLOs) is
synthetic-aperture radar (SAR). SAR provides high-resolution observations of
radar backscatter which, after processing, serve to delineate the flood
extent. Then, the intersection of the flood extent with a high-resolution
lidar digital terrain model is used to obtain the WLOs. The resulting WLOs
are discontinuous but locally dense in space; consequently, the errors in
the observations may be highly correlated. However, the current practice when
assimilating WLOs is to neglect the error correlations. To make the
assumption of uncorrelated errors valid the current approach is to thin the
data. Hence, in hydrology, one scenario that would benefit from improved
understanding of the observation uncertainties is the assimilation of the
satellite-derived water level observations (WLOs) for either operational
flood forecast or hindcast analyses

In this article we use the diagnostic of

Data assimilation is a technique used to provide the best estimate, the
analysis, of the current state of a dynamical system. The analysis is denoted

In

The observation error covariance matrix can be estimated using the
observation-minus-background,

The form of the diagnostic in Eq. (

The diagnostic in Eqs. (

One further issue is that the standard diagnostic is derived assuming that
the analysis is calculated using minimum variance linear statistical
estimation. If local ensemble DA is used to determine the analysis, the
diagnostic does not result in a correct estimate of the observation
uncertainties. However, by using a modified version of the diagnostic some of
the observation error statistics may be estimated. It is possible to estimate
the error correlations between two observations if the observation operator
that determines the model equivalent of observation

In this article we estimate the observation error statistics for SAR WLOs
that are assimilated using a LETKF into the LISFLOOD-FP 2-D hydrodynamic
model. This study makes use of the observation, model and assimilation system
described in

The original observations used in the deviation of WLOs are obtained using SAR which observes the surface backscatter. In a SAR image flood water appears dark so long as the surface water turbulence is insignificant. Therefore, to obtain flood extent, the pixels in a SAR image are grouped into homogeneous regions. A mean backscatter value is calculated for each region and if this value is below a given threshold, the region is classified as flooded. The threshold is determined by using training data from “flood” and “non-flood” regions. This initial estimate of flood extent is then refined by, for example, (1) correcting for any high backscatter that is a result of vegetation either within the flooded region or at the flood edge; (2) correcting for high backscatter near flooded areas that is a result of water with a rough surface; (3) performing a “nearest neighbour” check, where any local flood height that is significantly larger than those nearby is reclassified as non-flooded.

To provide the WLOs the refined flood extent is intersected with high-resolution digital elevation model (DEM). In order to improve the accuracy of the WLOs, they are only calculated if the slope in the DEM is sufficiently shallow. A further refinement takes into account, for example, the emergent vegetation at the flood edge.

The WLO derivation process results in a large number of WLOs that exist in
clusters. It is expected that many of the observations in a cluster will be
highly correlated and hence not contribute independent information. At this
stage in the processing,

The observations are assimilated into a 75 m resolution LISFLOOD-FP flood
simulation model

To compare the modelled field with the observed quantity it is necessary to
define an observation operator that maps from model to observation space. In
this study we use the “nearest wet pixel” approach described in

Data assimilation techniques can lose accuracy if presented with an
observation that is grossly inconsistent with the model state

In data assimilation the observation uncertainty has contributions from both
measurement errors and representation errors. The representation error arises
due to the difference between an actual observation and the modelled
representation of an observation; this difference can be a result of the following:

For the WLOs it is clear that a pre-processing error will exist as there is
potential for errors to be introduced in the derivation of the WLOs. For
example if the water surface is rough it may be assumed that the pixel is
dry; as a result the flood extent would be incorrect and hence an error would
be introduced in the WLO. For nearby pixels it is possible that there will be
similar errors in the derivation process, thereby introducing correlated
observation errors. The procedures in

A potential source of correlated error for WLOs is the observation operator
error. As described in Sect.

The error due to unresolved scales and processes is also a possible source of
observation error correlations. Although in this case the model is of
relatively high resolution compared to the observation resolution, there are
still scales that are unresolved. Previous studies that have considered these
scale mismatch errors have found that they are typically correlated

We estimate observation uncertainties for observations from a real flood
event that occurred in West England on an area of the lower
Severn and Avon rivers in November 2012 (Fig.

Before being assimilated, the WLOs are subject to the QC and thinning
procedures described in Sect.

We apply the diagnostic of

For this assimilation system we assume that the ensemble background error
covariance matrix gives a reasonable estimate of the true background error
statistics. The assumed standard deviation for the WLOs is 59 cm; this is
calculated as described in Sect.

As is typical for most DA systems, the observation errors are assumed
uncorrelated. With these assumed error statistics the theoretical work of

an underestimated standard deviation

an underestimated correlation length scale.

We first estimate average horizontal error covariances across the entire
domain for the duration of the flood event. We plot in Fig.

Estimated SAR WLO error correlations (black line) and number of samples (bars) used for the calculation. Estimated error standard deviation is 54 cm.

The estimated statistics give a standard deviation of 54 cm. This is slightly
lower that the assumed error standard deviation of 59 cm. Following the theory
of

Our results show that the correlations become insignificant (

It is possible that the local maximum in the correlations is a result of
observations on different tributaries of the river. To test this hypothesis
we split the domain in two (as shown in Fig.

Estimated SAR WLO error correlations (black line) and number of
samples (bars) used (bin width

Estimated SAR WLO error correlations (black line) and number of
samples (bars) used (bin width

From Figs.

We next consider if the correlation structure changes over time. We plot in
Figs.

Estimated SAR WLO error correlations (black line) and number of
samples (bars) used (bin width

Estimated SAR WLO error correlations (black line) and number of
samples (bars) used (bin width

During the middle of the flood event the observation error standard deviation decreases and the correlation length scale increases slightly. For the final two days the river is back in bank; for this period the standard deviation is largest, as is the correlation length scale, which is approximately 8 km. It is also in this final period where the “local maximum” appears in the correlations.

Figure

Estimated SAR WLO error correlations (black line) and number of
samples (bars) used (bin width

We have shown that the

The data used in this study are available in

JW, JG-P and DM prepared the data and ran the experiments. JW and JG-P analysed the results and drafted the manuscript. DM, SD and NN contributed to the discussion and manuscript editing.

The authors declare that they have no conflict of interest.

Joanne A. Waller, Nancy K. Nichols and Sarah L. Dance were supported in part by UK NERC grants NE/K008900/1 (FRANC), NE/N006682/1 (OSCA). Joanne A. Waller and Sarah L. Dance received additional support from UK EPSRC grant EP/P002331/1 (DARE). Nancy K. Nichols was also supported by the UK NERC National Centre for Earth Observation (NCEO). Javier García-Pintado, David C. Mason and Sarah L. Dance were supported by UK NERC grants NE/1005242/1 (DEMON) and NE/K00896X/1 (SINATRA). The data used in this study may be obtained on request, subject to licensing conditions, by contacting the corresponding author. Edited by: Florian Pappenberger Reviewed by: two anonymous referees