Discharge hydrograph estimation at upstream-ungauged sections by coupling a Bayesian methodology and a 2-D GPU shallow water model

Ferrari, Alessia; D'Oria, Marco; Vacondio, Renato; Dal Palù, Alessandro; Mignosa, Paolo; Tanda, Maria Giovanna

doi:https://doi.org/10.5194/hess-22-5299-2018

Articles | Volume 22, issue 10

https://doi.org/10.5194/hess-22-5299-2018

Articles | Volume 22, issue 10

Research article

16 Oct 2018

Research article |

| 16 Oct 2018

Discharge hydrograph estimation at upstream-ungauged sections by coupling a Bayesian methodology and a 2-D GPU shallow water model

Alessia Ferrari, Marco D'Oria, Renato Vacondio, Alessandro Dal Palù, Paolo Mignosa, and Maria Giovanna Tanda

Abstract

This paper presents a novel methodology for estimating the unknown discharge hydrograph at the entrance of a river reach when no information is available. The methodology couples an optimization procedure based on the Bayesian geostatistical approach (BGA) with a forward self-developed 2-D hydraulic model. In order to accurately describe the flow propagation in real rivers characterized by large floodable areas, the forward model solves the 2-D shallow water equations (SWEs) by means of a finite volume explicit shock-capturing algorithm. The two-dimensional SWE code exploits the computational power of graphics processing units (GPUs), achieving a ratio of physical to computational time of up to 1000. With the aim of enhancing the computational efficiency of the inverse estimation, the Bayesian technique is parallelized, developing a procedure based on the Secure Shell (SSH) protocol that allows one to take advantage of remote high-performance computing clusters (including those available on the Cloud) equipped with GPUs. The capability of the methodology is assessed by estimating irregular and synthetic inflow hydrographs in real river reaches, also taking into account the presence of downstream corrupted observations. Finally, the procedure is applied to reconstruct a real flood wave in a river reach located in northern Italy.

How to cite

How to cite.

Dates

Received: 08 Mar 2018 – Discussion started: 27 Apr 2018 – Revised: 02 Aug 2018 – Accepted: 20 Sep 2018 – Published: 16 Oct 2018

1 Introduction

The definition of discharge hydrographs in specific river sections is still a relevant hydraulic problem not only for flood modelling purposes but also for more practical issues related to flood-protection measures, hydropower plants, water resource management, the design of new structures, etc. Flood-routing techniques, either hydrological or hydraulic, are extensively studied and are widely used to estimate discharge hydrographs in downstream ungauged sites based on data available at upstream gauged stations (forward propagation). However, the flow hydrograph is often required in a river section that is completely ungauged and does not have useful upstream information for its definition. In these cases, discharge hydrographs at specific sites can be estimated by coupling rainfall-runoff and forward flood-propagation models. However, rainfall-runoff models (Beven, 2011) present several uncertainties associated, for example, with the choice of the model for the basin schematization, the evaluation of the effective rainfall, and the calibration procedure. An alternative approach is to assess the upstream unknown flow hydrograph using only the information in terms of the discharge values or water levels available downstream from the selected site and possibly the characteristics of the river reach. In the literature, this approach is known as reverse flow routing (D'Oria and Tanda, 2012), an ill-posed inverse problem that presents two main challenges; the solution may be non-unique, and instabilities may arise during the inversion. The traditional attempts of solving the reverse flow routing problem are based on two main approaches: the solution of a reverse form of the Saint Venant equations (e.g. Bruen and Dooge, 2007; Dooge and Bruen, 2005; Eli et al., 1974; Szymkiewicz, 1993) and the back-oriented application of hydrological routing schemes (e.g. Das, 2009; Koussis and Mazi, 2016; Koussis et al., 2012). Beyond the approximations introduced by the hydrological routing schemes, the aforementioned procedures were applied to simplified reach geometries and flow conditions. In almost all cases, especially considering downstream information affected by errors, instabilities and spurious oscillations appeared; low-pass filters with subjective parameters were sometimes used to dampen the estimated inflow fluctuations. D'Oria and Tanda (2012) and Zucco et al. (2015) provide additional references and details on the reverse flow routing problem.

In addition to the above procedures, the estimation of an unknown upstream flow hydrograph based only on downstream information (observations) can be performed via optimization methods. These techniques aim at finding the upstream flow hydrograph that, routed downstream, best matches the available observations. D'Oria and Tanda (2012) solved the reverse flow routing problem by adopting a novel Bayesian geostatistical approach (BGA) as an optimization procedure that considers the flow hydrograph as a continuous random function that presents autocorrelation. The authors showed the capability of the BGA methodology, in combination with a forward hydraulic model, to estimate the discharges in an upstream-ungauged section based only on an available downstream flow hydrograph: the solution was stable also in the presence of corrupted downstream flow values. The forward model, which solves the 1-D Saint Venant equations, was considered already implemented and calibrated and was able to describe the hydraulic routing process with sufficient accuracy. The BGA method was further extended in order to adopt stage hydrographs instead of discharge ones as downstream observations (D'Oria et al., 2014). Saghafian et al. (2015) identified the upstream hydrograph of a river reach given the downstream one by using a genetic algorithm coupled with a forward hydraulic model that solves the 1-D Saint-Venant equations under the kinematic wave approximations. Only some minor oscillations and instabilities occurred during the inversion, but the authors applied the procedure to a rectangular prismatic channel, and no errors were added to the downstream observations. Zucco et al. (2015) investigated the reverse flow routing process in natural channels and estimated the discharge hydrograph in ungauged sections by means of a genetic algorithm coupled with a simplified routing model. The parametric forward model was based on the continuity equation written in a characteristic form, lumped over the entire river reach, and on simplified rating curves at the channel ends. In addition, the unknown inflow hydrograph was assumed to be distributed in time as a Pearson type III function with three parameters, thus preventing the possibility of estimating real flood waves with irregular shapes (e.g. multi-peak hydrographs).

All the previously cited works adopted 1-D hydraulic models or simplified hydrological routing schemes in combination with different optimization procedures. Nevertheless, in many real cases, the complex hydrodynamic field generated by the flood propagation cannot be accurately described under 1-D assumptions, and it is necessary to adopt schemes based on the 2-D shallow water equations, even if this poses the drawback of the computational burden and requires a detailed terrain survey. However, nowadays, bathymetric data can be easily obtained from high-resolution digital terrain models (DTM), and fast 2-D numerical models have been developed. With the purpose of estimating the discharge hydrograph in an upstream-ungauged river section, having water level information only in a downstream observation site, this paper extends the BGA methodology for reverse flow routing from D'Oria and Tanda (2012) and D'Oria et al. (2014) to a 2-D forward algorithm in order to model natural rivers with complex geometry, including flood plains and floodable areas. With this aim, the stable, accurate and fast PARFLOOD graphics processing unit (GPU) code (Vacondio et al., 2014, 2016, 2017), which solves the conservative form of the 2-D shallow water equations on a finite volume scheme, is adopted as forward model and is coupled with the inverse estimation procedure. In order to reduce the computational time, the Jacobian matrix estimation procedure, which is the key point of the BGA method, has been parallelized. Additionally, a host–server data management procedure has been implemented to exploit the computational power of remote large modern supercomputer and/or cloud HPC resources. The capability of the optimization procedure has been tested by estimating real or pseudo-real inflow hydrographs in natural river reaches, where 1-D models cannot accurately describe the flood propagation. Moreover, during the discharge estimation, the presence of downstream corrupted observations has also been taken into account, since registered data at gauging stations are quite often affected by instrumental errors.

The paper is organized as follows; in Sect. 2 the theory of the Bayesian geostatistical approach is illustrated. A step-by-step description of the inverse procedure is provided in Sect. 3: the parallel implemented scheme, the forward model optimization for reducing the run times, and the iteration management between the local host and the remote server are described in detail. Section 4 is dedicated to the application of the procedure to synthetic test cases concerning the estimation of inflow hydrographs with different shapes in two rivers in northern Italy. The practicability of the inverse procedure for reconstructing a historical flooding event is presented in Sect. 5. Some concluding remarks are finally outlined in Sect. 6.

2 Theory of the Bayesian geostatistical approach

The optimization software adopted to solve the reverse flow routing problem is the bgaPEST (Fienen et al., 2013), which implements the Bayesian geostatistical approach of Kitanidis (1995), and it is developed according to the PEST (model independent parameter estimation) software (Doherty, 2016). The bgaPEST is appropriate for solving inverse problems (in a context of a highly parameterized inversion), which are characterized by unknown parameters that are correlated to one another in space or time, for example, the discharge values of a flow hydrograph. The first applications of the inverse methodology were related to the estimation of spatial parameter fields in a groundwater context (Hoeksema and Kitanidis, 1984; Kitanidis and Vomvoris, 1983, among others), but later the method was adopted to evaluate unknown time functions in different areas (e.g. Butera et al., 2013; D'Oria and Tanda, 2012; D'Oria et al., 2015; Leonhardt et al., 2014; Michalak et al., 2004; Snodgrass and Kitanidis, 1997).

2.1 Bayes' theorem

The crux of the adopted bgaPEST, as well as other methods based on the Bayesian approach, is Bayes' theorem, which reads

\begin{matrix} (1) & p (s | y) \propto L (y | s) p (s), \end{matrix}

where s is the vector of the unknown parameters, y is the vector of the measured data, p(s|y) is the posterior probability density function (pdf) of s given y, L(y|s) is the likelihood function, and p(s) is the prior probability density function of s. Since the present work aims at estimating an upstream hydrograph in an ungauged section, assuming the knowledge of downstream water levels, s represents the discharge values over time of the unknown inflow hydrograph, whereas y denotes the downstream water level observations. Following Eq. (1), the posterior pdf can be seen as a combination between a priori knowledge on the parameters (prior pdf), where a priori means that the observed data are still not considered, and information about parameters contained in the measured data (likelihood function) (Glickman and Van Dyk, 2007). In the BGA method proposed by Kitanidis (1995), the prior pdf and the likelihood function are described by means of Gaussian distributions, and the best set of parameter s is obtained by maximizing the posterior pdf.

2.1.1 The likelihood function

The likelihood function L(y|s) in Eq. (1) characterizes the deviation between observed data and model results (Fienen et al., 2013). Starting from the results of the forward model, L(y|s) delineates how a particular set of parameters s is able to reproduce the observations y in space and/or time, thus accounting for the epistemic errors. The investigated inverse problem presents different sources of errors that are related to the conceptual schematization of the inverse procedure, the numerical forward model, and the data measurement. In the likelihood function, the errors are assumed to be independent and identically distributed, with the zero mean and covariance matrix expressed as follows;

\begin{matrix} (2) & R = σ_{R}^{2} I, \end{matrix}

where $σ_{R}^{2}$ denotes the variance that expresses the misfit between observed and modelled data, and I is the identity matrix.

2.1.2 The prior probability density function

The prior knowledge about s (p(s) in Eq. 1) is limited to the definition of a mean value (unknown and estimated during the procedure) and a characteristic about the continuity and/or smoothness of the parameter field described through a covariance function. It furnishes a soft knowledge about the structure/shape of the unknowns and provides a regularization of the solution; the prior pdf can also be used to enforce non-negativity to the parameters (D'Oria and Tanda, 2012). The prior mean is defined as:

\begin{matrix} (3) & E [s] = X β, \end{matrix}

where E is the expected value, β is the vector of drift coefficients, and X is a known matrix of basis functions. In our case, β is a single unknown scalar, but different drift coefficients can be used to introduce discontinuities in the stochastic function to be estimated (e.g. when the unknown parameters are likely to form distinct populations). For example, in the context of reverse flow routing problems, multiple values of β are adopted if more than one inflow hydrograph must be estimated at the same time (e.g. the inflow on both the upstream branches of a river confluence). The matrix of the basis function, X, links each unknown parameter with the corresponding element of β and, at the same time, specifies the model of the mean (e.g. constant mean, mean with a trend, etc.); in our case the mean is constant and therefore X is a single vector of one (Fienen et al., 2008).

The prior covariance matrix of the unknown parameters Q_ss is then defined as

\begin{matrix} (4) & Q_{ss} = E [(s - X β) {(s - X β)}^{T}] . \end{matrix}

In the context of geostatistics, the covariance matrix Q_ss is a function of the separation distance (in time in this case) between the parameters and describes their deviations from the mean behaviour. Different functions can be adopted to describe the covariance. For example, it can be assumed as a linear function, represented through a limiting case of the exponential covariance function (Fienen et al., 2008) according to the following relation;

\begin{matrix} (5) & Q_{ss} (θ) = θ l \exp (- \frac{|d|}{l}), \end{matrix}

where d represents the vector of the separation times between all the parameter pairs ( $d_{i, j} = t_{i} - t_{j}$ , with i, j=1, …, N_p, t denoting the time associated with each parameter and N_p the total number of unknowns), l a fixed integral scale $(l = 10 max d)$ , and θ the slope (structural parameter) that governs the correlation between the discharge values of the unknown hydrograph. A different formulation (D'Oria et al., 2014) defines the prior covariance matrix Q_ss by means of a Gaussian function characterized by two structural parameters ( $σ_{s}^{2}$ and l);

\begin{matrix} (6) & Q_{ss} (σ_{s}^{2}, l) = σ_{s}^{2} \exp (- \frac{|d^{2}|}{l^{2}}), \end{matrix}

where $σ_{s}^{2}$ denotes the variance. The linear function (Eq. 5) enforces only continuity to the solution, whereas the Gaussian function (Eq. 6) also adds a degree of smoothness, but the final solution is still driven by the observations.

2.1.3 The posterior probability density function

With the assumptions made, the likelihood and prior terms that compose the posterior pdf of Eq. (1) can be rewritten as follows (D'Oria and Tanda, 2012; D'Oria et al., 2014; Fienen et al., 2009);

\begin{array}{l} (7) & L (y | s) = \exp (- \frac{1}{2} {(y - h (s))}^{T} R^{- 1} (y - h (s))), \\ (8) & p (s) = \exp (- \frac{1}{2} {(s - X β)}^{T} Q_{ss}^{- 1} (s - X β))) . \end{array}

In the likelihood function, the term h(s) represents the modelled values in the same place and time as the available observations y. Therefore, to evaluate h(s), a forward model of the considered river reach that is able to describe the hydraulic routing process is required in order to provide the corresponding downstream water levels for a given set of parameter s.

Recalling that the aim of the inverse procedure is to obtain the vector of the unknown parameters s, as well as to quantify the uncertainty in the estimation, the solution is found by maximizing the posterior pdf or, more conveniently, minimizing its negative logarithm (objective function) (Fienen et al., 2013).

In case a linear relationship between parameters and observations (linear forward model) holds, a computationally efficient method to find the best estimate $\hat{s}$ of vector s (and $\hat{β}$ of β) is obtained by introducing the vector $ξ = ({HQ}_{ss} H^{T} + R)^{- 1} (y - HX \hat{β})$ and solving the following linear system of equations (Fienen et al., 2009);

\begin{matrix} (9) & \{\begin{cases} \hat{s} = X \hat{β} + Q_{ss} H^{T} ξ \\ [\begin{array}{cc} {HQ}_{ss} H^{T} + R & HX \\ X^{T} H^{T} & 0 \end{array}] [\begin{array}{c} ξ \\ \hat{β} \end{array}] = [\begin{array}{c} y \\ 0 \end{array}], \end{cases} \end{matrix}

where H is the sensitivity (Jacobian) matrix, representing how the observations y are influenced by each unknown parameter s_i (D'Oria et al., 2015). However, for the particular problem under investigation, h(s) is non-linear and matrix H therefore depends on s. Following the quasi-linear geostatistical approach (Kitanidis, 1995), the relationship between observations and parameters can be successively linearized about a candidate solution s_k, where k represents each iteration;

\begin{matrix} (10) & h (s) \approx h (s_{k}) + {\tilde{H}}_{k} (s - s_{k}) . \end{matrix}

Then, a correction to the measurements is applied according to the following relation;

\begin{matrix} (11) & y_{k} = y - h (s_{k}) + {\tilde{H}}_{k} s_{k} . \end{matrix}

https://www.hydrol-earth-syst-sci.net/22/5299/2018/hess-22-5299-2018-f01

Figure 1Definition of the reverse flood routing problem (a) and of the unknown parameters (b).

Discharge hydrograph estimation at upstream-ungauged sections by coupling a Bayesian methodology and a 2-D GPU shallow water model

2.1 Bayes' theorem

2.1.1 The likelihood function

2.1.2 The prior probability density function

2.1.3 The posterior probability density function

3.1 Parallelization of the Jacobian matrix evaluation

3.2 The forward model

4.1 Inflow hydrograph estimation on the Parma River

4.2 Inflow hydrograph estimation on the Secchia River