Introduction

HESS

Hydrology and Earth System Sciences

HESS

Hydrol. Earth Syst. Sci.

1607-7938

Copernicus Publications

Göttingen, Germany

10.5194/hess-21-2701-2017

Multivariate statistical modelling of compound events via pair-copula constructions: analysis of floods in Ravenna (Italy)

Bevacqua

Emanuele

emanuele.bevacqua@uni-graz.at

https://orcid.org/0000-0003-0472-5183

Maraun

Douglas

Hobæk Haff

Ingrid

Widmann

Martin

https://orcid.org/0000-0001-5447-5763

Vrac

Mathieu

1Wegener Center for Climate and Global Change, University of Graz, Graz, Austria 2Department of Mathematics, University of Oslo, Oslo, Norway 3School of Geography, Earth and Environmental Sciences, University of Birmingham, Birmingham, UK 4Laboratoire des Sciences du Climat et de l'Environnement, CNRS/IPSL, Gif-sur-Yvette, France

Emanuele Bevacqua (emanuele.bevacqua@uni-graz.at)

8June2017

21 6 27012723 9December2016 2January2017 6April2017 1May2017

This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this licence, visit https://creativecommons.org/licenses/by/3.0/

This article is available from https://hess.copernicus.org/articles/21/2701/2017/hess-21-2701-2017.html

The full text article is available as a PDF file from https://hess.copernicus.org/articles/21/2701/2017/hess-21-2701-2017.pdf

Compound events (CEs) are multivariate extreme events in which the individual contributing variables may not be extreme themselves, but their joint – dependent – occurrence causes an extreme impact. Conventional univariate statistical analysis cannot give accurate information regarding the multivariate nature of these events. We develop a conceptual model, implemented via pair-copula constructions, which allows for the quantification of the risk associated with compound events in present-day and future climate, as well as the uncertainty estimates around such risk. The model includes predictors, which could represent for instance meteorological processes that provide insight into both the involved physical mechanisms and the temporal variability of compound events. Moreover, this model enables multivariate statistical downscaling of compound events. Downscaling is required to extend the compound events' risk assessment to the past or future climate, where climate models either do not simulate realistic values of the local variables driving the events or do not simulate them at all. Based on the developed model, we study compound floods, i.e. joint storm surge and high river runoff, in Ravenna (Italy). To explicitly quantify the risk, we define the impact of compound floods as a function of sea and river levels. We use meteorological predictors to extend the analysis to the past, and get a more robust risk analysis. We quantify the uncertainties of the risk analysis, observing that they are very large due to the shortness of the available data, though this may also be the case in other studies where they have not been estimated. Ignoring the dependence between sea and river levels would result in an underestimation of risk; in particular, the expected return period of the highest compound flood observed increases from about 20 to 32 years when switching from the dependent to the independent case.

Introduction

On 6 February 2015, a low-pressure system that developed over the north of Spain moved across the island of Corsica into Italy. The low pressure itself (Fig. ) and the associated south-easterly winds drove a storm surge to the Adriatic coast at Ravenna (Italy). Alongside the storm surge, large amounts of precipitation fell in the surrounding area, causing high values of discharge in small rivers near the coast. These river discharges were partially obstructed from draining into the sea by the storm surge, which then contributed to major flooding along the coast.

Sea level pressure and total precipitation on 6 February 2015, when the coastal area of Ravenna (indicated by the yellow dot) was hit by a compound flooding.

Such a compound flood is a typical example of a compound event (CE). CEs are multivariate extreme events in which the individual contributing variables may not be extreme themselves, but their joint – dependent – occurrence causes an extreme impact. The impact of CEs may be a climatic variable such as the gauge level (e.g. for compound floods), or other relevant variables such as fatalities or economic losses. CEs have received little attention so far, as underlined in the report of the Intergovernmental Panel on Climate Change on extreme events .

CEs are responsible for a very broad class of impacts on society. For example, heatwaves amplified by the lack of soil moisture, which reduces the latent cooling, may be classified as CEs . The impact of drought cannot be fully described by a single variable e.g.: analyses have been carried out which consider drought severity, duration , maximum deficit , as well as the affected area . Another example of CE includes fluvial floods resulting from extreme rainfall occurring on a wet catchment .

In the recent literature, more attention has been given to the study of CEs through multivariate statistical methods which can offer more in-depth information, regarding the multivariate nature of CEs, than conventional univariate analysis. Combinations of univariate analyses for studying CEs are only sufficient when no dependence exists among the compound variables. However, this is not usually the case, and so would lead to misleading conclusions about the assessment of the risk associated with CEs.

Modelling CEs is a complex undertaking , and methods to adequately study them are required. Parametric multivariate statistical models allow one to constrain the dependencies between the contributing variables of CEs, as well as their marginal distributions e.g.. The parametric structure reduces the uncertainties of the statistical properties we want to estimate from the data, compared to empirical estimates. However, such a reduction of the uncertainties depends on the choice of a proper parametric model. As observed data are often limited, the uncertainties might be substantial and should thus be quantified .

Due to the complex dependence structure between the contributing variables, advanced multivariate statistical models are necessary to model CEs. For example, modelling the multivariate probability distribution of the contributing variables with multivariate Gaussian distributions would usually not produce satisfying results. A multivariate Gaussian distribution would assume that the dependencies between all the pairs are of the same type (homogeneity of the pair dependencies), and without any dependence of the extreme events, also called tail dependence. Furthermore, a multivariate Gaussian distribution would assume that all of the marginal distributions would be Gaussian. To solve the latter problems, the use of copulas has been introduced in geophysics and climate science e.g.. Through copulas, it is possible to model the dependence structure of variables separately from their marginal distributions. However, multivariate parametric copulas lack flexibility when modelling systems with high dimensionality, where heterogeneous dependencies exist among the different pairs . Therefore, this lack of flexibility of copulas would be a limitation for many types of compound events. Pair-copula constructions (PCCs) decompose the dependence structure into bivariate copulas (some of which are conditional) and give greater flexibility in modelling generic high-dimensional systems compared to multivariate parametric copulas .

Here we develop a multivariate statistical model, based on PCCs, which allows for an adequate description of the dependencies between the contributing variables. The model provides a straightforward quantification of risk uncertainty, which is reduced with respect to the uncertainties obtained when computing the risk directly on the observed data of the impact. We extend the multivariate statistical model by including predictors for the contributing variables. Such predictors could represent for instance meteorological processes driving the contributing variables. This increase in complexity of the model due to additional variables is accommodated for through the use of PCCs. The predictors allow us to (1) gain insight into the physical processes underlying CEs, as well as into the temporal variability of CEs, and (2) to statistically downscale CEs and their impacts. Downscaling may be used to statistically extend the risk assessment back in time to periods where observations of the predictors but not of the contributing variables and impacts are available, or to assess potential future changes in CEs based on climate models. Based on this model, we study compound flooding in Ravenna.

In the context of compound floods, the dependence between rainfall and sea level has previously been studied for other regions e.g.. Among these studies, observed an increase in the risk of compound flooding in major US cities driven by an increasing dependence between storm surges and extreme rainfall. The impact of compound floods can be described as the gauge level in a river near the coast, which is driven both by the river discharge upstream and the sea level. Only a few studies have explicitly quantified the impact of compound floods and the associated risks . The reason might be difficulties in quantifying the impact due to a lack of data. For the Rotterdam case study, the impact has been explicitly quantified . However, there is still debate as to whether the floods in this case are actually CEs, i.e. if surges and discharges can be treated independently or not when assessing the risk of flooding. As discussed in , a significant dependence is more likely in small catchments, such as those in mountainous areas by the coast, which have a quick response time to rainfall that may favour the coincidence of high river flows and storm surges driven by the same synoptic weather system.

Here, we explicitly define the impact of compound floods as a function of sea and river levels in order to quantify the flood risk and its related uncertainties. Moreover, we quantify the risk underestimation that occurs when the dependence among sea and river levels is not considered. We identify the meteorological predictors driving the river and sea levels. By incorporating such predictors into the statistical model, we extend the analysis of compound floods into the past, where data are available for predictors but not for the river and sea level stations.

The paper is organized as follows. The Ravenna case study is discussed in Sect. . We introduce the conceptual model for compound events in Sect. . Pair-copula constructions, i.e. the mathematical method we use to implement the model, are introduced in Sect. . Based on the presented conceptual model for compound events, in Sect. we develop the model for compound floods in Ravenna. Results are presented in Sect. ; discussion and conclusions are provided in Sect. . More technical details can be found in the Appendices.

Compound flooding in the coastal area of Ravenna

In this study, we focus on the risk of compound floods in the coastal area of Ravenna. The choice of the case study was motivated by the extreme event that happened on 6 February 2015, as presented in the Introduction. On the day prior to the event, values of up to approximately 80 mm of rain were recorded in the surrounding area of Ravenna, and around 90 mm on the day of the event itself. The sea level recorded was the highest observed in the last 18 years . The high risk of flooding to the population in the Ravenna region has been underlined by the LIFE PRIMES project , recently financed by the European Commission, whose target is “to reduce the damages caused to the territory and population by events such as floods and storm surges” in Ravenna and its surrounding areas. As pointed out by , natural and anthropogenic subsidences represent a threat to the coastal area of Ravenna, characterized by land elevations which are in many places below 2 m above mean sea level . The sea level inundation risk along the coast of Ravenna has recently been studied by , who considered the joint effect of seawater level and significant wave height.

A schematic representation of the catchment on which we focus is shown in the black rectangle of Fig. . The Y variables, river and sea levels, represent the contributing variables, and the the water level h is the impact of the compound flood. The X variables are meteorological predictors of the contributing variables Y, which will be discussed in more detail later.

Hydraulic system for the Ravenna catchment. The area affected by compound floods is marked by the red point. The impact is the water level h, which is influenced by the contributing variables Y, i.e. sea and river levels. The variables inside the black rectangle are used to develop the three-dimensional (unconditional) model. The X are the meteorological predictors driving the contributing variables Y, which are incorporated into the five-dimensional (conditional) model.

We develop a multivariate statistical model able to assess the risk of compound floods in Ravenna. Our research objectives are the following.

Develop a statistical model to represent the dependencies between the contributing variables of the compound floods, via pair-copula constructions.

Explicitly define the impact of compound floods as a function of the contributing variables. This allows us to estimate the risk and the related uncertainty.

Identify the meteorological predictors for the contributing variables Y. Incorporate the meteorological predictors into the model to gain insight into the physical mechanisms driving the compound floods and into their temporal variability.

Extend the analysis into the past (where data are available for the predictors but not for the contributing variables Y).

Dataset

The data used here for the contributing variables Y and the impact h are water levels at a daily resolution (daily averages of hourly measurements). We use data for the extended winter season (November–March) of the period 2009–2015. Data sources are the Italian National Institute for Environmental Protection and Research (ISPRA) for the sea, and Arpae Emilia-Romagna for rivers and impact. River data were processed in order to mask periods of low quality, i.e. those suspected of being influenced by human activities such as the use of a dam. Moreover, we applied a procedure to homogenize the data of the rivers; details are given in Appendix . We do not filter out the astronomical tide component of the sea level, considering that the range of variation of the daily average of sea level is about 1 m, while that of the astronomical tide is about 9 cm. To check the above, we used the astronomical tide obtained through FES2012, which is a software produced by Noveltis, Legos and the CLS Space Oceanography Division and distributed by Aviso, with support from Cnes (http://www.aviso.altimetry.fr/). Meteorological predictors were obtained from the ECMWF ERA-Interim reanalysis dataset (covering the period 1979–2015, with 0.75×0.75∘ of resolution ). Specifically, for the river predictors we use daily data (sum of 12-hourly values) of total precipitation, evaporation, snowmelt and snowfall, while for the sea level predictor we use daily data (average of 6-hourly values) of sea level pressure.

Conceptual conditional model for compound events

define a CE as “an extreme impact that depends on multiple statistically dependent variables or events”. This definition stresses the extremeness of the impact rather than that of the individual contributing variables, which may not be extreme themselves, and the importance of the dependence between these contributing variables. The physical reasons for the dependence among the contributing variables can be different. There can be a mutual reinforcement of one variable by the other and vice versa due to system feedbacks, e.g. the mutual enhancement of droughts and heatwaves in transitional regions between dry and wet climates . Or the probability of occurrence of the contributing variables can be influenced by a large-scale weather condition, as has occurred in Ravenna (Fig. ), where the low-pressure system caused coinciding extremes of river runoff and sea level. It is clear then that the dependence among the contributing variables represents a fundamental aspect of compound events, and so it must be properly modelled to represent these extreme events well.

Our statistical conditional model consists of three components: the contributing variables Yi, including a model of their dependence structure, the impact h, and predictors Xj of the contributing variables. The contributing variables Yi and their multivariate dependence structure drive the CE. For instance, in the case of compound floods, the contributing variables are runoff and sea level. The impact h of a CE can be formalized via an impact function h=h(Y1,…,Yn). In the case of compound flooding, we define the river gauge level in Ravenna as the impact, but in principle it can be any measurable variable such as agricultural yield or economic loss. The predictors Xj provide insight into the physical processes underlying CEs, including the temporal variability of CEs, and can be used to statistically downscale CEs when the variables Y and the impact h are available e.g..

The downscaling feature is particularly useful for compound events, which are not realistically simulated or may not even be simulated at all by available climate models. For instance, standard global and regional climate models do not simulate realistic runoff , and do not simulate sea surges. Here, our model can be used to downscale these contributing variables, e.g. from simulated large-scale meteorological predictors. In particular, the model provides a simultaneous, i.e. multivariate, downscaling of the contributing variables Yi, which allows for a realistic representation both of the dependencies between the Yi and of their marginal distributions. This is relevant because a separate downscaling of the contributing variables Yi may lead to unrealistic representations of the dependencies between the Yi, which in turn would cause a poor estimation of the impact h. The downscaling feature can be useful for extending the risk analysis into the past, where observations of the predictors but not of the contributing variables and impacts are available.

More specifically, the conceptual conditional model consists of the following.

An impact function to quantify the impact h:h=h(Y1,…,Yn).

Predictors X for the contributing variables Y.

A conditional joint probability density function (pdf) fY|X(Y|X) of the contributing variables Y, given the predictors X (which we describe through a parametric model, via pair-copula constructions). In particular, both the contributing variables Y and predictors X are time dependent, i.e. Y=Y(t) and X=X(t).

A particular type of such a model is obtained when the predictors are not considered in the joint pdf, i.e. when considering fY(Y). This unconditional model does not allow for changes in the contributing variables Y and in the impact due to variations of the predictors X. In general, formalizing the impact h of a CE as in step 1 – to then assess the risk of CE based on values of h – corresponds to the structural approach , which has recently been formalized in . Here, the advantage of the general model we propose is that it allows for taking into account variations of the impact h driven by temporal changes in the predictors X. Through the conditional pdf, the model allows for a realistic representation both of the dependencies between the Yi and of their marginal distributions.

When the variables Y are available but not the impact h, the model can still be used to only estimate the variables Y. This may be useful when assessing the risk of CEs through e.g. multivariate return periods of the contributing variables Y e.g.. Moreover, it may happen that the impact h is available but the variables Y are not. In this case the model may still be used in the form fh|X(h|X) to directly estimate the impact h, based on the conditional joint pdf of the impact h, given the predictors X. In this case, depending on the physical system, it may be more or less complicated to calibrate the predictors. Also, we observe that Eq. (1) is general, and a possibility for estimating the impact would be to use the conditional joint pdf fh|Y(h|Y). Such an approach may be useful for cases where complex relations exist between the impact h and the variables Y, and therefore it may be difficult to implement e.g. a proper regression model to describe the impact h.

An advantage of using a parametric statistical model is that this constrains the dependencies between the contributing variables, as well as their marginal distributions, and thereby reduces their uncertainties with respect to empirical estimates . Such a reduction in turn reduces the uncertainty in the estimated physical quantity of interest, like the impact of the CE. However, the uncertainty reduction depends on the choice of a proper parametric model, in particular when modelling the tail of a univariate or multivariate distribution.

Statistical method

Pair-copula constructions (PCCs) are mathematical decompositions of multivariate pdfs proposed by , which allow for the modelling of multivariate dependencies with high flexibility. We start by presenting the concept of copulas, and then we introduce PCCs. More technical details can be found in the Appendices.

Copulas

Consider a vector Y=(Y1,…,Yn) of random variables, with marginal pdfs f1(y1),…,fn(yn), and cumulative marginal distribution functions (CDFs) F1(y1),…,Fn(yn), defined on R∪{-∞,∞}. We use the recurring definition ui:=Fi(yi), where the name u indicates that these variables are uniformly distributed by construction. According to Sklar's theorem the joint CDF F(y1,…,yn) can be written as F(y1,…,yn)=C(u1,…,un), where C is an n-dimensional copula. C is a copula if C:[0,1]n→[0,1] is a joint CDF of an n-dimensional random vector on the unit cube [0,1]n with uniform marginals .

Under the assumption that the marginal distributions Fi are continuous, the copula C is unique and the multivariate pdf can be decomposed as f(y1,…,yn)=f1(y1)⋅…⋅fn(yn)⋅c(u1,…,un), where c is the copula density. Equation () explicitly represents the decomposition of the pdf as a product of the marginal distributions and the copula density, which describes the dependence among the variables independently of their marginals. Equation () has some important practical consequences: it allows us to generate a large number of joint pdfs. In fact, inserting any existing family for the marginal pdfs and copula density into Eq. (), it is possible to construct a valid joint pdf, provided that suitable constraints are satisfied. The group of the existing parametric families of multivariate distributions (e.g. the multivariate normal distribution, which has normal marginals and copula) is only a part of the realizations which are possible via Eq. (). Copulas therefore make it easy to construct a wide range of multivariate parametric distributions.

Tail dependence

The dependence of extreme events cannot be measured by overall correlation coefficients such as Pearson, Spearman or Kendall. Given two random variables which are uncorrelated according to such overall dependence coefficients, there can be a significant probability of getting concurrent extremes of both variables, i.e. a tail dependence . On the contrary, two random variables which are correlated according to an overall dependence coefficient may not necessarily be tail dependent.

Mathematically, given two random variables Y1 and Y2 with marginal CDFs F1 and F2 respectively, they are upper tail dependent if the following limit exists and is non-zero: λU(Y1,Y2)=lim⁡u→1P(Y2>F2-1(u)|Y1>F1-1(u)), where P(A|B) indicates the generic conditional probability of occurrence of the event A given the event B. Similarly, the two variables are lower tail dependent if λL(Y1,Y2)=lim⁡u→0P(Y2<F2-1(u)|Y1<F1-1(u)) exists and is non-zero.

Pair-copula constructions (PCCs)

While the number of bivariate copula families is very large , building higher-dimensional copulas is generally recognized as a difficult problem . As a consequence, the set of copula families having a dimension greater than or equal to 3 is rather limited, and they lack flexibility in modelling multivariate pdfs where heterogeneous dependencies exist among different pairs. For instance, they usually prescribe that all the pairs have the same type of dependence, e.g. they are either all tail dependent or all not tail dependent. Under the assumption that the joint CDF is absolutely continuous, with strictly increasing marginal CDFs, PCCs allow us to mathematically decompose an n-dimensional copula density into the product of n(n-1)/2 bivariate copulas, some of which are conditional. In practice, this provides high flexibility in building high-dimensional copulas. PCCs allow for the independent selection of the pair-copulas among the large set of families, providing higher flexibility in building high-dimensional joint pdfs with respect to using the existing multivariate parametric copulas .

When the dimension of the pdf is large, there can be many possible, mathematically equally valid decompositions of the copula density into a PCC. For example, for a five-dimensional system there are 480 possible different decompositions. For this reason, have introduced the regular vine, a graphical model which helps to organize the possible decompositions. This is helpful for choosing which PCC to use to decompose the multivariate copula. In this study we concentrate on the subcategories canonical (also known as C-vine) and D-vine of regular vines. Out of the 480 possible decompositions for a five-dimensional copula density, 240 are regular vines (60 C-vines, 60 D-vines and 120 other types of vines) . The decomposition we selected for the conditional model is the following D-vine: f12345(y1,y2,y3,y4,y5)=f4(y4)⋅f5(y5)⋅f3(y3)⋅f1(y1)⋅f2(y2)⋅c45(u4,u5)⋅c53(u5,u3)⋅c31(u3,u1)⋅c12(u1,u2)⋅c43|5(u4|5,u3|5)⋅c51|3(u5|3,u1|3)⋅c32|1(u3|1,u2|1)⋅c41|35(u4|53,u1|53)⋅c52|13(u5|31,u2|31)⋅c42|135(u4|513,u2|513), where (Y1,Y2,Y3) are the variables (Y1Sea,Y2River,Y3River), and (Y4,Y5) are the predictors (X1Sea,X23Rivers) (details about the predictors are given in the next section). Details about the selection procedure of the vine (Eq. ) are given in Appendices and , while the graphical representation of this vine is shown in Fig. A (Appendix ).

As described in Sect. , the conditional model is based on the conditional joint pdf fY|X(Y|X), which is decomposed via PCC. Details regarding conditional joint pdfs decomposed as C- or D-vines (including the developed algorithms for sampling from such vines) are presented in Appendix . Moreover, the developed routines for working with conditional vines are publicly available via the CDVineCopulaConditional R package . More details about vines and the decompositions used for the unconditional model are given in Appendix . Details regarding the statistical inference of the joint pdf can be found in Appendix .

Model development

The extreme impact of compound events may be driven by the joint occurrence of non-extreme contributing variables . This is the case for compound floods in Ravenna, where not all extreme values of the impact would be considered when selecting only extreme values of the contributing variables. Therefore we model the contributing variables, without focusing only on their extreme values. Below we show the steps we follow to study compound floods in Ravenna, based on the conceptual model described in Sect. . We will go through these steps in detail in the next sections.

Define the impact function: h=h(Y1Sea,Y2River,Y3River). The contributing variables Y (sea and river levels) and the impact are shown in the black rectangle of Fig. .

Find the meteorological predictors of the contributing variables Y. For each variable Yi we found more than one meteorological predictor, which we aggregated into a single variable Xi. We refer to this variable as the predictor Xi of the variable Yi from now on. Moreover, we use the same predictor for the two river levels because they are driven by a similar meteorological influence. The predictors are graphically shown in Fig. , where we introduce X1Sea (the predictor of Y1Sea) and X23Rivers (the predictor of Y2River and Y3River).

Fit the five-dimensional conditional joint pdf fY|X(Y1Sea,Y2River,Y3River|X1Sea,X23Rivers) of the conditional model (modelled via PCC). To develop the unconditional model, we fit the three-dimensional pdf fY(Y1Sea,Y2River,Y3River), which includes only the contributing variables Y inside the black rectangle of Fig. . The time series of the contributing variables have significant serial correlations, and this should be considered in order to avoid underestimating the risk uncertainties (see Appendix and Fig. ). Only for the unconditional model did we explicitly model such serial correlations by combining the PCC with autoregressive AR(1) models (see Appendix ).

Given the complexity of the problem, an analytical derivation of the statistical proprieties of the impact is impracticable. Therefore, we apply a Monte Carlo procedure. Specifically we simulate the contributing variables Y from the fitted models, and then we define the simulated values of h via Eq. () as hsim:=hY1Seasim,Y2Riversim,Y3Riversim, where Ysim are the simulated values of Y.

Perform a statistical analysis of the values hsim. To assess the risk associated with the events, we compute the return levels of h by fitting a generalized extreme value (GEV) distribution to annual maximum values (defined over the period November–March). We compute the model uncertainties, which is straightforward through such models. Practically, such uncertainties propagate through to the risk assessment, and so they must be considered (details about model-based return level uncertainty are given in Appendix ).

To neglect the Monte Carlo uncertainties, i.e. the sampling uncertainties due to the model simulations, we produce long simulations. For example, to obtain the model-based return level curve, we simulate a time series hsim(t) of length equal to 200 times the length of the observed data (6 years). From this we get a time series of 1200 annual maximum values, to which we fit the GEV distribution to get the return level. Observation-based return levels are obtained by fitting a GEV to annual maximum values of hobs. The relative uncertainties are computed by propagating the parameter uncertainties of the fitted GEV distribution (more details are given at the end of Appendix ).

Impact function

The water level h is influenced by river (Y2River and Y3River) and sea (Y1Sea) levels (Fig. ). We describe this influence through the following multiple regression model: h=a1Y1Sea+a21Y2River+a22Y2River2+a31Y3River+a32Y3River2+c+ηh(0,σh), where ηh(0,σh) is a Gaussian distributed noise having a standard deviation equal to σh. The contribution of the rivers to the impact h is expressed via quadratic polynomials, which guarantees a better fit of the model according to the Akaike information criterion (AIC). In particular, we defined the regression model as the best output of both a forward and backward selection procedure, considering linear and quadratic terms for all of the Y as candidate variables. The Q–Q plot of the model, i.e. the plot of the quantiles of observed values against those of the mean predicted values from the model, is shown in Fig. . The points are located along the line y=x, which indicates that the model is satisfying. Omitting one of the variables as a predictor reduces the model performance, underlining the compound nature of the impact h. The sum of the relative contributions of the rivers is very similar to that of the sea. The parameters of this model (and of those in Sect. ) were estimated according to the maximum likelihood approach, solved by QR decomposition (via the lm function of the stats R package – ).

Q–Q plot between the observed impact (x-axis) and the modelled impact (y-axis) from the regression model (Eq. ).

Meteorological predictor selection

Figure shows the resulting scatter plots of observed predictands (Yobs) and selected observed predictors (Xobs). To fit the joint pdf of the conditional model, we use all time steps where data for all of the X and Y variables have been recorded. However, we calibrate the predictors of rivers and sea separately, so we use all available data for each Y variable (during the period November–March). The procedure we use to identify the meteorological predictors is shown below.

Scatter plots of predictands Yobs and predictors Xobs. The numbers are Spearman coefficient correlations. The red lines (computed via LOWESS, i.e. locally weighted scatter-plot smoothing) are shown to better visualize the relationship between pairs .

River levels

The meteorological influence on the two rivers Y2River and Y3River is very similar because their catchments are small and close by (as a consequence the Spearman correlation between the rivers is high, i.e. 0.79). Therefore we use the same predictor for the two river levels.

The river levels are influenced by the total input of water over the catchments, which is given by the positive contribution of precipitation and snowmelt, and by evaporation which results in a reduction of the river runoff. Specifically, we compute the input of water w on the day t* over the river catchments (one grid point) as w(t*)=Ptotal(t*)-E(t*)+Smelt(t*)-Sfall(t*), where Ptotal is the total precipitation, E is the evaporation, Smelt is the snowmelt and Sfall is the snowfall. The snowfall accounts for the fraction of precipitation which does not immediately contribute to the input of water over the catchments because of its solid state. While a fraction of the water input over the catchment rapidly reaches the rivers as surface runoff, another fraction infiltrates the ground and contributes only later to the river discharge. Compared with the first fraction, the second has a slower response to precipitation and changes more gradually over time. This double effect underlines the compound nature of river runoff whose response to precipitation falling at a given time is higher if in the previous period additional precipitation fell in the river catchment. To consider both of these effects, we define the river predictor as X23Rivers(t)=aR∑t*=t-1tw(t*)+bR∑t*=t-10tw(t*)+cR, where cR is a constant. We choose the parameters of Eq. () by fitting the right-hand side of this equation to the river contributions to the impact, i.e. Y23Rivers:=a21Y2River+a22Y2River2+a31Y3River+a32Y3River2 (see Eq. ). The lags n=1 and n=10 days are those which maximize respectively the upper tail dependence and the Spearman correlation between Y23Rivers(t) and the cumulated w over the previous n days, i.e. ∑t*=t-ntw(t*). Here, we use the upper tail dependence to get the typical river response time to the fraction of water which directly flows into the rivers as surface runoff. Similarly, the Spearman correlation is used to get the typical time required for the infiltrated water in the ground to flow into the rivers.

By defining the river predictor as in Eq. (), we aggregate the different meteorological drivers of the rivers in the single predictor X23Rivers(t). Such aggregation allows for a simplification of the system describing the compound floods, due to a reduction of the involved variables. Furthermore this reduces the variables described by the joint pdf fY,X(Y,X), whose numerical implementation errors can potentially increase with higher dimensionality .

All of the terms involved in the multiple regression model (Eq. ) are statistically significant at level α=2×10-16. Moreover, the quality of the river predictor X23Rivers improves (according to the likelihood and to the Spearman correlation between X23Rivers and Y23Rivers) when we use all of the terms in Eq. (), instead of only Ptotal(t*). The presence of more terms in Eq. () does not increase the number of model parameters.

Sea level

Sea level can be modelled as the superposition of the barometric pressure effect, i.e. the pressure exerted by the atmospheric weight on the water, the wind-induced surge, and an overall annual cycle. As for the river predictor, we aggregate the different physical contributions in a single predictor. We define the sea level predictor on day t as X1Sea(t)=aSSLPRavenna(t)+bSSLP(t)⋅RMAP+cSsin⁡(ωYeart+ϕ)+dS, where SLPRavenna is the sea level pressure in Ravenna, SLP⋅RMAP is the wind contribution due to the sea level pressure field SLP, the harmonic term is the annual cycle and dS is a constant term. In Eq. (12) the SLP field and the regression map are represented as column vectors. We choose the parameters of Eq. () by regressing the sea level Y1Sea(t) on the right-hand side of this equation. A more detailed physical interpretation of the terms is given in the following.

aSSLPRavenna accounts for the barometric pressure effect . The regression map RMAP indicates which anomalies of the SLP field are associated with high values of the residual of the barometric pressure effect (see Fig. , where more details are also given). Particularly, according to the geostrophic equation for wind, these pressure anomalies induce wind in the Adriatic Sea towards Ravenna's coast. Therefore, the projection of the SLP field onto this regression map, i.e. the term SLP(t)⋅RMAP, describes the wind-induced change in sea level at time t.

cSsin⁡(ωYeart+ϕ) describes the remaining annual cycle of the sea level which is not described by barometric pressure effect and wind contribution. This harmonic term could be driven by the annual hydrological cycle , i.e. due to cyclic runoff of rivers which flow into the Adriatic Sea, or due to density variations of the seawater (caused by the annual cycle of water temperatures). Astronomical tide may explain a minor fraction of this term. The range of variation of cSsin⁡(ωYeart+ϕ) is about 10 % of that of the sea level. When we use the predictor to extend the analysis to the period 1979–2015, this term will be kept constant assuming that the annual cycle has not drastically changed in past years. Moreover, we will not consider long-term sea level rise because its influence on both sea and impact h level variations is negligible over the considered period (the observed rate of sea level rise in the northern Adriatic Sea has been ∼0.8 mm yr-1 ). Also, the relative sea level rise has been negligible over the considered period .

Regression map R^MAP(i,j) in matrix notation. The value of the regression map in the location (i,j) is given by R^MAP(i,j)=var(R0)-1×cov(R0,SLPi,j), where R0(t) is the residual of the barometric pressure effect obtained from the fit of the linear model a0SLPRavenna(t)+d0 to Y1Sea(t). The regression map is equivalent to a one-dimensional maximum covariance analysis . The red dot indicates Ravenna.

All the terms involved in the multiple regression model are statistically significant at level α=2×10-16.

Results

The results of the unconditional and conditional models are presented in the following sections.

Unconditional (three-dimensional) model

The unconditional model reproduces the joint pdf of the contributing variables (Y1Sea,Y2River,Y3River), and, in conjunction with the autoregressive models, also the serial correlations. The model is used to simulate values of the impact h and assess the risk of compound floods, with related uncertainties. The selected pair-copula constructions and fitted pair-copula families are shown in Appendices and .

Scatter plots of observed (grey) against simulated (black) contributing variables Y. The simulated series are obtained via the three-dimensional model (including the serial correlation), and have the same length as the observed series.

Figure shows, qualitatively, a good agreement between simulated and observed contributing variables Y. In Fig. we show the return levels of the impact h. There is good agreement between the model- and observation-based expected return levels, even for return periods larger than 6 years (the length of the observed data). For return periods larger than shown in Fig. , the agreement slowly decreases. The model-based expected return period of the highest compound flood observed (3.19 m) is 18 years (the 95 % confidence interval is [2.5,∞] years, where ∞ indicates a value larger than 1050 in this context from now on). The reason for such large uncertainty in the return period is the shortness of available data. However, the model-based uncertainties are large but still smaller, up to return periods of about 60 years, than those obtained when computing the return level directly (based on the GEV) on the observed data of the impact (Fig. ). Moreover, when considering a model which does not take the serial correlation of the contributing variables Y into account, we get an underestimation of the risk uncertainties. For example, the amplitude of the 95 % confidence interval of the 20-year return level is underestimated by about 50 % (not shown).

Unconditional model. Return levels of the impact h with associated 95 % uncertainty intervals. The return level computed on hobs is shown in red (uncertainty shown in light red). The model-based return level is shown in black (uncertainty is in grey).

Conditional (five-dimensional) model

This model allows for assessment of the change in the risk of compound floods due to temporal variations of the meteorological predictors of the contributing variables Y. We calibrate the model to the period 2009–2015. After validating the model for the period 2009–2015, we use predictors of the period 1979–2015 to extend the analysis of compound flood risk to the past. The selected pair-copula construction and fitted pair-copula families are shown in Appendices and . We assess the quality of the model by comparing predictions with observations. Specifically we look at its overall accuracy by considering the root-mean-square error between model predictions and observed data. Moreover, we look at the accuracy of the model when predicting extreme values of the impact h (defined as values of h larger than the 95th percentile of hobs), using the Brier score (see Appendix ). To assess the quality of the model, avoiding overfitting, we perform a 6-fold cross-validation (see Appendix ).

Validation time series of the conditional model obtained by 6-fold cross-validation. hobs is shown in red. The average and 95 % prediction intervals of 104 simulated time series are respectively shown in black and grey.

The cross-validation time series of the impact h is visually compared with hobs in Fig. . The average of the simulated cross-validation time series in general follows the temporal progression of hobs (Fig. ), and about 94 % of the observed impact values lie within the 95 % prediction interval. In particular, the highest flood observed is well predicted and lies inside the prediction interval. The Brier score based on the cross-validation time series is BSCV=0.029, while that relative to the reference model, i.e. the climatology (see Appendix ), is BSCL=0.046. The resulting Brier skill score is BSS=1-BSCV/BSCL=0.38, which indicates that the model is more accurate than the reference model in predicting extreme values of the impact h. In general, the skill of the model, both in terms of root-mean-square error and Brier score, does not change much when the cross-validation is not performed. This underlines that no artificial skill is present in the model. These positive results provide good confidence for extending the impact time series to the period 1979–2015. It also makes the model potentially interesting for flood forecasting and warning.

In Fig. a we show the return levels of the impact h. As in the unconditional model, return levels are stationary, i.e. estimated by fitting a stationary GEV distribution to annual maximum values. The discrepancy between model- and observation-based return levels for the conditional model is smaller than for the unconditional one, in particular for high return periods. It may happen that the dependencies between river and sea levels are not considered in some analyses when assessing the risk of flooding. show in Rotterdam, which is affected by floods driven both from surge and river discharges, that the boundary conditions used to build the protection barrier were determined assuming independence between sea level and river discharge. Here we observe that ignoring such a dependence may result in an underestimation of the estimated risk. The expected return period of the highest compound flood observed (3.19 m), computed over the period 2009–2015, is 20 years (the 95 % confidence interval is [4.9,∞] years). When not considering the dependencies between river and sea levels, the expected return period of the highest compound flood observed increases to 32 years (the 95 % confidence interval is [6.7,∞] years). Figure b shows that the return level estimates are reduced by about 0.2 m when not considering such dependencies between sea and river levels. In particular, at the 95 % confidence level, the return levels are underestimated when not considering these dependencies for return periods smaller than about 40 years. The same, however, cannot be clearly concluded for return periods larger than 40 years because of the large uncertainties (Fig. b). A similar result is obtained from the unconditional model (not shown). Therefore, although there is not a large difference in the return levels when treating sea and rivers independently or not, in Ravenna it may be relevant to incorporate their dependencies into the flood risk estimation. An imprecise risk assessment may bring negative societal consequences due to inadequate information provided for infrastructural adaptation.

To estimate the risk based on predicted values of the impact during the past, we run the simulations by conditioning on predictors of the period 1979–2015. This allows us to get a more robust estimation of the risk compared to that obtained considering only the period 2009–2015. The return levels in Fig. a (dashed line) are similar to that estimated when analysing the period 2009–2015. Although this result suggests a stationarity of the risk during the period 1979–2015, we investigate whether there has been any trend in the risk during the recent past. To do this, we computed time-dependent return levels. Specifically, we computed stationary return levels on moving temporal windows of 6 years during the period 1979–2015, based on hsim values obtained by conditioning on predictors belonging to these windows. However, we did not observe any long-term trend in the risk. Moreover, analysing the return levels computed on moving temporal windows during the period 1979–2015, we did not observe any long-term trend, either in the risk of storm surge or in that of river floods (not shown).

During the period 1979–2015, there has not been any long-term trend in the risk due to a variation of the marginal distributions of the predictors or in their dependence. To study this, we computed the return levels on moving temporal windows in the cases described below. First, we simulated the impact by conditioning the Ysim variables on predictors having the observed marginal distributions of the period 1979–2015, but fixing the dependence to that observed during 2009–2015. Secondly, we simulated the impact by conditioning on predictors having the observed dependence of the period 1979–2015, and fixed marginal distributions to the ones observed during 2009–2015. In both cases we did not find any long-term trend in the return levels (not shown).

Conditional model. (a) Return levels of the impact h with associated 95 % uncertainty intervals. The return level computed on hobs is shown in red (uncertainty shown in light red). The model-based return level computed for the period 2009–2015 (black) is based on hsim values simulated for days where the observed data were available (uncertainty is shown in grey). The model-based return level computed for the period 1979–2015 (black dashed) has an uncertainty of similar amplitude to that of period 2009–2015 (not shown). (b) Difference between the model-based return level obtained when considering the realistic dependence between sea and river levels, and when assuming that they are independent. To make the dependencies between the sea and the river levels independent but keep the dependence between the two rivers, we shuffled the sea level data after each simulation, which guarantees random association between sea data and each of the rivers e.g.. The black line represents the median of the bootstrap samples.

Discussion and conclusions

Compound events (CEs) are multivariate extreme events in which the contributing variables may not be extreme themselves, but their joint – dependent – occurrence causes an extreme impact. Conventional univariate statistical analysis cannot give accurate information regarding the multivariate nature of CEs and therefore the risk associated with these events.

We develop a conceptual model, implemented via pair-copula constructions (PCCs), to quantify the risk of CEs as well as the associated sampling uncertainty. This model includes predictors, which could represent for instance meteorological processes. The inclusion of predictors in the model (1) provides insight into the physical processes underlying CEs, as well as into the temporal variability of CEs, and (2) allows for statistical downscaling of CEs and their impacts. The model is in principle extendable to any number of contributing variables and predictors, given a large enough sample of data for calibration.

Downscaling may be used to statistically extend the risk assessment back in time to periods where observations of the predictors are available but not of the contributing variables and impacts, or to assess potential future changes in CEs based on climate models. The conceptual model is particularly useful for downscaling large-scale predictors from climate models in cases where the local contributing variables driving the impacts of CEs are either not realistically simulated or not simulated at all by the available climate models. As such, the model can straightforwardly be used to assess future risk of CEs based on multi-model ensembles as available from the CMIP and CORDEX archives.

The model makes use of PCCs, a very powerful statistical method to model multivariate dependencies. PCCs are particularly useful for modelling CEs, when the contributing variable pairs have different dependence structures, e.g. when only some of them are characterized by tail dependence. To model such types of structures, even multivariate parametric copulas, which have been introduced in climate science to overcome some difficulties in modelling multivariate density distributions e.g., lack flexibility. PCCs are more convenient: by decomposing the dependence structure into bivariate copulas, they give high flexibility in modelling generic high-dimensional systems. We suggest considering the use of PCCs for modelling compound events which involve more than two contributing variables, or when predictors are included in the system as additional variables.

The model allows for a straightforward quantification of sampling uncertainties. In many cases, such risk uncertainties might be substantial as observed data are often limited, and should thus be quantified. In fact, uncertainty estimates are essential to avoid drawing conclusions that may be misleading when uncertainties are large (as also recently discussed by ).

We adapt the developed conceptual model to study compound floods in Ravenna, which are floods driven by the joint occurrence of storm surge and high river level. In other words, the contributing variables of the compound floods are the river and sea levels, whose combination drives the impact, i.e. the water level in between the river and the sea.

We used the specific adaptation of the model to statistically downscale the river and sea level from meteorological predictors, and therefore estimate the impact of the compound floods as a function of the downscaled sea and river levels. The accuracy of the estimated impact appears satisfactory, such that the model is potentially interesting for use in both flood forecasting and warning. Also, the model-based expected return levels of the impact are about the same as those directly computed on observed data of the impact. Although the model-based uncertainty in these return levels is very large (due to the shortness of the available data), for return period smaller than about 60 years it is smaller than that obtained by computing the risk directly on the observed data of the impact.

We calibrate the model over the period 2009–2015, and by including meteorological predictors obtained from the ECMWF ERA-Interim reanalysis dataset, we extend the analysis of compound flooding to the full period of 1979–2015, to obtain a more robust estimation of the risk. The expected return period of the highest compound flood observed, computed over the period 1979–2015, is 19 years (the 95 % confidence interval is [3.7,∞] years). Moreover, we did not observe any long-term trend in risk during the period 1979–2015.

Ignoring the estimated dependence between sea and river levels may lead to an underestimation of risk. Specifically, assuming independence between sea and river levels, the expected return period of the highest compound flood observed – computed over the period 2009–2015 – is 32 years (the 95 % confidence interval is [6.7,∞] years). When assuming the estimated dependence between sea and river levels, it decreases to 20 years (the 95 % confidence interval is [4.9,∞] years). In other cities affected by sea surges and river flooding, e.g. in Rotterdam, protection barriers were designed assuming independence between sea level and river discharge , a decision which is still debated . In Ravenna, it may be relevant to incorporate these dependencies into the flood risk estimation. An imprecise risk assessment may harm the population at risk due to inadequate information provided for infrastructural adaptation. In general, when considering generic CEs, their associated risk may be substantially influenced by the dependence between the contributing variables, and so this dependence should be considered.

In the context of compound floods, only a few studies have explicitly quantified the impact and the associated risks . This might be due to the practical difficulties in quantifying the impact. For example, to quantify the impact of compound floods in the river mouth, it is necessary to have water level data at a station where both the influence of sea and river are seen. However, we have found few locations where these stations exist as, maybe in part, stakeholders are usually interested in data where only the influence of the river or the sea is seen. Also, for places where data show both the influence of sea and river, the measurements can be affected by human influences such as pumping stations between river and sea stations. Moreover, while compound floods involve a dependence between sea and river levels , places where there are stations detecting both the influence of sea and river may not present such dependence. Therefore, we argue that to obtain more in-depth knowledge of these events, it may be very useful to create an archive containing data for locations where compound floods have been recorded and eventually increase the effective number of measurements in places which are supposed to be at risk of compound floods.

The developed routines for working with conditional joint probability density functions decomposed as D- or C-vines are publicly available via the CDVineCopulaConditional R package (more details are given in Appendix ). Other routines from this study are available from the authors upon request.

Sea level data of the Ravenna-Porto Corsini station were downloaded from the Italian National Institute for Environmental Protection and Research (ISPRA), and are available under the link www.mareografico.it. River data can be downloaded from Arpae Emilia-Romagna, via the link www.arpae.it/dettaglio_generale.asp?id=3284&idlivello=1625 (the names of the used stations are S. Marco, S. Bartolo and Rasponi, where the latter is that used for the impact). Meteorological predictors were obtained from the ECMWF ERA-Interim reanalysis dataset, which is available via the link http://apps.ecmwf.int/datasets/data/interim-full-daily/levtype=sfc/.

Homogenization of river level data

The zero reference level of river measurements is the water level in the river defined as zero in the measurements. In general, such a zero reference level may change during different periods of observation, for technical reasons. As the zero reference level of rivers Y2River and Y3River varied in the first 3 years but remained constant in the second 3, we homogenized the former with respect to the latter at both rivers. We performed such homogenization assuming that the precipitation falling into the catchment during 1 year is responsible for the average river level in the same year. For each river YiRiver, we fitted the linear model YiRiverannual=aiPiannual+bi in the last 3 years (those having a constant zero reference level), where YiRiverannual is the annual average of YiRiver and Piannual is the annual cumulated precipitation over the river basin (data from the ECMWF ERA-Interim reanalysis dataset). Finally, for each river, we translated the zero reference level of the first 3 years, such that the linear model was valid in these years as well.

Vines and sampling procedure

In this appendix we show more details about vines, focusing on C- and D-vines. Moreover, we discuss the sampling procedure, showing the algorithms to perform the conditional sampling from C- and D-vines.

Vines

Shown below are the general expressions to decompose an n-dimensional pdf via a PCC as a C-vine (Eq. ) or D-vine (Eq. ) : fY1,…,Yn(y1,…,yn)=∏k=1nf(yk)∏j=1n-1∏i=1n-jci,i+j|i+1,…,i+j-1{F(yi|yi+1,…,yi+j-1),F(yi+j|yi+1,…,yi+j-1)},fY1,…,Yn(y1,..,yn)=∏k=1nf(yk)∏j=1n-1∏i=1n-jcj,j+i|1,…,j-1{F(yj|y1,…,yj-1),F(yj+i|y1,…,yj-1)}.

The five-dimensional vine that we use for the conditional model is shown in Eq. (). The graphical representation of that decomposition is shown in Fig. A, where the concept of a tree is introduced. We show below the vines that we use for the unconditional model.

Three-dimensional vine

In total, a three-dimensional copula density can be decomposed in three different ways, and each of these vines is both a D-vine and a C-vine. For this application we use the following vine. f123(y1,y2,y3)=f1(y1)⋅f2(y2)⋅f3(y3)⋅c12(u1,u2)⋅c23(u2,u3)⋅c13|2(u1|2,u3|2). This decomposition is represented graphically in Fig. b. We underline that, in Eq. (), the rigorous expression of the conditional copula density c13|2, of the pair (U1,U3), given U2=u2, would be c13|2(u1|2,u3|2;u2). In Eq. (), c13|2 is written under the assumption of a simplified PCC; i.e. the parameters of c13|2 are the same for all values of u2∈(0,1). The simplified PCC may be a rather good approximation, even when the simplifying assumption is far from being fulfilled by the actual model . Copula parameters that are functions of the conditioning variables, and thus violate the simplifying assumption, are approximated by the average over all values of the conditioning variables. The effect of this approximation on the estimated impact is likely to be small .

In this study of compound floods, the variables (Y1,Y2,Y3) of Eq. () are the (ε1Sea,ε2River,ε3River) introduced in Appendix . Specifically, the vine of Eq. () represents that used at the first step of the procedure in Appendix . The vine that we use at the third step of the procedure in Appendix is f123(y1,y2,y3)=f3(y3)⋅f1(y1)⋅f2(y2)⋅c31(u3,u1)⋅c12(u1,u2)⋅c32|1(u3|1,u2|1), where (Y1,Y2,Y3)=(Y1Sea,Y2River,Y3River).

(a) Representation of the five-dimensional D-vine in Eq. (). There are 4 trees (T1,T2,T3,T4) and 10 edges. Each edge represents a pair-copula density, and the label indicates the subscript of the corresponding copula. For example, the edge 43|5 represents the copula density c43|5. The decomposition of the joint pdf related to the represented vine is obtained by multiplying all the represented pair-copula densities (10 in this case) and the marginal pdfs of each variable. For more details, see . (b) Representation of the three-dimensional vine in Eq. (). There are two trees (T1 and T2) and three edges.

Sampling procedure

To simulate a vector Y=(Y1,…,Yn) of random variables, with marginal CDFs F1(y1),…,Fn(yn), whose joint pdf is modelled via a copula, we first simulate from the copula the uniform variables Ui for i=1,…,n (ui:=Fi(yi)), and then transform them into Yi for i=1,…,n (yi:=Fi-1(ui)).

Sampling and conditional sampling from vines

The simulation of the uniform variables from vines is discussed in and . show the algorithms to sample uniform variables from C- and D-vines. Due to the nature of PCCs, the sampling procedure works as a cascade. Once the first variable is simulated from a uniform distribution, each following variable is simulated as conditioned on the previous group of simulated variables.

It is clear then that to sample from the conditional distribution of UNcond+1,…,Un given values for U1,…,UNcond (i.e. fUNcond+1,…,Un|U1,…,UNcond), it is possible to follow this procedure by simply fixing the first Ncond variables at the conditioning values. The approach used here to execute such a procedure is to select vines from which the conditioning variables would be sampled as first when following the sampling algorithms from . For example, using the D-vine represented in Fig. a (or in Eq. ), we could simulate by fixing the pairs (U4,U5) or (U2,U1) in case we are interested in conditioning the simulation on two variables.

Following this approach, for D-vines the number of n-dimensional decompositions which allow for conditioning on Ncond variables is Ncond!×(n-Ncond)!. For C-vines the number of decompositions which allow for such a conditioning is Ncond!×(n-Ncond)!/2 for n-Ncond>1, and Ncond! for n-Ncond=1. For example, in this study we model a five-dimensional system with two conditioning variables (the meteorological predictors), that is, n=5 and Ncond=2. Considering that there are no five-dimensional vines which belong to both the C-vine and D-vine categories , the choice of the vine used for the model is done among (2!/2×(5-2)!)+(2!×(5-2)!)=18 vines. Furthermore, we need to condition on values (y4,y5); therefore, we simulate from the copula by conditioning on (u4=F4(y4),u5=F5(y5)), where F4 and F5 are the fitted marginals in the calibration period, while (y4,y5) could theoretically be any value.

To apply such a sampling procedure, we developed Algorithms and , which are modified versions of Algorithms 1 and 2 shown in . The developed algorithms allow for conditional sampling from a C- or D-vine from which the conditioning variables would be sampled as first when following the sampling algorithms from . Specifically, given a C- or D-vine of the variables (X1,…,XNcond,XNcond+1,…,Xn), Algorithms and allow for the conditional sampling of (XNcond+1,…,Xn) given (X1=x1cond,…,XNcond=xNcondcond), where Ncond is the number of conditioning variables. When conditioning variables are not given (Ncond=0), Algorithms and reduce to the special cases of Algorithms 1 and 2 shown in . Both routines relative to Algorithms and are publicly available via the CDVineCopulaConditional R package . CDVineCopulaConditional includes tools to select the best vine (based on information criteria) among those which allow for such conditional sampling, and therefore to fit the pair-copula families.

Algorithm to simulate uniform variables X=(X1,…,XNcond,XNcond+1,…,Xn) from a C-vine. Generates one sample xNcond+1,…,xn conditioned on given values x1cond,…,xNcondcond. The h-function is defined as in . Θj,i is the set of parameters of the copula density cj,j+1|1,…,j-1.

Sample wNcond+1,…,wn independent uniform on [0,1].

if Ncond≠0 then

for i in (1,…,Ncond) do

wi=xicond

end for

end if

x1=v1,1=w1

for i in (2,…,n) do

vi,1=wi

if i>Ncond then

for k in (i-1,i-2,…,1) do

vi,1=h-1(vi,1,vk,k,Θk,i-k)

end for

end if

xi=vi,1

if i==n then

Stop

end if

for j in (1,…,i-1) do

vi,j+1=h(vi,j,vj,j,Θj,i-j)

end for

Algorithm to simulate uniform variables X=(X1,…,XNcond,XNcond+1,…,Xn) from a D-vine. Generates one sample xNcond+1,…,xn conditioned on given values x1cond,…,xNcondcond. The h-function is defined as in . Θj,i is the set of parameters of the copula density ci,i+j|i+1,…,i+j-1.

Sample wNcond+1,…,wn independent uniform on [0,1].

if Ncond≠0 then

for i in (1,…,Ncond) do

wi=xicond

end for

end if

x1=v1,1=w1

if Ncond<2 then

x2=v2,1=h-1(w2,v1,1,Θ1,1)

else

x2=v2,1=w2

end if

v2,2=h(v1,1,v2,1,Θ1,1)

for i in (3,…,n) do

vi,1=wi

if i>Ncond then

for k in (i-1,i-2,…,2) do

vi,1=h-1(vi,1,vi-1,2k-2,Θk,i-k)

end for

vi,1=h-1(vi,1,vi-1,1,Θ1,i-1)

end if

xi=vi,1

if i==n then

Stop

end if

vi,2=h(vi-1,1,vi,1,Θ1,i-1)

vi,3=h(vi,1,vi-1,1,Θ1,i-1)

if i>3 then

for j in (2,…,i-2) do

vi,2j=h(vi-1,2j-2,vi,2j-1,Θj,i-j)

vi,2j+1=h(vi,2j-1,vi-1,2j-2,Θj,i-j)

end for

end if

vi,2i-2=h(vi-1,2i-4,vi,2i-3,Θi-1,1)

end for

Finally, we underline that this is not the only way to proceed for the conditional simulation , but despite the fact that the best vine is selected among a fraction of all those possible, it can provide very satisfying results, as we show in this study. Also, we refer to and as other works where conditional joint pdfs decomposed as C-vines were used for statistical modelling.

Statistical inference of the joint pdf

Statistical inference on a pdf decomposed via a PCC is in principle computationally very demanding. As can be seen from Eq. (), the arguments of the copulas are influenced by the choice of the marginals (because of ui=Fi(xi)), and the argument of the copula in each level is influenced by the fit of the copulas in the previous levels too. As a consequence of this, the estimation of the parameters of the full pdf (marginals and pair-copulas) should be performed together. Moreover, the structure of the vine has to be chosen, increasing the demands of computational resources.

To overcome these obstacles, some techniques have been developed. The complications regarding the dependence of the copula parameters from the marginals estimation can be overcome using empirical marginals . This allows for the estimation of copula parameters without the need to consider the marginals. However, to take into account that the estimation of the parameters of each pair copula depends on those of the upper levels, the estimation of the parameters of all the pairs should be performed at the same time. This way of estimating the parameters is called semiparametric (SP). The estimator we use here is the stepwise semiparametric (SSP). It was proposed by and then , and despite being asymptotically less efficient than the SP , it produces very satisfactory results and speeds up the procedure considerably . As in SP, the PCC parameters are estimated independently of the marginals, but the estimation of the PCC parameters is performed level by level, plugging in the parameters from previous levels at each step .

In this study of compound floods, for each marginal pdf we use a mixture distribution composed of the empirical and generalized Pareto distribution (GPD) for the extreme. For each predictor X, the GPD is fitted to data above a threshold defined here as their respective 95th percentile. For each of the contributing variables Y, this threshold was chosen requiring that the mean of the simulated extreme values from the joint pdf was as near as possible to the maximum observed value of the variable Y we were fitting. Adding the GPD to the empirical marginal for the extremes is necessary so as not to constrain the model to simulate values of the variables Y with maximum values that never exceed those observed during the calibration period.

We use the AIC to select the best vine structure among C- and D-vines (those selected are shown in Sects. and ). In particular, for every possible C- and D-vine, we fit all possible families through the maximum likelihood estimation, and then we select the best family according to the AIC. Then, we select the best vine according to the AIC for the full model. The pair-copula families are chosen among those available in the VineCopula R package . In particular, for the unconditional model, all of the available families are considered during the selection, while for the conditional model we restricted the choice to the first 31 families listed in the documentation file of the package. This is because of technical issues regarding the simulation of data from the conditional pdf of the conditional model. Once the vine is selected, to better assess the quality of the fit of each pair-copula, we use the K-plot (Fig. ). This is a plot of the Kendall function K(w)=P(Ci,j(Ui,U,j)≤w) computed with the fitted copula against K(w) computed with the empirical copula obtained from the observed uniform data. This diagnostic plot indicates a good quality of the fit when the points follow the diagonal . We note that the K(w) of the fitted copula is computed using Monte Carlo methods (long simulations allow for neglect of the associated sampling error). In Fig. we show the resulting K-plots and the selected copulas with their respective parameters for the five-dimensional PCC (K-plots for the three-dimensional PCC are not shown). The families chosen for copulas c43|5(u4|5,u3|5) and c42|135(u4|513,u2|513) according to the AIC were describing slightly negative dependencies (<0.1), but for physical reasons we expect these copulas to describe slightly positive dependencies. We argue that this result is due to uncertainties of the model. Therefore we choose independent copulas for these pairs, which is a compromise between the expert knowledge we have about the data and the result of the fit. When assuming independent copulas for these two pairs, the corresponding K-plots show only a small deviation from the diagonal (right side of Fig. ). Moreover, these K-plots are mostly inside the 95 % confidence interval of the K-plots, which confirms the reasonability of choosing these two independent copulas.

K-plots of the pair-copula families selected for the five-dimensional model (names of the families and parameters are shown in the top left of each plot). In abscissa the empirical K-function and in ordinate the K-function based on fitted copula. The 95 % confidence interval (shown in light red) is obtained from 104 K-plots computed on simulated pairs (with the same length as the observed data) from the selected pair-copula families.

The CDVineCopulaConditional and VineCopula R packages were used to work with copulas. The GPDs for the marginal distributions were fitted through the gpd.fit function of the ismev R package .

Selected pair-copula families

In the case of the unconditional model, the fitted pair-copula families to the observed contributing variables Y – relative to the vine of Eq. () – are Survival BB1 (parameters: 0.49, 1.15) for c31(u3,u1), BB8 (parameters: 4.01, 0.6) for c12(u1,u2), and Tawn type 1 (parameters: 2.59, 0.73) for c32|1(u3|1,u2|1). The selected families relative to the vine of Eq. (), i.e. the one fitted to (ε1Sea,ε2River,ε3River) introduced in Appendix , are t-copula (parameters: 0.15, 3.44) for c12(u1,u2), Tawn type 2 (parameters: 2.85, 0.71) for c23(u2,u3), and Survival Gumbel (parameter: 1.13) for c13|2(u1|2,u3|2). In the case of the conditional model, the selected pair-copula families with relative parameters, fitted to the observed data of contributing variables Y and predictors X, are shown in Figure .

Model and risk uncertainty estimation via parametric bootstrap

The flexibility of copula theory in modelling multivariate distributions has determined its spread in the literature, and more recently in climate science. However, once the model is fitted to observed data, we stress that procedures to get an estimate of the uncertainties, both in the parameter estimates and the choice of the model, should be considered. This is particularly important, as it often happens that because of the limited sample size of the available data, these uncertainties are large and so cannot be neglected . Practically they have a direct influence on the uncertainties of risk analysis. In particular, we observed that the uncertainties are also controlled by the dependence values between the modelled pairs (not shown).

In this study, we find model uncertainties in the joint pdf which propagate into large uncertainties when assessing the risk of compound floods. This does not mean that such models are not useful, but instead that the results should be interpreted being aware of these existing uncertainties. Also, even if large, the obtained uncertainties in the risk are smaller than those obtained computing the risk analysis directly on the observed data of the impact, underlining another advantage of applying such procedures.

For both the unconditional and conditional models, we use a parametric bootstrap to assess the model and subsequent risk uncertainty, as follows.

Select and fit a model that can reproduce the statistical characteristics of Yobs ((Yobs,Xobs) for the conditional model), i.e. the dependence among the variables and their marginal distributions. For the unconditional model we also include the serial correlation as described in Appendix .

Simulate B=2.5×103 samples of the contributing variables Y (as well as predictors X for the conditional model) with the same length as the observed data.

On each of the B=2.5×103 samples, fit a joint pdf via PCCs (the structure of the PCC is the same as that fitted on the observed data, while the pair-copula families are re-selected for each sample).

From each of these B=2.5×103 models, simulate a sample of contributing variables Y of length equal to 200 times the observed (for the conditional model the contributing variables Y are simulated as conditioned on the predictors X).

For each sample, compute the simulated impact sequence as hsim=h(Y1Seasim,Y2Riversim,Y3Riversim) and estimate the corresponding return level curves. Return levels are estimated by fitting the generalized extreme value (GEV) distribution on annual maximum values. We simulated samples of length 200 times the length of the observed data (6 years), to get – for each sample – 1200 annual maximum values on which we fit the GEV distribution. This allows us to neglect the uncertainty of the return levels driven by the sampling because the uncertainties of the GEV distribution parameters are negligible.

Estimate the uncertainties in the return levels by identifying the 95 % confidence interval (i.e. the range 2.5–97.5 %) of the B=2.5×103 return level curves.

As underlined in step 1, for the unconditional model, we explicitly model the serial correlations of the contributing variables Y when computing the uncertainties. This was done to avoid an underestimation of the risk uncertainties (see Appendix ). For the conditional model, step 3 is a rigorous bootstrap procedure, while for the unconditional model this step is an approximation. In fact, for the unconditional model, at step 3 we should have fitted the same type of model as fitted in step 1, i.e. that could include the serial correlations. Unfortunately, such a procedure was not feasible because of computational limitations, and we had to proceed by approximation, i.e. fitting a pdf via a PCC without considering the autoregressive processes. In particular, the computational limitations were due to the tuning procedure explained in Appendix . Therefore we used the best method possible to avoid underestimation of the risk uncertainties, but we underline that we used such an approximation.

The uncertainty in the return levels obtained via the observed data hobs is computed by propagating the parameter uncertainties of the GEV distribution fitted to the annual maxima of hobs (Fig. ). In particular, the fitted GEV distribution is a function of the parameters μ (location), σ (scale) and η (shape) . The GEV-based return level RLt associated with the return period t is a function of the three parameters (μ,σ,η) . We obtained the standard deviations of the three parameters (μ,σ,η), respectively sμ, sσ, and sη, via the gev.fit function of the ismev R package . To estimate the standard deviation of the return level RLt, we propagated the standard deviations of the three parameters (μ,σ,η) using the formula sRLt=∂RLt∂μ2⋅sμ2+∂RLt∂σ2⋅sσ2+∂RLt∂η2⋅sη2, where sRLt is the standard deviation of the return level RLt. The final 95 % interval of uncertainty of the return level RLt is obtained as RLt±2sRLt.

Incorporation of the AR(1) into the unconditional model

ACF of the observed time series (shown in red) against the ACF 95 % confidence interval (grey) of the model (obtained through the Monte Carlo procedure). The dashed lines contain the 95 % confidence interval defined by the ACF of a white noise process; i.e. outside this interval the ACF of the contributing variables Y is significant.

Given a statistical model describing time series with serial correlations, to avoid an underestimation of the model uncertainties computed via the bootstrap procedure, it is necessary to use a model which can reproduce the serial correlation. During the bootstrap procedure, simulating samples without serial correlation, and then re-fitting the model to each of them, would mean assuming that the data carry more information than they actually do. In fact, it is as if the effective sample size of data with serial correlation is smaller than those without . Here we introduce the procedure we used to build a multivariate statistical model that can represent the serial correlation and the marginal pdf of the variables, and the statistical dependencies between them. The steps taken follow below.

Fit a linear Gaussian autoregressive model of order 1, AR(1), Yi(t)=c+φYi(t-1)+εi(t) on the ith marginal time series (i=1,2,3), i.e. (Y1Sea,Y2River,Y3River). The chosen AR(1) requires that the modelled variable is Gaussian distributed; so, before the fit, we transformed the river variables via the loge function, which guarantees more similar behaviour to the Gaussian. The observed sea variable was not transformed because it already had behaviour similar to Gaussian.

Assured via the autocorrelation function (ACF) that εi(t) no longer has a significant serial correlation, fit the joint pdf via PCCs on the residual variables (ε1,ε2,ε3). We observe that the dependencies of these modelled pairs via PCCs are not the usual physical dependencies between the contributing variables (i.e. sea and river levels), but between their residuals with respect to the AR(1) models.

Simulate the residuals (ε1sim,ε2sim,ε3sim) and plug into the ith autoregressive model. Finally, to get the simulated contributing variables Y, the river variables were transformed via the exp function.

We observe here that when selecting the fitted pair-copulas and parameters for the residuals via the AIC, the simulated contributing variables Y had a smaller dependence with respect to the observed variables. We therefore proceeded through a tuning procedure; i.e. we built a routine to automatically tune the parameters of the fitted families, requiring that the Kendall rank correlation coefficients among the contributing variables Y were well simulated.

In Fig. , the autocorrelation functions of the Yobs variables are compared with those of Ysim simulated from the fitted model. Because of the gaps in the Yobs time series, not all the observations are usable for computing the ACF (in particular, the percentage of usable data decreases when increasing the lag at which the ACF is computed). We therefore computed the ACF up to a lag of about 25 days, which guarantees the use of at least the 35 % of data from the observed time series. Up to a lag of about 15 days, except for a very few cases with the variable Y3River, the ACFs of the observed data are always inside the 95 % interval of the ACFs obtained from the fitted model.

We consider this result satisfying because our target is to include the serial correlation of the contributing variables Y in the model, and we can see that even for the variable Y3River, the values of the ACFs have a significant serial correlation. Also, considering that the ACF is only slightly misrepresented for just one of the three time series, we argue that this is likely to have only a small effect on the final assessment of the model uncertainties.

Brier score for extreme values

We employ the Brier score to assess the accuracy of the probabilistic predictions of the conditional model when predicting extreme values of the impact h e.g.. We defined an extreme of h as a value larger than the 95th percentile of hobs. The Brier score is BS=1N∑t=1N(pt-ot)2, where pt is the probability of getting an extreme value h from the model at time t, while ot is 1 if hobs(t) is extreme and 0 otherwise. We get the value of pt through a Monte Carlo procedure.

The Brier skill score (BSS) measures the relative accuracy of the model under validation over a reference model, and is defined as BSS=1-BSBSref, where BSref is the Brier score of the reference model. Here we consider the climatology of h as the reference model, i.e. an empirical model such that pt=0.05∀t. A significant positive value of BSS indicates that when predicting extreme values, the model under validation is more accurate – according to the BS – than the reference model.

Cross-validation procedure

To assess the quality of the conditional model, avoiding overfitting, we perform a 6-fold cross-validation. Therefore, the original sample of data (X,Y) is randomly partitioned into six equally sized subsamples. Of the six subsamples, five subsamples (the training data) are used in fitting the model that is then validated against the remaining subsample. For each training subsample we fit (1) new predictors X for the contributing variables Y, (2) a new joint pdf fY|X(Y|X) and (3) a new h-function. For each validation subsample, we simulated 104 realizations of the Y values by conditioning on the concurring predictors. Finally, by combining the simulations of each validation subsample, 104 cross-validation time series of the contributing variables Y and the impact h are obtained.

Douglas Maraun had the initial idea for the study. Emanuele Bevacqua and Douglas Maraun jointly developed the study with contributions from Martin Widmann. Emanuele Bevacqua developed the statistical model with contributions from Ingrid Hobæk Haff, Douglas Maraun and Mathieu Vrac. Emanuele Bevacqua carried out the analysis with contributions from Douglas Maraun and Ingrid Hobæk Haff. Emanuele Bevacqua, Douglas Maraun and Martin Widmann interpreted the results. Emanuele Bevacqua wrote the paper with contributions from all other authors.

The authors declare that they have no conflict of interest.

Acknowledgements

Emanuele Bevacqua received funding from the Volkswagen Foundation's CE:LLO project (Az.: 88468), which also supported project meetings. The authors would like to thank Arnoldo Frigessi for hosting them, and for fruitful discussions at the Norwegian Computing Center. Emanuele Bevacqua would like to thank Colin Manning for the productive discussions, and contributions during the writing process. The authors would like to thank the anonymous reviewers for their valuable comments and suggestions which contributed to improving the quality of the paper. The data used for sea and river levels have been provided by the Italian National Institute for Environmental Protection and Research (ISPRA) and Arpae Emilia-Romagna. Edited by: D. Koutsoyiannis Reviewed by: J. Zscheischler and two anonymous referees

References Aas et al.(2009)

Aas, K., Czado, C., Frigessi, A., and Bakken, H.: Pair-copula constructions of multiple dependence, Insurance: Mathematics and Economics, 44, 182–198, 10.1016/j.insmatheco.2007.02.001, 2009.

Acar et al.(2012)

Acar, E. F., Genest, C., and Nešlehová, J.: Beyond simplified pair-copula constructions, J. Multivariate Anal., 110, 74–90, 10.1016/j.jmva.2012.02.001, 2012.

Aghakouchak et al.(2014)

Aghakouchak, A., Cheng, L., Mazdiyasni, O., and Farahmand, A.: Global warming and changes in risk of concurrent climate extremes: Insights from the 2014 California drought, Geophys. Res. Lett., 41, 8847–8852, 10.1002/2014gl062308, 2014.

Arpa Emilia-Romagna(2015)

Arpa Emilia-Romagna: Servizio IdroMeteoClima, Unità Radarmeteorologia, Radarpluviometria, Nowcasting e Reti non convenzionali, Area Centro Funzionale e Sala Operativa Previsioni, Unità gestione Rete idrometeorologica RIRER,Area Modellistica Meteo: Rapporto dell'evento meteorologico del 5 e 6 febbraio 2015, Bologna, Italy, 2015.

Bedford and Cooke(2001a)

Bedford, T. and Cooke, R. M.: Monte Carlo simulation of vine dependent random variables for applications in uncertainty analysis, Proceedings of the European Conference on Safety and Reliability 2001, Turin, Italy, 2001a.

Bedford and Cooke(2001b)

Bedford, T. and Cooke, R. M.: Probability density decomposition for conditionally dependent random variables modeled by vines, Ann. Math. Artif. Intel., 32, 245–268, 10.1023/A:1016725902970, 2001b.

Bedford and Cooke(2002)

Bedford, T. and Cooke, R. M.: Vines–a new graphical model for dependent random variables, Ann. Stat., 30, 1031–1068, 10.1214/aos/1031689016, 2002.

Bevacqua(2017)

Bevacqua, E.: CDVineCopulaConditional: Sampling from Conditional C- and D-Vine Copulas, R package version 0.1.0, https://CRAN.R-project.org/package=CDVineCopulaConditional, last access: 1 June 2017.

Brechmann et al.(2013)

Brechmann, E. C., Hendrich, K., and Czado, C.: Conditional copula simulation for systemic risk stress testing, Insurance: Mathematics and Economics, 53, 722–732, 10.1016/j.insmatheco.2013.09.009, 2013.

Carbognin et al.(2011)

Carbognin, L., Teatini, P., Tosi, L., Strozzi, T., and Tomasin, A.: Present Relative Sea Level Rise in the Northern Adriatic Coastal Area, In: Coastal and marine spatial planning. Marine Research at CNR, DTA/06, CNR – Dipartimento Scienze del Sistema Terra e Tecnologie, Roma, 1147–1162, 2011.

Coles(2001)

Coles, S.: An introduction to statistical modeling of extreme values, Springer, London, 47–49, 2001.

Dee et al.(2011)

Dee, D. P., Uppala, S. M., Simmons, A. J., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Balmaseda, M. A., Balsamo, G., Bauer, P., Bechtold, P., Beljaars, A. C. M., Berg, L. V. D., Bidlot, J., Bormann, N., Delsol, C., Dragani, R., Fuentes, M., Geer, A. J., Haimberger, L., Healy, S. B., Hersbach, H., Hólm, E. V., Isaksen, L., Kållberg, P., Köhler, M., Matricardi, M., Mcnally, A. P., Monge-Sanz, B. M., Morcrette, J.-J., Park, B.-K., Peubey, C., Rosnay, P. D., Tavolato, C., Thépaut, J.-N., and Vitart, F.: The ERA-Interim reanalysis: configuration and performance of the data assimilation system, Q. J. Roy. Meteor. Soc., 137, 553–597, 10.1002/qj.828, 2011.

De Michele et al.(2013)

De Michele, C., Salvadori, G., Vezzoli, R., and Pecora, S.: Multivariate assessment of droughts: Frequency analysis and dynamic return period, Water Resour. Res., 49, 6985–6994, 10.1002/wrcr.20551, 2013.

Fischer et al.(2007)

Fischer, E. M., Seneviratne, S. I., Vidale, P. L., Lüthi, D., and Schär, C.: Soil Moisture-Atmosphere Interactions during the 2003 European Summer Heat Wave, J. Climate, 20, 5081–5099, 10.1175/jcli4288.1, 2007.

Flato et al.(2013)

Flato, G., Marotzke, J., Abiodun, B., Braconnot, P., Chou, S. C., Collins, W., Cox, P., Driouech, F., Emori, S., Eyring, V., Forest, C., Gleckler, P., Guilyardi, E., Jakob, C., Kattsov, V., Reason, C., and Rummukainen, M.: Evaluation of Climate Models, in: Climate Change 2013: The Physical Science Basis, Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, edited by: Stocker, T. F., Qin, D., Plattner, G.-K., Tignor, M., Allen, S. K., Boschung, J., Nauels, A., Xia, Y., Bex, V., and Midgley, P. M., Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 790–791, 2013.

Gambolati et al.(2002)

Gambolati, G., Teatini, P., and Gonella, M.: GIS simulations of the inundation risk in the coastal lowlands of the Northern Adriatic Sea, Math. Comput. Model., 35, 963–972, 10.1016/s0895-7177(02)00063-8, 2002.

Genest et al.(2007)

Genest, C. and Favre, A.-C.: Everything You Always Wanted to Know about Copula Modeling but Were Afraid to Ask, J. Hydrol. Eng., 12, 347–368, 10.1061/(asce)1084-0699(2007)12:4(347), 2007.

Genest et al.(1995)

Genest, C., Ghoudi, K., and Rivest, L.-P.: A semiparametric estimation procedure of dependence parameters in multivariate families of distributions, Biometrika, 82, 543–552, 10.1093/biomet/82.3.543, 1995.

Giorgi et al.(2009)

Giorgi, F., Jones, C., and Asrar, G. R.: Addressing climate information needs at the regional level: the CORDEX framework, WMO Bulletin, 58, 175–183, 2009.

Gräler et al.(2013)

Gräler, B., van den Berg, M. J., Vandenberghe, S., Petroselli, A., Grimaldi, S., De Baets, B., and Verhoest, N. E. C.: Multivariate return periods in hydrology: a critical and practical review focusing on synthetic design hydrograph estimation, Hydrol. Earth Syst. Sci., 17, 1281–1296, 10.5194/hess-17-1281-2013, 2013.

Gräler et al.(2016)

Gräler, B., Petroselli, A., Grimaldi, S., De Baets, B., and Verhoest, N.: An update on multivariate return periods in hydrology, Proc. IAHS, 373, 175–178, 10.5194/piahs-373-175-2016, 2016.

Heffernan and Stephenson(2016)

Heffernan, J. E. and Stephenson, A. G.: ismev: An Introduction to Statistical Modeling of Extreme Values. R package version 1.41, https://CRAN.R-project.org/package=ismev (last access: 1 June 2017), 2016.

Hobæk Haff(2012)

Hobæk Haff, I.: Comparison of estimators for pair-copula constructions, J. Multivariate Anal., 110, 91–105, 10.1016/j.jmva.2011.08.013, 2012.

Hobæk Haff(2013)

Hobæk Haff, I.: Parameter estimation for pair-copula constructions, Bernoulli, 19, 462–491, 10.3150/12-bej413, 2013.

Hobæk Haff et al.(2010)

Hobæk Haff, I., Aas, K., and Frigessi, A.: On the simplified pair-copula construction – Simply useful or too simplistic?, J. Multivariate Anal., 101, 1296–1310, 10.1016/j.jmva.2009.12.001, 2010.

Hobæk Haff et al.(2015)

Hobæk Haff, I., Frigessi, A., and Maraun, D.: How well do regional climate models simulate the spatial dependence of precipitation? An application of pair-copula constructions, Journal of Geophysical Research: Atmospheres J. Geophys. Res.-Atmos., 120, 2624–2646, 10.1002/2014jd022748, 2015.

Joe(1996)

Joe, H.: Families of m-variate distributions with given margins and m(m-1)/2 bivariate dependence parameters, Institute of Mathematical Statistics Lecture Notes – Monograph Series Distributions with fixed marginals and related topics, 120–141, 10.1214/lnms/1215452614, 1996.

Joe(2014)

Joe, H.: Multivariate Models and Multivariate Dependence Concepts, Taylor Francis Ltd, United States, 2014.

Kew et al.(2013)

Kew, S. F., Selten, F. M., Lenderink, G., and Hazeleger, W.: The simultaneous occurrence of surge and discharge extremes for the Rhine delta, Nat. Hazards Earth Syst. Sci., 13, 2017–2029, 10.5194/nhess-13-2017-2013, 2013.

Klerk et al.(2015)

Klerk, W. J., Winsemius, H. C., Verseveld, W. J. V., Bakker, A. M. R., and Diermanse, F. L. M.: The co-incidence of storm surges and extreme discharges within the Rhine-Meuse Delta, Environ. Res. Lett., 10, 035005, 10.1088/1748-9326/10/3/035005, 2015.

Kurowicka and Cooke(2005)

Kurowicka, D. and Cooke, R. M.: Sampling algorithms for generating joint uniform distributions using the vine – copula method, 3rd IASC world conference on Computational Statistics and Data Analysis, Limassol, Cyprus, 2005.

Leonard et al.(2014)

Leonard, M., Westra, S., Phatak, A., Lambert, M., Van den Hurk, B., Mcinnes, K., Risbey, J., Schuster, S., Jakob, D., and Stafford-Smith, M.: A compound event framework for understanding extreme impacts, WIREs Clim Change Wiley Interdisciplinary Reviews: Climate Change, 5, 113–128, 10.1002/wcc.252, 2014.

Lian et al.(2013)

Lian, J. J., Xu, K., and Ma, C.: Joint impact of rainfall and tidal level on flood risk in a coastal city with a complex river network: a case study of Fuzhou City, China, Hydrol. Earth Syst. Sci., 17, 679–689, 10.5194/hess-17-679-2013, 2013.

Life Primes(2016a)

LIFE PRIMES: Preventing flooding RIsks by Making resilient communitiES: http://ec.europa.eu/environment/life/project/Projects/index.cfm?fuseaction=search.dspPage&n_proj_id=5247, last access: 6 December 2016a.

Life Primes(2016b)

LIFE PRIMES: Il progetto LIFE PRIMES, http://protezionecivile.regione.emilia-romagna.it/life-primes/progetto/progetto-life-primes/il-progetto-life-primes, last access: 6 December 2016b.

Liu et al.(2015)

Liu, Z., Zhou, P., Chen, X., and Guan, Y.: A multivariate conditional model for streamflow prediction and spatial precipitation refinement, J. Geophys. Res.-Atmos., 120, 10116–10129, 10.1002/2015jd023787, 2015.

Maraun et al.(2010)

Maraun, D., Wetterhall, F., Ireson, A. M., Chandler, R. E., Kendon, E. J., Widmann, M., Brienen, S., Rust, H. W., Sauter, T., Themeßl, M., Venema, V. K. C., Chun, K. P., Goodess, C. M., Jones, R. G., Onof, C., Vrac, M., and Thiele-Eich, I.: Precipitation downscaling under climate change: Recent developments to bridge the gap between dynamical models and the end user, Rev. Geophys., 48, RG3003, 10.1029/2009rg000314, 2010.

Maraun(2014)

Maraun, D.: Reply to “Comment on “Bias Correction, Quantile Mapping, and Downscaling: Revisiting the Inflation Issue””, J. Climate, 27, 1821–1825, 10.1175/jcli-d-13-00307.1, 2014.

Masina et al.(2015)

Masina, M., Lamberti, A., and Archetti, R.: Coastal flooding: A copula based approach for estimating the joint probability of water levels and waves, Coast. Eng., 97, 37–52, 10.1016/j.coastaleng.2014.12.010, 2015.

Materia et al.(2010)

Materia, S., Dirmeyer, P. A., Guo, Z., Alessandri, A., and Navarra, A.: The Sensitivity of Simulated River Discharge to Land Surface Representation and Meteorological Forcings, J. Hydrometeorol., 11, 334–351, 10.1175/2009jhm1162.1, 2010.

Nelsen(2006)

Nelsen, R. B.: An introduction to copulas, Springer, New York, 2006.

NOAA, Tides and Currents(2017)

NOAA, Tides and Currents: https://tidesandcurrents.noaa.gov/, last access: 14 March 2017.

Pathiraja et al.(2012)

Pathiraja, S., Westra, S., and Sharma, A.: Why continuous simulation? The role of antecedent moisture in design flood estimation, Water Resour. Res., 48, W06534, 10.1029/2011wr010997, 2012.

R Core Team(2016)

R Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, https://www.R-project.org/ (last access: 1 June 2017), 2016.

Saghafian and Mehdikhani(2013)

Saghafian, B. and Mehdikhani, H.: Drought characterization using a new copula-based trivariate approach, Nat. Hazards, 72, 1391–1407, 10.1007/s11069-013-0921-6, 2013.

Salvadori and De Michele(2007)

Salvadori, G. and De Michele, C.: On the Use of Copulas in Hydrology: Theory and Practice, J. Hydrol. Eng., 12, 369–380, 10.1061/(asce)1084-0699(2007)12:4(369), 2007.

Salvadori et al.(2007)

Salvadori, G., De Michele, C., Kottegoda, N. T., and Rosso, R.: Extremes in nature: an approach using Copulas, Springer, Dordrecht, the Netherlands, 2007.

Salvadori et al.(2011)

Salvadori, G., De Michele, C., and Durante, F.: On the return period and design in a multivariate framework, Hydrol. Earth Syst. Sci., 15, 3293–3305, 10.5194/hess-15-3293-2011, 2011.

Salvadori et al.(2015)

Salvadori, G., Durante, F., Tomasicchio, G., and D'alessandro, F.: Practical guidelines for the multivariate assessment of the structural risk in coastal and off-shore engineering, Coast. Eng., 95, 77–83, 10.1016/j.coastaleng.2014.09.007, 2015.

Salvadori et al.(2016)

Salvadori, G., Durante, F., Michele, C. D., Bernardi, M., and Petrella, L.: A multivariate copula-based framework for dealing with hazard scenarios and failure probabilities, Water Resour. Res., 52, 3701–3721, 10.1002/2015wr017225, 2016.

Schepsmeier et al.(2016)

Schepsmeier, U., Stoeber, J., Brechmann, E. C., Graeler, B., Nagler, T., and Erhardt, T.: VineCopula: Statistical Inference of Vine Copulas, R package version 2.0.5, https://CRAN.R-project.org/package=VineCopula (last access: 1 June 2017), 2016.

Schölzel and Friederichs(2008)

Schölzel, C. and Friederichs, P.: Multivariate non-normally distributed random variables in climate research – introduction to the copula approach, Nonlin. Processes Geophys., 15, 761–772, 10.5194/npg-15-761-2008, 2008.

Seneviratne et al.(2010)

Seneviratne, S. I., Corti, T., Davin, E. L., Hirschi, M., Jaeger, E. B., Lehner, I., Orlowsky, B., and Teuling, A. J.: Investigating soil moisture-climate interactions in a changing climate: A review, Earth-Sci. Rev., 99, 125–161, 10.1016/j.earscirev.2010.02.004, 2010.

Seneviratne et al.(2012)

Seneviratne, S. I., Nicholls, N., Easterling, D., Goodess, C. M., Kanae, S., Kossin, J., Luo, Y., Marengo, J., Mcinnes, K., Rahimi, M., Reichstein, M., Sorteberg, A., Vera, C., Zhang, X., Rusticucci, M., Semenov, V., Alexander, L. V., Allen, S., Benito, G., Cavazos, T., Clague, J., Conway, D., Della-Marta, P. M., Gerber, M., Gong, S., Goswami, B. N., Hemer, M., Huggel, C., Van den Hurk, B., Kharin, V. V., Kitoh, A., Tank, A. M. K., Li, G., Mason, S., Mcguire, W., Oldenborgh, G. J. V., Orlowsky, B., Smith, S., Thiaw, W., Velegrakis, A., Yiou, P., Zhang, T., Zhou, T., and Zwiers, F. W.: Changes in Climate Extremes and their Impacts on the Natural Physical Environment, Managing the Risks of Extreme Events and Disasters to Advance Climate Change Adaptation Special Report of the Intergovernmental Panel on Climate Change, 109–230, 10.1017/cbo9781139177245.006, 2012.

Serinaldi(2015)

Serinaldi, F.: Can we tell more than we can know? The limits of bivariate drought analyses in the United States, Stoch. Env. Res. Risk A., 30, 1691, 10.1007/s00477-015-1124-3, 2015.

Serinaldi et al.(2009)

Serinaldi, F., Bonaccorso, B., Cancelliere, A., and Grimaldi, S.: Probabilistic characterization of drought properties through copulas, Phys. Chem. Earth, 34, 596–605, 10.1016/j.pce.2008.09.004, 2009.

Shiau(2003)

Shiau, J. T.: Return period of bivariate distributed extreme hydrological events, Stoch. Env. Res. Risk A., 17, 42–57, 10.1007/s00477-003-0125-9, 2003.

Shiau et al.(2007)

Shiau, J.-T., Feng, S., and Nadarajah, S.: Assessment of hydrological droughts for the Yellow River, China, using copulas, Hydrol. Process., 21, 2157–2163, 10.1002/hyp.6400, 2007.

Sklar(1959)

Sklar, A.: Fonctions de Répartition à Dimensions et Leurs marges, 8, Publications de l'Institut de Statistique de L'Université de Paris, Paris, France, 1959.

Stöber et al.(2013)

Stöber, J., Joe, H., and Czado, C.: Simplified pair copula constructions – Limitations and extensions, J. Multivariate Anal., 119, 101–118, 10.1016/j.jmva.2013.04.014, 2013.

Svensson and Jones(2002)

Svensson, C. and Jones, D. A.: Dependence between extreme sea surge, river flow and precipitation in eastern Britain, Int. J. Climatol., 22, 1149–1168, 10.1002/joc.794, 2002.

Taylor et al.(2012)

Taylor, K. E., Stouffer, R. J., and Meehl, G. A.: An Overview of CMIP5 and the Experiment Design, B. Am. Meteorol. Soc., 93, 485–498, 10.1175/bams-d-11-00094.1, 2012.

Tisseuil et al.(2010)

Tisseuil, C., Vrac, M., Lek, S., and Wade, A. J.: Statistical downscaling of river flows, J. Hydrol., 385, 279–291, 10.1016/j.jhydrol.2010.02.030, 2010.

Tsimplis and Woodworth(1994)

Tsimplis, M. N. and Woodworth, P. L.: The global distribution of the seasonal sea level cycle calculated from coastal tide gauge data, J. Geophys. Res., 99, 16031, 10.1029/94jc01115, 1994.

Van Den Brink et al.(2004)

Van Den Brink, H. W., Können, G. P., Opsteegh, J. D., Oldenborgh, G. J. V., and Burgers, G.: Improving 104-year surge level estimates using data of the ECMWF seasonal prediction system, Geophys. Res. Lett., 31, L17210, 10.1029/2004gl020610, 2004.

Van Den Brink et al.(2005)

Van Den Brink, H. W., Können, G. P., Opsteegh, J. D., Oldenborgh, G. J. V., and Burgers, G.: Estimating return periods of extreme events from ECMWF seasonal forecast ensembles, Int. J. Climatol., 25, 1345–1354, 10.1002/joc.1155, 2005.

Van den Hurk et al.(2015)

Van den Hurk, B., Meijgaard, E. V., Valk, P. D., Heeringen, K.-J. V., and Gooijer, J.: Analysis of a compounding surge and precipitation event in the Netherlands, Environ. Res. Lett., 10, 035001, 10.1088/1748-9326/10/3/035001, 2015.

Wahl et al.(2015)

Wahl, T., Jain, S., Bender, J., Meyers, S. D., and Luther, M. E.: Increasing risk of compound flooding from storm surge and rainfall for major US cities, Nature Climate Change, 5, 1093–1097, 10.1038/nclimate2736, 2015.

Widmann(2005)

Widmann, M.: One-Dimensional CCA and SVD, and Their Relationship to Regression Maps, J. Climate, 18, 2785–2792, 10.1175/jcli3424.1, 2005.

Volpi and Fiori(2014)

Volpi, E. and Fiori, A.: Hydraulic structures subject to bivariate hydrological loads: Return period, design, and risk assessment, Water Resour. Res., 50, 885–897, 10.1002/2013wr014214, 2014.

Zheng et al.(2013)

Zheng, F., Westra, S., and Sisson, S. A.: Quantifying the dependence between extreme rainfall and storm surge in the coastal zone, J. Hydrol., 505, 172–187, 10.1016/j.jhydrol.2013.09.054, 2013.

Zheng et al.(2014)

Zheng, F., Westra, S., Leonard, M., and Sisson, S. A.: Modeling dependence between extreme rainfall and storm surge to estimate coastal flooding risk, Water Resour. Res., 50, 2050–2071, 10.1002/2013wr014616, 2014.

Zheng et al.(2015)

Zheng, F., Leonard, M., and Westra, S.: Efficient joint probability analysis of flood risk, J. Hydroinform., 17, 584, 584–597, 10.2166/hydro.2015.052, 2015.

</app></app-group></back> </article>