HESSHydrology and Earth System SciencesHESSHydrol. Earth Syst. Sci.1607-7938Copernicus PublicationsGöttingen, Germany10.5194/hess-21-2701-2017Multivariate statistical modelling of compound events via pair-copula constructions: analysis of floods in Ravenna (Italy)BevacquaEmanueleemanuele.bevacqua@uni-graz.athttps://orcid.org/0000-0003-0472-5183MaraunDouglasHobæk HaffIngridWidmannMartinhttps://orcid.org/0000-0001-5447-5763VracMathieuWegener Center for Climate and Global Change, University of Graz, Graz, AustriaDepartment of Mathematics, University of Oslo, Oslo, NorwaySchool of Geography, Earth and Environmental Sciences, University of Birmingham, Birmingham, UKLaboratoire des Sciences du Climat et de l'Environnement, CNRS/IPSL, Gif-sur-Yvette, FranceEmanuele Bevacqua (emanuele.bevacqua@uni-graz.at)8June2017216270127239December20162January20176April20171May2017This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this licence, visit https://creativecommons.org/licenses/by/3.0/This article is available from https://hess.copernicus.org/articles/21/2701/2017/hess-21-2701-2017.htmlThe full text article is available as a PDF file from https://hess.copernicus.org/articles/21/2701/2017/hess-21-2701-2017.pdf
Compound events (CEs) are multivariate extreme events in which the individual
contributing variables may not be extreme themselves, but their joint –
dependent – occurrence causes an extreme impact. Conventional univariate
statistical analysis cannot give accurate information regarding the
multivariate nature of these events. We develop a conceptual model,
implemented via pair-copula constructions, which allows for the
quantification of the risk associated with compound events in present-day and
future climate, as well as the uncertainty estimates around such risk. The
model includes predictors, which could represent for instance meteorological
processes that provide insight into both the involved physical mechanisms and
the temporal variability of compound events. Moreover, this model enables
multivariate statistical downscaling of compound events. Downscaling is
required to extend the compound events' risk assessment to the past or future
climate, where climate models either do not simulate realistic values of the
local variables driving the events or do not simulate them at all. Based on
the developed model, we study compound floods, i.e. joint storm surge and
high river runoff, in Ravenna (Italy). To explicitly quantify the risk, we
define the impact of compound floods as a function of sea and river levels.
We use meteorological predictors to extend the analysis to the past, and get
a more robust risk analysis. We quantify the uncertainties of the risk
analysis, observing that they are very large due to the shortness of the
available data, though this may also be the case in other studies where they
have not been estimated. Ignoring the dependence between sea and river levels
would result in an underestimation of risk; in particular, the expected
return period of the highest compound flood observed increases from about 20
to 32 years when switching from the dependent to the independent case.
Introduction
On 6 February 2015, a low-pressure system that developed over the north of
Spain moved across the island of Corsica into Italy. The low pressure itself
(Fig. ) and the associated south-easterly winds
drove a storm surge to the Adriatic coast at Ravenna (Italy). Alongside the
storm surge, large amounts of precipitation fell in the surrounding area,
causing high values of discharge in small rivers near the coast. These river
discharges were partially obstructed from draining into the sea by the storm
surge, which then contributed to major flooding along the coast.
Sea level pressure and total precipitation on 6 February 2015,
when the coastal area of Ravenna (indicated by the yellow dot) was hit
by a compound flooding.
Such a compound flood is a typical example of a compound
event (CE). CEs are multivariate extreme events in which the individual
contributing variables may not be extreme themselves, but their
joint – dependent – occurrence causes an extreme impact. The impact
of CEs may be a climatic variable such as the gauge level (e.g. for compound
floods), or other relevant variables such as fatalities or economic losses.
CEs have received little attention so far, as underlined in the report of the
Intergovernmental Panel on Climate Change on extreme events .
CEs are responsible for a very broad class of impacts on society. For
example, heatwaves amplified by the lack of soil moisture, which reduces the
latent cooling, may be classified as CEs . The
impact of drought cannot be fully described by a single variable
e.g.: analyses have been carried out which
consider drought severity, duration , maximum deficit
, as well as the affected area .
Another example of CE includes fluvial floods resulting from extreme rainfall
occurring on a wet catchment .
In the recent literature, more attention has been given to the study of CEs
through multivariate statistical methods which can offer more
in-depth information, regarding the multivariate nature of CEs, than
conventional univariate analysis. Combinations of univariate analyses for
studying CEs are only sufficient when no dependence exists among the compound
variables. However, this is not usually the case, and so would lead to
misleading conclusions about the assessment of the risk associated with CEs.
Modelling CEs is a complex undertaking , and methods to
adequately study them are required. Parametric multivariate statistical
models allow one to constrain the dependencies between the contributing
variables of CEs, as well as their marginal distributions
e.g..
The parametric structure reduces the uncertainties of the statistical
properties we want to estimate from the data, compared to empirical
estimates. However, such a reduction of the uncertainties depends on the
choice of a proper parametric model. As observed data are often limited, the
uncertainties might be substantial and should thus be quantified
.
Due to the complex dependence structure between the contributing variables,
advanced multivariate statistical models are necessary to model CEs. For
example, modelling the multivariate probability distribution of the
contributing variables with multivariate Gaussian distributions would usually
not produce satisfying results. A multivariate Gaussian distribution would
assume that the dependencies between all the pairs are of the same type
(homogeneity of the pair dependencies), and without any dependence of the
extreme events, also called tail dependence. Furthermore, a multivariate
Gaussian distribution would assume that all of the marginal distributions
would be Gaussian. To solve the latter problems, the use of copulas has been
introduced in geophysics and climate science
e.g.. Through copulas, it is possible to
model the dependence structure of variables separately from their marginal
distributions. However, multivariate parametric copulas lack flexibility when
modelling systems with high dimensionality, where heterogeneous dependencies
exist among the different pairs . Therefore, this lack of
flexibility of copulas would be a limitation for many types of compound
events. Pair-copula constructions (PCCs) decompose the dependence structure
into bivariate copulas (some of which are conditional) and give greater
flexibility in modelling generic high-dimensional systems compared to
multivariate parametric copulas
.
Here we develop a multivariate statistical model, based on PCCs, which allows
for an adequate description of the dependencies between the contributing
variables. The model provides a straightforward quantification of risk
uncertainty, which is reduced with respect to the uncertainties obtained when
computing the risk directly on the observed data of the impact. We extend the
multivariate statistical model by including predictors for the contributing
variables. Such predictors could represent for instance meteorological
processes driving the contributing variables. This increase in complexity of
the model due to additional variables is accommodated for through the use of
PCCs. The predictors allow us to (1) gain insight into the physical processes
underlying CEs, as well as into the temporal variability of CEs, and (2) to
statistically downscale CEs and their impacts. Downscaling may be used to
statistically extend the risk assessment back in time to periods where
observations of the predictors but not of the contributing variables and
impacts are available, or to assess potential future changes in CEs based on
climate models. Based on this model, we study compound flooding in Ravenna.
In the context of compound floods, the dependence between rainfall and sea
level has previously been studied for other regions
e.g.. Among these
studies, observed an increase in the risk of compound
flooding in major US cities driven by an increasing dependence between storm
surges and extreme rainfall. The impact of compound floods can be described
as the gauge level in a river near the coast, which is driven both by the
river discharge upstream and the sea level. Only a few studies have
explicitly quantified the impact of compound floods and the associated risks
. The reason
might be difficulties in quantifying the impact due to a lack of data. For
the Rotterdam case study, the impact has been explicitly quantified
. However, there is still debate as
to whether the floods in this case are actually CEs, i.e. if surges and
discharges can be treated independently or not when assessing the risk of
flooding. As discussed in , a significant dependence is more
likely in small catchments, such as those in mountainous areas by the coast,
which have a quick response time to rainfall that may favour the coincidence
of high river flows and storm surges driven by the same synoptic weather
system.
Here, we explicitly define the impact of compound floods as a function of sea
and river levels in order to quantify the flood risk and its related
uncertainties. Moreover, we quantify the risk underestimation that occurs
when the dependence among sea and river levels is not considered. We identify
the meteorological predictors driving the river and sea levels. By
incorporating such predictors into the statistical model, we extend the
analysis of compound floods into the past, where data are available for
predictors but not for the river and sea level stations.
The paper is organized as follows. The Ravenna case study is discussed in
Sect. .
We introduce the conceptual model for compound events in
Sect. . Pair-copula
constructions, i.e. the mathematical method we use to implement the model,
are introduced in Sect. . Based on the presented
conceptual model for compound events, in Sect. we
develop the model for compound floods in Ravenna. Results are presented in
Sect. ; discussion and conclusions are provided in
Sect. . More technical details can be found in the
Appendices.
Compound flooding in the coastal area of Ravenna
In this study, we focus on the risk of compound floods in the coastal area of
Ravenna. The choice of the case study was motivated by the extreme event that
happened on 6 February 2015, as presented in the Introduction. On the day
prior to the event, values of up to approximately 80 mm of rain were
recorded in the surrounding area of Ravenna, and around 90 mm on the day
of the event itself. The sea level recorded was the highest observed in the
last 18 years . The high risk of flooding to the
population in the Ravenna region has been underlined by the LIFE PRIMES
project , recently financed by the European Commission,
whose target is “to reduce the damages caused to the territory and
population by events such as floods and storm surges” in
Ravenna and its surrounding areas. As pointed out by ,
natural and anthropogenic subsidences represent a threat to the coastal area
of Ravenna, characterized by land elevations which are in many places below
2 m above mean sea level . The sea level inundation
risk along the coast of Ravenna has recently been studied by
, who considered the joint effect of seawater level and
significant wave height.
A schematic representation of the catchment on which we focus is shown in the
black rectangle of Fig. . The Y variables, river
and sea levels, represent the contributing variables, and the the water level
h is the impact of the compound flood. The X variables are meteorological
predictors of the contributing variables Y, which will be discussed in more
detail later.
Hydraulic system for the Ravenna catchment. The area affected by
compound floods is marked by the red point. The impact is the water level
h, which is influenced by the contributing variables Y, i.e. sea and
river levels. The variables inside the black rectangle are used to develop
the three-dimensional (unconditional) model. The X are the meteorological
predictors driving the contributing variables Y, which are incorporated
into the five-dimensional (conditional) model.
We develop a multivariate statistical model able to assess the risk of
compound floods in Ravenna. Our research objectives are the following.
Develop a statistical model to represent the dependencies between the contributing variables of the compound floods, via pair-copula constructions.
Explicitly define the impact of compound floods as a function of the contributing variables. This allows us to estimate the risk and the related uncertainty.
Identify the meteorological predictors for the contributing variables Y. Incorporate the meteorological predictors into the model to gain insight
into the physical mechanisms driving the compound floods and into their temporal variability.
Extend the analysis into the past (where data are available for the predictors but not for the contributing variables Y).
Dataset
The data used here for the contributing variables Y and the impact h are
water levels at a daily resolution (daily averages of hourly measurements).
We use data for the extended winter season (November–March) of the period
2009–2015. Data sources are the Italian National Institute for
Environmental Protection and Research (ISPRA) for the sea, and Arpae
Emilia-Romagna for rivers and impact. River data were processed in order to
mask periods of low quality, i.e. those suspected of being influenced by
human activities such as the use of a dam. Moreover, we applied a procedure
to homogenize the data of the rivers; details are given in
Appendix . We do not filter out the astronomical
tide component of the sea level, considering that the range of variation of
the daily average of sea level is about 1 m, while that of the astronomical
tide is about 9 cm. To check the above, we used the astronomical tide
obtained through FES2012, which is a software produced by Noveltis, Legos
and the CLS Space Oceanography Division and distributed by Aviso, with
support from Cnes (http://www.aviso.altimetry.fr/). Meteorological
predictors were obtained from the ECMWF ERA-Interim reanalysis dataset
(covering the period 1979–2015, with 0.75×0.75∘ of
resolution ). Specifically, for the river predictors we use
daily data (sum of 12-hourly values) of total precipitation, evaporation,
snowmelt and snowfall, while for the sea level predictor we use daily data
(average of 6-hourly values) of sea level pressure.
Conceptual conditional model for compound events
define a CE as “an extreme impact that depends on multiple
statistically dependent variables or events”. This definition stresses the
extremeness of the impact rather than that of the individual contributing
variables, which may not be extreme themselves, and the importance of the
dependence between these contributing variables. The physical reasons for the
dependence among the contributing variables can be different. There can be a
mutual reinforcement of one variable by the other and vice versa due to
system feedbacks, e.g. the mutual enhancement of droughts and heatwaves in
transitional regions between dry and wet climates . Or the
probability of occurrence of the contributing variables can be influenced by
a large-scale weather condition, as has occurred in Ravenna
(Fig. ), where the low-pressure system caused
coinciding extremes of river runoff and sea level. It is clear then that the
dependence among the contributing variables represents a fundamental aspect
of compound events, and so it must be properly modelled to represent these
extreme events well.
Our statistical conditional model consists of three components: the
contributing variables Yi, including a model of their dependence
structure, the impact h, and predictors Xj of the contributing
variables. The contributing variables Yi and their multivariate dependence
structure drive the CE. For instance, in the case of compound floods, the
contributing variables are runoff and sea level. The impact h of a CE can
be formalized via an impact function h=h(Y1,…,Yn). In the case of
compound flooding, we define the river gauge level in Ravenna as the impact,
but in principle it can be any measurable variable such as agricultural yield
or economic loss. The predictors Xj provide insight into the physical
processes underlying CEs, including the temporal variability of CEs, and can
be used to statistically downscale CEs when the variables Y and the impact
h are available e.g..
The downscaling feature is particularly useful for compound events, which are
not realistically simulated or may not even be simulated at all by available
climate models. For instance, standard global and regional climate models do
not simulate realistic runoff , and
do not simulate sea surges. Here, our model can be used to downscale these
contributing variables, e.g. from simulated large-scale meteorological
predictors. In particular, the model provides a simultaneous, i.e.
multivariate, downscaling of the contributing variables Yi, which allows
for a realistic representation both of the dependencies between the Yi and
of their marginal distributions. This is relevant because a separate
downscaling of the contributing variables Yi may lead to unrealistic
representations of the dependencies between the Yi, which in turn would
cause a poor estimation of the impact h. The downscaling feature can be
useful for extending the risk analysis into the past, where observations of
the predictors but not of the contributing variables and impacts are
available.
More specifically, the conceptual conditional model consists of the
following.
An impact function to quantify the impact h:h=h(Y1,…,Yn).
Predictors X for the contributing variables Y.
A conditional joint probability density function (pdf) fY|X(Y|X) of
the contributing variables Y, given the predictors X (which we describe through a parametric model,
via pair-copula constructions). In particular, both the contributing variables Y and predictors X
are time dependent, i.e. Y=Y(t) and X=X(t).
A particular type of such a model is obtained when the predictors are not
considered in the joint pdf, i.e. when considering fY(Y).
This unconditional model does not allow for changes in the contributing
variables Y and in the impact due to variations of the predictors X. In
general, formalizing the impact h of a CE as in step 1 – to then assess
the risk of CE based on values of h – corresponds to the
structural approach,
which has recently been formalized in . Here, the
advantage of the general model we propose is that it allows for taking into
account variations of the impact h driven by temporal changes in the
predictors X. Through the conditional pdf, the model allows for a realistic
representation both of the dependencies between the Yi and of their
marginal distributions.
When the variables Y are available but not the impact h, the model can
still be used to only estimate the variables Y. This may be useful when
assessing the risk of CEs through e.g. multivariate return periods of the
contributing variables Ye.g..
Moreover, it may happen that the impact h is available but the variables
Y are not. In this case the model may still be used in the form
fh|X(h|X) to directly estimate the impact h, based on the
conditional joint pdf of the impact h, given the predictors X. In this
case, depending on the physical system, it may be more or less complicated to
calibrate the predictors. Also, we observe that Eq. (1) is general, and a
possibility for estimating the impact would be to use the conditional joint
pdf fh|Y(h|Y). Such an approach may be useful for cases
where complex relations exist between the impact h and the variables Y,
and therefore it may be difficult to implement e.g. a proper regression model
to describe the impact h.
An advantage of using a parametric statistical model is that this constrains
the dependencies between the contributing variables, as well as their
marginal distributions, and thereby reduces their uncertainties with respect
to empirical estimates . Such a reduction in turn reduces
the uncertainty in the estimated physical quantity of interest, like the
impact of the CE. However, the uncertainty reduction depends on the choice of
a proper parametric model, in particular when modelling the tail of a
univariate or multivariate distribution.
Statistical method
Pair-copula constructions (PCCs) are mathematical decompositions of
multivariate pdfs proposed by , which allow for the modelling
of multivariate dependencies with high flexibility. We start by presenting
the concept of copulas, and then we introduce PCCs. More technical details
can be found in the Appendices.
Copulas
Consider a vector Y=(Y1,…,Yn) of random variables, with
marginal pdfs f1(y1),…,fn(yn), and cumulative marginal
distribution functions (CDFs) F1(y1),…,Fn(yn), defined on
R∪{-∞,∞}. We use the recurring definition
ui:=Fi(yi), where the name u indicates that these variables are
uniformly distributed by construction. According to Sklar's theorem
the joint CDF F(y1,…,yn) can be written as
F(y1,…,yn)=C(u1,…,un),
where C is an n-dimensional copula. C is a copula if C:[0,1]n→[0,1] is a joint CDF of an n-dimensional random vector on the
unit cube [0,1]n with uniform marginals
.
Under the assumption that the marginal distributions Fi are continuous,
the copula C is unique and the multivariate pdf can be decomposed as
f(y1,…,yn)=f1(y1)⋅…⋅fn(yn)⋅c(u1,…,un),
where c is the copula density. Equation () explicitly
represents the decomposition of the pdf as a product of the marginal
distributions and the copula density, which describes the dependence among
the variables independently of their marginals. Equation ()
has some important practical consequences: it allows us to generate a large
number of joint pdfs. In fact, inserting any existing family for the marginal
pdfs and copula density into Eq. (), it is possible to
construct a valid joint pdf, provided that suitable constraints are
satisfied. The group of the existing parametric families of multivariate
distributions (e.g. the multivariate normal distribution, which has normal
marginals and copula) is only a part of the realizations which are possible
via Eq. (). Copulas therefore make it easy to construct a
wide range of multivariate parametric distributions.
Tail dependence
The dependence of extreme events cannot be measured by overall correlation
coefficients such as Pearson, Spearman or Kendall. Given two random variables
which are uncorrelated according to such overall dependence coefficients,
there can be a significant probability of getting concurrent extremes of both
variables, i.e. a tail dependence . On the contrary, two
random variables which are correlated according to an overall dependence
coefficient may not necessarily be tail dependent.
Mathematically, given two random variables Y1 and Y2 with marginal CDFs
F1 and F2 respectively, they are upper tail dependent if the
following limit exists and is non-zero:
λU(Y1,Y2)=limu→1P(Y2>F2-1(u)|Y1>F1-1(u)),
where P(A|B) indicates the generic conditional probability of occurrence of
the event A given the event B. Similarly, the two variables are lower
tail dependent if
λL(Y1,Y2)=limu→0P(Y2<F2-1(u)|Y1<F1-1(u))
exists and is non-zero.
Pair-copula constructions (PCCs)
While the number of bivariate copula families is very large
, building higher-dimensional copulas is generally
recognized as a difficult problem
. As a consequence, the set of copula families having a
dimension greater than or equal to 3 is rather limited, and they lack
flexibility in modelling multivariate pdfs where heterogeneous dependencies
exist among different pairs. For instance, they usually prescribe that all
the pairs have the same type of dependence, e.g. they are either all tail
dependent or all not tail dependent. Under the assumption that the joint CDF
is absolutely continuous, with strictly increasing marginal CDFs, PCCs allow
us to mathematically decompose an n-dimensional copula density into the
product of n(n-1)/2 bivariate copulas, some of which are conditional. In
practice, this provides high flexibility in building high-dimensional
copulas. PCCs allow for the independent selection of the pair-copulas among
the large set of families, providing higher flexibility in building
high-dimensional joint pdfs with respect to using the existing multivariate
parametric copulas .
When the dimension of the pdf is large, there can be many possible,
mathematically equally valid decompositions of the copula density into a PCC.
For example, for a five-dimensional system there are 480 possible different
decompositions. For this reason,
have introduced the regular vine, a graphical model which helps to organize
the possible decompositions. This is helpful for choosing which PCC to use to
decompose the multivariate copula. In this study we concentrate on the
subcategories canonical (also known as C-vine) and D-vine of regular
vines. Out of the 480 possible decompositions for a five-dimensional copula
density, 240 are regular vines (60 C-vines, 60 D-vines and 120 other types of
vines) . The decomposition we selected for the conditional
model is the following D-vine:
f12345(y1,y2,y3,y4,y5)=f4(y4)⋅f5(y5)⋅f3(y3)⋅f1(y1)⋅f2(y2)⋅c45(u4,u5)⋅c53(u5,u3)⋅c31(u3,u1)⋅c12(u1,u2)⋅c43|5(u4|5,u3|5)⋅c51|3(u5|3,u1|3)⋅c32|1(u3|1,u2|1)⋅c41|35(u4|53,u1|53)⋅c52|13(u5|31,u2|31)⋅c42|135(u4|513,u2|513),
where (Y1,Y2,Y3) are the variables
(Y1Sea,Y2River,Y3River),
and (Y4,Y5) are the predictors
(X1Sea,X23Rivers) (details about the
predictors are given in the next section). Details about the selection
procedure of the vine (Eq. ) are given in Appendices
and , while the graphical representation of
this vine is shown in Fig. A (Appendix ).
As described in
Sect. , the
conditional model is based on the conditional joint pdf
fY|X(Y|X), which is decomposed via PCC. Details
regarding conditional joint pdfs decomposed as C- or D-vines (including the
developed algorithms for sampling from such vines) are presented in
Appendix . Moreover, the developed routines for working with
conditional vines are publicly available via the CDVineCopulaConditional R
package . More details about vines and the decompositions
used for the unconditional model are given in Appendix .
Details regarding the statistical inference of the joint pdf can be found in
Appendix .
Model development
The extreme impact of compound events may be driven by the joint occurrence
of non-extreme contributing variables . This is the case
for compound floods in Ravenna, where not all extreme values of the impact
would be considered when selecting only extreme values of the contributing
variables. Therefore we model the contributing variables, without focusing
only on their extreme values. Below we show the steps we follow to study
compound floods in Ravenna, based on the conceptual model described in
Sect. . We will go
through these steps in detail in the next sections.
Define the impact function:
h=h(Y1Sea,Y2River,Y3River).
The contributing variables Y (sea and river levels) and the impact are shown in the black rectangle of Fig. .
Find the meteorological predictors of the contributing variables Y. For each variable Yi we found more
than one meteorological predictor, which we aggregated into a single variable Xi. We refer to this variable as
the predictor Xi of the variable Yi from now on.
Moreover, we use the same predictor for the two river levels because they are
driven by a similar meteorological
influence. The predictors are graphically shown in Fig. , where we introduce X1Sea
(the predictor of Y1Sea) and X23Rivers (the predictor of Y2River
and Y3River).
Fit the five-dimensional conditional joint pdf fY|X(Y1Sea,Y2River,Y3River|X1Sea,X23Rivers) of the conditional model (modelled via PCC). To develop the unconditional model,
we fit the three-dimensional pdf
fY(Y1Sea,Y2River,Y3River),
which includes only
the contributing variables Y inside the black rectangle of Fig. . The time series of the contributing
variables have significant serial correlations, and this should be considered in order to avoid underestimating the risk
uncertainties (see Appendix and
Fig. ). Only for the unconditional model
did we explicitly model such serial correlations by combining
the PCC with autoregressive AR(1) models (see Appendix ).
Given the complexity of the problem, an analytical derivation of the statistical proprieties of the impact is
impracticable. Therefore, we apply a Monte Carlo procedure. Specifically we simulate the contributing variables
Y from the fitted models, and then we define the simulated values of h
via Eq. () as
hsim:=hY1Seasim,Y2Riversim,Y3Riversim,
where Ysim are the simulated values of Y.
Perform a statistical analysis of the values hsim. To assess the risk associated with the
events, we compute the return levels of h by fitting a generalized extreme
value (GEV) distribution
to annual maximum values (defined over the period November–March). We compute the model uncertainties, which
is straightforward through such models. Practically, such uncertainties propagate through to the risk assessment,
and so they must be considered (details about model-based return level
uncertainty are given in Appendix ).
To neglect the Monte Carlo uncertainties, i.e. the sampling uncertainties due
to the model simulations, we produce long simulations. For example, to obtain
the model-based return level curve, we simulate a time series
hsim(t) of length equal to 200 times the length of the observed
data (6 years). From this we get a time series of 1200 annual maximum
values, to which we fit the GEV distribution to get the return level.
Observation-based return levels are obtained by fitting a GEV to annual
maximum values of hobs. The relative uncertainties are computed
by propagating the parameter uncertainties of the fitted GEV distribution
(more details are given at the end of Appendix ).
Impact function
The water level h is influenced by river (Y2River and
Y3River) and sea (Y1Sea) levels
(Fig. ). We describe this influence through the
following multiple regression model:
h=a1Y1Sea+a21Y2River+a22Y2River2+a31Y3River+a32Y3River2+c+ηh(0,σh),
where ηh(0,σh) is a Gaussian distributed noise having a
standard deviation equal to σh. The contribution of the rivers to
the impact h is expressed via quadratic polynomials, which guarantees a
better fit of the model according to the Akaike information criterion (AIC).
In particular, we defined the regression model as the best output of both a
forward and backward selection procedure, considering linear and quadratic
terms for all of the Y as candidate variables. The Q–Q plot of the
model, i.e. the plot of the quantiles of observed values against those of the
mean predicted values from the model, is shown in Fig. . The
points are located along the line y=x, which indicates that the model is
satisfying. Omitting one of the variables as a predictor reduces the model
performance, underlining the compound nature of the impact h. The sum of
the relative contributions of the rivers is very similar to that of the sea.
The parameters of this model (and of those in
Sect. ) were estimated according to
the maximum likelihood approach, solved by QR decomposition (via the lm
function of the stats R package – ).
Q–Q plot between the observed impact (x-axis) and the
modelled impact (y-axis) from the regression model
(Eq. ).
Meteorological predictor selection
Figure
shows the resulting scatter plots of observed predictands (Yobs)
and selected observed predictors (Xobs). To fit the joint pdf of
the conditional model, we use all time steps where data for all of the X
and Y variables have been recorded. However, we calibrate the predictors of
rivers and sea separately, so we use all available data for each Y variable
(during the period November–March). The procedure we use to identify the
meteorological predictors is shown below.
Scatter plots of predictands Yobs and predictors
Xobs. The numbers are Spearman coefficient correlations. The red
lines (computed via LOWESS, i.e. locally weighted scatter-plot smoothing)
are shown to better visualize the relationship between pairs .
River levels
The meteorological influence on the two rivers Y2River
and Y3River is very similar because their catchments are
small and close by (as a consequence the Spearman correlation between the
rivers is high, i.e. 0.79). Therefore we use the same predictor for the two
river levels.
The river levels are influenced by the total input of water over the
catchments, which is given by the positive contribution of precipitation and
snowmelt, and by evaporation which results in a reduction of the river
runoff. Specifically, we compute the input of water w on the day t*
over the river catchments (one grid point) as
w(t*)=Ptotal(t*)-E(t*)+Smelt(t*)-Sfall(t*),
where Ptotal is the total precipitation, E is the
evaporation, Smelt is the snowmelt and Sfall is
the snowfall. The snowfall accounts for the fraction of precipitation which
does not immediately contribute to the input of water over the catchments
because of its solid state. While a fraction of the water input over the
catchment rapidly reaches the rivers as surface runoff, another fraction
infiltrates the ground and contributes only later to the river discharge.
Compared with the first fraction, the second has a slower response to
precipitation and changes more gradually over time. This double effect
underlines the compound nature of river runoff whose response to
precipitation falling at a given time is higher if in the previous period
additional precipitation fell in the river catchment. To consider both of
these effects, we define the river predictor as
X23Rivers(t)=aR∑t*=t-1tw(t*)+bR∑t*=t-10tw(t*)+cR,
where cR is a constant. We choose the parameters of
Eq. () by fitting the right-hand side of this equation
to the river contributions to the impact, i.e.
Y23Rivers:=a21Y2River+a22Y2River2+a31Y3River+a32Y3River2 (see Eq. ). The lags n=1 and
n=10 days are those which maximize respectively the upper tail dependence
and the Spearman correlation between Y23Rivers(t) and the
cumulated w over the previous n days, i.e. ∑t*=t-ntw(t*). Here, we use the upper tail dependence to get the typical river
response time to the fraction of water which directly flows into the rivers
as surface runoff. Similarly, the Spearman correlation is used to get the
typical time required for the infiltrated water in the ground to flow into
the rivers.
By defining the river predictor as in Eq. (), we
aggregate the different meteorological drivers of the rivers in the single
predictor X23Rivers(t). Such aggregation allows for a
simplification of the system describing the compound floods, due to a
reduction of the involved variables. Furthermore this reduces the variables
described by the joint pdf fY,X(Y,X), whose
numerical implementation errors can potentially increase with higher
dimensionality .
All of the terms involved in the multiple regression model
(Eq. ) are statistically significant at level α=2×10-16. Moreover, the quality of the river predictor
X23Rivers improves (according to the likelihood and
to the Spearman correlation between X23Rivers and
Y23Rivers) when we use all of the terms in
Eq. (), instead of only Ptotal(t*). The
presence of more terms in Eq. () does not increase the number
of model parameters.
Sea level
Sea level can be modelled as the
superposition of the barometric pressure effect, i.e. the pressure exerted by
the atmospheric weight on the water, the wind-induced surge, and an overall
annual cycle. As for the river predictor, we aggregate the different physical
contributions in a single predictor. We define the sea level predictor on day
t as
X1Sea(t)=aSSLPRavenna(t)+bSSLP(t)⋅RMAP+cSsin(ωYeart+ϕ)+dS,
where SLPRavenna is the sea level pressure in Ravenna,
SLP⋅RMAP is the wind contribution due to the
sea level pressure field SLP, the harmonic term is the annual cycle and
dS is a constant term. In Eq. (12) the SLP field and the
regression map are represented as column vectors. We choose the parameters of
Eq. () by regressing the sea level
Y1Sea(t) on the right-hand side of this equation. A more
detailed physical interpretation of the terms is given in the following.
aSSLPRavenna accounts for the barometric pressure effect
. The regression map RMAP indicates
which anomalies of the SLP field are associated with high values of the
residual of the barometric pressure effect (see Fig. ,
where more details are also given). Particularly, according to the
geostrophic equation for wind, these pressure anomalies induce wind in the
Adriatic Sea towards Ravenna's coast. Therefore, the projection of the SLP
field onto this regression map, i.e. the term SLP(t)⋅RMAP, describes the wind-induced change in sea level at time
t.
cSsin(ωYeart+ϕ) describes the remaining annual
cycle of the sea level which is not described by barometric pressure effect
and wind contribution. This harmonic term could be driven by the annual
hydrological cycle , i.e.
due to cyclic runoff of rivers which flow into the Adriatic Sea, or due to density variations
of the seawater (caused by the annual cycle of water temperatures).
Astronomical tide may
explain a minor fraction of this term. The range of variation of cSsin(ωYeart+ϕ)
is about 10 % of that of the sea level. When we use the predictor to
extend the analysis to the period 1979–2015, this
term will be kept constant assuming that the annual cycle has not drastically changed in past years. Moreover, we will not
consider long-term sea level rise because its influence on both sea and impact h level variations is negligible over the
considered period (the observed rate of sea level rise in the northern
Adriatic Sea has been ∼0.8 mm yr-1). Also, the
relative sea level rise has been negligible over the considered period
.
Regression map R^MAP(i,j) in matrix
notation. The value of the
regression map in the location (i,j) is given by
R^MAP(i,j)=var(R0)-1×cov(R0,SLPi,j), where R0(t) is the residual of the
barometric pressure effect obtained from the fit of the linear model
a0SLPRavenna(t)+d0 to
Y1Sea(t). The regression map is equivalent to a
one-dimensional maximum covariance analysis .
The red dot indicates Ravenna.
All the terms involved in the multiple regression model are statistically
significant at level α=2×10-16.
Results
The results of the unconditional and conditional models are
presented in the following sections.
Unconditional (three-dimensional) model
The unconditional model reproduces the joint pdf of the contributing
variables (Y1Sea,Y2River,Y3River),
and, in conjunction with the autoregressive models, also the serial
correlations. The model is used to simulate values of the impact h and
assess the risk of compound floods, with related uncertainties. The selected
pair-copula constructions and fitted pair-copula families are shown in
Appendices and .
Scatter plots of observed (grey) against simulated (black)
contributing variables Y. The simulated series are obtained via the
three-dimensional model (including the serial correlation), and have the same
length as the observed series.
Figure shows, qualitatively, a good agreement between
simulated and observed contributing variables Y. In
Fig. we show the return levels of the impact h.
There is good agreement between the model- and observation-based expected
return levels, even for return periods larger than 6 years (the length of the
observed data). For return periods larger than shown in
Fig. , the agreement slowly decreases. The
model-based expected return period of the highest compound flood observed
(3.19 m) is 18 years (the 95 % confidence interval is [2.5,∞]
years, where ∞ indicates a value larger than 1050 in this context
from now on). The reason for such large uncertainty in the return period is
the shortness of available data. However, the model-based uncertainties are
large but still smaller, up to return periods of about 60 years, than those
obtained when computing the return level directly (based on the GEV) on the
observed data of the impact (Fig. ). Moreover, when
considering a model which does not take the serial correlation of the
contributing variables Y into account, we get an underestimation of the
risk uncertainties. For example, the amplitude of the 95 % confidence
interval of the 20-year return level is underestimated by about 50 %
(not shown).
Unconditional model. Return levels of the impact h with associated
95 % uncertainty intervals. The return level computed on
hobs is shown in red (uncertainty shown in light red). The
model-based return level is shown in black (uncertainty is in grey).
Conditional (five-dimensional) model
This model allows for assessment of the change in the risk of compound floods
due to temporal variations of the meteorological predictors of the
contributing variables Y. We calibrate the model to the period 2009–2015.
After validating the model for the period 2009–2015, we use predictors of
the period 1979–2015 to extend the analysis of compound flood risk to the
past. The selected pair-copula construction and fitted pair-copula families
are shown in Appendices and . We assess the
quality of the model by comparing predictions with observations. Specifically
we look at its overall accuracy by considering the root-mean-square error
between model predictions and observed data. Moreover, we look at the
accuracy of the model when predicting extreme values of the impact h
(defined as values of h larger than the 95th percentile of
hobs), using the Brier score (see Appendix ). To
assess the quality of the model, avoiding overfitting, we perform a 6-fold
cross-validation (see Appendix ).
Validation time series of the conditional model obtained by 6-fold
cross-validation. hobs is shown in red. The average and
95 % prediction intervals of 104 simulated time series are
respectively shown in black and grey.
The cross-validation time series of the impact h is visually compared with
hobs in Fig. . The average of the
simulated cross-validation time series in general follows the temporal
progression of hobs (Fig. ), and
about 94 % of the observed impact values lie within the 95 %
prediction interval. In particular, the highest flood observed is well
predicted and lies inside the prediction interval. The Brier score based on
the cross-validation time series is BSCV=0.029, while that
relative to the reference model, i.e. the climatology (see
Appendix ), is BSCL=0.046. The resulting Brier
skill score is BSS=1-BSCV/BSCL=0.38, which indicates that the model is more accurate than the reference
model in predicting extreme values of the impact h. In general, the skill
of the model, both in terms of root-mean-square error and Brier score, does
not change much when the cross-validation is not performed. This underlines
that no artificial skill is present in the model. These positive results
provide good confidence for extending the impact time series to the period
1979–2015. It also makes the model potentially interesting for flood
forecasting and warning.
In Fig. a we show the return levels of the impact
h. As in the unconditional model, return levels are stationary, i.e.
estimated by fitting a stationary GEV distribution to annual maximum values.
The discrepancy between model- and observation-based return levels for the
conditional model is smaller than for the unconditional one, in particular
for high return periods. It may happen that the dependencies between river
and sea levels are not considered in some analyses when assessing the risk of
flooding. show in Rotterdam, which is affected by floods
driven both from surge and river discharges, that the boundary conditions
used to build the protection barrier were determined assuming independence
between sea level and river discharge. Here we observe that ignoring such a
dependence may result in an underestimation of the estimated risk. The
expected return period of the highest compound flood observed (3.19 m),
computed over the period 2009–2015, is 20 years (the 95 % confidence
interval is [4.9,∞] years). When not considering the dependencies
between river and sea levels, the expected return period of the highest
compound flood observed increases to 32 years (the 95 % confidence
interval is [6.7,∞] years). Figure b shows
that the return level estimates are reduced by about 0.2 m when not
considering such dependencies between sea and river levels. In particular, at
the 95 % confidence level, the return levels are underestimated when
not considering these dependencies for return periods smaller than about
40 years. The same, however, cannot be clearly concluded for return periods
larger than 40 years because of the large uncertainties
(Fig. b). A similar result is obtained from the
unconditional model (not shown). Therefore, although there is not a large
difference in the return levels when treating sea and rivers independently or
not, in Ravenna it may be relevant to incorporate their dependencies into the
flood risk estimation. An imprecise risk assessment may bring negative
societal consequences due to inadequate information provided for
infrastructural adaptation.
To estimate the risk based on predicted values of the impact during the past,
we run the simulations by conditioning on predictors of the period
1979–2015. This allows us to get a more robust estimation of the risk
compared to that obtained considering only the period 2009–2015. The return
levels in Fig. a (dashed line) are similar to that
estimated when analysing the period 2009–2015. Although this result suggests
a stationarity of the risk during the period 1979–2015, we investigate
whether there has been any trend in the risk during the recent past. To do
this, we computed time-dependent return levels. Specifically, we computed
stationary return levels on moving temporal windows of 6 years during the
period 1979–2015, based on hsim values obtained by conditioning
on predictors belonging to these windows. However, we did not observe any
long-term trend in the risk. Moreover, analysing the return levels computed
on moving temporal windows during the period 1979–2015, we did not observe
any long-term trend, either in the risk of storm surge or in that of river
floods (not shown).
During the period 1979–2015, there has not been any long-term trend in the
risk due to a variation of the marginal distributions of the predictors or in
their dependence. To study this, we computed the return levels on moving
temporal windows in the cases described below. First, we simulated the impact
by conditioning the Ysim variables on predictors having the
observed marginal distributions of the period 1979–2015, but fixing the
dependence to that observed during 2009–2015. Secondly, we simulated the
impact by conditioning on predictors having the observed dependence of the
period 1979–2015, and fixed marginal distributions to the ones observed
during 2009–2015. In both cases we did not find any long-term trend in the
return levels (not shown).
Conditional model. (a) Return levels of the impact h with
associated 95 % uncertainty intervals. The return level computed on
hobs is shown in red (uncertainty shown in light red). The
model-based return level computed for the period 2009–2015 (black) is based
on hsim values simulated for days where the observed data were
available (uncertainty is shown in grey). The model-based return level
computed for the period 1979–2015 (black dashed) has an uncertainty of
similar amplitude to that of period 2009–2015 (not shown).
(b) Difference between the model-based return level obtained when
considering the realistic dependence between sea and river levels, and when
assuming that they are independent. To make the dependencies between the sea
and the river levels independent but keep the dependence between the two
rivers, we shuffled the sea level data after each simulation, which
guarantees random association between sea data and each of the rivers
e.g.. The black line represents the median of
the bootstrap samples.
Discussion and conclusions
Compound events (CEs) are multivariate extreme events in which the
contributing variables may not be extreme themselves, but their joint –
dependent – occurrence causes an extreme impact. Conventional univariate
statistical analysis cannot give accurate information regarding the
multivariate nature of CEs and therefore the risk associated with these
events.
We develop a conceptual model, implemented via pair-copula constructions
(PCCs), to quantify the risk of CEs as well as the associated sampling
uncertainty. This model includes predictors, which could represent for
instance meteorological processes. The inclusion of predictors in the model
(1) provides insight into the physical processes underlying CEs, as well as
into the temporal variability of CEs, and (2) allows for statistical
downscaling of CEs and their impacts. The model is in principle extendable to any
number of contributing variables and predictors, given a large enough sample
of data for calibration.
Downscaling may be used to statistically extend the risk assessment back in
time to periods where observations of the predictors are available but not of
the contributing variables and impacts, or to assess potential future changes
in CEs based on climate models. The conceptual model is particularly useful
for downscaling large-scale predictors from climate models in cases where the
local contributing variables driving the impacts of CEs are either not
realistically simulated or not simulated at all by the available climate
models. As such, the model can straightforwardly be used to assess future
risk of CEs based on multi-model ensembles as available from the CMIP
and CORDEX archives.
The model makes use of PCCs, a very powerful statistical method to model
multivariate dependencies. PCCs are particularly useful for modelling CEs,
when the contributing variable pairs have different dependence structures,
e.g. when only some of them are characterized by tail dependence. To model
such types of structures, even multivariate parametric copulas, which have
been introduced in climate science to overcome some difficulties in modelling
multivariate density distributions e.g., lack
flexibility. PCCs are more convenient: by decomposing the dependence
structure into bivariate copulas, they give high flexibility in modelling
generic high-dimensional systems. We suggest considering the use of PCCs for
modelling compound events which involve more than two contributing variables,
or when predictors are included in the system as additional variables.
The model allows for a straightforward quantification of sampling
uncertainties. In many cases, such risk uncertainties might be substantial as
observed data are often limited, and should thus be quantified. In fact,
uncertainty estimates are essential to avoid drawing conclusions that may be
misleading when uncertainties are large (as also recently discussed by
).
We adapt the developed conceptual model to study compound floods in Ravenna,
which are floods driven by the joint occurrence of storm surge and high river
level. In other words, the contributing variables of the compound floods are
the river and sea levels, whose combination drives the impact,
i.e. the water level in
between the river and the sea.
We used the specific adaptation of the model to statistically downscale the
river and sea level from meteorological predictors, and therefore estimate
the impact of the compound floods as a function of the downscaled sea and
river levels. The accuracy of the estimated impact appears satisfactory, such
that the model is potentially interesting for use in both flood forecasting
and warning. Also, the model-based expected return levels of the impact are
about the same as those directly computed on observed data of the impact.
Although the model-based uncertainty in these return levels is very large
(due to the shortness of the available data), for return period smaller than
about 60 years it is smaller than that obtained by computing the risk
directly on the observed data of the impact.
We calibrate the model over the period 2009–2015, and by including
meteorological predictors obtained from the ECMWF ERA-Interim reanalysis
dataset, we extend the analysis of compound flooding to the full period of
1979–2015, to obtain a more robust estimation of the risk. The expected
return period of the highest compound flood observed, computed over the
period 1979–2015, is 19 years (the 95 % confidence interval is
[3.7,∞] years). Moreover, we did not observe any long-term trend in
risk during the period 1979–2015.
Ignoring the estimated dependence between sea and river levels may lead to an
underestimation of risk. Specifically, assuming independence between sea and
river levels, the expected return period of the highest compound flood
observed – computed over the period 2009–2015 – is 32 years (the
95 % confidence interval is [6.7,∞] years). When assuming the
estimated dependence between sea and river levels, it decreases to 20 years
(the 95 % confidence interval is [4.9,∞] years). In other
cities affected by sea surges and river flooding, e.g. in Rotterdam,
protection barriers were designed assuming independence between sea level and
river discharge , a decision which is still debated
. In Ravenna, it may be relevant to
incorporate these dependencies into the flood risk estimation. An imprecise
risk assessment may harm the population at risk due to inadequate information
provided for infrastructural adaptation. In general, when considering generic
CEs, their associated risk may be substantially influenced by the dependence
between the contributing variables, and so this dependence should be
considered.
In the context of compound floods, only a few studies have explicitly
quantified the impact and the associated risks
. This might be due to the
practical difficulties in quantifying the impact. For example, to quantify
the impact of compound floods in the river mouth, it is necessary to have
water level data at a station where both the influence of sea and river are
seen. However, we have found few locations where these stations exist as,
maybe in part, stakeholders are usually interested in data where only the
influence of the river or the sea is seen. Also, for places where data show
both the influence of sea and river, the measurements can be affected by
human influences such as pumping stations between river and sea stations.
Moreover, while compound floods involve a dependence between sea and river
levels , places where there are stations detecting both the
influence of sea and river may not present such dependence. Therefore, we
argue that to obtain more in-depth knowledge of these events, it may be very
useful to create an archive containing data for locations where compound
floods have been recorded and eventually increase the effective number of
measurements in places which are supposed to be at risk of compound floods.
The developed routines for working with conditional joint
probability density functions decomposed as D- or C-vines are publicly
available via the CDVineCopulaConditional R package
(more details are given in Appendix ). Other routines from
this study are available from the authors upon
request.
Sea level data of the Ravenna-Porto Corsini station were
downloaded from the Italian National Institute for Environmental Protection
and Research (ISPRA), and are available under the link
www.mareografico.it. River data can be downloaded from Arpae
Emilia-Romagna, via the link
www.arpae.it/dettaglio_generale.asp?id=3284&idlivello=1625 (the names
of the used stations are S. Marco, S. Bartolo and Rasponi, where the latter
is that used for the impact). Meteorological predictors were obtained from
the ECMWF ERA-Interim reanalysis dataset, which is available via the link
http://apps.ecmwf.int/datasets/data/interim-full-daily/levtype=sfc/.
Homogenization of river level data
The zero reference level of river measurements is the water level in the
river defined as zero in the measurements. In general, such a zero reference
level may change during different periods of observation, for technical
reasons. As the zero reference level of rivers Y2River and
Y3River varied in the first 3 years but remained constant in
the second 3, we homogenized the former with respect to the latter at both
rivers. We performed such homogenization assuming that the precipitation
falling into the catchment during 1 year is responsible for the average river
level in the same year. For each river YiRiver, we fitted
the linear model YiRiverannual=aiPiannual+bi in the last 3 years (those having a constant
zero reference level), where YiRiverannual is the
annual average of YiRiver and Piannual is the
annual cumulated precipitation over the river basin (data from the ECMWF
ERA-Interim reanalysis dataset). Finally, for each river, we translated the
zero reference level of the first 3 years, such that the linear model was
valid in these years as well.
Vines and sampling procedure
In this appendix we show more details about vines, focusing on C- and
D-vines. Moreover, we discuss the sampling procedure, showing the algorithms
to perform the conditional sampling from C- and D-vines.
Vines
Shown below are the general expressions to decompose an n-dimensional pdf
via a PCC as a C-vine (Eq. ) or D-vine
(Eq. ) :
fY1,…,Yn(y1,…,yn)=∏k=1nf(yk)∏j=1n-1∏i=1n-jci,i+j|i+1,…,i+j-1{F(yi|yi+1,…,yi+j-1),F(yi+j|yi+1,…,yi+j-1)},fY1,…,Yn(y1,..,yn)=∏k=1nf(yk)∏j=1n-1∏i=1n-jcj,j+i|1,…,j-1{F(yj|y1,…,yj-1),F(yj+i|y1,…,yj-1)}.
The five-dimensional vine that we use for the conditional model is shown in
Eq. (). The graphical representation of that decomposition is
shown in Fig. A, where the concept of a tree is
introduced. We show below the vines that we use for the unconditional model.
Three-dimensional vine
In total, a three-dimensional copula density can be decomposed in three
different ways, and each of these vines is both a D-vine and a C-vine. For
this application we use the following vine.
f123(y1,y2,y3)=f1(y1)⋅f2(y2)⋅f3(y3)⋅c12(u1,u2)⋅c23(u2,u3)⋅c13|2(u1|2,u3|2).
This decomposition is represented graphically in
Fig. b. We underline that, in Eq. (),
the rigorous expression of the conditional copula density c13|2, of the
pair (U1,U3), given U2=u2, would be
c13|2(u1|2,u3|2;u2). In Eq. (), c13|2 is
written under the assumption of a simplified PCC; i.e. the parameters of
c13|2 are the same for all values of u2∈(0,1). The simplified PCC
may be a rather good approximation, even when the simplifying assumption is
far from being fulfilled by the actual model .
Copula parameters that are functions of the conditioning variables, and thus
violate the simplifying assumption, are approximated by the average over all
values of the conditioning variables. The effect of this approximation on the
estimated impact is likely to be small .
In this study of compound floods, the variables (Y1,Y2,Y3) of
Eq. () are the
(ε1Sea,ε2River,ε3River)
introduced in Appendix . Specifically, the vine of
Eq. () represents that used at the first step of the procedure
in Appendix . The vine that we use at the third step of the
procedure in Appendix is
f123(y1,y2,y3)=f3(y3)⋅f1(y1)⋅f2(y2)⋅c31(u3,u1)⋅c12(u1,u2)⋅c32|1(u3|1,u2|1),
where
(Y1,Y2,Y3)=(Y1Sea,Y2River,Y3River).
(a) Representation of the five-dimensional D-vine in
Eq. (). There are 4 trees (T1,T2,T3,T4) and 10 edges.
Each edge represents a pair-copula density, and the label indicates the
subscript of the corresponding copula. For example, the edge 43|5
represents the copula density c43|5. The decomposition of the joint pdf
related to the represented vine is obtained by multiplying all the
represented pair-copula densities (10 in this case) and the marginal pdfs of
each variable. For more details, see .
(b) Representation of the three-dimensional vine in
Eq. (). There are two trees (T1 and T2) and three
edges.
Sampling procedure
To simulate a vector Y=(Y1,…,Yn) of random variables, with
marginal CDFs F1(y1),…,Fn(yn), whose joint pdf is modelled via a
copula, we first simulate from the copula the uniform variables Ui for
i=1,…,n (ui:=Fi(yi)), and then transform them into Yi for
i=1,…,n (yi:=Fi-1(ui)).
Sampling and conditional sampling from vines
The simulation of the uniform variables from vines is discussed in
and .
show the algorithms to sample uniform variables from C-
and D-vines. Due to the nature of PCCs, the sampling procedure works as a
cascade. Once the first variable is simulated from a uniform distribution,
each following variable is simulated as conditioned on the previous group of
simulated variables.
It is clear then that to sample from the conditional distribution of
UNcond+1,…,Un given values for U1,…,UNcond (i.e. fUNcond+1,…,Un|U1,…,UNcond), it is possible to follow this procedure by simply
fixing the first Ncond variables at the conditioning values. The
approach used here to execute such a procedure is to select vines from which
the conditioning variables would be sampled as first when following the
sampling algorithms from . For example, using the D-vine
represented in Fig. a (or in Eq. ), we
could simulate by fixing the pairs (U4,U5) or (U2,U1) in case we are
interested in conditioning the simulation on two variables.
Following this approach, for D-vines the number of n-dimensional
decompositions which allow for conditioning on Ncond variables is
Ncond!×(n-Ncond)!. For C-vines the number of
decompositions which allow for such a conditioning is Ncond!×(n-Ncond)!/2 for n-Ncond>1, and
Ncond! for n-Ncond=1. For example, in this study we
model a five-dimensional system with two conditioning variables (the
meteorological predictors), that is, n=5 and Ncond=2.
Considering that there are no five-dimensional vines which belong to both the
C-vine and D-vine categories , the choice of the vine used
for the model is done among (2!/2×(5-2)!)+(2!×(5-2)!)=18
vines. Furthermore, we need to condition on values (y4,y5); therefore, we
simulate from the copula by conditioning on (u4=F4(y4),u5=F5(y5)),
where F4 and F5 are the fitted marginals in the calibration period,
while (y4,y5) could theoretically be any value.
To apply such a sampling procedure, we developed
Algorithms and ,
which are modified versions of Algorithms 1 and 2 shown in
. The developed algorithms allow for conditional sampling
from a C- or D-vine from which the conditioning variables would be sampled as
first when following the sampling algorithms from .
Specifically, given a C- or D-vine of the variables (X1,…,XNcond,XNcond+1,…,Xn), Algorithms
and allow for the
conditional sampling of (XNcond+1,…,Xn) given
(X1=x1cond,…,XNcond=xNcondcond),
where Ncond is the number of conditioning variables. When
conditioning variables are not given (Ncond=0),
Algorithms and
reduce to the special cases of Algorithms 1 and 2 shown in
. Both routines relative to
Algorithms and
are publicly available via the CDVineCopulaConditional R package
. CDVineCopulaConditional includes tools to select the
best vine (based on information criteria) among those which allow for such
conditional sampling, and therefore to fit the pair-copula families.
Algorithm to simulate uniform variables X=(X1,…,XNcond,XNcond+1,…,Xn) from a C-vine.
Generates one sample xNcond+1,…,xn conditioned on
given values x1cond,…,xNcondcond.
The h-function is defined as in . Θj,i is the
set of parameters of the copula density cj,j+1|1,…,j-1.
Sample wNcond+1,…,wn independent uniform on
[0,1].
ifNcond≠0then
foriin(1,…,Ncond)do
wi=xicond
endfor
endif
x1=v1,1=w1
foriin(2,…,n)do
vi,1=wi
ifi>Ncondthen
forkin(i-1,i-2,…,1)do
vi,1=h-1(vi,1,vk,k,Θk,i-k)
endfor
endif
xi=vi,1
ifi==nthen
Stop
endif
forjin(1,…,i-1)do
vi,j+1=h(vi,j,vj,j,Θj,i-j)
endfor
endfor
Algorithm to simulate uniform variables X=(X1,…,XNcond,XNcond+1,…,Xn) from a D-vine.
Generates one sample xNcond+1,…,xn conditioned on
given values x1cond,…,xNcondcond.
The h-function is defined as in . Θj,i is the
set of parameters of the copula density ci,i+j|i+1,…,i+j-1.
Sample wNcond+1,…,wn independent uniform on
[0,1].
ifNcond≠0then
foriin(1,…,Ncond)do
wi=xicond
endfor
endif
x1=v1,1=w1
ifNcond<2then
x2=v2,1=h-1(w2,v1,1,Θ1,1)
else
x2=v2,1=w2
endif
v2,2=h(v1,1,v2,1,Θ1,1)
foriin(3,…,n)do
vi,1=wi
ifi>Ncondthen
forkin(i-1,i-2,…,2)do
vi,1=h-1(vi,1,vi-1,2k-2,Θk,i-k)
endfor
vi,1=h-1(vi,1,vi-1,1,Θ1,i-1)
endif
xi=vi,1
ifi==nthen
Stop
endif
vi,2=h(vi-1,1,vi,1,Θ1,i-1)
vi,3=h(vi,1,vi-1,1,Θ1,i-1)
ifi>3then
forjin(2,…,i-2)do
vi,2j=h(vi-1,2j-2,vi,2j-1,Θj,i-j)
vi,2j+1=h(vi,2j-1,vi-1,2j-2,Θj,i-j)
endfor
endif
vi,2i-2=h(vi-1,2i-4,vi,2i-3,Θi-1,1)
endfor
Finally, we underline that this is not the only way to proceed for the
conditional simulation , but despite the fact that
the best vine is selected among a fraction of all those possible, it can
provide very satisfying results, as we show in this study. Also, we refer to
and as other works where conditional
joint pdfs decomposed as C-vines were used for statistical modelling.
Statistical inference of the joint pdf
Statistical inference on a pdf decomposed via a PCC is in principle
computationally very demanding. As can be seen from Eq. (),
the arguments of the copulas are influenced by the choice of the marginals
(because of ui=Fi(xi)), and the argument of the copula in each level is
influenced by the fit of the copulas in the previous levels too. As a
consequence of this, the estimation of the parameters of the full pdf
(marginals and pair-copulas) should be performed together. Moreover, the
structure of the vine has to be chosen, increasing the demands of
computational resources.
To overcome these obstacles, some techniques have been developed. The
complications regarding the dependence of the copula parameters from the
marginals estimation can be overcome using empirical marginals
. This allows for the estimation of copula parameters
without the need to consider the marginals. However, to take into account
that the estimation of the parameters of each pair copula depends on those of
the upper levels, the estimation of the parameters of all the pairs should be
performed at the same time. This way of estimating the parameters is called
semiparametric (SP). The estimator we use here is the stepwise semiparametric
(SSP). It was proposed by and then ,
and despite being asymptotically less efficient than the SP
, it produces very satisfactory results and speeds up the
procedure considerably . As in SP, the PCC parameters are
estimated independently of the marginals, but the estimation of the PCC
parameters is performed level by level, plugging in the parameters from
previous levels at each step .
In this study of compound floods, for each marginal pdf we use a mixture
distribution composed of the empirical and generalized Pareto distribution
(GPD) for the extreme. For each predictor X, the GPD is fitted to data
above a threshold defined here as their respective 95th percentile. For each
of the contributing variables Y, this threshold was chosen requiring that
the mean of the simulated extreme values from the joint pdf was as near as
possible to the maximum observed value of the variable Y we were fitting.
Adding the GPD to the empirical marginal for the extremes is necessary so as
not to constrain the model to simulate values of the variables Y with
maximum values that never exceed those observed during the calibration
period.
We use the AIC to select the best vine structure among C- and D-vines (those
selected are shown in Sects. and
). In particular, for every possible C-
and D-vine, we fit all possible families through the maximum likelihood
estimation, and then we select the best family according to the AIC. Then, we
select the best vine according to the AIC for the full model. The pair-copula
families are chosen among those available in the
VineCopula R package
. In particular, for the unconditional model, all of the
available families are considered during the selection, while for the
conditional model we restricted the choice to the first 31 families listed in
the documentation file of the package. This is because of technical issues
regarding the simulation of data from the conditional pdf of the conditional
model. Once the vine is selected, to better assess the quality of the fit of
each pair-copula, we use the K-plot (Fig. ). This is a plot of
the Kendall function K(w)=P(Ci,j(Ui,U,j)≤w) computed with the
fitted copula against K(w) computed with the empirical copula obtained from
the observed uniform data. This diagnostic plot indicates a good quality of
the fit when the points follow the diagonal . We
note that the K(w) of the fitted copula is computed using Monte Carlo
methods (long simulations allow for neglect of the associated sampling
error). In Fig. we show the resulting K-plots and the selected
copulas with their respective parameters for the five-dimensional PCC
(K-plots for the three-dimensional PCC are not shown).
The families chosen for copulas c43|5(u4|5,u3|5) and
c42|135(u4|513,u2|513) according to the AIC were describing
slightly negative dependencies (<0.1), but for physical reasons we expect
these copulas to describe slightly positive dependencies. We argue that this
result is due to uncertainties of the model. Therefore we choose independent
copulas for these pairs, which is a compromise between the expert knowledge
we have about the data and the result of the fit. When assuming independent
copulas for these two pairs, the corresponding K-plots show only a small
deviation from the diagonal (right side of Fig. ). Moreover,
these K-plots are mostly inside the 95 % confidence interval of the
K-plots, which confirms the reasonability of choosing these two independent
copulas.
K-plots of the pair-copula families selected for the
five-dimensional model (names of the families and parameters are shown in the
top left of each plot). In abscissa the empirical K-function and in ordinate
the K-function based on fitted copula. The 95 % confidence interval
(shown in light red) is obtained from 104 K-plots computed on simulated
pairs (with the same length as the observed data) from the selected
pair-copula families.
The CDVineCopulaConditional and VineCopula
R packages were used to work with copulas. The GPDs for
the marginal distributions were fitted through the gpd.fit function of the
ismev R package .
Selected pair-copula families
In the case of the unconditional model, the fitted pair-copula families to
the observed contributing variables Y – relative to the vine of
Eq. () – are Survival BB1 (parameters: 0.49, 1.15) for
c31(u3,u1), BB8 (parameters: 4.01, 0.6) for c12(u1,u2), and
Tawn type 1 (parameters: 2.59, 0.73) for c32|1(u3|1,u2|1). The
selected families relative to the vine of Eq. (), i.e. the one
fitted to
(ε1Sea,ε2River,ε3River)
introduced in Appendix , are t-copula (parameters: 0.15,
3.44) for c12(u1,u2), Tawn type 2 (parameters: 2.85, 0.71) for
c23(u2,u3), and Survival Gumbel (parameter: 1.13) for
c13|2(u1|2,u3|2). In the case of the conditional model, the
selected pair-copula families with relative parameters, fitted to the
observed data of contributing variables Y and predictors X, are shown in
Figure .
Model and risk uncertainty estimation via parametric bootstrap
The flexibility of copula theory in modelling multivariate distributions has
determined its spread in the literature, and more recently in climate
science. However, once the model is fitted to observed data, we stress that
procedures to get an estimate of the uncertainties, both in the parameter
estimates and the choice of the model, should be considered. This is
particularly important, as it often happens that because of the limited
sample size of the available data, these uncertainties are large and so
cannot be neglected . Practically they have a direct
influence on the uncertainties of risk analysis. In particular, we observed
that the uncertainties are also controlled by the dependence values between
the modelled pairs (not shown).
In this study, we find model uncertainties in the joint pdf which propagate
into large uncertainties when assessing the risk of compound floods. This
does not mean that such models are not useful, but instead that the results
should be interpreted being aware of these existing uncertainties. Also, even
if large, the obtained uncertainties in the risk are smaller than those
obtained computing the risk analysis directly on the observed data of the
impact, underlining another advantage of applying such procedures.
For both the unconditional and conditional models, we use a parametric
bootstrap to assess the model and subsequent risk uncertainty, as follows.
Select
and fit a model that can reproduce the statistical characteristics of
Yobs ((Yobs,Xobs) for the
conditional model), i.e. the dependence among the variables and their
marginal distributions. For the unconditional model we also include the
serial correlation as described in Appendix .
Simulate B=2.5×103 samples of the contributing variables Y (as well as predictors X for the
conditional model) with the same length as the observed data.
On each of the B=2.5×103 samples, fit a joint pdf via PCCs (the structure of the PCC is the same
as that fitted on the observed data, while the
pair-copula families are re-selected for each sample).
From each of these B=2.5×103 models, simulate a sample of contributing variables Y of length equal
to 200 times the observed (for the conditional model the contributing variables Y are simulated as conditioned on the predictors X).
For each sample, compute the simulated impact sequence as hsim=h(Y1Seasim,Y2Riversim,Y3Riversim) and estimate the corresponding return
level curves. Return levels are estimated by fitting the generalized extreme
value (GEV) distribution on annual maximum values. We simulated samples of
length 200 times the length of the observed data (6 years), to get – for
each sample – 1200 annual maximum values on which we fit the GEV
distribution. This allows us to neglect the uncertainty of the return levels
driven by the sampling because the uncertainties of the GEV distribution
parameters are negligible.
Estimate the uncertainties in the return levels by identifying the 95 % confidence interval (i.e. the range 2.5–97.5 %) of
the B=2.5×103 return level curves.
As underlined in step 1, for the unconditional model, we explicitly model the
serial correlations of the contributing variables Y when computing the
uncertainties. This was done to avoid an underestimation of the risk
uncertainties (see Appendix ). For the conditional model, step
3 is a rigorous bootstrap procedure, while for the unconditional model this
step is an approximation. In fact, for the unconditional model, at step 3 we
should have fitted the same type of model as fitted in step 1, i.e. that
could include the serial correlations. Unfortunately, such a procedure was
not feasible because of computational limitations, and we had to proceed by
approximation, i.e. fitting a pdf via a PCC without considering the
autoregressive processes. In particular, the computational limitations were
due to the tuning procedure explained in Appendix .
Therefore we used the best method possible to avoid underestimation of the
risk uncertainties, but we underline that we used such an approximation.
The uncertainty in the return levels obtained via the observed data
hobs is computed by propagating the parameter uncertainties of
the GEV distribution fitted to the annual maxima of hobs
(Fig. ). In particular, the fitted GEV distribution
is a function of the parameters μ (location), σ (scale) and η
(shape) . The GEV-based return level RLt associated
with the return period t is a function of the three parameters
(μ,σ,η). We obtained the standard deviations
of the three parameters (μ,σ,η), respectively sμ,
sσ, and sη, via the gev.fit function of the ismev R
package . To estimate the standard deviation of the return level
RLt, we propagated the standard deviations of the three parameters
(μ,σ,η) using the formula
sRLt=∂RLt∂μ2⋅sμ2+∂RLt∂σ2⋅sσ2+∂RLt∂η2⋅sη2,
where sRLt is the standard deviation of the return level
RLt. The final 95 % interval of uncertainty of the return level
RLt is obtained as RLt±2sRLt.
Incorporation of the AR(1) into the unconditional model
ACF of the observed time series (shown in red) against the ACF 95 % confidence interval (grey) of the model (obtained through the Monte
Carlo procedure). The dashed lines contain the 95 % confidence interval
defined by the ACF of a white noise process; i.e. outside this interval the
ACF of the contributing variables Y is significant.
Given a statistical model describing time series with serial correlations, to
avoid an underestimation of the model uncertainties computed via the
bootstrap procedure, it is necessary to use a model which can reproduce the
serial correlation. During the bootstrap procedure, simulating samples
without serial correlation, and then re-fitting the model to each of them,
would mean assuming that the data carry more information than they actually
do. In fact, it is as if the effective sample size of data with serial
correlation is smaller than those without . Here we
introduce the procedure we used to build a multivariate statistical model
that can represent the serial correlation and the marginal pdf of the
variables, and the statistical dependencies between them. The steps taken
follow below.
Fit a linear Gaussian autoregressive model of order 1, AR(1),
Yi(t)=c+φYi(t-1)+εi(t)
on the ith marginal time series (i=1,2,3), i.e.
(Y1Sea,Y2River,Y3River). The chosen
AR(1) requires that the modelled variable is Gaussian distributed; so, before
the fit, we transformed the river variables via the loge function, which
guarantees more similar behaviour to the Gaussian. The observed sea variable
was not transformed because it already had behaviour similar to Gaussian.
Assured via the autocorrelation function (ACF) that εi(t) no longer has a
significant serial correlation, fit the joint pdf via PCCs on the residual
variables (ε1,ε2,ε3). We observe
that the dependencies of these modelled pairs via PCCs are not the usual
physical dependencies between the contributing variables (i.e. sea and river
levels), but between their residuals with respect to the AR(1) models.
Simulate the residuals (ε1sim,ε2sim,ε3sim) and
plug into the ith autoregressive model. Finally, to get the simulated
contributing variables Y, the river variables were transformed via the exp
function.
We observe here that when selecting the fitted pair-copulas and parameters
for the residuals via the AIC, the simulated contributing variables Y had a
smaller dependence with respect to the observed variables. We therefore
proceeded through a tuning procedure; i.e. we built a routine to
automatically tune the parameters of the fitted families, requiring that the
Kendall rank correlation coefficients among the contributing variables Y
were well simulated.
In Fig. , the autocorrelation functions of the
Yobs variables are compared with those of Ysim
simulated from the fitted model. Because of the gaps in the Yobs
time series, not all the observations are usable for computing the ACF (in
particular, the percentage of usable data decreases when increasing the lag
at which the ACF is computed). We therefore computed the ACF up to a lag of
about 25 days, which guarantees the use of at least the 35 % of data
from the observed time series. Up to a lag of about 15 days, except for a
very few cases with the variable Y3River, the ACFs of the
observed data are always inside the 95 % interval of the ACFs obtained
from the fitted model.
We consider this result satisfying because our target is to include the
serial correlation of the contributing variables Y in the model, and we can
see that even for the variable Y3River, the values of the ACFs
have a significant serial correlation. Also, considering that the ACF is only
slightly misrepresented for just one of the three time series, we argue that
this is likely to have only a small effect on the final assessment of the
model uncertainties.
Brier score for extreme values
We employ the Brier score to assess the accuracy of the probabilistic
predictions of the conditional model when predicting extreme values of the
impact he.g.. We defined an extreme of h as a
value larger than the 95th percentile of hobs. The Brier score is
BS=1N∑t=1N(pt-ot)2,
where pt is the probability of getting an extreme value h from the
model at time t, while ot is 1 if hobs(t) is extreme and 0
otherwise. We get the value of pt through a Monte Carlo procedure.
The Brier skill score (BSS) measures the relative accuracy of the model under
validation over a reference model, and is defined as
BSS=1-BSBSref,
where BSref is the Brier score of the reference model. Here we
consider the climatology of h as the reference model, i.e. an empirical
model such that pt=0.05∀t. A significant positive value of BSS
indicates that when predicting extreme values, the model under validation is
more accurate – according to the BS – than the reference model.
Cross-validation procedure
To assess the quality of the conditional model, avoiding overfitting, we
perform a 6-fold cross-validation. Therefore, the original sample of data
(X,Y) is randomly partitioned into six equally sized
subsamples. Of the six subsamples, five subsamples (the training data) are
used in fitting the model that is then validated against the remaining
subsample. For each training subsample we fit (1) new predictors X for the
contributing variables Y, (2) a new joint pdf
fY|X(Y|X) and (3) a new h-function. For each
validation subsample, we simulated 104 realizations of the Y
values by conditioning on the concurring predictors. Finally, by combining
the simulations of each validation subsample, 104 cross-validation time
series of the contributing variables Y and the impact h are obtained.
Douglas Maraun had the initial idea for the study. Emanuele Bevacqua and Douglas Maraun jointly
developed the study with contributions from Martin Widmann. Emanuele Bevacqua
developed the statistical model with contributions from Ingrid
Hobæk Haff, Douglas Maraun and Mathieu Vrac. Emanuele Bevacqua carried
out the analysis with contributions from Douglas Maraun and
Ingrid Hobæk Haff. Emanuele Bevacqua, Douglas Maraun and Martin Widmann
interpreted the results. Emanuele Bevacqua wrote the paper with contributions
from all other authors.
The authors declare that they have no conflict of
interest.
Acknowledgements
Emanuele Bevacqua received funding from the Volkswagen Foundation's CE:LLO
project (Az.: 88468), which also supported project meetings. The authors
would like to thank Arnoldo Frigessi for hosting them, and for fruitful
discussions at the Norwegian Computing Center. Emanuele Bevacqua would like
to thank Colin Manning for the productive discussions, and contributions
during the writing process. The authors would like to thank the anonymous
reviewers for their valuable comments and suggestions which contributed to
improving the quality of the paper. The data used for sea and river levels
have been provided by the Italian National Institute for Environmental
Protection and Research (ISPRA) and Arpae Emilia-Romagna. Edited by: D. Koutsoyiannis Reviewed by: J.
Zscheischler and two anonymous referees
ReferencesAas, K., Czado, C., Frigessi, A., and Bakken, H.: Pair-copula constructions
of multiple dependence, Insurance: Mathematics and Economics, 44, 182–198,
10.1016/j.insmatheco.2007.02.001, 2009.Acar, E. F., Genest, C., and Nešlehová, J.: Beyond simplified
pair-copula constructions, J. Multivariate Anal., 110, 74–90,
10.1016/j.jmva.2012.02.001, 2012.Aghakouchak, A., Cheng, L., Mazdiyasni, O., and Farahmand, A.: Global warming
and changes in risk of concurrent climate extremes: Insights from the 2014
California drought, Geophys. Res. Lett., 41, 8847–8852,
10.1002/2014gl062308, 2014.
Arpa Emilia-Romagna: Servizio IdroMeteoClima, Unità Radarmeteorologia,
Radarpluviometria, Nowcasting e Reti non convenzionali, Area Centro
Funzionale e Sala Operativa Previsioni, Unità gestione Rete
idrometeorologica RIRER,Area Modellistica Meteo: Rapporto dell'evento
meteorologico del 5 e 6 febbraio 2015, Bologna, Italy, 2015.
Bedford, T. and Cooke, R. M.: Monte Carlo simulation of vine dependent random
variables for applications in uncertainty analysis, Proceedings of the
European Conference on Safety and Reliability 2001, Turin, Italy, 2001a.Bedford, T. and Cooke, R. M.: Probability density decomposition for
conditionally dependent random variables modeled by vines, Ann. Math. Artif.
Intel., 32, 245–268, 10.1023/A:1016725902970, 2001b.Bedford, T. and Cooke, R. M.: Vines–a new graphical model for dependent
random variables, Ann. Stat., 30, 1031–1068, 10.1214/aos/1031689016,
2002.Bevacqua, E.: CDVineCopulaConditional: Sampling from Conditional C- and
D-Vine Copulas, R package version 0.1.0,
https://CRAN.R-project.org/package=CDVineCopulaConditional, last
access: 1 June 2017.Brechmann, E. C., Hendrich, K., and Czado, C.: Conditional copula simulation
for systemic risk stress testing, Insurance: Mathematics and Economics, 53,
722–732, 10.1016/j.insmatheco.2013.09.009, 2013.
Carbognin, L., Teatini, P., Tosi, L., Strozzi, T., and Tomasin, A.: Present
Relative Sea Level Rise in the Northern Adriatic Coastal Area, In: Coastal
and marine spatial planning. Marine Research at CNR, DTA/06, CNR –
Dipartimento Scienze del Sistema Terra e Tecnologie, Roma, 1147–1162, 2011.
Coles, S.: An introduction to statistical modeling of extreme values,
Springer, London, 47–49, 2001.Dee, D. P., Uppala, S. M., Simmons, A. J., Berrisford, P., Poli, P.,
Kobayashi, S., Andrae, U., Balmaseda, M. A., Balsamo, G., Bauer, P.,
Bechtold, P., Beljaars, A. C. M., Berg, L. V. D., Bidlot, J., Bormann, N.,
Delsol, C., Dragani, R., Fuentes, M., Geer, A. J., Haimberger, L., Healy, S.
B., Hersbach, H., Hólm, E. V., Isaksen, L., Kållberg, P., Köhler,
M., Matricardi, M., Mcnally, A. P., Monge-Sanz, B. M., Morcrette, J.-J.,
Park, B.-K., Peubey, C., Rosnay, P. D., Tavolato, C., Thépaut, J.-N., and
Vitart, F.: The ERA-Interim reanalysis: configuration and performance of the
data assimilation system, Q. J. Roy. Meteor. Soc., 137, 553–597,
10.1002/qj.828, 2011.De Michele, C., Salvadori, G., Vezzoli, R., and Pecora, S.: Multivariate
assessment of droughts: Frequency analysis and dynamic return period, Water
Resour. Res., 49, 6985–6994, 10.1002/wrcr.20551, 2013.Fischer, E. M., Seneviratne, S. I., Vidale, P. L., Lüthi, D., and
Schär, C.: Soil Moisture-Atmosphere Interactions during the 2003 European
Summer Heat Wave, J. Climate, 20, 5081–5099, 10.1175/jcli4288.1, 2007.
Flato, G., Marotzke, J., Abiodun, B., Braconnot, P., Chou, S. C., Collins,
W., Cox, P., Driouech, F., Emori, S., Eyring, V., Forest, C., Gleckler, P.,
Guilyardi, E., Jakob, C., Kattsov, V., Reason, C., and Rummukainen, M.:
Evaluation of Climate Models, in: Climate Change 2013: The Physical Science
Basis,
Contribution of Working Group I to the Fifth Assessment Report of
the Intergovernmental Panel on Climate Change, edited by: Stocker, T. F.,
Qin, D., Plattner, G.-K., Tignor, M., Allen, S. K., Boschung, J., Nauels, A.,
Xia, Y., Bex, V., and Midgley, P. M., Cambridge University Press,
Cambridge, United Kingdom and New York, NY, USA, 790–791, 2013.Gambolati, G., Teatini, P., and Gonella, M.: GIS simulations of the
inundation risk in the coastal lowlands of the Northern Adriatic Sea, Math.
Comput. Model., 35, 963–972, 10.1016/s0895-7177(02)00063-8, 2002.Genest, C. and Favre, A.-C.: Everything You Always Wanted to Know about
Copula Modeling but Were Afraid to Ask, J. Hydrol. Eng., 12, 347–368,
10.1061/(asce)1084-0699(2007)12:4(347), 2007.Genest, C., Ghoudi, K., and Rivest, L.-P.: A semiparametric estimation
procedure of dependence parameters in multivariate families of distributions,
Biometrika, 82, 543–552, 10.1093/biomet/82.3.543, 1995.
Giorgi, F., Jones, C., and Asrar, G. R.: Addressing climate information needs
at the regional level: the CORDEX framework, WMO Bulletin, 58, 175–183,
2009.Gräler, B., van den Berg, M. J., Vandenberghe, S., Petroselli, A.,
Grimaldi, S., De Baets, B., and Verhoest, N. E. C.: Multivariate return
periods in hydrology: a critical and practical review focusing on synthetic
design hydrograph estimation, Hydrol. Earth Syst. Sci., 17, 1281–1296,
10.5194/hess-17-1281-2013, 2013.Gräler, B., Petroselli, A., Grimaldi, S., De Baets, B., and Verhoest, N.:
An update on multivariate return periods in hydrology, Proc. IAHS, 373,
175–178, 10.5194/piahs-373-175-2016, 2016.Heffernan, J. E. and Stephenson, A. G.: ismev: An Introduction to Statistical
Modeling of Extreme Values. R package version 1.41,
https://CRAN.R-project.org/package=ismev (last access: 1 June 2017),
2016.Hobæk Haff, I.: Comparison of estimators for pair-copula constructions,
J. Multivariate Anal., 110, 91–105, 10.1016/j.jmva.2011.08.013, 2012.Hobæk Haff, I.: Parameter estimation for pair-copula constructions,
Bernoulli, 19, 462–491, 10.3150/12-bej413, 2013.Hobæk Haff, I., Aas, K., and Frigessi, A.: On the simplified pair-copula
construction – Simply useful or too simplistic?, J. Multivariate Anal., 101,
1296–1310, 10.1016/j.jmva.2009.12.001, 2010.Hobæk Haff, I., Frigessi, A., and Maraun, D.: How well do regional
climate models simulate the spatial dependence of precipitation? An
application of pair-copula constructions, Journal of Geophysical Research:
Atmospheres J. Geophys. Res.-Atmos., 120, 2624–2646,
10.1002/2014jd022748, 2015.Joe, H.: Families of m-variate distributions with given margins and
m(m-1)/2 bivariate dependence parameters, Institute of Mathematical
Statistics Lecture Notes – Monograph Series Distributions with fixed
marginals and related topics, 120–141, 10.1214/lnms/1215452614, 1996.
Joe, H.: Multivariate Models and Multivariate Dependence Concepts, Taylor
Francis Ltd, United States, 2014.Kew, S. F., Selten, F. M., Lenderink, G., and Hazeleger, W.: The simultaneous
occurrence of surge and discharge extremes for the Rhine delta, Nat. Hazards
Earth Syst. Sci., 13, 2017–2029, 10.5194/nhess-13-2017-2013, 2013.Klerk, W. J., Winsemius, H. C., Verseveld, W. J. V., Bakker, A. M. R., and
Diermanse, F. L. M.: The co-incidence of storm surges and extreme discharges
within the Rhine-Meuse Delta, Environ. Res. Lett., 10, 035005,
10.1088/1748-9326/10/3/035005, 2015.
Kurowicka, D. and Cooke, R. M.: Sampling algorithms for generating joint
uniform distributions using the vine – copula method, 3rd IASC world
conference on Computational Statistics and Data Analysis, Limassol, Cyprus,
2005.Leonard, M., Westra, S., Phatak, A., Lambert, M., Van den Hurk, B., Mcinnes,
K., Risbey, J., Schuster, S., Jakob, D., and Stafford-Smith, M.: A compound
event framework for understanding extreme impacts, WIREs Clim Change Wiley
Interdisciplinary Reviews: Climate Change, 5, 113–128, 10.1002/wcc.252,
2014.Lian, J. J., Xu, K., and Ma, C.: Joint impact of rainfall and tidal level on
flood risk in a coastal city with a complex river network: a case study of
Fuzhou City, China, Hydrol. Earth Syst. Sci., 17, 679–689,
10.5194/hess-17-679-2013, 2013.LIFE PRIMES: Preventing flooding RIsks by Making resilient communitiES:
http://ec.europa.eu/environment/life/project/Projects/index.cfm?fuseaction=search.dspPage&n_proj_id=5247,
last access: 6 December 2016a.LIFE PRIMES: Il progetto LIFE PRIMES,
http://protezionecivile.regione.emilia-romagna.it/life-primes/progetto/progetto-life-primes/il-progetto-life-primes,
last access: 6 December 2016b.Liu, Z., Zhou, P., Chen, X., and Guan, Y.: A multivariate conditional model
for streamflow prediction and spatial precipitation refinement, J. Geophys.
Res.-Atmos., 120, 10116–10129, 10.1002/2015jd023787, 2015.Maraun, D., Wetterhall, F., Ireson, A. M., Chandler, R. E., Kendon, E. J.,
Widmann, M., Brienen, S., Rust, H. W., Sauter, T., Themeßl, M., Venema,
V. K. C., Chun, K. P., Goodess, C. M., Jones, R. G., Onof, C., Vrac, M., and
Thiele-Eich, I.: Precipitation downscaling under climate change: Recent
developments to bridge the gap between dynamical models and the end user,
Rev. Geophys., 48, RG3003, 10.1029/2009rg000314, 2010.Maraun, D.: Reply to “Comment on “Bias Correction, Quantile Mapping, and
Downscaling: Revisiting the Inflation Issue””, J. Climate, 27, 1821–1825,
10.1175/jcli-d-13-00307.1, 2014.Masina, M., Lamberti, A., and Archetti, R.: Coastal flooding: A copula based
approach for estimating the joint probability of water levels and waves,
Coast. Eng., 97, 37–52, 10.1016/j.coastaleng.2014.12.010, 2015.Materia, S., Dirmeyer, P. A., Guo, Z., Alessandri, A., and Navarra, A.: The
Sensitivity of Simulated River Discharge to Land Surface Representation and
Meteorological Forcings, J. Hydrometeorol., 11, 334–351,
10.1175/2009jhm1162.1, 2010.
Nelsen, R. B.: An introduction to copulas, Springer, New York, 2006.NOAA, Tides and Currents: https://tidesandcurrents.noaa.gov/, last
access: 14 March 2017.Pathiraja, S., Westra, S., and Sharma, A.: Why continuous simulation? The
role of antecedent moisture in design flood estimation, Water Resour. Res.,
48, W06534, 10.1029/2011wr010997, 2012.R Core Team: R: A language and environment for statistical computing. R
Foundation for Statistical Computing, Vienna, Austria,
https://www.R-project.org/ (last access: 1 June 2017), 2016.Saghafian, B. and Mehdikhani, H.: Drought characterization using a new
copula-based trivariate approach, Nat. Hazards, 72, 1391–1407,
10.1007/s11069-013-0921-6, 2013.Salvadori, G. and De Michele, C.: On the Use of Copulas in Hydrology: Theory
and Practice, J. Hydrol. Eng., 12, 369–380,
10.1061/(asce)1084-0699(2007)12:4(369), 2007.
Salvadori, G., De Michele, C., Kottegoda, N. T., and Rosso, R.: Extremes in
nature: an approach using Copulas, Springer, Dordrecht, the Netherlands,
2007.Salvadori, G., De Michele, C., and Durante, F.: On the return period and
design in a multivariate framework, Hydrol. Earth Syst. Sci., 15, 3293–3305,
10.5194/hess-15-3293-2011, 2011.Salvadori, G., Durante, F., Tomasicchio, G., and D'alessandro, F.: Practical
guidelines for the multivariate assessment of the structural risk in coastal
and off-shore engineering, Coast. Eng., 95, 77–83,
10.1016/j.coastaleng.2014.09.007, 2015.Salvadori, G., Durante, F., Michele, C. D., Bernardi, M., and Petrella, L.: A
multivariate copula-based framework for dealing with hazard scenarios and
failure probabilities, Water Resour. Res., 52, 3701–3721,
10.1002/2015wr017225, 2016.Schepsmeier, U., Stoeber, J., Brechmann, E. C., Graeler, B., Nagler, T., and
Erhardt, T.: VineCopula: Statistical Inference of Vine Copulas, R
package version 2.0.5, https://CRAN.R-project.org/package=VineCopula
(last access: 1 June 2017), 2016.Schölzel, C. and Friederichs, P.: Multivariate non-normally distributed
random variables in climate research – introduction to the copula approach,
Nonlin. Processes Geophys., 15, 761–772, 10.5194/npg-15-761-2008, 2008.Seneviratne, S. I., Corti, T., Davin, E. L., Hirschi, M., Jaeger, E. B.,
Lehner, I., Orlowsky, B., and Teuling, A. J.: Investigating soil
moisture-climate interactions in a changing climate: A review, Earth-Sci.
Rev., 99, 125–161, 10.1016/j.earscirev.2010.02.004, 2010.Seneviratne, S. I., Nicholls, N., Easterling, D., Goodess, C. M., Kanae, S.,
Kossin, J., Luo, Y., Marengo, J., Mcinnes, K., Rahimi, M., Reichstein, M.,
Sorteberg, A., Vera, C., Zhang, X., Rusticucci, M., Semenov, V., Alexander,
L. V., Allen, S., Benito, G., Cavazos, T., Clague, J., Conway, D.,
Della-Marta, P. M., Gerber, M., Gong, S., Goswami, B. N., Hemer, M., Huggel,
C., Van den Hurk, B., Kharin, V. V., Kitoh, A., Tank, A. M. K., Li, G.,
Mason, S., Mcguire, W., Oldenborgh, G. J. V., Orlowsky, B., Smith, S., Thiaw,
W., Velegrakis, A., Yiou, P., Zhang, T., Zhou, T., and Zwiers, F. W.: Changes
in Climate Extremes and their Impacts on the Natural Physical Environment,
Managing the Risks of Extreme Events and Disasters to Advance Climate Change
Adaptation Special Report of the Intergovernmental Panel on Climate Change,
109–230, 10.1017/cbo9781139177245.006, 2012.Serinaldi, F.: Can we tell more than we can know? The limits of bivariate
drought analyses in the United States, Stoch. Env. Res. Risk A., 30, 1691,
10.1007/s00477-015-1124-3, 2015.Serinaldi, F., Bonaccorso, B., Cancelliere, A., and Grimaldi, S.:
Probabilistic characterization of drought properties through copulas, Phys.
Chem. Earth, 34, 596–605,
10.1016/j.pce.2008.09.004, 2009.Shiau, J. T.: Return period of bivariate distributed extreme hydrological
events, Stoch. Env. Res. Risk A., 17, 42–57, 10.1007/s00477-003-0125-9,
2003.Shiau, J.-T., Feng, S., and Nadarajah, S.: Assessment of hydrological
droughts for the Yellow River, China, using copulas, Hydrol. Process., 21,
2157–2163, 10.1002/hyp.6400, 2007.
Sklar, A.: Fonctions de Répartition à Dimensions et Leurs marges, 8,
Publications de l'Institut de Statistique de L'Université de Paris,
Paris, France, 1959.Stöber, J., Joe, H., and Czado, C.: Simplified pair copula constructions
– Limitations and extensions, J. Multivariate Anal., 119, 101–118,
10.1016/j.jmva.2013.04.014, 2013.Svensson, C. and Jones, D. A.: Dependence between extreme sea surge, river
flow and precipitation in eastern Britain, Int. J. Climatol., 22, 1149–1168,
10.1002/joc.794, 2002.Taylor, K. E., Stouffer, R. J., and Meehl, G. A.: An Overview of CMIP5 and
the Experiment Design, B. Am. Meteorol. Soc., 93, 485–498,
10.1175/bams-d-11-00094.1, 2012.Tisseuil, C., Vrac, M., Lek, S., and Wade, A. J.: Statistical downscaling of
river flows, J. Hydrol., 385, 279–291, 10.1016/j.jhydrol.2010.02.030,
2010.Tsimplis, M. N. and Woodworth, P. L.: The global distribution of the seasonal
sea level cycle calculated from coastal tide gauge data, J. Geophys. Res.,
99, 16031, 10.1029/94jc01115, 1994.Van Den Brink, H. W., Können, G. P., Opsteegh, J. D., Oldenborgh, G. J.
V., and Burgers, G.: Improving 104-year surge level estimates using data
of the ECMWF seasonal prediction system, Geophys. Res. Lett., 31, L17210,
10.1029/2004gl020610, 2004.Van Den Brink, H. W., Können, G. P., Opsteegh, J. D., Oldenborgh, G. J.
V., and Burgers, G.: Estimating return periods of extreme events from ECMWF
seasonal forecast ensembles, Int. J. Climatol., 25, 1345–1354,
10.1002/joc.1155, 2005.Van den Hurk, B., Meijgaard, E. V., Valk, P. D., Heeringen, K.-J. V., and
Gooijer, J.: Analysis of a compounding surge and precipitation event in the
Netherlands, Environ. Res. Lett., 10, 035001,
10.1088/1748-9326/10/3/035001, 2015.Wahl, T., Jain, S., Bender, J., Meyers, S. D., and Luther, M. E.: Increasing
risk of compound flooding from storm surge and rainfall for major US cities,
Nature Climate Change, 5, 1093–1097, 10.1038/nclimate2736, 2015.Widmann, M.: One-Dimensional CCA and SVD, and Their Relationship to
Regression Maps, J. Climate, 18, 2785–2792, 10.1175/jcli3424.1, 2005.Volpi, E. and Fiori, A.: Hydraulic structures subject to bivariate
hydrological loads: Return period, design, and risk assessment, Water Resour.
Res., 50, 885–897, 10.1002/2013wr014214, 2014.Zheng, F., Westra, S., and Sisson, S. A.: Quantifying the dependence between
extreme rainfall and storm surge in the coastal zone, J. Hydrol., 505,
172–187, 10.1016/j.jhydrol.2013.09.054, 2013.Zheng, F., Westra, S., Leonard, M., and Sisson, S. A.: Modeling dependence
between extreme rainfall and storm surge to estimate coastal flooding risk,
Water Resour. Res., 50, 2050–2071,
10.1002/2013wr014616, 2014.Zheng, F., Leonard, M., and Westra, S.: Efficient joint probability analysis
of flood risk, J. Hydroinform., 17, 584, 584–597,
10.2166/hydro.2015.052, 2015.