Joint return periods in hydrology : a critical and practical review focusing on synthetic design hydrograph estimation

Introduction Conclusions References


Introduction
One of the most important aims of hydrological studies is to provide design variables for diverse engineering projects.Recently, there is an increasing interest in, and need for, simultaneously considering multiple design variables, which are likely to be associated with each other.In hydrology and hydraulics, several applications including sewer systems, dams and flood risk mapping require the selection of storm or hydrograph attributes with a predefined return period.Standard hydrological design methods are mostly based on well-established univariate frequency analysis methods.Notwithstanding this, methods to describe hydrological phenomena involving multiple variables have recently been proposed, aiding the practitioners to estimate multivariate return periods.In literature, as will be described later on, several methods have evolved over the years.However, it is not clear how these methods compare to each other and which one is the most appropriate.
Recent developments in statistical hydrology have shown the great potential of copulas for the construction of multivariate cumulative distribution functions (CDFs) and for carrying out a multivariate frequency analysis (Favre et al., 2004;Salvadori, 2004;Salvadori andDe Michele, 2004, 2007;Salvadori et al., 2007Salvadori et al., , 2011;;Genest and Favre, 2007;Vandenberghe et al., 2011).Copulas are functions that combine several univariate marginal cumulative distribution functions into their joint cumulative distribution function.Copulas describe the dependence structure between random variables and allow for the calculation of joint probabilities, independently of the marginal behaviour of the involved variables.For more theoretical details, we refer to Sklar (1959) and Nelsen (2006).Several studies have been dedicated to the frequency analysis of multivariate hydrological phenomena such as storms and floods, often within the context of design.However, limited applications have been developed with more than two variables (e.g., Vandenberghe et al., 2010;Pinya et al., 2009;Kao andGovindaraju, 2008, 2007;Genest et al., 2007;Serinaldi and Grimaldi, 2007;Zhang and Singh, 2007;Grimaldi and Serinaldi, 2006a,b).For a complete and continuously updated list of papers about copula applications in hydrology see the website of the International Commission on Statistical Hydrology of International Association of Hydrological Sciences.
Multivariate frequency analysis is becoming more and more widespread and several papers provide insight into generalizations of the univariate case and into new definitions of the multivariate return period (see e.g., Salvadori et al., 2011;Salvadori and De Michele, 2004;Shiau, 2003;Yue and Rasmussen, 2002).Since some of the proposed approaches are in contradiction and others are introduced within specific Figures contexts, there exists a need to clarify the definitions provided so far and to highlight their differences.In this paper, the construction of multivariate distribution functions based on paircopulas (Aas et al., 2009) is first briefly introduced (Sect.2.2) followed by an overview of several methods for defining joint return periods (JRPs,Sect. 3).Subsequently, a case study focusing on the selection of a design hydrograph is presented, which will serve as a test case for evaluating the differences between the joint return period definitions.
Section 4 provides all details on the practical context of this case study.Then, in Sect.5, extreme discharge events are selected and their most important variables such as peak discharge, volume and duration are analyzed, as they form the basis of the analysis.
Finally, Sect.6 deals with evaluating the performance and differences between several JRP methods in quantifying design hydrograph characteristics and highlights important issues for practitioners concerned with multivariate frequency analyses in hydrology.

Choice of construction method
Most of the copula-based research in hydrology addresses the application of twodimensional copulas, for which several fitting and evaluation criteria are becoming more and more widespread.In contrast, the use of multi-dimensional copulas remains a more challenging task.Only a few hydrological studies address this issue and almost always face severe (practical) drawbacks of the available high-dimensional copula families.Most work has been done in the trivariate analysis of rainfall (e.g., Zhang and Singh, 2007;Kao and Govindaraju, 2008;Salvadori and De Michele, 2006;Grimaldi and Serinaldi, 2006b), floods (e.g., Serinaldi and Grimaldi, 2007;Genest et al., 2007) and droughts (e.g., Song and Singh, 2010;Wong et al., 2010).Recently, a flexible construction method for high-dimensional copulas, based on the mixing of (conditional) two-dimensional copulas, has been introduced and has been shown to have a large Figures potential for hydrological applications.In literature, this construction is known as the pair-copula or vine-copula construction (see e.g., Kurowicka and Cooke, 2007;Aas et al., 2009;Aas and Berg, 2009;Hobaek Haff et al., 2010).The underlying theory for the vine-copula construction is described in Bedford andCooke (2001, 2002).This construction method originates from work presented by Joe (1997) on which also the method of "conditional mixtures", as applied by De Michele et al. (2007), is based.In this paper, the vine-copula method will be used to construct the three-dimensional copula for peak discharge Q p , volume V p and duration D. The construction and fitting is discussed in the next section.

Construction of a 3-D copula
In this paper, the focus will be on a three-dimensional vine-copula joining the three marginal distributions of three random variables X , Y and Z.In general, the approach can be extended to any number of dimensions, although limitations may be introduced by the computational power and data available.In the following, we assume that the samples of all three variables have each been transformed using the following rankorder-transformation S in order to obtain the marginal empirical distribution functions: where n denotes the number of observations for the given variable.We denote the transformed variables by U, V and W so that all three variables are now approximately uniformly distributed on [0, 1].
The basic idea of vine-copulas is to construct high-dimensional copulas based on a stagewise mixing of (conditional) bivariate copulas.This corresponds to decomposing the full density function into a product of low-dimensional density functions.At the base of the construction all relevant pairwise dependences are modelled with bivariate cop- is called a canonical or C-vine.If all mutual dependences are considered one after the other, i.e. the first with the second one, the third with the fourth one, etc., this is called a D-vine.C-and D-vines are special cases of regular vines, the latter being all possible pairwise decompositions.In the three-dimensional case there is no difference between a C-or a D-vine, only the ordering of variables can be changed.
Figure 1 illustrates the concept for constructing a three-dimensional vine-copula.In the first tree, three variables U, V , W are given, and their pairwise dependences are captured by the bivariate copulas C UV and C V W .These bivariate copulas can be conditioned for a specific value of the second variable through partial differentiation (Aas et al., 2009).This conditioning is indicated by dashed arrows and results in the conditional cumulative distribution functions F U|V and F W |V (see Eq. 2).
In the second tree, the two conditional CDF values are calculated for all triplets (u i , v i , w i ).These "conditioned observations", which are again approximately uniformly distributed on [0, 1], are then used to fit another bivariate copula C UW |V to.This copula can also be conditioned through partial differentiation to F U|V W .This conditional CDF will be of use for simulation purposes (Aas et al., 2009).The full density function c UV W of the three-dimensional copula is thus given by: It should be noted that the choice of the conditioning variable (i.e.V ) is not unique and different choices might lead to different results.In this paper, the ordering of variables is based on the two bivariate copulas C UV and C V W that fitted best considering the investigated copula families.Thus, in order to derive the building blocks of the three-dimensional copula, three bivariate copulas C UV , C V W and C UW |V need to be fitted.This is done stagewise and one can choose any of the available methods in literature.Here, each bivariate copula is fitted by means of the maximum likelihood method, considering different copula families.The best fit is determined by the highest log-likelihood value (see Sect. 5.3).
Several goodness-of-fit tests can be considered to validate the fitted bivariate copulas.In this paper, the chosen goodness-of-fit test is the A7 approach appearing in Berg (2009) and originating from Panchenko (2005).The advantage of this approach is that it estimates the distance between the two multivariate distribution functions without the need of any explicit dimension reduction, i.e. it is directly based on a comparison of observed pseudo-observations and simulated pseudo-observations under the null hypothesis.A bootstrap approach is taken to obtain the distribution of this test statistic under the null hypothesis.The original procedure as proposed by Berg (2009) is slightly altered in this paper as the test statistic of the hypothesis is averaged over the same number of simulations that are conducted during the bootstrap.A p-value estimate is derived from the fraction of test statistics exceeding this mean test statistic.
Combining the bivariate copulas as in Eq. (3) and substituting the marginal distribution functions F X , F Y and F Z yields the three-dimensional distribution function of (X , Y , Z).Let f X , f Y and f Z denote the marginal density functions and define The full density function f X Y Z of the distribution for any triplet (x, y, z) is then given by: The estimations in this paper have been done using R (R Development Core Team, 2011), a free software environment for statistical computing, and the package spcopula1 building on the packages copula (Kojadinovic and Yan, 2010) and CDVine (Schepsmeier and Brechmann, 2011).The R-scripts are available upon request from the authors.A demo related to this paper will be available in the spcopula package.In literature, different ways for obtaining design events for a given design return period exist.The following sections provide a short overview of the most popular ways to define joint return periods, focusing on how the design event for a given return period should be calculated.An ensemble-based design approach, in contrast to a single design event, will also be introduced.

JRP based on a regression analysis
A first method is based on a univariate frequency analysis.First, the driving variable X , i.e. the variable with a prominent role in the design, is chosen.Then a design return period T is fixed, and given the marginal cumulative distribution of the design variable F X (x) the corresponding design quantile x T is sought, based on Eq. ( 5), with µ T the mean interarrival time (yr).In the case of yearly maxima, µ T equals 1 yr.Then, based on a linear regression of X with the other design variable Y , the second design value y T is obtained (see Eq. 6).This approach is taken by e.g., Grimaldi et al. (2012c): and with a and b constants.

JRP based on a bivariate conditional distribution
A second method consists in conditioning the bivariate cumulative distribution function (CDF) F X Y (x, y) for the design quantile x T corresponding to the chosen univariate design return period T (see Eq. F Y |X (y|x = x T ) can then be used to calculate, similarly as in Eq. ( 5), the value y T for the same univariate design return period T .Advantage will be taken of the bivariate copula C X Y (x, y) to perform the calculation.With u T = F X (x T ) and v T = F Y (y T ) the procedure can be expressed as follows.We can rewrite the initial definition in terms of a copula with U := F X (X ) and Inverse transformation yields: It should be noted that this method does not result in a real bivariate design event having a joint return period in the strict sense.The bivariate distribution is conditioned for the quantile of interest to the practitioner (corresponding with a univariate return period).This conditioned distribution is then used to obtain the other quantile, again based on the principles of a univariate return period.Therefore, the two obtained design quantiles x T and y T should not be considered as a real joint design event.

JRP based on a bivariate joint distribution
Instead of using a conditional CDF, a widely used method to calculate a bivariate return period can be followed which exploits the full bivariate CDF F X Y (x, y).This can easily Figures be expressed by means of a bivariate copula C UV (u, v) with U := F X (X ) and V := F Y (Y ) as before: This method is in fact an intuitive extension of the definition of a univariate return period.All couples (u, v) that are at the same probability level t = C UV (u, v) of the copula will have the same bivariate return period.For a given design return period, the corresponding level t can easily be calculated, whereafter one design point (u T , v T ) at this level can be obtained by selecting the point with the largest joint probability: The corresponding design values x T and y T are easily calculated through the inverse CDFs:

JRP based on a copula's Kendall distribution function
Another definition of the bivariate return period is given by Salvadori and De Michele (2004); Salvadori (2004); Salvadori et al. (2007).Recently, the concept of this bivariate secondary return period was extended to a complete multidimensional setting by Salvadori et al. (2011), called "Kendall return period".This return period corresponds to the mean interarrival time of events more critical than the design event, the so-called "super-critical" or "dangerous" events.This partitioning of the probability distribution into a super-critical and non-critical region is based on the Kendall distribution function K C .This function is a univariate representation of multivariate information as it is the CDF of the copula values: K C (t) = P{C(u, v) ≤ t}.It allows for the calculation of the probability that a random point (u, v) in the unit square has a smaller (or larger) copula value than a given critical probability level t.Introduction

Conclusions References
Tables Figures

Back Close
Full The use of the Kendall distribution function to define the probability measure for calculating a JRP is advocated by Salvadori et al. (2011) as it is a theoretically sound multivariate approach sharing the notion of a critical layer, defined through the cumulative distribution function, with the univariate approach.The definition of the return period in both the univariate and in the multivariate Kendall approach is characterized by making a distinction between super-critical and non-critical events based on a critical cumulative probability level.The only way to extend this to a multivariate context is by using the Kendall distribution function.Probability measures that are constructed differently always entail events that will have a joint cumulative distribution function value that is larger or smaller than the critical probability level, and thus fail in subdividing the space between super-critical and non-critical events with respect to the joint cumulative distribution function.Following this avenue, any critical probability level t uniquely corresponds to a subdivision of the space into super-critical and non-critical regions.This is different from the copula approach mentioned before, where in general different choices of critical events from the same critical probability level t subdivide the space differently.
For any given copula of any dimension, the Kendall distribution function can be calculated either analytically (e.g., for Archimedean copulas) or estimated numerically, and can thus be used to calculate the Kendall return period.Until now, only a very limited number of studies actually applied this kind of return period (e.g., Vandenberghe et al., 2010).In the following sections, the procedure for the two-and three-dimensional cases is outlined.

Two-dimensional case
After choosing the design return period T , the corresponding probability level t of the copula can be calculated by means of the inverse of the two-dimensional Kendall distribution function (Eq.11).In 2-D this corresponds to finding an isoline on the copula.Introduction

Conclusions References
Tables Figures

Back Close
Full When no analytical expression for K C is available, the inverse can be calculated numerically based on an extensive simulation algorithm, described in Salvadori et al. (2011).
Once t is known, the most likely design event in the unit square (u T , v T ) is selected on the corresponding isoline in the same way as described by Eq. ( 9).Through the use of the inverse of the marginal CDFs the corresponding design event (x T , y T ) is then found.

Three-dimensional case
In three dimensions, the corresponding probability level t should be found again in the same way as in Eq. ( 11).To calculate the inverse of the function K C , one needs to rely on a numerical method as for instance described by Salvadori et al. (2011).However, in contrast to the two-dimensional case, the probability level t corresponds to an isosurface, i.e. all triplets (u, v, w) on this surface have the same copula value t.
Generally, for an n-dimensional copula an n−1 dimensional isohypersurface exists that contains all n-dimensional points with the same copula level t.A single design event (u T , v T , w T ) should again be selected on this isosurface.Therefore the point (u, v, w) with the highest likelihood is selected.In fact this is the three-dimensional extension of the approach given in Eq. ( 9), i.e.:

Conclusions References
Tables Figures

Back Close
Full

JRP and ensembles of design events
From Sects.3.3 and 3.4 it should be clear that for a design event characterized by several variables, one has to select an event out of a range of events which all share the same JRP.The selection of merely one event sensibly reduces the amount of information that can be obtained by the multivariate approach chosen.The importance of an ensemble-based approach has already been stressed by Salvadori et al. (2011).Vandenberghe et al. (2010) provided a first attempt to benefit from the richness of an ensemble of critical values in a practical context.In this section, a new approach is introduced to obtain an ensemble of statistically similar design events, i.e. that have the same JRP, but that could affect the design in different ways.
Consider first the bivariate case, in which the JRP methods based on copulas (Sect.3.3) and based on the Kendall distribution function (Sect.3.4.1)result in the finding of a contour level t = C(u, v) on which all pairs (u, v) have the same JRP.
Instead of using Eq. ( 9) to select the most likely point, the full likelihood function f X Y over the t-isoline could be seen as a univariate density function (PDF) out of which an ensemble of pairs can be sampled.Generally, not all pairs (u, v) on the t-isoline have the same likelihood, i.e. pairs on the edges are less likely than pairs closer to the center of the isoline.In this way, sampling according to f X Y makes more sense from a practical point of view than uniformly sampling over the isoline (as done by Vandenberghe et al., 2010).
Eventually, one will end up with an ensemble of (u i , v i )-pairs (with i ranging from 1 to n, the ensemble size).By means of the inverse marginal CDFs, these pairs are easily transformed to real values.This ensemble could then be used to run simulations from which detailed information on the uncertainty of specific design parameters can be assessed.As an example, one could route an ensemble of 1000 pairs of peak discharge and volume through a dam model and consider the water height in the reservoir.Using just one design event, only one single water height is obtained.However, using the ensemble, information on the range and likelihood of possible water heights for the Figures given design JRP is obtained, making it possible to incorporate the uncertainty in the design.
In the trivariate case (see Sect. 3.4.2) no isoline is obtained but an isosurface.Similar to the two-dimensional case, the full likelihood over this isosurface could be seen as a bivariate density function out of which an ensemble of triplets can be sampled.The higher the dimensionality of the design problem, the more advantageous the ensemble approach becomes: in three dimensions more information is lost than in two dimensions by selecting just one design event.The drawback of the ensemble approach is the increasing need for run time when higher dimensionalities are considered.

Experimental set-up
In order to provide an exhaustive comparison of the joint return period estimation methods described in the previous sections, a simulation experiment is set up and analyzed with respect to the synthetic design hydrograph (SDH) attributes.
The SDH is defined as a hydrograph with an assigned return period, which can be characterized by random variables such as the flood peak Q p , the volume V p and the duration D. Specifically, given an observed or simulated runoff time series from which a set of extreme hydrographs is selected, one can determine the SDH shape in several ways (see Serinaldi and Grimaldi (2011) and references therein).In a two-dimensional setup, two hydrograph parameters (peak-volume, peak-duration or volume-duration) should be fixed, while the third one is obtained from the chosen hydrograph shape distribution.In a three-dimensional set-up, the three characteristic parameters are obtained jointly.
In most common hydrological applications the interest is in the flood peak (Q p ) and volume (V p ). Consequently, the two-dimensional analyses in this paper focus on these variables.Serinaldi and Grimaldi (2011)  values for Q p and V p .They apply a univariate statistical analysis on Q p to obtain the design value for the peak discharge with a given return period, and then define the corresponding design volume through a linear regression analysis between Q p and V p .This approach is equivalent to the method described in Sect.3.1.However, as described in Sect.3, there are several other approaches that lead to the design values for Q p and V p , including a three-dimensional approach.
The case study proposed in this paper consists of applying a continuous simulation model on a small, ungauged basin for which 500 yr of synthetic direct runoff time series at a 5 min resolution are simulated.From this series, the 500 maximum annual peaks are selected together with their corresponding hydrograph (identified as the continuous sequence of non-zero direct discharge values including the annual peak).Note that as direct discharge is considered, a zero discharge value does not imply a dry river.Consequently, 500 (Q p , D, V p ) triplets are available to which the described JRP methods are applied.By considering a real case study, the obtained differences can be evaluated in a practical context.In order to simulate the 500 yr runoff time series, the COSMO4SUB model, described in the following section, is applied.

The COSMO4SUB framework
The synthetic data set on which the previously described JRP estimation methods are applied is obtained through the use of the COSMO4SUB framework (Grimaldi et al., 2012d,c).COSMO4SUB is a continuous model which allows the simulation of synthetic direct runoff time series using minimal input information from rainfall data and digital terrain support.Specifically, the watershed digital elevation model (DEM) with a standard resolution used in hydrological modelling, the soil use and type, daily (preferably at least 30 yr long) and sub-daily (preferably at least 5 yr long) rainfall observations are the only data necessary to run the model.COSMO4SUB includes three modules: a rainfall time series simulator, a rainfall excess scheme, and a geomorphological rainfall-runoff model.Next, the general principles are explained and in Sect.The first module is based on a single-site copula-based daily rainfall generator (Serinaldi, 2009) and on the continuous-in-scale universal multifractal model (Schertzer and Lovejoy, 1987) for disaggregating the daily rainfall to the desired time scale (up to 5 min).The parameters included in this first module (six for each month for the daily rainfall simulator and three for the disaggregation model) are calibrated on the basis of the available rainfall observations (at two different scales).
The second module is related to the rainfall excess step.A new mixed Green Ampt-Curve Number (CN4GA Curve Number for Green Ampt) procedure was recently proposed (Grimaldi et al., 2012b) and included in the present version of the COSMO4SUB framework.The key concept is to use the initial abstraction (i.e.all the losses due to initial saturation, filling terrain gaps, interception, etc.) and the total SCS-CN excess rainfall volume to estimate the effective saturated hydraulic conductivity and the ponding time of the Green-Ampt model.Consequently, the CN4GA approach tries to appropriately distribute the volume estimated by the SCS-CN method over time.This module is characterized by five parameters (specified in Sect.5.1) which are empirically assigned using the soil use and soil type map information.In addition, the event separation time (T s ) is included in this module since the continuous implementation of the SCS-CN method requires to fix a no-rain time interval for which the cumulative gross and excess precipitation can be reset to zero.As shown in Grimaldi et al. (2012d,c), this parameter has a limited influence on the final results and the value can be arbitrarily assigned in the range of 12-36 h.
The third module allows to carry out a continuous convolution of the rainfall excess for obtaining the direct runoff time series through an advanced version of the Width Function Instantaneous Unit Hydrograph (WFIUH).The adopted model, named WFIUH-1par (Grimaldi et al., 2010(Grimaldi et al., , 2012a)), identifies the watershed IUH through the topographic information and needs only one parameter that can be quantified referring to the watershed concentration time (T c ), estimated using empirical equations.Following the application of the three described modules a continuous runoff scenario is obtained from which maximum annual hydrographs are selected.Introduction

Conclusions References
Tables Figures

Back Close
Full

Model set-up
In order to provide a realistic scenario that can be used to evaluate the different JRP methods, the COSMO4SUB model was applied on the Torbido River, a small tributary of the Tiber River located in Central Italy (watershed area: 61.67 km 2 ).Basin elevations range from 85 to 625 m, the average slope is 22 % and the maximum distance between divide and outlet is 25.8 km.The watershed DEM at a 20-m spatial resolution was provided by the Italian Geographic Military Institute (IGMI, 2003), while land cover was extracted from the CORINE database (EEA, 2000).Observed rainfall data, useful for calibrating the two-stage rainfall simulator parameters, are available from the Castel Cellesi rain gauge station for a period of 49 yr at a daily time scale and for a period of 10 yr at a 5-min resolution (Serinaldi, 2009(Serinaldi, , 2010)).
For a description and evaluation of the 500-yr rainfall synthetic time series, we refer to Grimaldi et al. (2012c).
As mentioned in Sect.4.2, the excess rainfall time series was obtained by applying the CN4GA method and setting the percentage of potential retention that forms the initial abstraction equal to 20 % (λ = 0.2) and the Curve Number (CN) equal to 80 (estimated from NRCS (2008); Chapter 9, 2-3 pp.).Concerning Green-Ampt parameters, the following values were adopted based on the sandy loam average soil condition (EEA, 2000): 110 mm for the moving wetting front (h f ), 0.1 for the initial value of soil-water content (θ i ) and 0.5 for the field saturated soil-water content (θ s ); for the saturated hydraulic conductivity (K s ), an initial value of 10 mm h −1 was chosen and later automatically calibrated by the model (Rawls et al., 1982(Rawls et al., , 1983)).
The event separation time T s is set to 36 h based on the physical hypothesis that in this region, the initial abstraction phenomena return to the pre-event condition in 1.5 days.The WFIUH-1par and the convolution of the excess rainfall series were computed as described by Grimaldi et al. (2010Grimaldi et al. ( , 2012c,b),b).The only calibration parameter (the Introduction

Conclusions References
Tables Figures

Back Close
Full channel velocity) was estimated based on the concentration time T c = 4.5 h, which was quantified through an in Italy widely used empirical formula (Giandotti, 1934).

Annual extreme discharge events
Once the 500-yr synthetic direct runoff time series is determined, as described in Sect.4.1, the 500 maximum annual events are selected and characterized by their peak discharge Q p , volume V p and duration D. For only six years the model provides a zero direct runoff, which is reasonable considering the limited size of the watershed.These values are excluded in the following analyses.
In first instance, the marginal distribution functions of Q p , V p and D need to be estimated.As these variables are annual extreme values selected from the 500 yr discharge series, the fit of several extreme value distributions is considered, i.e. the exponential, the Weibull and the Generalised Extreme Value (GEV) distribution functions.These distributions are, respectively, a one, two and three parameter distribution, allowing for various degrees of model complexity.Furthermore, the GEV distribution generally encompasses three different distributions, namely the Fr échet, the (reversed) Weibull and Gumbel distributions either directly, or through a transformation as in the case of the Weibull distribution which corresponds to a reversed Weibull distribution.These different distribution types each represent a different kind of tail behaviour, namely a light tail (Gumbel), a heavy tail with infinite variance (Fr échet) and a bounded upper tail (Weibull).These behaviours can be separated based on the shape parameter ξ of the GEV.Furthermore, the Weibull distribution is fitted separately as well, as it only corresponds to a GEV distribution after transformation.Finally, most extreme value distributions are of the exponential type, and cannot deal with an offset, i.e. when the smallest value of the variable in the CDF is larger than zero.However, as a result of censoring the zeros, the smallest value of the variables tends to be significantly higher than zero, leading to poor fits of the CDF.Therefore, the distributions are fitted to the data after substraction of the minimum value to ensure a proper fit in the tails.Introduction

Conclusions References
Tables Figures

Back Close
Full A first test to ascertain the appropriate distribution for the three marginal variables is to display the empirical CDFs together with the directly fitted distribution.This is shown in Fig. 2, in which only the upper tail of the CDF is shown, i.e. the interval [0.80, 1] as the focus is on the extremes.It can immediately be seen that not all the distributions fit these tails equally well.This is corroborated by the Akaike Information Criterion (AIC) computed for all different models, shown in Table 1, as well as the log-likelihood of each model (not shown).Based on these criteria, we would select the Weibull distribution for Q p , the exponential distribution for V p and the GEV for D. Despite this, the GEV does not fit D well, even though various statistics showing it as the best fit.Seemingly, the lower portion of the curve is reasonably well approximated by a GEV, resulting in the significance of its fit, despite the poor representation of the tail.As the focal point of this study is the tail, we chose to select the exponential distribution because of its better fit in this region.More in-depth testing through Q-Q plots (not shown here) indicates that this is indeed a better approximation of the distribution, despite its non-significance.Hence, the following models are selected: Here, the Anderson-Darling test was used to determine whether the samples were significantly different from the fitted distributions.As mentioned earlier, only the duration was significantly different from its distribution.It should be understood that a consistency in marginal distribution functions across the different JRP estimation methods is far more important for comparison reasons than a perfect fit, considering the underlying data are simulations.
To analyse the association between the variables, which will be modelled by means of copulas, Kendall's tau is calculated and normalized rank scatterplots are evaluated for each pair of variables (Fig. 3).Evidently, there are strong positive associations.Also, some ties are present, especially for D, but they will be neglected in the further analyses.The next section deals with the modelling of these associations.

Fitting of the 2-D and 3-D copulas
As described in Sect.2.2, we use maximum likelihood estimation to fit a copula from each investigated family for every pair of variables and selected the best fitting one by the highest log-likelihood value.The copula families investigated include Normal, Student, Gumbel, Frank, Clayton, BB1, BB6, BB7, BB8 and the survival copulas of the 4 latter ones (details on all these families can be found in Nelsen, 2006;Joe, 1997).
Table 2 gives an overview of the parameters and goodness-of-fit results.The p-values are estimated from 1000 iterations each.
The following JRP methods in the two-dimensional case make use of the fitted BB7 copula C 13 which models the dependence between Q p and V p .It should be noted that this copula is not able to represent the boundary effect present in the rank-scatter plot (Fig. 3).As the BB7 copula family belongs to the class of Archimedean copulas, its Kendall distribution function can easily be obtained analytically.For the JRP method in the three-dimensional case, the three fitted bivariate copulas C 12 , C 23 and C 13|2 are then composed into the three-dimensional vine-copula as given in Eq. ( 3).For comparison purposes, three-dimensional copula fits for the 3 parameter Gaussian copula and the one parameter Clayton, Frank and Gumbel copulas have also been performed.The log-likelihoods show a 10 % increase for the fitted vine-copula (1047) with respect to the Gaussian one (935), while the three one-parameter Archimedean copulas have far smaller values (432-532).Thus, the vine-copula yields the best fit within this set of copula families, i.e. a Gaussian distribution is outperformed by our approach.As no closed form exists for the cumulative distribution function of this vine-copula, a numerical evaluation based on a sample of 100 000 points was carried out in order to be able to calculate the (inverse of the) Kendall distribution function.6 Results and discussion

Calculation of single design events
In this section, the design values for the SDH with a design return period of T = 100 yr are calculated based on the 2-D and 3-D JRP methods presented in Sect.3. The triplet (Q p , D, V p ) is considered for which the following transformations hold: As a reference, the univariate case is first analyzed.Figure 4 shows the derivation of the design quantiles for Q p , D and V p based on the inverse of the CDFs F Q p , F D and 99. Design values q p,T = 302.9m 3 s −1 , d T = 28.37 h and v p,T = 4.42 × 10 6 m 3 are obtained.In the following, Table 3 and Fig. 9 provide a way to compare these and all further derived design events.In order to be able to compare design events with the data, the simulated pairs (Q p , V p ) are visualized as gray dots in Fig. 9 that summarizes all described methods.First the two-dimensional case is considered, in which the focus is on the couple (Q p , V p ).In the regression-based JRP method (Sect.3.1) the starting point is the univariately derived quantile q p,T , being usually the driving variable in many hydrological applications (see Eq. 5).Based on a linear regression between Q p and V p , as shown in Fig. 5, the design volume v p,T is easily estimated as 3.89 × 10 6 m 3 .This volume is lower than the one obtained by a purely univariate analysis.
The second two-dimensional method is the JRP method based on the conditional copula (Sect.(w), the design volume v p,T is calculated as 6.13 × 10 6 m 3 , which is considerably larger than the former design volumes.
The JRP method based on the bivariate copula C UW (Sect.3.3) is the third 2-D approach.For T = 100 yr, the corresponding copula level t equals 0.99 and corresponds with an isoline.Using the marginal CDFs for Q p and V p Eq. ( 9) can be solved to find the point (u T , w T ) with the highest likelihood, i.e. (0.9927,0.9926).Using the inverse CDFs the design event is obtained: (q p,T , v p,T ) = (319.4m 3 s −1 , 4.71×10 6 m 3 ).Both the design peak discharge and the volume are larger than what is obtained in the univariate case.
The last two-dimensional approach is the one in which the JRP is calculated using the Kendall distribution function (Sect.3.4.1).Here, the focus is on the inverse of K C for a probability level of 0.99: t = K −1 c (0.99). Figure 7 shows the Kendall distribution function of the bivariate copula C UW , which allows to calculate the t-level corresponding to a cumulative probability of 0.99, i.e. t = 0.9823.This level is smaller than the one obtained in the former copula-based JRP method.Again, Eq. ( 9) can be solved to obtain the most likely design event (u T , w T ) = (0.9871, 0.9870).Transformation to the real domain by means of the inverse CDFs results in the design event (q p,T , v p,T ) = (290 m 3 s −1 , 4.16 × 10 6 m 3 ).
Besides the estimation of two-dimensional design events, also one method for estimating a three-dimensional design event is available.To do so, one should rely on the JRP method presented in Sect.3.4.2together with the fitted three-dimensional vine-copula (see Sect. 5.3).The three-dimensional vine-copula is used for simulating 100 000 triplets (u, v, w) as a basis for the numerical inversion of the Kendall distribution function.The latter is shown in Fig. 8: the probability level of 0.99 corresponds to a tlevel of 0.9509 on the three-dimensional vine-copula.In contrast to the two-dimensional methods, this t-level corresponds to a surface.Using the marginal CDFs in combination with Eq. ( 12), the most likely point on this surface having a joint cumulative probability density of 0.9509 is found as (u T , v T , w T ) = (0.9752, 0.9692, 0.9834).Using the inverse CDFs this results in the design event (q p,T , d T , v p,T ) = (254 m 3 s −1 , 22.4 h, 3.93×10 6 m 3 ).Introduction

Conclusions References
Tables Figures

Back Close
Full Note that the Kendall distribution function is a univariate representation of multivariate information and that its form is different in the two-dimensional and three-dimensional cases (Fig. 7 vs. 8).

Obtaining an ensemble of design events
The preceding analyses resulted in a single design event, however, as stated in Sect.3.5 the generation of an ensemble would be preferable.For example, consider the JRP method based on the Kendall distribution function in the two-dimensional case.The t-level was found to be 0.9823 for a JRP of 100 yr (see Table 3).Figure 10 shows this t-level, together with the earlier identified most likely design event (u T , w T ).However, along this contour the occurrence of several other events is possible.
A sampling across this contour according to the full likelihood function results in an ensemble of events all having a JRP equal to 100 yr.An ensemble size of 500 is chosen here.By means of the inverse marginal CDFs, this ensemble of (u, w) pairs can be transformed into (q p , v p ) pairs.The result is shown in Fig. 11.All sampled events clearly lie on a contour, corresponding with the t-contour.According to the greyscale, the highest density of design events is sampled around the most likely realization whereas less design events are sampled on the two outer limits of the contour.
The density of the ensemble across this contour could be projected (and normalized) on both the Q p and V p axis, resulting in univariate PDFs for Q p and V p in the ensemble.These are shown in Figs. 12 and 13.The most likely design event (q p,T , v p,T ) = (290 m 3 s −1 , 4.16 × 10 6 m 3 ) (see Table 3) is naturally situated at the maximum of these PDFs.In general, these conditional distributions do not have to be bounded and extremely large events might possess a positive likelihood.These PDFs hold a lot of information on the design events.For example, 90 % of all design events with JRP equal to 100 yr have a volume in the range of calculate another design variable, such as the water height in a reservoir, for which again a PDF of possible design values can easily be obtained.However, this exercise is beyond the scope of this paper.

Some practical considerations
Table 3 and Fig. 9 clearly demonstrate that the choice of the JRP method influences the parameters of the design events.This evidently is something the practitioner should consider when designing a structure, e.g., a dam, based on a specific design hydrograph, as it directly influences the safety and the cost of the structure to be built.With respect to the univariate design quantile q p T , only the copula-based JRP method provides a larger quantile, whereas for the JRP methods based on the Kendall distribution function a smaller quantile is found.The other JRP methods use the univariate quantile as a starting point, so these quantiles are the same.Considering the design quantile v p,T , all JRP methods give a smaller quantile than the univariate quantile, except for the methods based on the (conditional) copula.
In fact, the presented JRP methods include the dependence between the design variables in two different ways: both by means of a linear regression, and by means of copulas.Nowadays, the use of copulas, which are able to model non-linear dependences, is preferred above the more simplistic linear regression, so practitioners should rely less on the latter method for JRP calculations.The remaining JRP methods all use copula-based multivariate information.The method using the conditional copula is in on the copula solely.For a full discussion, please refer to Salvadori et al. (2011).In this respect, it is good to rely on the Kendall-based methods from a theoretical point of view, as they are closest related to the well-known univariate methods.However, at this stage of research no clear practical guidelines on which method to use can be made and one may also prefer the copula-based method.The most important is to be aware of the practical implications of the method chosen (outlined in this paper).
It is also evident that the more variables are included (2-D vs. 3-D), the smaller the design quantiles become as more complexity of the process of generating extremes is captured.
Furthermore, the issue of selecting just one design event out of a range of events all having the same joint return period (i.e. on an isoline or isosurface of the copula) could be seen as a drawback of the multivariate JRP methods available in literature, as the most likely event does not necessarily correspond to the most severe one for a given structure.However, there is the full potential to set a step aside from this "one-eventdesign" approach to a full ensemble-based design approach.Therefore this paper proposed a method for the generation of a design ensemble.Assuming the 2-D Kendall approach as most appropriate for defining the JRP, it is clear that the ensemble approach provides a lot more information on the possible outcome of design events.The proposed ensemble-based approach entails the most likely design event, but furthermore provides a clear idea on the probability that other events (but all having the same JRP) will occur.Checking these ensembles against the desired design of the dam will illustrate the real threat to the structure.It therefore provides a way of assessing uncertainty related to designing for a specific JRP.It should also be noted that the fitting of the copulas (bivariate, trivariate or multivariate) is a very important part of the JRP estimation.If the practitioner is not acquainted with this initial aspect of design studies, it is very easy to make wrong choices.The authors of this work believe that the vine-copula approach is the way to go for constructing flexible multivariate distribution functions, as it enables to use more widely spread bivariate copulas as building blocks for more complex multivariate distribution Introduction

Conclusions References
Tables Figures

Back Close
Full functions.Of course, a good balance between the number of variables considered and the (numerical) complexity of the vine-copula should be sought, keeping in mind that all this also affects the eventual design.Finally, a practitioner should also be very aware of the fact that the JRP methods based on the Kendall distribution function as presented here are only valid for variables that are positively associated, and with a focus on extremes in terms of large values.In all other cases, adaptations should be made in order to operate in the right "area" of the copula.Further applications of this JRP method in other case studies should provide more insight on this in the near future.

Conclusions
The aim of this study was to provide an overview of state-of-the-art methods for joint return period (JRP) estimation and to discuss their differences in a practical context.Therefore, an in-depth case study focusing on the estimation of design parameters for a synthetic design hydrograph (SDH) was considered.As they are the most important SDH variables, the peak discharge Q p , the duration D and the volume V p were chosen.
In first instance, a review of several JRP methods available in recent literature was provided focusing on how to apply these methods.As multiple variables were considered in the JRP methods, an important aspect is (the modelling of) the dependence between variables.In this context, the potential and use of copulas for the construction of multivariate distribution functions was stressed and illustrated.On the one hand a bivariate copula of (Q p , V p ) was fitted.On the other hand, also the fitting of the trivariate copula of (Q p , D, V p ) was elaborated in a comprehensive way by means of the promising pair-copula or vine-copula method.
Eventually, design events for a 100-yr joint return period were obtained considering a 2-D linear regression based, a 2-D conditional copula-based, a 2-D copula-based, purposes.Differences in design quantiles were discussed while also the theoretical appropriateness was explained.This paper warns practitioners for blind use of just one available JRP estimation method, and stresses the importance of good copula fitting and the effect on the eventual design event outcome.Based on the available literature and the case study in this paper, the JRP method based on the Kendall distribution function is probably the most valuable in a multivariate context, when applied correctly.
For constructing multivariate copulas, the vine-copula method is advised.Further (joint) research efforts should focus on a shift from one-design-event methods to ensemble-design-event methods, enabling to incorporate uncertainty in the design.A first valuable approach to this ensemble-based design was provided in this paper.The ultimate goal should be the elaboration of a useful and understandable framework for multivariate frequency analyses, with clear guidelines to practitioners.Full  Full  Full  The values are rounded to address the limited numerical precision and ease comparison.Full umber of dimensions, although limitations may be introduced by the computational power and data vailable.In the following, we assume that the samples of all three variables have each been transrmed using the following rank-order-transformation S in order to obtain the marginal empirical istribution functions: Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | ulas.If all mutual dependences are with respect to the same variable, the construction Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | 5).The resulting (univariate) conditional CDF 6789 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | T Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | illustrate a simple approach to obtain design Figures Back Close Full Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | 5.1 specific details of the calibration are presented.Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Weibull (Anderson-Darling p = 0.562), -V p : Exponential (Anderson-Darling p = 0.295), -D: Exponential (Anderson-Darling p = 0.0443).
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | 3.2).The conditioning of the bivariate copula C UW (denoted as C 13 in Sect.5.3) for u = 0.99 results in the function C W |U (w, u = 0.99) as shown in Fig.6.The value of w = 0.9983 corresponds with a probability level of 0.99.By means of Discussion Paper | Discussion Paper | Discussion Paper | [3.98 × 10 6 m 3 , 5.45 × 10 6 m 3 ] and a peak discharge in [279 m 3 s −1 , 358 m 3 s −1 ].Note from Fig. 11 that lower volumes occur together with higher peak discharge values and vice versa.As briefly mentioned in Sect.3.5, the ensemble of design events can also be used to Introduction Discussion Paper | Discussion Paper | Discussion Paper | principle different from the other three methods: the bivariate distribution function is conditioned for a univariate quantile which results in a univariate distribution function of the other design variable.Therefore a relatively larger design volume is found, compared to the standard univariate method.The three other JRP methods use information of the full bivariate (copula and 2-D Kendall method) or trivariate (3-D Kendall method) distribution function.The Kendall function based methods have the advantage of using a mathematically consistent way of defining the probability of extremes or dangerous events relying on the CDF as in the univariate approach, unlike the JRP method based Introduction Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | a 2-D Kendall distribution function-based and a 3-D Kendall distribution function-based method.The traditional 1-D method is considered as a reference for comparison Introduction Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | ig. 1.The concept of the hierarchical nesting of bivariate copulas in the construction of a 3D copula.

Fig. 9 .
Fig. 9.An overview of the different design values for T = 100 yr obtained with the different methods.

Fig. 10 .Fig. 11 .Fig. 12 .Fig. 13 .
Fig. 10.t-level of the copula C UW which corresponds to a JRP of 100 yr when considering the 2-D Kendall distribution function method, with indication of the most likely event.

Table 1 .
The values of the AIC for the various distributions of the respective variables.

Table 2 .
An overview of the fitted bivariate copulas in the 2-D copula and 3-D vine-copula approach.

Table 3 .
Overview of the calculated design event for T = 100 yr, based on several methods.