Non-stationary Extreme Value Analysis : a simplified approach for Earth science applications

Statistical approaches to study extreme events require by definition long time series of data. The climate is subject to natural and anthropogenic variations at different temporal scales, leaving their footprint on the frequency and intensity of 10 climatic and hydrological extremes, therefore assumption of stationarity is violated and alternative methods to conventional stationary Extreme Value Analysis (EVA) need to be adopted. In this study we introduce the Transformed-Stationary (TS) methodology for non-stationary EVA. This approach consists in (i) transforming a non-stationary time series into a stationary one to which the stationary EVA theory can be applied; and (ii) reverse-transforming the result into a nonstationary extreme value distribution. As a transformation we propose and discuss a simple time-varying normalization of 15 the signal and show that it allows a comprehensive formulation of non stationary GEV/GPD models with constant shape parameter. A validation of the methodology is carried out on time series of significant wave height, residual water level, and river discharge, which show varying degrees of long-term and seasonal variability. The results from the proposed approach are comparable with the ones from (a) a stationary EVA on quasi-stationary slices of non stationary series and (b) the previously applied non stationary EVA approach. However, the proposed technique comes with advantages in both cases, as 20 in contrast to (a) it uses the whole time horizon of the series for the estimation of the extremes, allowing for a more accurate estimation of large return levels; and with respect to (b) it decouples the detection of non-stationary patterns from the fitting of the extreme values distribution. As a result the steps of the analysis are simplified and intermediate diagnostics are possible. In particular the transformation can be carried out by means of simple statistical techniques such as low-pass filters based on running mean and standard deviation, and the fitting procedure is a stationary one with a few degrees of freedom 25 and easy to implement and control. An open-source MATLAB toolbox has been developed to cover this methodology, available at https://bitbucket.org/menta78/tseva.

2 are usually associated to disasters and damages with relevant social and economic cost. A correct statistical evaluation of the strength of extreme events related to their average return period is crucial for impact assessment, for the evaluation of the risks affecting human lives and activities, and for planning actions connected to risk management and prevention (Jongman et al., 2014).
Often it is required to apply EVA to non-stationary time series, i.e. series with statistical properties varying in time due to 5 changes in the dynamic system. In particular, relevant climate changes are usually associated to variations in the statistical properties of time series of climatic variables. For example an intensification of the meridional thermal gradient at middle latitudes on global scale would lead to an increase of the climatic variability (e.g. Brierley and Fedorov, 2010) which would involve a reduction of the average return period of storms with a given strength. Consequently in the study of climate changes an accurate statistical estimation of middle-long term extremes is inherently connected to the application of non-10 stationary methodologies.
While a general theory about non stationary EVA has not yet been formulated (Coles, 2001) there are several studies describing methodologies for the estimation of time-varying extreme value distributions on non stationary time series, which rely on the pragmatic approach of using the standard extreme value theory as a basic model that can be enhanced by means of further statistical techniques (e.g. Coles, 2001;Davison and Smith, 1990;Husler, 1984;Leadbetter, 1983;Méndez et al., 15 2006).
An established technique consists in expressing the parameters of an extreme value distribution as time-varying parametric functions M of time, for some custom parameters (α i , β i , γ i …). By means of a fitting process such as the Maximum Likelihood Estimator (MLE) it is then possible to fit the values of (α i , β i , γ i …) to model the extremes of the non-stationary series. Appropriate implementations of such a methodology, hereinafter referred to as "established approach" and 20 abbreviated as EA, produce meaningful results, as proved by a number of contributions (e.g. Cheng et al., 2014;Gilleland and Katz, 2015;Izaguirre et al., 2011;Méndez et al., 2006;Menéndez et al., 2009;Mudersbach and Jensen, 2010;Russo et al., 2014;Sartini et al., 2015;Serafin and Ruggiero, 2014).
A drawback of this approach is that there is no general indication on how to formulate the function M. As a rule the model should be parsimonious, i.e. simpler models should be preferred. For this reason often several test formulations of M are 25 used together, and then the best model is chosen through a balance between high likelihood and low degrees of freedom, for example by means of the Akaike criterion (Akaike, 1973). Furthermore the choice of M depends on the statistical model we choose for the extreme value analysis: for example for the same series the M used for the Generalized Extreme Value (GEV) model is different from the M used for the Generalized Pareto Distribution (GPD) model. As in this approach the estimation of the time-varying properties of the series is incorporated into the fitting of the extreme value distribution, non-stationary 30 fitting methods are required despite being relatively complex to implement and control.
Another widespread approach to deal with non-stationary series is dividing them into quasi-stationary slices and applying the stationary theory to each slice (e.g. Vousdoukas et al., 2016). This technique will be hereinafter referred to as "stationary on slice" and abbreviated as SS. Although this technique allows to detect meaningful trends for short return periods, its use has the drawback of reducing the size of the sample used for the EVA, implying larger uncertainty in the estimation of long return periods.
This research aims to contribute to the field of non-stationary EVA by introducing the Transformed-Stationary extreme value methodology, hereinafter referred to as TS, which allows to decouple the study of the non stationary behavior of the series from the fit of the extreme value distribution. To this purpose it introduces a standard methodology to model the variations 5 of the statistical properties of the series.
In section 2.1 the TS methodology is discussed and outlined in a general and theoretic way, while section 2.2 describes its implementation. Section 3 is dedicated to the validation of the methodology, and section 4 illustrates a comparison with other widespread approaches for the EVA of non stationary series such as EA and SS for modeling time series characterized by seasonal cycles and time series showing long term trends. In section 5 the results are discussed and in section 6 some 10 conclusions are drawn.

2
Methods and data

Theoretical background
The TS methodology consists in three steps: transforming of a non-stationary time series y(t) into a stationary series x(t), performing a stationary Extreme Value Analysis (EVA), and back-transforming the resulting extreme value distribution into 15 a time dependent one.
The transformation we propose is: where (t) tr y is the trend of the series, i.e. a curve representing the long term, slowly varying tendency of the series, and (t) ca y is the long term, slowly varying amplitude of a confidence interval which represents the amplitude of the distribution of y(t) . In particular, if ) (t ca y is equal to the long term varying standard deviation ) (t std y of the series y(t) , Eq. (1) 20 reduces to a simple time-varying renormalization of the signal: In the rest of the manuscript for simplicity we will limit our analysis to Eq. (2), knowing that all the considerations can be easily extended to any time varying confidence interval ) (t ca y . Transformation (2) where ) , ( t y f is the transformation from y to x given by Eq. (1), and It is always possible to compute is a monotonically increasing function of y for every time t, being the standard deviation ) (t std y always positive.
Using Eqs. (3) and (5) in Eq. (4) we find . ) ( shape, scale and location parameters given by 15 Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-65, 2016 It can be shown that the time-dependent GEV parameters given by Eq. (7) which involves, considering for example the scale parameter x where the last step is possible because (t) std y does not depend on x  .
This means that if x is stationary, when the likelihood is maximum for it is maximum also for ) , ( t y p GY , and that applying an MLE to best fit the stationary parameters   x x   , coincides to applying a non-stationary MLE to best fit the parameters (a, b) of the parametric expression (8). The equivalence between the two methodologies suggests that the TS 5 approach is dual to EA, meaning that any implementation of EA is equivalent to an implementation of the TS approach for some transformation (see appendix A for a more detailed discussion). One can also prove that Eq. (1) allows a general TS formulation with constant shape parameter, i.e. all the TS models with a constant ε y can be connected to Eq. (1) (see appendix A). This last result is remarkable, because it shows that Eq. (1) is exhaustive for all the TS models with constant shape parameter. 10 The findings drawn above are general and can be applied also to Peak Over Threshold (POT) methodologies, because the GPD is formally derived from the GEV as the conditional probability that an observation beyond a given threshold u is greater than x. In particular, the POT/GPD parameters are given by can be derived.
It worth noting that the TS methodology is "neutral" for a stationary series, i.e., the application of this methodology to a stationary series leads to the same results as a stationary EVA with the same underlying statistical model. That is because in such case y tr and y std are constant, and transformation (2) reduces to a constant translation and scaling.

Modelling the seasonality 20
In general we would like to model the fact that extreme events vary with season, with a typical size of local winter extremes different from that of local summer. A simple way to add the seasonal cycle to formulation (7-15) is expressing the trend The time varying GEV parameters can be expressed as and the time varying POT/GPD parameters can be expressed as 5

Implementation
The implementation of the TS methodology is illustrated in Figure 1. The fundamental input is represented by the series itself, and the core of the implementation consists of a set of algorithms for the elaboration of the time varying trend ) ( because this is the generally accepted time horizon for observing significant variations in the climate (e.g. Arguez and Vose, 2011;Hirabayashi et al., 2013). It is worth stressing that the chosen value of T should be verified a-posteriori to ensure that the transformed series is stationary. The time window T sn is used to estimate the intra-annual variability of the standard deviation (see section 2.2.1). In Figure 1 the input corresponding to the seasonal time window T sn is drawn in a dashed box because its value is relatively easier to choose than that of T. For the examined case studies a value of two months for T sn always resulted in a satisfactory estimation of the seasonal cycle.
In this implementation of the TS methodology the estimation of the long term statistics is separated from the estimation of the seasonality. This separation allows both the study of the sole long term variability of the extreme values, which is the usual approach studying the extremes on an annual basis, and of the combination of long term and seasonal variability, 5 which is the usual approach studying the extremes on a monthly basis.
we can apply Eq. (2) and perform a stationary EVA on the transformed series. It is important to stress that the stationary EVA is performed on the whole time horizon. The stationarity of the transformed signal allows us to apply different techniques for the EVA. In this study we illustrate the GEV and GPD approaches, but an interesting development would be the elaboration of non-stationary techniques for other approaches such 10 as (Goda, 1988) or (Boccotti, 2000) based on the TS methodology.
The final step of the implementation is the back-transformation of the fitted extreme value distribution into a non stationary one as given by Eq. (8) and (18) for GEV and by Eq. (15) and (19) for GPD.

Estimation of trend, standard deviation and seasonality
There are several possible ways of estimating the slowly varying trend and standard deviation and their seasonality. We 15 propose here a simple methodology based on running mean and standard deviation. We formulate the trend ) ( 0 t tr y as a running mean of the signal y(t) on a multi-yearly time window T, (20) where t N is the number of observations available during the time interval . The seasonality of the trend can be estimated as the monthly mean of the de-trended series. For a given month of the year the seasonality is then indicates that the averaging operation is limited to time intervals within each considered 20 month of the year. For example the seasonality of January is computed as the average on all the Januaries of the detrended signal. To estimate the slowly varying standard deviation we execute a running standard deviation with the same time window used to estimate ) ( 0 t tr y : . Where the subscript "rough" stresses the fact that this expression is sensitive to outliers and that its direct employment leads to a relevant statistical error, as it will be explained in session 2.2.2. To overcome this problem we smooth ROUGH oy t std ) ( with a moving average on a time window smaller than T, for example T/S with S=2: It is worth stressing that in general a further smoothing of the results of running means and standard deviations is licit if it reduces the error and improves the detection of the slowly varying statistical behavior of the time series. This is because the 5 estimation of ) ( 0 t tr y and ) ( 0 t std y consists in a low-pass filter which result should be smooth on time scales lower than T and affected by low relative error.
To estimate the seasonality we perform another running standard deviation ) (t std sn on a time window sn T much shorter than one year, in the order of the month, The seasonality of the standard deviation can be then computed as the monthly average of the ratio between ) (t std sn and 10 ) ( 0 t std y : The estimated seasonality terms tr sn and std sn are periodic with a period of one year. In order to smooth them and remove any possible noise in the signal, we take into account only their first three Fourier components computed in a period of one year, corresponding to components with a periodicity of one year, six months and three months.

Statistical error 15
Since there is an inherent error in the estimation of trend, standard deviation and seasonality given by Eqs. (21-25) we need to estimate it and propagate it to the statistical error of the parameters of the non-stationary GEV and GPD distributions. In general, given a sample s of data with size N, average s , variance ) var(s and standard deviation ) (s std we have 1 : 1 We can evaluate the error on the average by propagating to expression Using Eqs. (26) and (27) The error on std sn can be estimated as the error of the average ratio Using Eqs. (29), (34) and (36) we can estimate the error on the time varying GEV parameters as and the error on the time varying GPD parameters as

Data and validation 5
To assess the generality of the approach the TS methodology has been validated on time series of different variables, from different sources and with different statistical properties.
The analysis of annual and monthly maxima has been carried out on time series of significant wave height at two locations, the first located in the Atlantic Ocean, West of Ireland (coordinates -10.533°E, 55.366°N) the second close to Cape Horn will be hereinafter referred to as GWWIII. Here the TS methodology is applied to examine its applicability to climate change studies.
The annual and monthly analysis have been repeated on a series of water level residuals offshore of the Hebrides Islands 15 (Scotland, coordinates -7.9E, 57.3N) obtained from a 35 years hindcast of storm surges at European scale (Vousdoukas et al., Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-65, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci.  (Dee et al., 2011). This dataset will be hereinafter referenced as JRCSURGES.
For annual maxima we furthermore compare the TS methodology with the SS technique as, for example, implemented by (Alfieri et al., 2015) and (Vousdoukas et al., 2016). To this purpose we extracted time series from projections of streamflow in the Rhine and Po rivers covering a time horizon from 1970 to 2100 (Alfieri et al., 2015) hereinafter referred to as 5 JRCRIVER. Also the two series of significant wave height of West Ireland and Cape Horn extracted from the GWWIII dataset have been employed in this comparison.
Finally we compare the TS methodology and the EA for monthly maxima using time series of significant wave height extracted from a 35-years wave hindcast database (Mentaschi et al., 2015) in proximity of the locations of La Spezia and Ortona. The analysis of this dataset, hereinafter referred to as WWIII_MED, focuses on a comparison between seasonal 10 cycles modeled by the two approaches.

Waves: annual extremes
The validation of the TS methodology was performed first on the time series of significant wave height of West Ireland and Cape Horn from the GWWIII dataset. We verified first the non seasonal transformation given by Eq.
(2) and the time 15 dependent GEV/GPD given by Eqs. (7) and (15). By neglecting the seasonality, this formulation is suitable to find extremes and peaks on an annual basis. For technical reasons the two series do not have data in two time intervals, from 2005 to 2010 and from 2092 to 2095, but the impact of the missing data on the analysis is small specially if we choose a time window T large enough for the estimation of the trend and of the standard deviation using Eqs. (20) and (22). In particular for this analysis we chose a time window of 20 years, which is long enough to ensure the accuracy of the results and short enough to 20 include the multidecadal variability of a 130 years long time series.
The results of the analysis for the two time series are illustrated respectively in Figure 2 and GPD are also reported. Inter-decadal oscillations in the annual maxima are modeled for both of the series, though they are more pronounced for the West Ireland time series. Moreover, for both the series there is a tendency of the annual maxima to increase, more pronounced for the series of Cape Horn, where the increase in the annual maxima of significant wave height estimated by GWWIII is of about 2 m.
It is worth noting that for both the considered series the statistical mode of GEV and GPD grows faster in time than the 5 slowly varying trend ) (t tr y . This is due to the fact that the growth of the location parameter ) (t y  of the non stationary GEV (expression 7), and of the threshold ) (t u y of the non stationary GPD (Eq. 15) are related not only to the growth of ) (t tr y but also to the growth of ) (t std y . The high tail of the distributions grows even faster because also the scale parameter is proportional to ) (t std y . The impact of the statistical error of the slowly varying trend and standard deviation on the uncertainty of the distribution 10 parameters have been examined using expressions (37) and (38), which for the non seasonal analysis reduce to , for the GPD. The result is that for the non seasonal analysis the error due to the estimation of trend and standard deviation is negligible with respect to the error associated to the stationary MLE. In Table 1 the values of the different components of the error compared in Eqs. (39) and (40) are reported together with the total error estimated for each parameter of the non 15 stationary GEV and GPD. Since the threshold u x of the stationary GPD was selected to have on average 5 events per year, the error has been computed as the uncertainty related to this definition. The percentage contribution to the squared error is also reported in Table 1, in a single column because the percentages estimated for the two series are roughly equal. The error for both GEV and GPD and for both of the series is clearly dominated by the error associated to the estimation of the

Waves: monthly extremes
The seasonal formulation of the approach is suitable to estimate extreme value distributions on a monthly basis. Hence, we applied Eq. (17) to estimate the normalized series, fitted a stationary GEV of monthly maxima by means of a MLE and backtransformed into a non stationary GEV through Eq. (18). It is worth stressing that for the stationary MLE the entire normalized series was used, covering a time horizon of 130 years. For the GPD we selected the threshold in order to have on 5 average 12 events per year, corresponding to the 93 th percentile for both of the series. Results are displayed in Figure 4 for the location of West Ireland and in Figure 5 for Cape Horn. To make the seasonal cycle distinguishable in the figures we plotted only a slice of 5 years from 2085 to 2090. The meaning of the four panels in Figure 4 and Figure 5 are the same as in Figure 2 and 3. The non stationary extreme value distribution estimated for the location of West Ireland presents a strong seasonal cycle with extremes higher and more broad-banded during winter. For Cape Horn the seasonal cycle is weaker, with 10 extremes of significant wave height slightly lower during the local summer. The estimated seasonal GEV and GPD are significantly lower than those estimated for the non-seasonal analysis because in the seasonal analysis we consider monthly extremes, while in the non-seasonal one we consider annual extremes.
It is worth stressing that in the study of the monthly maxima the long term trend is also estimated, even if it cannot be appreciated in Figure 4 and Figure 5 due to the short time horizon represented. 15 Table 2 reports the components of the statistical error due to the uncertainty in the estimation of the seasonality together with the components due to the stationary MLE. The components of the error due to the uncertainty in the estimation of y tr 0 and y std 0 were omitted as they are negligible as compared with the error associated to the fitting of the stationary extreme value distribution (see section 3.1). In Table 2 we can see that, as for the non-seasonal analysis, the error for both GEV and GPD and for both series is clearly dominated by the uncertainty associated to the estimation of the parameters of the stationary 20 distributions, though in this case the error related to the stationary MLE is significantly smaller than that found for the nonseasonal analysis, due to the larger sample of data.

Residual water levels
To verify the performance of the TS methodology on a series from a different source, of a different quantity and with different statistical characteristics, we tested it on a series of water level residuals extracted from the JRCSURGES dataset 25 for a location offshore of the Hebrides Islands, Scotland, with coordinates (-7.9E, 57.3N). This series is characterized by a flat trend ) (t tr y because the model results are approximately constant-averaged. Therefore almost all the variability is modeled by the TS methodology in the standard deviation ) (t std y . Since the time horizon of this series is shorter than that of the GWWIII projections we chose a time window for the computation of the trend of 6 years to better identify its interannual variability. The results of the TS analysis of the yearly maxima are shown in Figure 6. The series displays also a strong seasonal behavior with annual maxima usually occurring during the local winter (for brevity the seasonal analysis is not illustrated).
An interesting aspect is that the estimated standard deviation ) (t std y presents a strong correlation (ρ=0.79) with the annual means of the North Atlantic Oscillation (NAO) index. This is illustrated in Figure 7, where the scatter plot of ) (t std y versus the annual means of the NAO index (panel a) and the two time series (panel b) are represented. As a consequence the 5 estimated annual maxima are also correlated with the NAO index.

4
Comparison with other approaches

Stationary methodology on time slices for long trend estimation
A comparison was carried out between the TS methodology and the SS technique, which consists in performing a stationary analysis on quasi-stationary slices of data. This analysis was carried out on river discharge projections for the Po and the 10 Rhine river extracted from the JRCRIVER dataset and on the projections of significant wave height extracted from the Results are illustrated in Figure 8, where the return levels of the projected discharge of the Rhine river are illustrated for three time slices. In the figure, the continuous black line and the green band represent the return levels and the 95% confidence interval estimated by the TS methodology, the dashed black line represents the return levels estimated by the stationary EVA on the considered slice (labeled in the legend as SS). As expected the return levels estimated for short return periods by the two methodologies are close, while they tend to spread for high return periods. This fact is also evident from 20 Figure 9, where the return levels estimated by the two methodologies are plotted one versus the other for the river discharge of the Rhine and the Po and for the significant wave height of West Ireland and Cape Horn. We can see that the two methodologies for the analyzed time series are in good agreement for return periods below 30 while they spread for larger return levels. Some quantitative figures about this fact are reported in Table 3 where RL TS and RL cmp are the return levels returned respectively by the TS and the SS methodologies. Table 3 also includes the maximum deviation between the return levels estimated by the TS and by the SS methodology. The NBI and the maximum deviations were obtained comparing results of the two techniques on the three 30-year time windows. From Table   3 we can see that the maximum deviation for return periods up to 30 years is always below 6%, while for higher return period it increases up to 13% for the discharge of the Po river. This is mainly due to the fact that for the stationary analysis 30 Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-65, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. on the quasi-stationary time slices we consider a sample of only 30 years, which leads to large uncertainty ranges in the estimation of large return periods such as 100 and 300 years. This also explains the sharp variations of high return levels that we find between the three time windows using the SS approach. These variations are likely more related to the uncertainty in estimating the levels associated to long return periods rather than to climatic changes. The TS methodology allows a more accurate estimation of high return levels because it uses the whole sample of 130 years, and this represents one of the 5 strengths of using the TS methodology instead of SS.

Established non-stationary approach for seasonal variability
Section 3 shows that the TS methodology is mathematically equivalent to a particular implementation of the EA methodology as described for example by (Coles, 2001;Izaguirre et al., 2011;Menéndez et al., 2009;Sartini et al., 2015).
For the sake of completeness in this paragraph we show the results of a comparison between the performances of a different 10 formulation of the EA methodology. In its formulation the parameters of the non stationary GEV of the monthly maxima are expressed as where β 0 , α 0 and γ 0 are the stationary components, β i , α i and γ i are the harmonics amplitudes, ω = 2πT -1 is the angular frequency, with T corresponding to one year, N μ , N σ and N ε are the number of harmonics and t is expressed in years. The parameters β i , α i and γ i have been therefore optimized through a non-stationary MLE in order to fit the monthly maxima of 15 the non-stationary series. Different combinations of N μ , N ψ and N ε have been tested and the best model was chosen as the one presenting the lowest value of the Akaike criterion (Akaike, 1973) given by where k is the number of degree of freedoms of the model, L is the likelihood. In particular the maximum value tested for N μ , and N ψ is 3 while the maximum considered value of N ε is 2. In general this model can be extended to incorporate long term trends, but the two series examined in this test display flat trends, hence Eq. (42) is adequate to model them. 20 In the comparison, the EA and the seasonal TS methodology (GEV only) were applied to the same series of significant wave heights relative to the WWIII_MED dataset described in section 2.3. For the transformed-stationary approach a 10-year time window was used for the computation of the long-term trend. The results of the two methodologies are similar, with a roughly flat trend and a strong seasonal pattern. The comparison of the seasonal cycles estimated by the two techniques is represented in Figure 10 for the two series. In the figures the continuous red and green lines are the location and scale 25 parameters (μ and σ respectively) as estimated by the TS approach. The dashed red and green lines are the location and scale Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-65, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. parameters estimated through the EA. The blue dots represent the monthly maxima, while the color scale represents the time varying probability density estimated by the transformed-stationary methodology. Since for both of the series the Akaike criterion selected models with a constant shape parameter ε, these are reported in the figure for both of the series together with those estimated by the TS methodology.
The GEV parameters estimated by the two approaches are in good agreement, and the small differences have relatively small 5 impact on the return levels, as one can see in Figure 11 where the return levels estimated by the two methodologies for the month of January are plotted. For both of the series the return levels estimated by EA lie within the 95% confidence interval estimated by the TS methodology. Table 4 reports the values of normalized bias NBI between the return levels estimated by the TS and the EA methodologies, defined as in Eq. (41). In the table the values of NBI are reported for the four seasons for return periods of 5, 10, 30, 50 and 100 years, and for both La Spezia and Ortona. In the used definition of seasons, Winter 10 starts on December 1 st , Spring on March 1 st , Summer on June 1 st and Autumn on September 1 st . We did not report return levels of periods greater than 100 years because the extension of the data covers only 35 years, and the estimates for such periods are inaccurate for both the methodologies. The average deviation between RL TS and RL cmp for the considered time series are rather small, below 7% for all of the seasons.

Discussion 15
Extreme Value Analysis is a subject of broad interest not only for Earth Science, but also for other disciplines such as Economy and Finance (e.g. Gençay and Selçuk, 2004;Russo et al., 2015), Sociology (e.g. Feuerverger and Hall, 1999), Geology (e.g. Caers et al. 1996), and Biology (e.g. Williams, 1995), among others. As a consequence non-stationarity of signals is a common problem (e.g. Gilleland and Ribatet, 2014). In this respect it is important to stress that the TS methodology is general, and its applicability does not require a time series for any specific property but the stationarity of the 20 transformed signal. Therefore even if in this study the technique was applied only to series related to Earth Science, it can be employed in all the disciplines dealing with extremes.
Given that the extreme value statistical model is an important component of applications like the ones presently discussed (e.g. Coles, 2001;Hamdi et al., 2013), it is important to stress that the theory was formulated in a way that is not restricted to GEV and GPD, but can be extended to any other statistical model for extreme values. In particular, since the GEV 25 distribution is a generalization of the Gumbel, Frechet and Weibull statistics, TS can be reformulated separately for these three distributions; as well as for the r-largest approach statistics which have been also commonly used (e.g. Coles, 2001;Hamdi et al., 2013). Finally an extension of TS to statistical models not based on the GEV theory (e.g. Boccotti, 2000;Goda, 1988) may open the way to their non-stationary generalization and could be an interesting direction for future research.
The presently discussed approach was presented using the trend, standard deviation and seasonality to perform a simple, 30 time-varying normalization of the signal, allowing different types of analysis. The first product of the methodology is related to the estimation the extreme values of the signal. In addition, the TS approach allows the analysis of the long term Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-65, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. variability; and as an example it was shown to be useful in relating the long term trend of the signal with the NAO climatic index (see section 3.3). Finding correlations of natural parameters with climatic indices is a theme of common interest in Earth Science, especially in view of climate change (e.g. Barnard et al., 2015;Dodet et al., 2010;Plomaritis et al., 2015). If a time series is long-term correlated to a climatic index, an advantage of the TS methodology is that it is able to model extremes correlated to the index without considering explicitly it in the computation. Finally, the TS methodology was also 5 extended to describe the seasonal variability of the extremes which is also critical for climate studies (e.g. Sartini et al. 2015;Menendez et al. 2009;Méndez et al. 2006).
As shown in section 4 the TS methodology comes with advantages over both the SS methodology (e.g. Vousdoukas et al. 2016) and the EA (e.g. Cheng et al., 2014;Gilleland and Katz, 2015;Izaguirre et al., 2011;Méndez et al., 2006;Menéndez et al., 2009;Mudersbach and Jensen, 2010;Russo et al., 2014;Sartini et al., 2015) in terms of accuracy of the results and of 10 conceptual and implementation simplicity. In particular in the comparison with the SS methodology for long term variability the return levels estimated by the two techniques are similar for return periods for which the SS is accurate. The use of the whole time horizon of the series represents a major advantage of the TS over the SS methodology, because it allows more accurate estimations of the return levels associated to long return periods. A conceptual advantage of the TS methodology over the EA is that it decouples the detection of the non-stationary behavior of the series from the best fit of the extreme 15 value distribution: the goal of estimating the time-varying statistical features of the series is delegated to the transformation.
This fact provides a simple diagnostic tool to evaluate the validity of the model applied to a particular series: the model is valid if the transformed series is stationary. This simple diagnostic is useful to validate the output of the approach. Moreover this decoupling simplifies both the detection of non stationary patterns and the fitting of the extreme values distribution. In particular the detection of non stationary patterns can be accomplished by means of simple statistical techniques such as low-20 pass filters based on running mean and standard deviation, and the fit of the extreme value distribution can be obtained through a stationary MLE with a small number of degrees of freedom, easy to implement and control. Moreover, unlike many implementation of the EA (e.g. Cheng et al., 2014;Gilleland and Katz, 2015;Izaguirre et al., 2011;Méndez et al., 2006;Menéndez et al., 2009;Sartini et al., 2015;Serafin and Ruggiero, 2014)  It is worth remarking that the EA implemented for example using Eq. (42) is able to model a shape parameter varying in time, while the TS methodology using transformation (1) is not. While in principle this is a weak point of the TS methodology described in this manuscript, assuming a constant shape parameter is most of cases a reasonable assumption, 30 because in general simple models should be preferred to complex ones (e.g. Coles, 2001). In particular using the EA the Akaike criterion (Akaike, 1973), which favors simple models with fewer degrees of freedoms, often selects models with fixed shape parameter (e.g. Sartini et al. 2015;Menendez et al. 2009). Moreover, the finding that a non stationary GEV Hydrol. Earth Syst. Sci. Discuss., doi:10.5194/hess-2016-65, 2016 Manuscript under review for journal Hydrol. Earth Syst. Sci. always corresponds to a transformation of the non stationary time series into a stationary one, shown in appendix A, suggests that a generalization of the TS methodology is possible in order to include models with time varying shape parameters.

Conclusions
In this manuscript the TS methodology for non-stationary extreme value analysis is described. The main assumption underlying this approach is that if a non stationary time series can be transformed into a stationary one to which the 5 stationary EVA theory can be applied, then the result can be back-transformed into a non-stationary extreme value distribution through the inverse transformation. The proposed methodology is general, and even if in this study we applied it only to series related to Earth Science, it can be employed in all the sciences dealing with EVA. Moreover, though we discussed it only for GEV and GPD, it can be extended to any other statistical model for extremes.
As a transformation we proposed a simple time-varying normalization of the signal, estimated by means of time-varying 10 mean and standard deviation. This simple transformation was also adapted to describe the seasonal variability of the extremes. In addition it was proved to provide a comprehensive model for non stationary GEV and GPD with constant shape parameter, meaning that it can be applied to wide range of non-stationary processes. The formal duality between the TS approach and the established one has also been proved, suggesting that a complete generalization of the TS approach is possible to include models with time-varying shape parameter. 15 The methodology was tested on time series of different sources, sizes and statistical properties. An evaluation of the statistical error associated to the transformation led to the conclusion that for the examined series it is negligible (the squared error is 2 orders of magnitude smaller) with respect to the error associated to the stationary MLE and, for GPD, to the estimation of the threshold.
The TS methodology was compared with the technique of performing a stationary EVA on quasi-stationary slices of non 20 stationary series (SS methodology) for the estimation of the long term variability of the extremes, and with the established approach (EA) to non stationary EVA, showing that the return levels estimated by TS are comparable to the ones obtained by these two methodologies. However, the TS approach comes with advantages on both SS and EA. With respect to SS the TS methodology uses the whole time series for the fit of the extreme value distribution, guaranteeing a more accurate estimation of large return levels. With respect to EA it decouples the detection of the non stationarity of the series from the 25 fit of the extreme value distribution, involving a simplification of both the steps of the analysis. In particular the fit of the distribution can be accomplished by means of a simple MLE with a few degrees of freedom, simple to implement and to control. The detection of the non stationarity can be performed by means of easy-to-implement and fast-to-run low-pass filters which do not require as an input any parametric function for the variability, making this methodology well suited for massive applications where the simultaneous evaluation several time series is required. 30 An implementation of the TS methodology has been developed in an open-source matlab toolbox (tsEva), which is available at https://bitbucket.org/menta78/tseva.

Duality between the established approach and the TS methodology
In this appendix we show that if the extremes of a time series y(t) are fitted by a non-stationary GEV is a stationary GEV fitting the extremes of a supposed stationary series x(t). 5 To prove this we expand relationship )] , Equation (45)  are regarded as known variables, which is a wrong assumption in practical applications.
But it is enough to say that any implementation of the non-stationary established approach is equivalent to a transformation into a supposed stationary series x(t). Therefore Eq. (45) could be used as a diagnostic tool for implementations of the 20 established approach: a condition for the validity of the non stationary model is that the transformed x(t) series is stationary.