Like other Mediterranean areas, Italy is prone to the development of events with significant rainfall intensity, lasting for several hours. The main triggering mechanisms of these events are quite well known, but the aim of developing rainstorm hazard maps compatible with their actual probability of occurrence is still far from being reached. A systematic frequency analysis of these occasional highly intense events would require a complete countrywide dataset of sub-daily rainfall records, but this kind of information was still lacking for the Italian territory. In this work several sources of data are gathered, for assembling the first comprehensive and updated dataset of extreme rainfall of short duration in Italy. The resulting dataset, referred to as the Italian Rainfall Extreme Dataset (I-RED), includes the annual maximum rainfalls recorded in 1 to 24 consecutive hours from more than 4500 stations across the country, spanning the period between 1916 and 2014. A detailed description of the spatial and temporal coverage of the I-RED is presented, together with an exploratory statistical analysis aimed at providing preliminary information on the climatology of extreme rainfall at the national scale. Due to some legal restrictions, the database can be provided only under certain conditions. Taking into account the potentialities emerging from the analysis, a description of the ongoing and planned future work activities on the database is provided.
Italy can boast of a role at the highest level in the
development of meteorological observations
In spite of the huge heritage of data, only a small fraction of the Italian
rainfall data is available in a computer-readable format. Moreover, the
dismantlement of the National Service led to a lack of updates for the national
database of extreme rainfall that is still stuck, for some regions, at the
beginning of the 1990s. This has led to a very fragmented framework: updated
rainstorm hazard assessments are actually only available for some regions and
only at the regional scale (see, e.g.
In view of the assembling of the first comprehensive dataset of extreme rainfall of short duration in Italy several major sources of data have been analysed. The resulting dataset, referred to as the Italian Rainfall Extremes Database (I-RED), includes data from more than 4500 stations across the country, spanning the period between 1916 and 2014, and refers to annual maximum rainfall recorded in 1 to 24 consecutive hours (exact durations available are 1, 3, 6, 12 and 24 h).
The following sections describes the sources of the data, the work carried out for the merging of the database and the operations that are still required for making it suitable for nationwide robust rainfall frequency analyses. A preliminary analysis of the extreme rainfall regime at the national scale is also presented.
As a follow-up of the activities of the Italian National Group for the
Prevention of the Hydrogeological Disasters (GNDCI) a comprehensive
nationwide hydrological information system has been set up, within the
“CUBIST project”, funded by the Italian Ministry of Education and
Research within the funding PRIN 2005 (Italian Research Projects of
National Relevance). The database includes about 6000 pluviographs and
pluviometers, 700 temperature stations and about 400 river basins
After the late 1980s, indeed, the local environmental agencies started to
support the SIMN in its work. Gradually, the 21 regional
hydrological services took over the networks and the tasks of the national
one. In this period most of the old manual tipping-bucket rain gauges have
been substituted with automatic stations, similar to the one described in
Names of the Italian regions and type of datasets provided by the
regional authorities. The cases refer to the bullet list of Sect.
Regions of Italy with the assigned code and the related local Operational Center with references to the availability of digitized data.
Merging and harmonizing the different datasets is a quite long and difficult operation, that is still ongoing. The different operational centres provided different types of datasets, with different temporal coverages and spatial reference systems. Duplicate stations are often present in the databases of neighbouring regions.
The first steps of this work have been carried out at the regional scale. For
each region all the data falling inside the regional boundaries have been
considered. These data, according to the setting of the databases of the
local operational centres, could belong to one of these three categories:
data from the CUBIST database for the 1900–2001 period already
available from the former national service data provided by the regional authority data provided by the regional authorities of the neighbouring regions that extend beyond their regional
borders.
Observations dating from before 1916 have been discarded, as they are considered not
significant and too unevenly distributed. Considering that most of the
provided data have been validated from the related authorities, they are
considered reliable and, at first, included directly in the I-RED.
For information on the validation procedures, please refer to the Appendix
Once the type (b) and type (c) datasets were merged for each region,, the resulting
dataset has to be merged with the type (a) dataset. This operation has been
quite complex, as the overlapping period between the different dataset was
different for each region and because most of the authorities did not track
the change in the name and code of the stations. The different procedures
performed, according to the type of the dataset that the region has provided
(as reported in Fig.
With the application of the above described rules, 20 complete regional
datasets have been obtained. The regional datasets were finally merged
together to generate the I-RED. After the merging phase some
reliability check has been performed, in order to detect any problematic or
incorrect information. These include the identification and removal of the
duplicate data or stations, and reliability checks on the larger values of the
dataset, comparing them to the absolute record-breaking events for all the
durations (see
Due to the complexity of the check operations, further efforts and collaborations with the regional authorities are still ongoing to increase the consistency of the database. Nevertheless, to date (October 2017) the I-RED includes more than 4500 stations nationwide and constitutes the largest updated dataset of annual maxima for Italy.
Considering that most of the regional authorities supervise the use and widespread dissemination of their datasets in order to prevent improper use, a detailed description on how to access the I-RED is reported in the Data Availability section.
In the following, the spatio-temporal distribution of the assembled data will be described.
The number of data available per year in the I-RED is reported in
Fig.
The smaller size of the I-RED compared the CUBIST database
in some years can be due to the following:
The presence, before 1945, in the CUBIST database of data from territories
lost by Italy after World War II (e.g. Istria) or from neighbouring countries, not
included in the I-RED; The fact that some regional agencies could have decided for different reasons not to include data or stations from the
SIMN in their datasets. Considering that these data are only contained in the CUBIST dataset, for the regions where the procedures (1) and (3) described in Section 2.2 are applied, they are not included in the I-RED.
Considering the limited significance of the information loss, further efforts
for including these data will be planned only in a future stage of the
development of the database.
For a descriptive analysis of the rainfall data, all the assembled time
series are classified according to their length. Results are shown in Fig.
The spatial distribution of the stations is shown in Fig.
Total number of data per cell over a 50 km grid.
A preliminary descriptive analysis of the characteristics of extreme
rainfalls at the national scale has been carried out on the newly developed
I-RED database. Series with a minimum length of 20 years of data
have been considered in this analysis. This length constraint leads to a
subset of 1974 series available for the analysis, out of the original 4686.
For each duration, the median of the series is depicted in Fig.
Median values of the I-RED series from 1
Some geographical areas are characterized by clusters of large median values and these clusters appear consistent across the different durations. Furthermore, at the country-wide scale we observe that the coefficient of variation of the medians increases for increasing durations, suggesting a wider range of variability of the corresponding median values.
For each series, the sample L-moments
The significance of the developed dataset also allows preliminary exploration
of the rainfall events sometimes referred to as “black swans”
The first comprehensive dataset of extreme rainfall in Italy,
called I-RED, has been presented here. It is a significant source of
information, able to provide unprecedented knowledge on the characteristics
of heavy precipitation in Italy and on the possible rainfall regime changes
in the last century. Further efforts will be addressed to increase the
spatial data homogeneity and coverage in time, by including the data of the
most recent years and, eventually, by contacting the local authorities for
requesting assistance in the merging of the series. The final aim is to make
the update of the database systematic and unsupervised. This can be done
by strengthening the collaboration with the data providers, in the framework
of joined projects, as did the one that led to the development of the
ArCIS
The original data can be requested to the authorities
reported in Table research individuals or groups in the framework of the authors' project; research individuals or groups not collaborating with the authors' project,
upon evidence of permission received by the involved regional agencies, reported in Table
For further details and queries, please contact the corresponding author.
[
The first level of data validation is performed on the raw data (or gross
data), i.e. the data at the original temporal resolution with which they are
transmitted or detected at the measuring station and consists of the
application of basic procedures for verifying the validity of the data. These
checks aim at indicating malfunctions, instability or interference. In the
case of data coming from automatic measuring stations the validity checks are
applied to the “meteorological message” coming from the station in the
transcoding phase of the message that for the transmission must comply with
certain rules. The checks carried out will therefore be related to the
expected formats within a given message, to the date and time stamps, to the
location of measuring station, to the codes of stations and sensors and to
the presence of duplicate elements. This category of checks includes syntax
controls (e.g. alphabetic characters appearing in a text that should be
numeric) which, if incorrect, can mine the transcoding process; logical
controls that refer to both the intrinsic characteristics of the magnitude
The supplement related to this article is available online at:
The authors declare that they have no conflict of interest.
The authors thank Enrica Caporali and Valentina Chiarello for their
assistance in preparing and screening the Tuscany regional dataset, Stefano
Macchia for his contribution in collecting and cleaning the data and the
insightful comments of Alberto Montanari, three anonymous reviewers and the
handling editor that allowed the quality of the
original manuscript to be significantly improved. Data providers reported in Table