Examination of homogeneity of selected Irish pooling groups

. Flood frequency analysis is a necessary and important part of ﬂood risk assessment and management studies. Regional ﬂood frequency methods, in which ﬂood data from groups of catchments are pooled together in order to enhance the precision of ﬂood estimates at project locations, is an accepted part of such studies. This enhancement of precision is based on the assumption that catchments so pooled together are homogeneous in their ﬂood producing properties. If homogeneity is assured then a homogeneous pooling group of sites lead to a reduction in the error of quantile estimates, relative to estimators based on single at-site data series alone. Homogeneous pooling groups are selected by using a previously nominated rule and this paper examines how effective one such rule is in selecting homogeneous groups. In this paper a study, based on annual maximum series obtained from 85 Irish gauging stations, examines how successful a common method of identifying pooling group membership is in selecting groups that actually are homogeneous. Each station has its own unique pooling group selected by use of a Euclidean distance measure in catchment descriptor space, commonly denoted d ij and with a minimum of 500 station years of data in the pooling group. It was found that d ij could be effectively deﬁned in terms of catchment area, mean rainfall and baseﬂow index. The study then investigated how effective this selected method is in selecting groups of catchments that are actually homogenous as indicated by their L-Cv values. The sampling distribution of L-CV ( t 2 ) in each pooling group and the 95% conﬁdence limits about the pooled estimate of t 2 are obtained by simulation. The t 2 values of the selected group members are compared with these conﬁdence limits both graphically and numerically. Of the 85 stations, only 1 station’s pooling group members have all their t 2 values within the conﬁdence


Introduction
It is widely accepted that a short annual flood (AM) series is inadequate for the estimation of design floods of large return periods. Regionalization (FSR, 1975), i.e. pooling analysis (FEH, 1999), is one of the possible methods used to provide a framework for design floods. In pooling analysis flood data are pooled from other gauging stations that possess similar hydrological behaviours to the at-site station. A very common way to implement regional/pooling is the index flood method proposed by Dalrymple (1960). The estimation of Q T , T-year flood, based on this approach involves derivation of a growth curve which shows the relation between X T and the return period T where X T = Q T /Q I and Q I is the index flood at the site of interest. Generally the mean (FSR, 1975) or median (FEH, 1999) of the at-site AM flood series is taken as the index flood. It is assumed that the X T − T relation is the same at all sites in a homogeneous pooling group. The identification of a homogeneous pooling group is therefore important in pooling analysis. Lettenmaier et al. (1987); Stedinger and Lu (1995) and Hosking and Wallis (1997) among other researchers have demonstrated that a successful pooling analysis requires a homogeneity criterion to be satisfied. 820 S. Das and C. Cunnane: Homogeneity test However very recently Kjeldsen and Jones (2009) have approached this in a different way.
An examination of homogeneity is normally used to assess whether a proposed group of sites is homogeneous or not. Examination of the homogeneity of regions/pooling groups is usually based on a statistic that relates to the formulation of a frequency distribution model, e.g. the coefficient of variation, CV (Wiltshire, 1986;Fill and Stedinger, 1995) and/or skew coefficient, g, their L-moment equivalents (Chowdhury et al., 1991;Hosking and Wallis, 1997) or of dimensionless quantiles such as the 10-year event (Dalrymple, 1960;Lu and Stedinger, 1992). Hosking andWallis (1993, 1997) proposed homogeneity tests based on L-moment ratios such as L-CV alone (H1) and L-CV and L-skewness jointly (H2) which are widely used in flood frequency analysis although the former one is recommended by these authors for having better power to discriminate between homogeneous and heterogeneous regions. Very recently, a similar conclusion has been drawn by Viglione et al. (2007) when they compared several homogeneity tests. They stated that the H1 test is ahead of all others when the L-skewness is lower than 0.23. They further concluded that the H2 as a homogeneity test lacks power. These findings certainly indicate that the heterogeneity among the sites in a group is mainly due to variations in the sample L-CVs. However, one of the main assumptions of these tests is that the true regional distribution is kappa. For that reason and others Hosking and Wallis (1997) recommended that though the heterogeneity statistic is constructed like a significance test it should not be used in that way. They, Hosking and Wallis (1997, p. 70), further stated that ". . . a significance test is of doubtful utility anyway, because even a moderately heterogeneous region can provide quantile estimates of sufficient accuracy for practical purposes. Thus a test of exact homogeneity is of little interest." In this paper a graphical way of examining the homogeneity of a pooling group is presented which is based on L-CV , i.e. t 2 . The main idea behind the approach is the comparison of the variability of t 2 from each site in the pooling group with that expected (un-weighted average pooled t 2 ) supposing the differences between sites to be due to sampling error. The pooling groups are identified by the Region of Influence (ROI) approach. The population distribution is GEV (with k = −0.05, k = 0.0, k = +0.03), rather than Kappa as suggested by Hosking and Wallis (1997), and was based on the GEV's descriptive ability of the annual maximum data series of Ireland.
The outline of the paper is structured as follows: the next section describes the procedure used to obtain growth factors and flood quantiles in the context of flood frequency pooling analysis. This is followed by a description of procedures to select pooling variables for similarity distance measure (d ij ) in the context of formation of pooling groups using the ROI approach. A graphical way of examining homogeneity of pooling groups obtained by the ROI approach is then presented. Then the analysis of the examination procedure is summarised and finally a selected number of heterogeneous pooling groups are reviewed with the help of Box-plots of catchment descriptors.

Estimation of pooled growth factors and flood quantiles
The growth factor X T is the factor which when multiplied by the index flood Q I , gives the flood magnitude of return period T , Q T , as in Eq. (1) The relationship between X T and T is often referred to as the growth curve. When a growth curve is obtained by pooling the information from sites of a pooling group, it is called the pooled growth curve. Qmed is used as the index flood in this study where Qmed is the median of the annual maximum series.
In this study the pooled growth curve is obtained using the approach based on the method of L-moments. The Lmoments developed by Hosking (1990) are based on probability weighted moments (PWMs) introduced by Greenwood et al. (1979). With this approach the derivation of a growth curve in a pooling group involves the following key steps: 1. computation of at-site and pooled L-moment ratios 2. selection of a suitable form of distribution and estimation of its parameters by the method of L-moments.
L-moments are calculated and then the dimensionless Lmoment ratios t 2 and t 3 are calculated for each site. Pooled Lmoment ratios for the target site, i, are then computed using the following equation: where t (j ) is the L-moment ratio (either t 2 or t 3 ) for the j -th most similar site and w ij is a weighting term. Weights can be related to a site's record length and/or a site's d ij values. Recently a more complex way of assigning weights is proposed by Kjeldsen and Jones (2009) although they state that only a little has been gained in the flood estimation procedure using the new approach. In this study w ij is taken as 1. Choice of unweighted averages was guided by the observations made by Hosking and Wallis (1997, p.90), namely "The calculation of regional averages by weighting the sites proportionally to their record lengths is not essential. If the region is exactly homogeneous, then a good approximation of the variance of t (i) is proportional to n(i) − 1, and in this case weighting the sites proportionally to their record lengths minimizes the variance of the regional average t R . If the region is heterogeneous, it is possible that weighting proportionally to record length may give undue influence to sites that have frequency distributions markedly different from the region as a whole and that also have long records".
The Generalised Extreme Value (GEV) has been selected as the pooled distribution function. The selection of the GEV distribution is explained in Sect. 4. The values t R 2 , t R 3 are equated to expressions for these quantities written in terms of the distribution's unknown parameters (expressed in dimensionless form) and the resulting equations are solved for the unknown parameter values. The dimensionless GEV growth curve (X T ) is defined by two parameters k and β: where T is the return period.
The two parameters k and β are estimated from the sample L-CV, t 2 , and sample L-skewness, t 3 , as follows (Hosking and Wallis, 1997) where denotes the complete gamma function.

Formation of pooling groups using Region Of Influence (ROI) approach
The Region of Influence (ROI) approach of formation of a pooling group is considered to be the most appropriate and meaningful way of delineating a pooling group. The technique developed by Burn (1990), involves the identification of a region of influence i.e. a separate pooling group for each gauging station in a region. The identification of a pooling group consists of selecting stations that are hydrologically similar to the site of interest. Similarity is measured generally by a Euclidean distance measure in catchment descriptor space.
The effective identification of a pooling group in a ROI approach is governed by two important criteria: the choice of appropriate site descriptors as pooling variables and the size of a group in terms of number of sites and station years included. Burn (1990) investigated a number of options to determine a threshold value based on the d ij values to define a cut-off for the inclusion of stations in the ROI method for a target site. However, a more practical way of choosing an appropriate size of a pooling group was presented by FEH (1999). They investigated a range of pooling group sizes and decided on adoption of the 5T rule, namely that the total number of station years of data to be included when estimating the T year flood should be at least 5T . The adoption of such a rule was a compromise. If too few stations are included the precision of the Q T estimate is sacrificed whereas if far too many stations are included then the assumption of homogeneity may be compromised. Hosking and Wallis (1997) however show that a small departure from homogeneity can be tolerated so that having too few stations included may be less desirable than having slightly too many. They also suggested not to use more than 20 sites in a group as little gain in the accuracy of quantile estimates is obtained by using more than about 20 sites in a group. Recently, Kjeldsen and Jones (2009) found that a fixed pooling group consisting of 500 station years performed well for a range of return periods. In relation to identifying site descriptors as pooling variables, careful consideration is necessary as to which form of catchment descriptors are to be used in a ROI method of pooling analysis. In the next subsection an investigation of selecting pooling variable for the Irish case is described in detail.

Choice of catchment descriptors on effectiveness of ROI distance measures
The general form of the similarity measure used for selecting members of a pooling group is defined by where d ij is the weighted Euclidean distance from site j to site i; n is the number of attribute variables; X k,i is the value of the k-th variable at the i-th site and W k is the weight applied to attribute k, reflecting its relative importance. The subscript i denotes the subject site and the subscript j denotes the j -th pooled site.
In choosing a distance measure d ij a decision has to be made about which catchment descriptors are to be included in the distance measure and what weightings are to be applied to them and whether logarithms or other transformations are to be used. The FEH (1999) provided a number of useful maxims for choosing a distance measure. It recommended not to use at-site flood statistics (e.g. CV, g) as pooling variables because this might well result in groups consisting of sites that have experienced similar floods in recent history. Neither could such site flood statistics be used for ungauged catchments. Seasonality of the flood response (e.g. timing and regularity of flood events) has also been considered (Burn, 1997;Cunderlik and Burn, 2006) as a similarity measure. Seasonality statistics are obtained from observed flood series. Therefore, a similarity measure based on these could not be used for ungauged sites, without additional assumptions.
For Irish conditions two sets of catchment descriptors have been selected as potential pooling variables: -similar variables as used in FEH i.e., AREA (catchment area), SAAR (standard average annual rainfall), BFI (baseflow index) and FARL (index of flow attenuation by reservoir and lake) -on the assumption that homogeneity is strongly dependent on CV or L-CV, those catchment descriptors that could predict L-CV best were identified and a selection of these were used to form d ij . This approach is along the lines outlined by Kjeldsen and Jones (2009).
For selecting the final set of pooling variables, FEH used pooled uncertainty measure (PUM) which is a weighted average of the squared differences between each at-site growth factor and the pooled growth factor measured on a logarithmic scale. In this part of the study a simulation procedure is used for this purpose because far fewer stations (85) than the 602 stations used for the UK study were available. The first objective is to find which combinations of FEH descriptors, which are listed in Table 2, lead to pooling groups which are most effective at exploiting the information about the flood distribution contained in the pooling groups.
The simulation procedure uses the GEV distribution for data generation which is considered to be representative of what is appropriate in Irish conditions. Hosking and Wallis (1997, p.93) suggested not to use the observed sample L-moment ratios as the population L-moment ratios of the simulated region because this would yield a simulated region that has much more heterogeneity than the actual data. Castellarin et al. (2001) addressed the issue by using a region of influence approach to estimate the at-site population values of t 2 and t 3 . A similarity measure based on at-site flood statistics is used to form a group of sites for a subject site and its population values of t 2 and t 3 are considered as the corresponding pooled estimate of t 2 and t 3 for the group. Later, Gaál et al. (2008) adopted this approach in their study. A similar kind of approach is used here with a similarity measure defined as which is independent of the descriptor variables being considered in Table 2. A pooling group is formed for each site using Eq. (8) and the pooled t 2 and t 3 are estimated using Eq.
(2). The estimated pooled values of t 2 and t 3 are then used as population values for each site in step 2 of the simulation procedure. The simulation procedure does not consider the implications of intersite correlation among sites in a pooling group because it was found by Hosking and Wallis (1997, p.127) to be of very little consequence. The steps of the simulation procedure for selecting variables are described as follows.
1. The gauging stations in the subject site's pooling group are identified using the d ij values of Eq. (7) for a set of catchment descriptors having a minimum of 5T station years of data in the pooling group.
2. Random samples are drawn from GEV populations for the subject site and for each site in the pooling group.
For each site the sample size is taken as being equal to the length of the observed historical record at the site and the parameters are estimated from the site t 2 and t 3 values obtained using the procedure described above, as in Castellarin et al. (2001) and Gaál et al. (2008).
3. The t 2 and t 3 values are obtained for each sample in the pooling group and the average of these is calculated to represent the pooled t 2 and t 3 values.
4. The pooled t 2 and t 3 values are then used to determine the pooling group's GEV growth curve parameters k and β using Eqs. (4) and (6).
5. The subject site'sX T value is calculated for T = 50 and 100 years respectively using Eq. (3).
6. Steps 2 to 5 are repeated 10 000 times to provide 10 000 values ofX T and the RMSE T and BIAS T are calculated for the subject site by the following equations: whereX T i,s is the estimated T -year growth factor at a site i at the s-th repetition; X T i is the assumed true Tyear growth factor at site i; M is the number of sites in the pooling group and S is the number of repetitions. RMSE T and BIAS T defined in the simulation procedure has been evaluated at 50 and 100-year return periods for each site. The eight combinations listed in Table 2 of the four variables have been tested based on RMSE T (primarily) . In all, 85 stations have been considered for the study. The data sets that have been used in the study are summarized in Table 1. For each of these sites, a pooling group was selected from the 85 stations. Initially in the simulation procedure all weights W k in Eq. (7) were set to unity. Figures 1 and 2 shows, in box-plot form, respectively the variation in the 100-year RMSE and BIAS values for different sets of catchment descriptors used in Eq. (7). In Table 2, the corresponding mean variation of RMSE 100 and RMSE 50 values, for different sets of pooling variables, is summarised. It shows that the numerical measures of effectiveness vary by very little between rows. The set of two variables, lnAREA and lnSAAR, and the set of the single variable lnAREA performed best in terms of providing the lowest RMSE 100 values. In terms of RMSE 50 , the set consisting of lnAREA and lnSAAR comes second best to the set consisting of lnAREA on its own. Overall, the set of variables comprised of lnAREA and lnSAAR may be considered as being  the most suitable set of pooling variables for Irish conditions. However, if there is also a desire to incorporate another physical catchment effect then the BFI could be included with these two. While inclusion of just one or two catchment descriptors may indeed be best, there is an intuitive attraction in also representing some descriptor of catchment response even at the cost of a small apparent loss in effectiveness. This could be of relevance in engineering investigations where differences in catchment behaviour are considered of importance by the investigator. An extension to this investigation with varying values of weights W k in Eq. (7) was also done, particularly for the set of variables of lnAREA, lnSAAR and BFI but the results of all variations examined are not reported in detail here. An automatic search procedure was not used but it was found, by trial and error, that the weights 1.5, 1.0 and 0.1 for lnAREA, lnSAAR and BFI respectively gave RMSE 100 = 15.22 and RMSE 50 = 12.81 which offer small improvements on the W k = 1.0 values used in the calculations for the set of variables of lnAREA, lnSAAR and BFI. The trial and error approach involved assigning a selection of weights, varying from 0 to 3, to each of the quantities, i.e. lnAREA, lnSAAR and BFI. In the second approach a set of catchment descriptors were identified through the use of regression models of L-CV on the catchment descriptors. These descriptors were then also used as potential pooling variables. In the search for a best regression model both log-transformed and nontransformed variants of the catchment descriptors and L-CV were used. The best regression model for L-CV containing three catchment descriptors was found to be based on MSL, FORMWET and ARTDRAIN, where MSL is the mean stream length, FORMWET is a form of catchment wetness index analogous to PROPWET in FEH (1999) and ARTDRAIN is an arterial drainage index which is defined as % of catchment area affected by arterial drainage improvements. These descriptors were identified from a pool of twenty five catchment descriptors made available by the Irish Office of Public Works. The R 2 value of the best available model is a modest 29%. These identified catchment descriptors were also assessed by the above simulation procedure. The RMSE T values for T = 50, 100 are listed in Table 3 for six combinations of the three variables. The set of two variables, lnMSL and ARTDRAIN, and the set of the single variable lnMSL performed best in terms of providing the lowest RMSE 100 % values.
Both approaches described above provide similar outcomes in terms of RMSE 100 %. This may be partly due to the relatively weak relations identified for predicting L-CV (R 2 = 0.29). A regression of L-CV on the other set's catchment descriptors (AREA, SAAR, BFI, FARL) also yields a weak relation for predicting L-CV (R 2 = 0.21). Since both sets of catchment descriptors can predict L-CV only in a weak manner, and both approaches are similar in RMSE it is concluded that neither approach is clearly superior to the other.

Procedure for examination of homogeneity
A homogeneity test is used to assess whether a proposed group of sites is homogeneous or not. A homogeneous group of sites leads to a reduction in the error of quantile estimators relative to estimators based on single at-site data series alone which is the main goal of a regional flood frequency analysis. A homogeneity test was introduced by Dalrymple (1960). Other tests were introduced by Wiltshire (1986), Lu and Stedinger (1992), Fill and Stedinger (1995) and Hosking andWallis (1993, 1997).
A simulation procedure, using graphical presentation of key results is applied in this study to examine homogeneity of pooling groups that were formed using the ROI technique. GEV distributions with 3 different shape parameter values (k = −0.05, k = 0.0 (EV1), k = +0.03) are used in the simulation procedure. The GEV, and its special case the EV1, have a history of usage in Ireland since publication of the Flood Studies Report (FSR, 1975, p.173-174, Table 2.38, Fig. 2.14, Vol. I). More recently, a national study sponsored by the Office of Public Works, Dublin, based on annual maximum flood data of 110 stations, with average length of record 37 years and with a quarter of them between 50 and 55 years, has indicated that GEV and EV1 distributions are suitable parents for the majority of Irish flood series (Das, 2010, Ch. 3). This conclusion is based on visual examination of probability plots and numerical scores assigned to them, on classical goodness of fit tests and on L-Moment Ratio diagrams, such as Fig. 10 which shows that GEV/EV1 looks more suitable as a parent than other 3-parameter distributions tested such as Generalised logistic and Lognormal 3. While the 4 parameter Kappa distribution has been recommended by Hosking and Wallis (1997) as a parent for simulation studies, this choice was sometimes found to be problematical because of numerical difficulties and estimation failures during the parameter estimation process and as a result GEV was selected as parent distribution in this study. The steps of the simulation procedure are as follows: 1. The gauging stations in the subject site's pooling group are identified using d ij values obtained by the following equation having a minimum of 500 station years of data in the pooling group and satisfying the 5T rule for the 100 year quantile. The weights 1.5, 1.0 and 0.1 are those reported in Sect. 3 above.
2. The t 2 is obtained for each site in the pooling group and the average, without weights, of these is calculated to represent the pooled average t 2 (t R 2 ).   (55) 3. Random samples are drawn from GEV distributions with 3 different shape parameter values (k = −0.05, k = 0.0 (EV1), k = +0.03) using the t R 2 as the population value to construct a 95% confidence interval for t R 2 . These population shape parameters, k = −0.05 , k = 0.0 (EV1) and k = 0.03, are selected in this context which correspond to L-skewness ≈ 0.21, 0.17 and 0.15 respectively, this being the range relevant for Ireland. The sample size is taken as being equal to the average record length of the observed historical record at the gauging sites and the parameter values are estimated from the value of the t R 2 . The 95% confidence interval is constructed assuming that the samples t R 2 values are normally distributed. While the L-CV values may not be perfectly normally distributed Viglione's (Viglione, 2010) results show that the departure from normality is not severe for the range of L-CV and L-skewness values that are observed in Irish conditions. Hence the normality assumption was made in the calculation of confidence intervals.
4. The number of stations in the selected pooling group whose t 2 values fall outside the confidence interval (the attribute termed here as m) is counted and reported. It is also noted whether the t 2 of the subject site is outside the confidence limits (CL).

Analysis
The procedure described above is applied for each of the 85 stations. Each station had its own unique pooling group.
The sample values of t 2 for the stations in the group, t R 2 and the CL about t R 2 are displayed in Fig. 3 for five stations. The summary statistics of the procedure are given in tabular form in Table 4. In addition to that the heterogeneity measures, H1 and H2, described in Appendix A, for each group is calculated and a summary of these measures is reported in Table 5.
The following observations and findings are obtained from the analysis.
1. Table 4 lists how many stations, m fall into the categories of zero value outside the CL, one value outside the CL, 2 values outside the CL, 3 values outside the CL or more than 3 outside the CL. In all, for the case of EV1, only one station (1%) was in the first category while 52% of stations were in the latter category.  values for GEV (k = −0.05) and GEV (k = +0.03) are broadly similar. Table 4, it is seen that as the shape parameter increases from k = −0.05 to +0.03 the number of cases where m > 3 increases from 33 to 47.

From
3. In 27 groups (32% of groups) the t 2 of the subject site was outside the CL for the case of EV1. The corresponding numbers for the case of negative shaped GEV and for the case of positive shaped GEV are 27 and 28 respectively. All the 27 stations of the EV1 case were also in the latter cases.
4. Table 5 summarises the results of H1 and H2 for the 85 pooling groups. 22% of groups have a H1 value lower than 4.0. The percentage increases to 86% when the same criterion is set for H2 and that is very similar to what was found for the UK pooling groups (FEH, 1999, p. 176).
5. The range of t 2 values, max t 2 -min t 2 , was calculated for the 85 pooling groups. The average range of t 2 for the 85 pooling groups was 0.11 with a minimum value  Figure 4 shows a plot between H1 values and ranges of L-CV values for the 85 groups. The plot shows an upward trend, implying that a high H1 value can be expected when the t 2 values in a pooling group have a large range, which can be expected in the absence of homogeneity. A similar plot is drawn for H2 in Fig. 5, showing no obvious trend, implying that a low H2 value may be obtained for a pooling group which is in fact a heterogeneous group.
6. Figure 6 shows a plot between H1 and m. Different values of H1 occur for a particular m value and that is reasonable as the memberships of the groups in those cases are different even though they may have some overlap. However, the average values, marked by triangles in the plot, show an increase of H1 with m, i.e. the higher the number of t 2 values of group members outside the CL, the higher the value of H1 that can be expected. If a H1 value less than 4.0 is considered as a good criterion for testing homogeneity, then in this approach it is required that fewer than m = 2 values fall outside the confidence limit, i.e. m/N ≤ 0.15. 7. Figure 7 shows a plot between H1 and d ij,max of the pooling groups. The d ij,max is defined here as the distance associated with the group member which just qualified as a member of the pooling group. The plot shows an upward trend to some extent, implying that a low H1 value can be expected for a low d ij,max value, which is an implicit assumption of a ROI pooling scheme. However in many cases, low d ij,max values, even those below 1.0, can lead to a high value of H1 suggesting that the assumption may not always be true particularly for Irish conditions. A similar plot is drawn in Fig. 8 between d ij,max and m. The plot leads to a similar conclusion to that for Fig. 7. While a low value of d ij,max is desirable, it is noted that even low values of d ij,max can occur where a significant number of group members' t 2 values falls outside the CL.

Investigation of selected heterogeneous pooling groups
The investigation has been carried out on those 27 cases where the pooling groups are heterogeneous and in which the t 2 of the subject site lies outside the confidence limits. The investigation mainly focuses on identifying any inappropriateness among group members that would cause the pooling groups to be heterogeneous. In this context, FEH (1999, 3, Fig. 16.9) documented a detailed review system, providing an example. That system mainly considers two attributes: (1) whether the subject site has any special qualities that need to be taken into account and (2) whether any of the pooled sites has catchment descriptors that are particularly different from those of the subject site. Sites in the pooling group can be investigated using several characteristics including atsite flood statistics and catchment descriptors. Statistics in a pooling group such as discordancy measure (Hosking and Wallis, 1997) and the distance measure (d ij ) can also be used to investigate sites in the pooling group. In this part of the study, four catchment descriptors, namely, AREA, SAAR, BFI, FARL; and the distance measure (d ij ) are taken into account in the investigation process. The first three of the catchment descriptors, AREA, SAAR and BFI,were already used for initial selection of sites for a pooling group. In the investigation procedure, sites are reviewed with the help of Box-plots and a summary table and in some cases, with the help of the 'examination of homogeneity' chart described in Sect. 4. Four Box-plots of catchment descriptors, such as AREA, SAAR, BFI and FARL, are constructed to show the subject site in the context of the pooling group. For each of these catchment descriptors, the placement of numerical values for sites in the pooling group is displayed against a backdrop of the relative frequency of the 85 sites considered in this study. This facilitates the identification of any particularly inappropriate sites. In the summary table, statistical properties such as t 2 , t 3 and d ij values of sites in a pooling group are listed as shown in Fig. 9. The investigation procedure for pooling groups of station no 6031 is described in detail as it serves as an example.

An example: station no 6031 on the River Flurry
There are 17 sites in the pooling group of which eight, including the subject site, have values which fall outside the CL, thus indicating a strongly heterogeneous group. The heterogeneity measures H1 and H2 for the group are 7.66 and 2.82 respectively. The examination of Box-plots in Fig. 9 reveals the catchment area of the subject site is small (46.2 km 2 ) and it is very near to the 5 percentile mark on the Box-plot of AREA. The site is not positioned at the centre of the group of gauged catchments in the pooling group. There are 5 sites on the left of the subject site and there are as many as 11 sites on the right. The attribute certainly includes some sites that have large catchment area compared to the subject site. This may lead to d ij values exceeding the value 1.0 in several cases. The d ij values for the last three sites are around 1.3 and these sites are among the seven other sites that fall outside the CL. The examination of the summary table on the right hand side of Fig. 9 shows that the subject site has large values of both t 2 and t 3 and that these are the largest among the group members. Hence, the conclusion can be drawn here that the pooling group in its present structure may not be ideal for that subject site 6031. Leaving out some sites at the bottom of the table might be considered in this context. The large number of sites, 17, in the pooling group is also a possible contributor to heterogeneity. The remaining 26 pooling groups of 27 heterogeneous pooling groups were also investigated and in 8 cases the heterogeneity was due to exceptionally large or small catchment area relative to the other group members. Likewise 3 cases were similarly caused by exceptionally large SAAR or small SAAR values and 3 cases by exceptionally large BFI or small BFI values. A further 5 cases were caused by extremely low FARL values relative to other pooling group members. In 7 cases there was no obvious single cause of heterogeneity. Fig. 9. Four Box-plots and a summary table for investigating a pooling group. The subject site is marked with a ×. Small dots denote sites included in the pooling group. The underlying distribution of each catchment descriptor is shown in the Box-plots. Each Box-plot gives the minimum and the maximum value (+) and percentiles for the frequencies 0.05, 0.25, 0.5, 0.75, 0.95. The summary table lists record length, t 2 , t 3 and d ij values for a 100-yr pooling group for subject station 6031.

Conclusions
In the context of ROI pooling group based flood frequency estimation procedure, the most suitable form of distance measure d ij for Irish conditions was sought. The ROI method with the suitably identified distance measure, Eq. (1), was used to form pooling groups for the subject sites. A simple graphical approach of examining homogeneity of the pooling groups was presented. The graphical approach compared the sampling variability of pooled estimates of L-CV with the L-CV of pooling group members. The approach also allowed the location of L-CV of the subject site to be viewed in the context of pooling group members, which is important in the case of site specific pooling group. Most of the Irish pooling groups exhibited a degree of heterogeneity among the group members. A graphical approach of reviewing a heterogeneous pooling group was also presented in this context. The following conclusions were obtained from the above studies: 1. It was found that the distance measure d ij could be satisfactorily defined in terms of lnAREA and lnSAAR but if there is a desire to incorporate another physical catchment effect then the BFI could be included with these two. The d ij can also be defined in terms of lnMSL and ARTDRAIN.
2. A visual approach for the identification of the homogeneity of ROI pooling groups has been presented. The results are compared with the heterogeneity measures H1 and H2, obtained for those groups. Overall the results show that even with a carefully considered ROI procedure, such as using distance measure of Eq. (1), it is not certain that perfectly homogeneous pooling groups are identified. As a compromise it is recommended that a group containing more than 2 values of L-CV outside the 95% confidence limits of that variable, i.e. m/N > 0.15 should not be considered homogeneous.
3. A thorough investigation on 27 heterogeneous pooling groups has been carried out. In many cases, special attributes of the subject site such as extremely large or small values of AREA or of SAAR or of BFI or exceptionally low values of FARL contributed to the degree of observed heterogeneity of the pooling groups. It is deemed necessary that the subject site be positioned near the centre of the group of gauging sites, on the respective catchment descriptor axes, to which it is hydrologically similar; but in some cases the fulfillment of that condition does not guarantee that the pooling group is homogeneous.