Extending global irrigation maps – going beyond statistics

Agriculture is the largest global consumer of water. Irrigated areas contribute to 40% of the agricultural production. Information on their spatial distribution is highly relevant for regional water management and food security. Spatial information on irrigation is highly important for policy and decision makers who are facing the transition towards a more efficient sustainable agriculture. However, existing global 10 datasets are coarse and the mapping of irrigated areas still represents a challenge for land classifications. The following study extends existing irrigation maps that are based on statistics. The approach processes and analyses spatial data incrementally in a multi decision tree by using multi-temporal NDVI SPOT-VGT data and agricultural suitability data – both at a spatial resolution of 30 arc seconds. It covers the period from 1999 to 2012. The result exceeds the resolution of existing global studies and is not restricted to official reports made by 15 surveys. Irrigated areas that were not yet included in the reports could be identified. The results globally show 22% more irrigated areas than existing approaches and statistical data. The largest differences compared to existing data are found in Asia and particularly in China and India. The additional areas are mainly identified within already known irrigated regions where irrigation is more dense than previously estimated. 20


Introduction
One of the major challenges for the 21st century will be the nourishment of the rising world population (Foley et al., 2011).
The consideration of increasing meat consumption and additionally the increased use of biofuel and bio-based materials, lead 25 to estimations that global agricultural production double until 2050 (Alexandratos and Bruinsma, 2012;Godfray et al., 2010;Tilman et al., 2011).Separated by sector, agriculture is the largest consumer of water.69% of the global water withdrawal from rivers, lakes and groundwater (blue water) is used for agriculture, in some regions the share can be over 90% like in South Asia or in the Middle East (FAO, 2014b).The regional limitation of fresh water availability plays a crucial role for global agricultural production, considering that 40% of the global yields are harvested on irrigated fields (FAO, 2014a).30 Irrigated areas almost doubled over the last 50 years and contribute to 20% of the global harvested area today (FAO, 2016b).
A future expansion of irrigated area and a related increase in water consumption is expected (Neumann et al., 2011).Due to climate change in some parts agricultural water availability is expected to decrease (Strzepek and Boehlert, 2010).The low water-use efficiency of the common irrigation techniques such as sprinkler and flood irrigation (Evans and Sadler, 2008), the unsustainable usages of limited sources like groundwater (Wada et al., 2014), the changing river regimes (Döll and Schmied, 35 2012) and the changing supply by snow melt (Justin et al., 2015;Prasch et al., 2013) underline the need of a transition towards a more sustainable and efficient use of water.The SDG's clearly reflects this need in achieving food security and a 1 Hydrol.Earth Syst.Sci. Discuss., doi:10.5194/hess-2017-156, 2017 Manuscript under review for journal Hydrol.Earth Syst.Sci. Discussion started: 5 April 2017 c Author(s) 2017.CC-BY 3.0 License.
sustainable development of land use (UNO, 2016).For a better inventory and investigation of global and regional water cycles and as input for crop models detailed global information on irrigated areas at a high resolution is needed.
Attempts to identify irrigated areas already exist that do not rely on surveys and are independent from statistics (Ozdogan et al., 2010).Remote sensing can be an alternative approach for mapping irrigated areas.Previous studies showed that remote sensing data can be used to detect irrigated areas for small and medium scaled analyses (Abuzar et al., 2015;Jin et al., 2016;Ozdogan and Gutman, 2008).Vegetation indices (Ozdogan and Gutman, 2008) or climate elements, such as evapotranspiration (Abuzar et al., 2015) derived from satellite information and combined with meteorological data were used to determine irrigated area.Ozdogan et al. (2010) summarised different approaches for mapping irrigated areas from local to global scale.
There are only few sturdies which identify irrigated areas globally (Salmon et al., 2015;Siebert et al., 2005;Thenkabail et al., 2009a).Land use classification data sets often neglect irrigated area.Some classify irrigated area as a separate class (ESA, 2015;USGS, 2000), but do not focus on irrigated areas.
A common approach to the specific mapping of irrigated area, such as provided by the Global Map of Irrigation Areas (GMIA) (Siebert et al., 2005), distributes statistical data of national and subnational surveys like AQUASTAT (FAO, 2016a) to agricultural and other areas from land use classifications.However, approaches that are restricted to statistics are hard to verify, since statistics may include errors.For instance in some countries in West Africa the informal irrigated areas in urban and peri-urban areas are twice the size of the official irrigated areas for the whole country (Drechsel et al., 2006).It may increase due to economic growth and a dietary shift from staple crops towards more vegetables and fruits (Molden, 2007).
Already 15 years ago the official FAO statistics engendered criticism after comparing national statistics with remote sensed based data (Vorosmarty and Sahagian, 2000).The study of Thenkabail et al. (2009a) globally identified 43% more irrigated areas than reported in official FAO statistics.The discrepancies between those data were explained by the politicized nature of the FAO data reports and different definitions of irrigated area (Vörösmarty, 2002).The global irrigated map (GIAM) of Thenkabail et al. (2009a) is a pure remote sensing dataset using multiple satellite sensors.It is validated using ground truth data and Google Earth images.Thenkabail et al. (2009a) showed that the global irrigated areas might be underestimated by the official statistics.The latest attempt to map global irrigated areas was developed by Salmon et al. (2015).They combine statistics and climate-and remote sensing data.The study also shows an underestimation by the national-and subnational statistics -although a small one.Thenkabail et al. (2009b) concludes that 'both remote sensing and national statistical approaches require further refinement'.
The aim of this study is to develop a new global approach that is not restricted to irrigated areas known by official reports and allows for extending these predetermined areas.We do this by combining statistical data from official reports with multitemporal remote sensing data and data on agricultural suitability, following a decision tree to determine irrigated areas.We compare our results with existing approaches and ground truth data and investigate regional differences to statistics.

Data and Method
The basic idea of the approach is to combine different datasets with different information.The applied datasets have different spatial resolutions (Tab.1), which are homogenized in a first step We decided for a high spatial resolution of 30 arc seconds (approx. 1 km² at the equator), since the demand for high resolution global data is increasing in different applications (Deryng et al., 2016;Jägermeyr et al., 2015;Liu et al., 2007;Mauser et al., 2015;Rosenzweig et al., 2014) and the pixel size of approximately 1 km² is already close to large fields (depending on the region) or an agglomeration of smaller irrigated fields.Therefore, the resulting data at 30 arc seconds only distinguishes between irrigated and rainfed and does not contain percentage shares.We assume that irrigated areas are well covered at a spatial resolution of 30 arc seconds, since irrigation is usually not practiced on a single field, due to high investment and installation costs of irrigation systems.

2.1
The downscaling of the statistically based dataset Siebert et al. (2005) distribute statistical data to the Global Map of Irrigated Areas (GMIA).The dataset has a resolution of 5 arc minutes and is available in several versions -we applied the version 5.0 (Siebert et al., 2013).To combine the different data sets to a final irrigation map at a resolution of 30 arc seconds, the resolution of GMIA has to increase.For the downscaling process we use global bimonthly maximum MERIS NDVI data (ESA, 2007) at a spatial resolution of 10 arc seconds and calculate the yearly maximum NDVI (Fig. 1).After upscaling the yearly maximum NDVI to 30 arc seconds, the GMIA data are distributed to the areas with the highest NDVI within a corresponding coarse pixel.To avoid distributions to dense woodlands (closed tree cover >40%), cities and open water, these areas are excluded from the distribution, based on the ESA-CCI-LC dataset (ESA, 2015).Pixels with a percentage share of irrigated area below 1% are not considered.The downscaled dataset of Siebert et al. (2013) shows the irrigated area at a high spatial resolution of 30 arc seconds and will in the next steps be extended by irrigated area, which are not part of the statistics yet.In the following, the downscaled dataset of Siebert et al. (2013) will be named as "downscaled GMIA".

Remote sensing data
For the detection of the actual active vegetation we used the NDVI product of ESA-CCI (ESA, 2015).The data provides 7daily-NDVI means and covers the time period from 1999 to 2012.From this data, we calculated the annual course of NDVI, averaged over the time period.Thereof we derived the number of NDVI peaks.In order to increase the precision of detecting active vegetation, each pixel is analysed according to a number of criteria that needs to be fulfilled (Fig. 2).The chosen criteria are robust regarding the fact, that we used 7-daily-NDVI-means averaged over 14 years.
The minimum NDVI has to be below 0.4, while the maximum NDVI has to be over 0.4.Since the NDVI product is a 7-daily mean over 14 years, it is very likely that fields lie fallow within the time period, resulting in lower mean values.Therefore, a NDVI of 0.4 figured out to be a suitable threshold.This guarantees clear distinction between non vegetated and vegetated pixels and eliminates evergreen vegetation, such as forests and pasture.Minimum and maximum NDVI must at least have a difference of 0.2 points to identify only pixels with a dynamic annual course that is assumed for agricultural areas.NDVI peaks must be at least 12 weeks apart to assign a peak to a specific growing period, assuming that the length of a growing period is 12 weeks in minimum (Sys et al., 1993).Additionally, this allows for separating multiple growing periods within a year.Often, a slight greening right after harvest was observed.This can be explained e.g.due to the seeding of legumes for soil treatment, or the development of natural vegetation after harvest, which results in an increase of NDVI.In order to avoid classifying multiple peaks as a regular harvest, it turned out that two sequenced peaks must not differ by more than 25%.The described criteria of minimum, maximum and yearly course of NDVI and the length of growing period turned out as robust to determine the number of crop cycles globally.

Land use classification products
The extension of irrigation is restricted to agricultural areas.The information on agricultural areas are token from the ESA-CCI-LC product (cropland rainfed, cropland irrigated, mosaic cropland > 50%) (ESA, 2015) and from the predecessor GlobCover (ESA, 2010) (Post-flooding or irrigated croplands, rainfed croplands, mosaic cropland (50-70%)).According to the authors, the 'accuracy associated with the cropland and forest classes' is high 'and therefore a quite good result' (ESA, 2015).But especially the difference between cropland and pasture is not completely clear.GlobCover more often allocate the class 'grasland' where the ESA-CCI-product classified 'cropland'.In using both datasets we classify irrigation on areas where we can be sure that the land is used as cropland and bypass the problem of pasture classified as cropland.

Agricultural suitability data
Agricultural suitability data are taken from Zabel et al. (2014).The data describes the suitability for 16 different crops according to climate, soil and topography conditions at a spatial resolution of 30 arc seconds and are available for past and future climate periods.The considered crops include the most important staple, energy and forage crops.Crop specific requirements for climate, soil and topography are used to determine membership functions in a fuzzy logic approach to calculate suitability (values between 0 and 1).The data set also includes information on the potential number of crop cycles per year.The data are available for rainfed and irrigated conditions separately.The applied data determines the agricultural suitability for crop cultivation and the potential number of crop cycles per year, both for rainfed conditions and under the 5 climate for 1981-2010 (Zabel et al., 2014).Soil properties are not considered in this approach, because human activities may alter soil properties e.g. by fertilizer and manure application and soil tillage.

High resolution mapping of irrigated areas
The downscaled GMIA data serve as a basis, providing a proven global distribution of irrigated areas.The irrigated areas which are already part of the statistics are extended by additional -until now -not captured irrigated areas.The identification 10 of the additional irrigated areas in the new irrigation map is accomplished using specific criteria described above and If one of the criteria is true, we assume the full area of the 30 arc second pixel as being irrigated.The decision tree in Fig. 3 gives an overview of the full processing and analysis of the applied data.As a result, the combination of A, B, and C identify 25 the irrigated pixels, which were not assigned to irrigation areas in the downscaled GMIA irrigation map.It now includes irrigated areas that were not identified before and quantitatively are not captured in the statistics.

Results
Before we compare the results of the new global irrigation map with the existing products, we analyse the difference between the downscaled GMIA and the GMIA in the original resolution, to test the quality of the downscaled GMIA.30

Differences between the downscaled GMIA and the original GMIA
The downscaling process leads to differences between the downscaled and the original GMIA data.Since irrigated areas < 1% are not allocated to the finer resolution, they are neglected within the downscaling process.This leads to a global loss of irrigated area of 46,329 km².If there are no pixels available for distribution, e.g.due to excluded land such as forests, water bodies or urban areas, the irrigated area may not be allocated, which results to a global reduction of 19,780 km².

35
Globally, we additionally lose 2,442 km² through rounding the floating point numbers of the percentage share of the irrigated areas.Overall, we do not distribute 68,551 km² of irrigated areas, which are 2.28% of the GMIA dataset in its original resolution.Due to this small difference, we compared the new irrigation map with the downscaled GMIA, which results from the procedure described above, to use the same spatial resolution for a pixel wise comparison.

Global analysis
The new global irrigation map shows 22% more irrigated areas than the downscaled GMIA (Fig. 4).Overall, 3,680,760 km² of irrigated areas have been identified, which is an increase of 667,440 km² compared to the downscaled GMIA (Fig. 5).The global result confirms the underestimation of irrigated areas of Thenkabail et al. (2009a)who globally identified 3,985,270 km² irrigated areas by a remote sensing based approach and are significantly higher than the results of Salmon et al. (2015) with 3,14,100 km² and the global estimates of the FAO or of Siebert et al. (2005).
Figure 5 shows the additional allocated area according to each of the indicators A, B, and C as described in section 2.5.The largest amount of additional irrigated area is identified by the consideration of multiple cropping (B).In this case, 493,123 km² are not part of the downscaled GMIA.These areas are mainly found in Asia (Fig. 4), where according to our results, irrigation is often required to allow for multiple cropping.100,069 km² are additionally identified, because they are not suitable for crop cultivation but are classified as cropland (indicator C).By the use of indicator A, 76,054 km² are additionally allocated.
The indicators A, B and C show different amount of additional irrigated area for different regions.Methods A and C identified irrigated areas mostly in arid and semi-arid regions, by comparing low or no suitability versus high NDVI.Figure 6 shows that additional irrigated areas by using A and C are mainly found in regions with annual precipitation < 500 mm, according to the WorldClim dataset for 1961-1990 (Hijmans et al., 2005).
In humid regions, criterion A and C are not sensitive, because agricultural suitability values in humid regions are high since precipitation is not limiting.We found that B extends irrigated areas in regions with low as well as high annual precipitation (Fig. 6), where irrigation is often used to allow for a second harvest.In total, Figure 6 demonstrates that irrigation decreases with increasing precipitation, but irrigation not only takes place in dry regions.

Asia
The new identified irrigated areas are mainly found in Asia, particularly in Central and South East Asia.The countries with the largest amount of additionally identified area are India (+267,283 km²) and China (+149,871 km²).In these countries, irrigation plays a dominant role in agriculture, where 40% (India) and 57% (China) of the total cropland is irrigated according to statistics (FAO, 2016b).Nevertheless, statistics seem to largely underestimate irrigated areas, particularly in India.Here, we found on the one hand considerable additional irrigated areas compared to GMIA within regions that are sparsely irrigated, such as the state of Madhya Pradesh (Fig. 7).On the other hand, irrigated areas are additionally identified within regions that already show a high density of irrigated areas, such as Uttar Pradesh along the foothills of the Himalayan Mountains, where the density of irrigated areas even increases in our results (Fig. 7).Particularly in these regions the irrigated areas where detected comparing the potential vegetation cycles to the actual yearly NDVI coarse.Due to the seasonality of the precipitation only one harvest is possible -the second has to be achieved by irrigation.Even possible legumes as nitrogen fertilizers have to be irrigated.Within Asia, the developed method unveil large previously unknown irrigated areas in Kazakhstan (+30,661 km²), Pakistan (+26,667 km²), Myanmar (+25,212 km²), Uzbekistan (+17,454 km²) and Turkmenistan (+13,483).In Central Asia, particularly the irrigated areas along the rivers are larger than reported.The Asian countries with the largest percentage mismatch when compared to FAOSTAT statistical data (averaged from 1999-2012) are Mongolia (+4,211%), Kazakhstan (+376%), Oman (+331%) and Turkmenistan (+198%) (FAO, 2016b).

Africa
Irrigation plays a minor role in the tropical regions of Africa, while there are contiguous irrigated regions along the Nile in Egypt and Sudan, some smaller irrigated areas within the Mediterranean countries and some irrigated areas within Southern Africa.The countries with the largest amount of additional irrigated areas are found in Somalia (+6,427 km²), Egypt (3,867  2).Countries with the highest percentage difference to statistics are Botswana (+5,718%), Kenya (3,221%), Chad (+2,040%), and Congo (+1,812%) (FAO, 2016b).

Europe
The discrepancy between the downscaled GMIA and the new irrigation map in Europe is smaller than in the regions mentioned above.The highest difference exists in Italy (+11,059 km²), Spain (+5,270 km²) and Greece (+3,922 km²).While the Po valley, the largest contiguous irrigated region within Europe, does not show remarkable differences between the downscaled GMIA and the new irrigation map, many additional areas on Sardinia and Sicily are detected.In Spain, the known irrigated areas near to the Pyrenees are well captured by GMIA but especially the intensely used agricultural area around Valladolid in the North West of Spain shows additional irrigated areas according to our results.The highest percentage mismatch to FAOSTAT is found for Bosnia-Herzegovina (+912%), Montenegro (+574%), Croatia (+384%) and some other countries in the East Europe.The comparison of FAOSTAT to GMIA in these regions results in similarly high differences, since the FAOSTAT data were obviously not used in the GMIA data.The highest percentage mismatch in Western Europe to FAOSTAT are found in Portugal (+47%), Italy (+31%), France (+28%) and Great Britain (+27%) (FAO, 2016b).

America
The large irrigated areas in North America are very consistent to the distributed statistics of the downscaled GMIA.Only in the North-Western part of the USA the irrigated areas are underestimated by GMIA.It is notable that additional identified irrigated areas are found next to already detected irrigated areas in California, North West and the Middle West of the USA.Thus, density increases within irrigated agglomeration regions.The percentage mismatch to FAOSTAT is relatively low compared to the other continents (Table 2).The highest percentage difference is found for Chile (+77%), Canada (+41%), Brazil (+35%) and Mexico (+27%) (FAO, 2016b).
To demonstrate the effect of the high spatial resolution of the results, Fig. 8 shows the results for a specific extent in the North West of the USA (Oregon).The comparison of the new irrigation map at 30 arc seconds resolution with the GMIA at 5 arc minutes resolution demonstrates the improvement of the data (Fig. 8).The higher resolution allows for a more precise identification of irrigated fields.Further, the additionally recognized irrigated areas that are not included in the GMIA dataset match well with the underlying true colour satellite image.In this case it also shows that the resolution of 30 arc seconds degree is more suitable for field scale and fits quite well for the size of irrigated fields.

Validation
The new irrigation map partially shows significant differences to the results of existing studies.The official national and subnational statistics are considered in the irrigation map.Therefore, using the statistics to validate the new irrigation map is not an appropriate way to prove the results regarding their accuracy.The comparison of ground truth data with the new irrigation map can be a way to outline the accuracy.There are some ground truth data available, that provide point specific land use information.The European Union started the LUCAS photo viewer (EUROSTAT, 2012) to validate their land use classification CORINE (Agency, 2014).It covers more than 270,000 sample points in Europe.The data base includes for each sample a photo and metadata which classifies the land use.The systematic comparison of the 6,328 samples on irrigated land results in an accuracy of 72% and is 2% higher than the accuracy of the downscaled GMIA.The better performance of the method in dry areas or regions with a hot summer implies a higher accuracy in the other irrigated regions.The accuracy of the new irrigation map in dryer and warmer parts in southern Europe (derived from Geiger-Koeppen climate classification (Kottek et al., 2006)) is 87% and thereby higher than in the more humid and colder parts.

Discussion and Conclusion
The results give a different and more realistic view on the global distribution of irrigated areas at a high spatial resolution.
The analysis indicates an increase of irrigated land by 22% compared to the reported statistics.The validation of the results shows the reliability of the irrigation mapping method, especially in dryer regions.Nevertheless, uncertainties within the input data are of course included in the results.Despite the high accuracy, the ESA-CCI-LC and GlobCover land use 5 classification include uncertainties, which lead to errors in mapping irrigated areas.For example, cropland is often classified as grassland, pastures or meadows.Furthermore, the use of the agricultural suitability may lead to errors because it consists of only 16 different crops.Accordingly, it does not consider e.g.drought resistant varieties or other species that are adapted to regional climatic conditions.Further, through high groundwater levels or the proximity to open water, plants could reach water sources through capillary rise or directly tap the groundwater.This creates alternate water availability for the plants and 10 can mimic irrigation in otherwise unsuitable locations.
Most of the additional areas are found in India and China and the highest discrepancies to the statistics are generally found in developing countries.Possible reasons are inadequate statistics that are often a result of political interests (Thenkabail et al., 2009b).General uncertainties or inadequacies of agricultural statistics are well known in many developing countries and e.g.discussed in Young (1999), andThenkabail et al. (2009b).It seems that not all irrigated areas are correctly reported in the 15 official statistics.This indicates the existence of illegal or unregistered irrigation activities.The results also go along with former analyses that showed large underestimation of irrigated areas in the statistical data, especially for India (Thenkabail et al., 2009b) and West Africa (Drechsel et al., 2006).
The results change the view on global agriculture.Irrigation increases agricultural production (Smith, 2012), reduces vulnerability for crop failures, increases food security and income (Bhattarai et al., 2002).At the same time, more irrigated 20 areas need more water that is mainly taken from surface runoff and groundwater storage.This may lead to an overuse of regionally available water resources and may threat future agricultural activities (Du et al., 2014).Illegal wells and withdrawals are not reported in the official statistics and are part of the problem in overusing water resources.The quantification of not reported or illegal water withdrawal for irrigation is highly important for inventorying agricultural water use for a better estimation of the available water resources.Moreover, the results demonstrate the need to use further 25 independent survey techniques besides official statistics and reports to monitor the implementation of the SDG's.Mapping irrigated areas still represents a challenge.Remote sensing is an appropriate tool to monitor permanently the globally irrigated area at high spatial resolution.Recent progress in the availability of remote sensing instruments through the Copernicus system of EU (European Commission, 2017) now delivers weekly global high resolution (10-20 m) coverage.As a next step SDG monitoring systems have to be developed which combine these data streams with sophisticated crop growth 30 simulations to determine the efficiency with which irrigation water is used on each irrigation field on the Globe.
relationships of plant physiological indices to the agricultural suitability.The general criterion for the identification of unknown irrigated areas is that the land use is already cropland according to ESA-CCI-LC and GlobCover.The restriction to cropland avoids the classification of irrigated areas in other land uses or covers in dry areas with high NDVI values due to lichens or weed.A low agricultural suitability does not exclude plant growth at all.Irrigation is assumed on areas classified 15 as cropland if at least one of the following three criteria is fulfilled: A. The annual NDVI course clearly suggests a dynamic vegetation growth while the agricultural suitability shows a low value.B. The number of NDVI peaks is higher than the potential number of crop cycles per year under rainfed conditions.20 C. Land is not suitable but classified as cropland while at the same time NDVI values and yearly courses indicate vegetation.

Figure 1 :
Figure 1: Yearly maximum NDVI derived from maximum bimonthly NDVI data of the EnviSAT MERIS instrument.

Figure 2 :Figure 3 :
Figure 2: Idealized NDVI course of single-and multi-cropping and the conditions which must be fulfilled.

Figure 4 :
Figure 4: Irrigated areas identified by different approaches.

Figure 5 :
Figure 5: Results of the new irrigation map compared the downscaled GMIA.5

Figure 6 :
Figure 6: Yearly precipitation within the irrigated areas.Criteria A and C are suitable in dry regions while criterion B identifies in humid regions as well.Further, irrigation decreases with increasing precipitation, but is also used in regions with high yearly precipitation.

Figure 7 :
Figure 7: The Indian subcontinent and its identified irrigated areas.The blue areas are the information of the downscaled GMIA.Irrigation is more dense than expected in already irrigated regions and new areas appear in the state Madhya Pradesh.

Figure 8 :
Figure 8: Small scaled analysis of the new irrigation map (lower left) and GMIA (upper right).