Water Vapor Mapping by Fusing Insar and Gnss Remote Sensing Data and Atmospheric Simulations

Data fusion aims at integrating multiple data sources that can be redundant or complementary to produce complete, accurate information of the parameter of interest. In this work, data fusion of precipitable water vapor (PWV) estimated from remote sensing observations and data from the Weather Research and Forecasting (WRF) modeling system are applied to provide complete grids of PWV with high quality. Our goal is to correctly infer PWV at spatially continuous , highly resolved grids from heterogeneous data sets. This is done by a geostatistical data fusion approach based on the method of fixed-rank kriging. The first data set contains absolute maps of atmospheric PWV produced by combining observations from the Global Navigation Satellite Systems (GNSS) and Interferometric Synthetic Aperture Radar (InSAR). These PWV maps have a high spatial density and a millimeter accuracy; however, the data are missing in regions of low coherence (e.g., forests and vegetated areas). The PWV maps simulated by the WRF model represent the second data set. The model maps are available for wide areas, but they have a coarse spatial resolution and a still limited accuracy. The PWV maps inferred by the data fusion at any spatial resolution show better qualities than those inferred from single data sets. In addition, by using the fixed-rank kriging method, the computational burden is significantly lower than that for ordinary kriging.


Introduction
Water vapor is a vital constituent of the Earth's electrically neutral atmosphere (neutrosphere).Although the ratio of water vapor partial to total atmospheric pressure is typically below 4 %, it is an important constituent in many respects.Due to the dynamic nature of the neutrosphere and the complex energy exchange with the Earth's surface, the spatiotemporal distribution of water vapor can be highly variable.Accurate information about its content and tendency is the main prerequisite for the prediction of clouds and precipitation.Water vapor is important for studies of climate and natural disasters such as floods, droughts or glacier melting.On the other hand, radio signals transmitted from spaceborne sensors are refracted when traversing the Earth's neutrosphere.The neutrospheric water vapor contributes less than 10 % of the signal path delay; however, this error source is not easily eliminated.Accurate information about the water vapor concentration along the signal path is required, which is not always obtainable.Although many efforts have been made to produce accurate information about water vapor using ground-based, space-based or numerical methods, the available information is often limited in the temporal resolution, spatial resolution or accuracy (Bevis et al., 1992).Numerical atmospheric prediction models are increasingly used to provide simulations of the atmospheric parameters.Various studies suggested the assimilation of atmospheric parameters, such as water vapor, estimated from the Global Positioning System (GPS) or Interferometric Synthetic Aperture Radar (InSAR), into these models to improve the quality of F. Alshawaf et al.: Atmospheric water vapor by data fusion the simulated parameters (Pichelli et al., 2010;Bennitt and Jupp, 2008).We want to comprehend whether the model simulations of water vapor, in their current quality, can be used to even out the deficits of the measurement-based estimates, particularly in regions with no measurements.To achieve this purpose, a statistical data fusion approach is applied.The output water vapor maps can be used in tomographic approaches to provide 3-D water vapor grids and to adjust the parameters of numerical atmospheric prediction models.The remainder of this section presents the recent related research on water vapor using remote sensing data and atmospheric models.
The amount of remote sensing data available for monitoring the Earth and its atmosphere is growing in a rapid, continuous way.InSAR has proved its capability for detecting surface deformation, landslides, and tectonic movements (Massonnet et al., 1993;Zebker et al., 1994), and for deriving digital elevation models (Zebker and Goldstein, 1986).The influence of water vapor in the observations can be reduced by averaging a large number of interferograms (Zebker et al., 1997) or by time series analysis that indicates the stable persistent scatterers (Ferretti et al., 2001;Hooper et al., 2007).Besides, InSAR has recently been used to derive the phase shift caused due to the propagation in the Earth's atmosphere from the interferograms or by time series analysis (Hanssen, 2001;Meyer et al., 2008;Pichelli et al., 2010;Alshawaf et al., 2012).Global Navigation Satellite Systems (GNSS), however, have been considered since the 1990s as an efficient microwave-based tool for atmospheric sounding (Bevis et al., 1992;Rocken et al., 1995).Since then, numerous methods have exploited the GNSS observations to produce estimates of the integrated atmospheric water vapor and to generate water vapor maps (Luo et al., 2008;Jade and Vijayan, 2008;Karabatić et al., 2011).InSAR and GNSS, signals are affected in a similar way by the atmosphere (Onn and Zebker, 2006).Therefore, Alshawaf et al. (2015b) presented a new approach to deriving absolute, high-resolution maps of precipitable water vapor (PWV) by combining data from InSAR and GNSS.The SAR systems acquire the images at repeat cycles of multiples of days.Enivsat images, which are used in this work, are available in multiples of 35 days.The availability of the data over time can be increased by processing data from ascending and descending modes.In addition, new SAR missions have shorter repeat cycles, 11 days for TerraSAR-X and 6 days for Sentinel-1.The InSAR-based PWV estimates cannot be used to observe the variability of water vapor over a short time, but they are important in different aspects.This geodetic-based method produces maps of the PWV at a high spatial resolution without additional costs.These data can be exploited, first, to model the spatial variations of atmospheric turbulent and non-turbulent effects.Second, they can be used to observe the variation of water content over long time periods to detect, for example, unusual trends.Third, they can be used to adjust/readjust the initial and boundary conditions in atmospheric prediction models.
Atmospheric modeling systems are standard approaches to simulate 3-D distributions of the neutrospheric water vapor at various temporal and spatial samplings.Dynamic local area models (LAMs) are common tools for scaling down the coarse grids of global circulation models to meso-scale applicability.Several studies employed the Weather Research and Forecasting modeling system (WRF, Skamarock and Klemp, 2008) to compare the LAM simulations of PWV with GNSS point estimates (Mateus et al., 2010;Bender et al., 2008;Cimini et al., 2012) and PWV maps from MERIS (MEdium Resolution Imaging Spectrometer) (Alshawaf et al., 2012).These studies conclude that the mediumto long-scale (greater than 20 km) water vapor signals can be well predicted, whereas short-scale fluctuations are often hardly captured in a realistic way.
Despite manifold improvements over the last years, considerable uncertainties are still connected with the parameterization of physical processes in mesoscale-atmospheric models and biases of the driving model (Prein et al., 2015).This, in addition to the configuration of the model domains, can significantly impact the simulation output (Gong et al., 2010) as well as the model intrinsic water balance (Awan et al., 2011;Fersch et al., 2012;Fersch and Kunstmann, 2014).Therefore, the setup of the local area model is crucial, and it has to be proper for the study region and the research objectives.
Due to the availability of various data sources, which can be complementary or redundant, data fusion has received increasing attention in the Earth observation studies.The focus is put on the combination of multiple sources, which may be spatially, temporally, or spectrally inhomogeneous, to produce a more complete representation of a geophysical process.In this work, we use remote sensing data and numerical atmospheric models through a data fusion approach to provide improved information about the distribution of atmospheric water vapor.This information is important not only for weather forecasting and climate research, but also for better understanding how the InSAR interferograms are affected by water vapor, and for selecting the most appropriate method for reducing this noise.In turn, reliable local water vapor maps can support adaptation of the WRF model configurations and, hence, may improve the model performance.
In the following, we present water vapor maps derived from microwave remote sensing data and numerical atmospheric models.Since the available data have different spatial levels of aggregation, it is important to discuss the change of support problem.Then, we present the data fusion approach based on the kriging or fixed-rank kriging techniques.We first describe the ordinary kriging and how it can be extended for fusing multiple data sets.Then, we present the reasons behind using the fixed-rank kriging.We use the data fusion approach for predicting maps of the atmospheric PWV from remote sensing data and atmospheric models.

Atmospheric water vapor
Several observation systems are commonly used to continuously monitor the vertical and horizontal distributions of water vapor in the atmosphere.These devices are used either from the ground, such as radiosondes and ground-based water vapor radiometers, or from space, such as space-based water vapor radiometers and infrared sensors.In this work, we employ microwave remote sensing systems as well as numerical atmospheric models to provide accurate maps of the atmospheric water vapor at a high spatial resolution.Alshawaf et al. (2015b) presented a new approach to derive absolute, high-resolution maps of PWV by combining data from InSAR and GNSS.The data are collected in the region of Upper Rhine Graben in Germany and France over the period 2003-2008.Persistent scatterer InSAR (PSI) using the Stanford Method for Persistent Scatterers (StaMPS, Hooper et al., 2007) was applied to derive PWV maps from the In-SAR interferograms.These maps contain the water vapor signal of short-scale spatial variations, while the elevationdependent and long wavelength water vapor components are eliminated when forming the interferograms or by phase filtering.Therefore, GNSS-based PWV estimates were used to reconstruct the missing components.The approach for combining InSAR and GNSS data is presented in detail in Alshawaf et al. (2015a) and Alshawaf et al. (2015b).Figure 1 shows a map of PWV derived by combining PSI and GNSS data and the corresponding map extracted from MERIS observations.MERIS is a passive imaging spectrometer located on board the Envisat platform.It measures the solar radiation reflected from the Earth's surface or clouds.The ratio of the radiance values measured at channels 14 and 15, located respectively at 885 and 900 nm, are used to determine the vertical PWV content in the neutrosphere (Fischer and Bennartz, 1997).MERIS provides maps of the PWV at a spatial reso-lution of 260 m × 290 m (full-resolution mode).Under cloud weather conditions MERIS measurements are highly underestimated since the measured PWV represents only the water vapor existing between the sensor and cloud top; therefore, only five MERIS PWV images were available for this study.

Water vapor from remote sensing data
The PSI method produces information where stable persistent scatterers are identified, which requires a high coherence between the SAR images.In forests and vegetated areas, the probability of identifying persistent scatterers is low; therefore, in these regions, only sparse points are found.The white areas within the left figure indicate regions of low coherence and the corresponding data from MERIS are masked out.The spatial correlation between the maps is 95 % and the root mean square (rms) value of the differences is 0.68 mm.We can observe that the persistent scatterers are dense in the urban areas, while they almost disappear in the low coherence regions.Since PWV data are spatial, their covariance function is exploited by geostatistical techniques to reasonably infer the PWV at regular grids.In order to improve the inferred PWV maps, especially in the areas where the PWV estimates are sparse, we apply data fusion of the remotely sensed PWV maps with maps produced by the WRF model.

Water vapor from regional atmospheric models
As depicted in Fig. 2, the WRF model (version 3.1.1, Skamarock et al., 2008) was set up with a parent domain of 27 km × 27 km resolution and two nests with 9 km × 9 km and 3 km × 3 km, respectively.Feedback from the nests to their parent domain was not activated.Vertically, the model is divided into 42 layers with variable distance.The resolution is increased for the lower troposphere where most of the atmospheric vapor resides.The model top is defined at 50 hPa.The selection of the physical modules is based on the study of Berg et al. (2013); accordingly, the WRF singlemoment (WSM) 5-class scheme (Hong et al., 2004) was selected for microphysics.Shortwave and longwave radia- tion was computed with the community atmospheric model (CAM) scheme (Collins et al., 2004).The processes in the planetary boundary layer were represented by the Yonsai University scheme (Hong et al., 2006).The surface layer was simulated with the Monin-Obukhov scheme, and the Noah land-surface model (Chen and Dudhia, 2001) was applied for the surface physics.Sub-grid convective processes were included with the Kain-Fritsch parametrization (Kain, 2004).The global dynamic boundary conditions were ingested from the European Center for Medium-Range Weather Forecasts (ECMWF) ERA-INTERIM reanalysis at a 6 h interval (Uppala et al., 2008).In ERA-INTERIM, a broad range of different data sources is assimilated.For the atmospheric moisture analysis, ground-based station observations, radiosonde profiles, and GPS radio occultation are exploited.Additionally, total column water vapor information from the Special Sensor Microwave/Imager (SSM/I) and the Advanced Microwave Scanning Radiometer for the Earth Observing System (AMSR-E) is assimilated (Dee et al., 2011).MERIS retrievals of column water vapor are not ingested into ERA-INTERIM, and thus they depict an independent data set for our approach.
The WRF simulations cover the period between July 2004 and September 2005, such that the first 5 months were considered as spin-up.The PWV content was determined at every output time step (10 min) by a vertical integration of all moisture fields from the land surface to the model top.Two output time slices were compared with the simultaneous MERIS observations.The long-scale signal is modeled by a linear trend and subtracted from the maps; hence, negative values are observed on the color bars.From the compared maps shown in Fig. 3, we observe that the spatial atmospheric patterns are not always correctly resembled by the model.On 27 June 2005 (09:51 UTC), WRF and MERIS PWV maps are strongly correlated with a coefficient of 0.8, whereas the analysis of 5 September 2005 (09:51 UTC) shows a lower spatial correlation (0.71).While the patterns east of the Upper Rhine valley are reasonably resembled, an unexpected discontinuity exists in the area around 7.7 • E, 48.7 • N.
At the lateral boundaries, WRF ingests the mixing ratio concentration from the global model.Thus, for the presented simulation, the global climate model lateral boundary conditions were applied to the first (outer) domain.Neither gridded nor spectral nudging was activated in order to conserve the model's internal water balance.Hence the GCM boundary fluxes and the local area model physics solely determine the propagation of moisture through the respective domains.For the analysis of 27 June 2005, the atmospheric conditions were rather unexcited and varied slowly, resulting in a good agreement between MERIS and WRF data.On 5 September, a quickly moving frontal system with a strong west-to-east gradient and a notch in the atmospheric vapor over the Upper Rhine Graben characterized the study region.It is not clearly distinguishable whether the structure and dynamics of the ERA-INTERIM boundaries or the WRF model configurations are responsible for the discontinuity in PWV.

Change of support problem
Spatial data, for which close observations correlate more than distant ones, can be collected at points or areal units.The former are called point-level data or simply point data and the latter are areal-level or block data (Gelfand et al., 2001).In geostatistics, this defines the spatial support of the data.When both data types are available, data fusion can be applied to infer the underlying process at any level of support.The change of support problem is concerned with the inference of the underlying process at point levels or block levels different from those at which the data are available.This also includes fusing data at different support levels.Based on the available input data and the desired output grid, there are four prediction possibilities: points to points, points to blocks, blocks to points, or blocks to blocks.These prediction possibilities may be collected under the umbrella of kriging (Cressie, 1993).
For block data that can be expressed as an average of point data as if it is collected within the block, such as rainfall, temperature, surface elevation, and atmospheric water vapor, the following model is appropriate: where Y (B i ) and Y (s) define the block and point data, respectively (Fig. 4).B i refers to the block over which the data are aggregated and |B i | is the volume (or cardinality) of the data.The block-level covariance can then be related to the point-level covariance as follows: where C(B i , B j ) is the block-to-block or block covariance function and C(u, v) is the point covariance function.
4 Spatial data fusion using kriging methods

Ordinary kriging
In geostatistics, a spatial process can be inferred over a continuous spatial domain by exploiting the covariance function as an important source of information.Predictions are obtained based either on single or multiple sets.Kriging is a geostatistical interpolation technique that infers values at new locations by considering spatial correlations (Cressie, 1993).The spatial density of the data points has to be enough to capture the covariance structure of the process.This information is represented by a variogram or covariance function, which is used to determine the predictions.If the considered spatial data set is denoted by Z, then the kriging estimator

F. Alshawaf et al.: Atmospheric water vapor by data fusion
Ŷ (s 0 ) at the location s 0 is determined as follows: where the vector a contains the kriging weighting coefficients and Z is the centered data set (see Eq. 7).The best linear unbiased estimator is found by solving the following constrained minimization problem: The constraint is added to guarantee that the estimator is unbiased with respect to the true process Y (s).A semivariogram function that reflects the spatial correlations is required to solve the minimization problem, which is determined from the detrended data in Eq.A6.
The kriging method extends the spatial process using the following linear model: where (s) is an independent error term, which is assumed to be a white noise process with a mean zero and variance σ 2 .T(s) • α defines a deterministic linear trend, T has a size of N ×3 and each row has the following entries: [1 longitude(s) latitude(s)].N is the number of observations and α is a vector of the least squares regression coefficient.ν(s) captures the spatial covariance structure of the process, and it is assumed to have a mean zero and generally a non-stationary covariance function.Before inferring the signal at a new location, it is required to center the data by estimating and subtracting the linear trend, i.e., The detrended signal Z is used to determine the predictions in Eq. ( 4) and the deterministic signal is calculated from T (s 0 ) α.The sum of the two terms gives the total estimated value of Y (s 0 ).In the next section, a similar strategy is followed to solve for the best unbiased estimator using two data sets as presented in Braverman et al. (2009).

Spatial statistical data fusion
Spatial statistical data fusion (SSDF) is a method that statistically combines two data sets to optimally infer the quantity of interest and calculate the corresponding uncertainties at any predefined grid (Nguyen, 2009;Braverman et al., 2009).This method extends the kriging technique described above to find the optimal estimator using multiple data sets.Let the underlying process Y (s) to be estimated at the location s from the data in Z 1 and Z 2 with the sizes N 1 and N 2 , respectively.The estimator Ŷ (s) at the location s is obtained from the two data sets as follows: where a 1 and a 2 are the fusion weighting coefficients, and Z 1 and Z 2 are detrended data sets of Z 1 and Z 2 , respectively.Following Eq. ( 5) and Eq. ( 8), the Lagrangian function L for the minimization problem under the unbiasedness constraint is where ii = cov( Z i ), ij = cov( Z i , Z j ), and c i = cov( Z i , Y (s)) are the covariance functions.1 N i is a vector with all entries 1 and a length N i , and m denotes the Lagrange multiplier.The last term of L accounts for the unbiasedness constraint.By differentiating L with respect to a 1 , a 2 , m and assigning the results to zero, we get, in the following system of equations, and hence There are several important discussion points for the solution in Eq. ( 11).The covariance matrices ij should be determined without assuming that the underlying process is isotropic or stationary.This is important for atmospheric parameters, particularly the atmospheric water vapor that shows spatial anisotropy as observed from the spatial autocorrelation function in Fig. 5.The covariance function c i should account for the change in the support between the input and the output data.For massive data sets, the size of the covariance matrix is huge and the solution in Eq. ( 11) is not feasible anymore.Also, the covariance matrices should be modeled such that they would allow data prediction to any level of aggregation.The fixed-rank kriging covariance model suggested by Cressie and Johannesson (2008) provides a comprehensive solution for these problems for single data sets and the generalized model for fusing multiple data sets was presented by Nguyen (2009) and Braverman et al. (2009).In the next section, we describe the fixed-rank kriging method and the associated covariance model.Then, we describe how the data fusion approach is applied to our data sets.

Fixed-rank kriging
The fixed-rank kriging (FRK) approach splits the spatial process into two or three components depending on the spatial wavelength, i.e,The vector η contains r hidden spatial random effects, which are estimated from the data at predefined nodes.Therefore, we should be able to estimate η regardless of the aggregation level of the input data.When neglecting the last term in Eq. ( 12), the weighted sum r j =1 S j (s)η j should give the detrended value of Y at the location s.
The weights stored in the matrix S for each location s depend on the distance between s and each node.The weighting function S(s) has the following form: m i is the node location and r i is a predefined effective radius.The formula in Eq. ( 13) represents a bi-square bellshaped function that has its maximum value at m i and decreases smoothly until it reaches zero outside the circle.To demonstrate this, a schematic diagram for the node setup is shown in Fig. 6.Within the domain of the data, four nodes, m 1 , • • •, m 4 , are defined with a corresponding radius.In Fig. 6, if s is located within the radius of a certain node, it gets a positive weight; otherwise, the weight is zero.Hence, S(s) = [0, 0, 0, S(s)].
The last component in Eq. ( 12) accounts for the variations of the process that has not been captured so far (Kang and  Cressie, 2011).The component ζ is assumed to be an uncorrelated Gaussian process with a mean zero and a variance σ 2 ζ .
Based on the model in Eq. ( 12), the FRK estimator is found when η and ζ are determined; i.e., where S p (s o ) is the weighting matrix for the prediction location and is the covariance matrix of the input data.The matrix E in Eq. ( 14) has a value of one if s = s o and zero elsewhere.Ŷ represents the detrended estimator.η and ζ are the optimal a posteriori estimates of η and ζ , respectively (Braverman et al., 2011).In order to get the total value of Ŷt , we calculate The steps followed to obtain the predictions based on the FRK method are summarized in Fig. 7.The methods to estimate the noise variance σ 2 , the covariance matrix K, and the variance of the fine-scale signal σ 2 ζ are shown in Appendix A. We classify the spatial variations of the atmospheric water vapor signal into three components: long wavelength, medium to short wavelength, and uncorrelated fine scale.Therefore, we split the water vapor signal using the linear model in Eq. ( 12) and use the FRK method for prediction.
We applied the OK and FRK to estimate the zenithdirected wet delay derived from remote sensing data.For the  FRK, the matrix S is constructed using the node setup shown in Fig. 8.The nodes or center locations of 93 basis functions are established at three spatial resolutions: the first resolution is 40 km, the second resolution is 20 km, and the third resolution is 10 km.The semivariogram and the fitted spherical variogram model are shown in Fig. 9a, while the covariance matrix determined using the FRK method is shown in Fig. 9b.The predicted maps with 3 km × 3 km resolution are shown in Fig. 10.Due to the lack of ground truth data that should be used to estimate the bias in the model data, we do not add the long-wavelength component into the figures to enable unbiased comparison.We observe similar results from both ordinary kriging and fixed-rank kriging that agree with the original WRF map.The spatial correlation coefficients with the corresponding WRF data are approximately 85 and 83 % for FRK and OK, respectively.When using OK, we assumed the signal to be spatially isotropic to ease the computations; therefore, the OK prediction map shows results sightly different from the FRK.The most impressive point here is the computational time reported for both algorithms.The FRK algorithm is fast, so that it requires significantly shorter time to produce the predictions.Most of the time is invested in the calculations of the covariance model parameters and constructing the matrices S and .We implemented the OK algorithm such that the predictions are found iteratively.Also, to estimate a value at location s, we do not use the entire data, but only those that exist within a predefined radius around the prediction location.Nevertheless, the OK algorithm requires computational time with an order of magnitude higher than that required by the FRK method, on the same machine.
In the next section, we describe the extension of the FRK method for predicting the atmospheric PWV by fusing remote sensing data and the WRF model.

Data fusion for water vapor estimation
In this section, we fuse the PWV maps derived from the remote sensing data and WRF model.Since we classify the spatial variations of the atmospheric water vapor signal into long wavelength, medium to short wavelength, and uncorrelated fine-scale components, we use the following model setup for prediction.

Model setup
PWV maps will be derived from the remote sensing data, denoted Z 1 , and those from the WRF model denoted Z 2 with the sizes N 1 and N 2 , respectively.Z 1 contains the point PWV estimates from remote sensing data and Z 2 contains the block WRF data.Following the SME model in Eq. ( 12), the two data sets can be expressed as The regression coefficient α should be estimated jointly from both data sets.However, we do not have a priori information about the biases; therefore, we estimate α in this contribution independently for each data set.The matrices S 1 and S 2 contain the weights of each location for each data set.To distinguish between point and block data, we used the notation S 2 for block-level data.The model components for point and block data are given in Table 1.The WRF data are available  at a resolution of 3 km × 3 km; therefore, the highly variable signal of water vapor is smoothed.Hence, we do not add the component ζ for the model data.
To solve the system in Eq. ( 11), we determine the covariance structure associated with each SRE model in Eq. ( 16), i.e., where σ 2 ζ V ζ and σ 2 V are diagonal covariance matrices for ζ and , respectively.Note that when computing the crosscovariance functions 12 and 21 , the only part of the signals that is assumed correlated is η.In order to solve Eq. ( 11), we need not only to specify the covariance matrices of the input data, but also to find the covariance between the observations and the spatial process at the prediction locations.
The covariance terms are obtained from The matrix E in Eq. ( 20) has a value of one if s = s o and zero elsewhere.By solving for a 1 and a 2 in Eq. ( 11) and substituting the results in Eq. ( 8), the estimator Ŷ (s o ) becomes The mean squared prediction error (MSPE) corresponding to Ŷ can be obtained from Using the FRK covariance model in Eq. ( 19) makes the matrix inversion of Eq. ( 22) scalable.That is, the matrix inversion can be achieved by applying a recursive block-wise inversion as follows: where and A, B, C, and D are matrices of any size, and A and D must be square.The inversion of individual matrices in Eq. ( 24) is achieved by applying the formula of Sherman-Morrison-Woodbury, which is made possible due to the FRK covariance structure, The computations require the inversion of the matrices K and , where each of them has the size r ×r with r significantly smaller than the data size.Note that D i is a diagonal matrix, for which the inversion is achieved by inverting the diagonal elements.Using the FRK covariance model makes the computational burden for the matrix inversion linear with the data size (Cressie and Johannesson, 2008).

Application to the data
In this section, we build PWV maps from remote sensing and WRF model data using a spatial statistical data fusion method.The first PWV map, derived by combining GNSS and PSI, has 169 688 data points.The WRF model provides a block-level map of 1296 cells of the size 3 km × 3 km.The data to be fused have different qualities, a huge size, different spatial supports, and gaps in the remote sensing data.The output grid is defined at 3 km × 3 km (block-level support) and MERIS PWV maps are used for evaluation.
Following the work flow in Fig. 7, we first estimate the long wavelength trends and remove them from the data using Eq. ( 7).By comparing the PWV from the WRF model and remote sensing data, we found it is most likely that the model data have a bias.Due to the lack of a priori information about the bias and the absence of accurate ground truth data to estimate it, we estimated α independently for each data set.The centered maps are shown in Fig. 11.
Second, the matrices S 1 and S 2 are constructed for the first data set (remote sensing data) and the second data set (model data).The node setup is shown in Fig. 8.The number of nodes must be the same for both data sets, and they are selected such that S does not contain columns of zeros; otherwise, the corresponding node has to be removed.If PWV data are available at point level, a weighting value is calculated for each point with respect to all nodes.However, the WRF model simulates data at block level; hence, we superimpose the model grid with a lattice of regular points such that each cell in the WRF grid contains nine points.A weighting value is calculated for each point; these values are averaged to get one weighing value for each WRF cell to form the matrix S 2 .Building the matrix S p for the prediction loca- tions is done in a similar way, either at point level or block level, depending on the output grid.
In the third step, the covariance parameters (K, σ 2 ζ , σ 2 ) are estimated from the centered data Z 1 and Z 2 .The error variances for both data sets, K and σ 2 ζ , are estimated as described in Appendix A. Note that when the two data sets are combined to infer a single process, i.e., PWV, one K is estimated for all data sets.

Results
So far, all components required to produce the predictions using Eq. ( 22) have been obtained.In Fig. 12, we show the prediction maps obtained by applying FRK to individual data sets as well as the map obtained by data fusion.The figure also shows the MSPE maps associated with each prediction map.We compare the interpolations obtained by applying FRK to single data sets with those obtained by SSDF, and we compare both with the MERIS data.The results show that the map obtained by data fusion correlates more consistently with the map predicted only from PSI + GNSS (Table 2).In the PWV map generated by WRF, shown in Fig. 11, the area in the lower left corner shows artifacts that do not reflect the correct values of PWV as observed from the MERIS PWV map (Fig. 3c and d).Applying FRK to the WRF data does not remove these artifacts from the prediction map.However, in the map obtained by the fusion of both data sets, the artifacts in the lower corner disappeared, but the corresponding MSPE values are large for this region.The MSPE values corresponding to the SSDF predictions are generally smaller, and we should note that in the regions of sparse observations, the corresponding MSPE values tend to increase.For regions of sparse observations in the PWV map (Fig. 11), i.e., the areas in the west of the Rhine valley or in the lower right corner, the map from the WRF model contributes to improving the estimation of the PWV values in the prediction map.The region in the lower right corner has a higher topography and the wet delay values are expected to decrease, as we observe from the map of WRF.In the prediction map obtained by applying FRK to PWV from PSI and GNSS, the predicted values tend to increase since the data in this area are sparse and partially biased.By applying the SSDF approach, the data available from WRF influence the predictions such that the PWV values in this area are more reasonable, and they decrease by moving to the lower right corner.In a sim-ilar way, the data from WRF improve the predictions in the region around 7.8 • E, 49.25 • N, where only sparse PWV data exist.The data from the model, however, affect the prediction in the lower left corner such that they are smaller than those observed in the MERIS map.
In addition, we show the PWV profiles over the line drawn horizontally at the latitude 49.37 • N in Fig. 12h.It is observed from the plots that the predictions made by data fusion are affected more by the data from WRF in region A, where the remote sensing data are sparse.However, in region B, the WRF data are significantly overestimated.In the prediction map made by data fusion, these data have a lower effect in than those received from the remote sensing data.The map received by applying the data fusion shows the best spatial correlation with the data from MERIS and the smallest rms value (see Table 2).
In the above example, the data from remote sensing have a more significant influence on the output.In Fig. 13, we show another example where the model highly affects the predicted map.The predicted map based on model data shows a better spatial correlation and a lower uncertainty value compared to the map predicted using remote sensing data.In this case, the fusion map is more affected by the model data.The spatial correlation coefficients and the values of uncertainty are given in Table 2.In the first example (Fig. 12), the effect of the remote sensing data on the prediction map is significant.The other examples in Fig. 13 and Table 2 show that the model has a larger effect on the output map.

Conclusions and outlook
We presented a method to obtain the atmospheric PWV over any aggregation level by the fusion of remote sensing data and atmospheric models.The PWV maps derived by combining data from PSI and GNSS are available at discrete points that are absent in regions of low coherence.On the other hand, the WRF model provides simulations of PWV in the atmosphere on regular grids at a coarse spatial resolution.Both the quality of the model data and the model skills for representing mesoscale atmospheric structures should be improved.The quality of the prediction maps should be improved by data fusion.For data fusion, the method of spatial statistical data fusion, first presented in (Nguyen, 2009), was employed.This method is based on the fixed-rank kriging ap-  proach that attempts to solve the problems of computational complexity of huge data sets, change of support, and bias.We inferred PWV data on a grid of 3 km × 3 km and compared the results with PWV maps inferred from MERIS data on the same grid.The results show a strong correlation between data fusion maps and those maps from MERIS.The difference between both maps shows uncertainty values of less than 1 mm, which is lower than that obtained from inferring data based on single sets.
To further improve the results, we suggest the following.The matrix S i has so far been constructed for each data source by defining a set of spatial nodes.The number of the nodes is empirically adjusted such that the covariance function computed for the data set based on the estimated F. Alshawaf et al.: Atmospheric water vapor by data fusion matrix K approximates the empirical covariance.In future work, the size and the locations of nodes have to be optimized by minimizing the difference between the empirical and the estimated covariance functions.We should also estimate the biases for each data set (if they exist), so that they can be accounted for in the fusion approach.The data fusion approach can be extended such that more than two data sets are used, for example, by including the MERIS maps in the fusion.With the increasing number of satellite missions and improved atmospheric models, we are able to produce complete, accurate information about the Earth's atmosphere based on data fusion approaches.Moreover, the improved PWV maps can be iteratively assimilated into the local area atmospheric model to generate more accurate 3-D water vapor fields.Also, testing other combinations of physical schemes within the WRF model can further improve the resulting water vapor maps.In this paper, we compared the prediction maps with the data from MERIS; however, in future work, the results should be validated using bootstrapping or jackknifing techniques.

Appendix A: Estimation covariance parameters
Predicting the stochastic component of the atmospheric signal using kriging requires obtaining the covariance function and fitting a covariance model.Using the FRK covariance model, we need to estimate the matrix K, the noise variance σ 2 , and the variance of the fine-scale signal σ 2 ζ .The first method proposed to estimate K is called the binned method of moments (MM) (Cressie and Johannesson, 2008;Nguyen, 2009).This approach derives the empirical estimator for and obtains K such that || ˆ − || F is minimum, where ||•|| F refers to the Frobenius norm.
Another approach proposed by Katzfuss and Cressie (2009) targets determination of the covariance parameters using the algorithm of maximum likelihood estimation (MLE).Furthermore, they estimated the covariance parameters using the expectation-maximization (E-M) algorithm (Dempster et al., 1977) to reduce the computational burden.This algorithm provides estimates not only of K, but also of σ 2 ζ , where the solution for the MLEs is found iteratively.Within each iteration the algorithm performs two steps, the expectation and the maximization.In the following, we present a description of how to obtain the maximum likelihood estimates of the covariance model parameters via the E-M algorithm.
Assume that the observations in Z follow a multivariate Gaussian distribution; that is, Z ∼ N (0, ).Let the parameters of interest K and σ 2 ζ be summarized in the vector ; then, the likelihood function L( ) is (Katzfuss and Cressie, 2009) where c = (N/2) log 2π is a constant independent of , and hence it cancels out in the maximization step.tr(•) denotes the trace operator of a square matrix, with tr (A) = n i=1 a ii .In the expectation step of the algorithm, we calculate Then Eq. (A2) becomes We should remind the reader that the parameters to be estimated here are K and σ 2 ζ , while σ 2 is estimated from the robust semivariogram, as described later.To proceed with the solution, we are required to quantify the conditional expectations in Eq. (A3).Using the standard formula required for calculating conditional expectations for multivariate normal distribution, the expectations will have the following form (Katzfuss and Cressie, 2009): − K [t] S [t] −1 SK [t] , and After the expectation step, we perform a maximization step.
The parameters K and σ 2 ζ in Eq. (A3) should be selected such that Q(•) is maximized.The partial derivative is taken with respect to both parameters and the result is assigned to zero.Finding the derivative here is rather simple since η and ζ do not show dependency on each other, as observed from Eq. (A3).The updating scheme of the E-M algorithm in each iteration is K [t+1]  = K [t]  + K [t] S [t] −1 Z Z [t] −1 − I N S K [t]   ; (A4) (A5) We keep updating the solution until the algorithm converges.One criterion to monitor convergence is to calculate the norm of the difference between the current and last update of the vector (which is of size r 2 + 1).That means || < b should hold for a small enough and positive value of b.Following Katzfuss and Cressie (2009), b is assigned a value of 10 −6 r 2 .The starting choice of K and σ 2 ζ should be valid; strictly speaking, K [0] must be symmetric and positivedefinite and σ 2 ζ [0] must be positive; i.e., K [0]  = 0.9 • var( Z)I r and σ 2 The measurement error variance σ 2 is estimated separately from the empirical semivariogram of the data.Estimating both σ 2 and σ 2 ζ from the data is not a trivial task.That is because the nugget effect in the semivariogram reflects not only the error variance, but may also be affected by the fine-scale variance.Therefore, having information about the error distribution and variance is worthwhile.In our case we estimate σ 2 using the method of a robust semivariogram (Cressie, 1993), 2γ (h) = where h is the separation distance, assuming the signal is spatially isotropic.To obtain an estimate of σ 2 , a straight line is fitted to the estimated semivariogram at short h.Since the slope of the structure function (variogram) describing atmospheric turbulence is expected to vary with h, we made the line fitting based on the estimates of the first 3 km (empirically defined).Let the line fit be γ (h) = γ (0+) + bh; then, the estimate of σ 2 is Should γ (0+) have a negative value, σ 2 is set to zero.
The estimate of K using the detrended PWV maps estimated from the PSI + GNSS and model data on 5 September 2005 is shown in Fig. A1.The corresponding covariance function is also shown.The matrix S is constructed as described in Sect.4.3 using the nodes setup in Fig. 8.The K EM has a maximum value for the element (29,29), which is equivalent to estimating at the node in the lower right corner at the location (8.524 • E, 48.69 • N); see Fig. 8.This can be explained by the sparseness of PWV estimates close to this node, and the PWV values from PSI and GNSS are significantly higher than those from the model.The covariance matrix is computed for the observations binned into 7 km × 7 km blocks to demonstrate covariance structure.We observe from the covariance matrices that the variances, on the main diagonal, increase in areas of sparse observations.The reader should note that the observations do not exist on a regular grid (due to the spatial distribution of PS points); hence, the covariance values in the off-diagonal cells can be negative and then again positive.

Figure 1 .
Figure 1.Maps of the absolute atmospheric PWV derived by combining PSI and GNSS data and the corresponding map from MERIS.The spatial correlation is 95 % and the rms value of the differences is 0.68 mm.

Figure 2 .
Figure 2. WRF model set up with a parent domain of resolution 27 km × 27 km and two nests of 9 km × 9 km and 3 km × 3 km, respectively.

Figure 3 .
Figure 3. Maps of PWV content as received from MERIS and WRF, where a linear trend is subtracted from each map.The upper data are received on 27 June 2005 (09:51 UTC), the lower data on 5 September 2005 (09:51 UTC).Gaussian averaging is applied to scale the MERIS data at WRF resolution, 3 km × 3 km.The spatial correlation coefficient between the upper maps is 0.8 and 0.71 for the lower maps.

Figure 4 .
Figure 4. Point and block data, such that for spatial data, Y (B i ) represents the average of the point data within the block.

Figure 5 .
Figure 5. Spatial autocorrelation function for a PWV map, with the long-wavelength component removed, computed from remote sensing data acquired on 5 September 2005, 10:51 UTC.

Figure 6 .
Figure 6.The observation domain with the black dots defines the locations at which the data are available.The black little squares indicate the nodes.The weights for each location s are related to the distances d i .The dashed circles define the radius for each node.

Figure 8 .
Figure 8. FRK nodes or center locations of 93 basis functions at three spatial resolutions.The first resolution is 40 km, the second resolution is 20 km, and the third resolution is 10 km.
Figure 9. (a) The experimental semivariogram and the fitted spherical variogram model.(b) Covariance matrix used to predict the wet delay maps in Fig. 10.

Figure 10 .
Figure 10.Wet delay prediction map using block OK and FRK.The resolution of the grid is 3 km × 3 km.A point-level wet delay map, on 23 May 2005 at 09:51 UTC, is used as input to the algorithms.

Figure 11 .
Figure 11.PWV maps from the PSI + GNSS combination and WRF on 5 September 2005, with a linear trend subtracted from each map.PSI + GNSS provide point-level observations, while WRF generates block data with a block size of 3 km × 3 km.The predictions will be obtained within the area indicated by the black box.

Figure 12 .
Figure 12.PWV prediction and MSPE maps obtained by data fusion of PWV estimates from PSI and GNSS and maps from WRF as well as predictions obtained by applying FRK to individual data sets.The data are available on 5 September 2005 at 09:51 UTC.The output grid has a block size of 3 km × 3 km.The label A defines a region of sparse remote sensing data and the model data in region B are highly overestimated.

Figure 13 .
Figure 13.PWV maps from remote sensing (PSI+GNSS) and WRF model data on 27 June 2005 at 09:51 UTC as well as prediction maps obtained by data fusion and individual data sets.The output grid has a block size of 3 km × 3 km over the area indicated by the black box in (a) and (b).

1Figure A1 .
Figure A1.Estimate of the covariance matrix K using the E-M algorithm and the corresponding covariance matrix for the wet delay map from PSI + GNSS.The wet delay observations are aggregated into maps of 7 × 7 km 2 cells before their covariance matrices are computed.

F. Alshawaf et al.: Atmospheric water vapor by data fusion
Figure 7. Obtaining predictions via the FRK method.

Table 1 .
Model components from point-level and areal-level data.

Table 2 .
Spatial correlation coefficients (CC) and rms values when comparing the prediction maps with MERIS PWV maps.