Simultaneous estimation of model state variables and observation and forecast biases using a two-stage hybrid Kalman filter

In this paper, we present a two-stage hybrid Kalman filter to estimate both
observation and forecast bias in hydrologic models, in addition to state
variables. The biases are estimated using the discrete Kalman filter, and the
state variables using the ensemble Kalman filter. A key issue in this
multi-component assimilation scheme is the exact partitioning of the
difference between observation and forecasts into state, forecast bias and
observation bias updates. Here, the error covariances of the forecast bias
and the unbiased states are calculated as constant fractions of the biased
state error covariance, and the observation bias error covariance is a
function of the observation prediction error covariance. In a series of
synthetic experiments, focusing on the assimilation of discharge into a
rainfall-runoff model, it is shown that both static and dynamic observation
and forecast biases can be successfully estimated. The results indicate a
strong improvement in the estimation of the state variables and resulting
discharge as opposed to the use of a bias-unaware ensemble Kalman filter.
Furthermore, minimal code modification in existing data assimilation software
is needed to implement the method. The results suggest that a better
performance of data assimilation methods should be possible if both forecast
and observation biases are taken into account.


Introduction
During the last decade, data assimilation has been frequently applied for the correction of errors in hydrologic model results. These errors originate from uncertainties in meteorological forcing data, model parameters, formulation of the model physics, and initial conditions. A number of methods are available for this purpose, of which the most commonly used are Newtonian nudging (Stauffer and Seaman, 1990), the extended Kalman filter (Welch and Bishop, 1995), the ensemble Kalman filter (Evensen, 1994), variational assimilation (Rabier et al., 2000), and the particle filter (Gordon et al., 1993). These methods have been applied for the assimilation of various variables. Examples of these variables and studies that focus on their assimilation are surface soil moisture values (Crow and van den Berg, 2010), surface temperatures (Meng et al., 2009), brightness temperatures (Seuffert et al., 2004), radar backscatter values (Hoeben and Troch, 2000), snow water equivalent (De Lannoy et al., 2012), snow cover fraction (Su et al., 2010), piezometric head data (Chen and Zhang, 2006), chemical tracer data (Ng et al., 2009), and discharge values (Pauwels and De Lannoy, 2009).
In many studies, observations are used that contain both random error and significant bias (Torres et al., 2012). Furthermore, hydrologic model results do not only contain random errors, but in many cases are also prone to bias (Ashfaq et al., 2010). Typically, the above mentioned methods only function optimally when the assimilated data and the model are free of bias. In order to bypass this inconsistency, a number of studies have focused on the removal of systematic differences between the assimilated data and the model through rescaling the data to the model climatology (Reichle and Koster, 2004;Slater and Clark, 2006;De Lannoy et al., 2012). Other studies have focused on the estimation of the forecast bias in addition to the model state variables, using the discrete (Kalman, 1960) and the ensemble Kalman filter for both linear and nonlinear systems, in a 3500 V. R. N. Pauwels et al.: Simultaneous bias estimations using a two-stage hybrid Kalman filter wide range of applications ranging from groundwater modeling to soil moisture and temperature assimilation (Dee and Da Silva, 1998;Dee and Todling, 2000;De Lannoy et al., 2007;Drécourt et al., 2006;Bosilovich et al., 2007;Reichle et al., 2010). Dee (2005) further explains how forecast bias can be taken into account in a data assimilation system using the Kalman filter or variational assimilation as assimilation algorithm. The estimation of observation biases through data assimilation has been investigated as well. Derber and Wu (1998) present a simple observation bias estimation scheme for the assimilation of radiance data into an atmospheric model. Auligné et al. (2007) and Dee and Uppsala (2009) used a variational approach to estimate satellite data biases, while Montzka et al. (2013) used the particle filter for the retrieval of remotely sensed soil moisture biases. Another approach, as opposed to state updating and online bias estimation, is to update model parameters in addition to state variables, under the assumption that all forecast bias is caused by the model parameters (Moradkhani et al., 2005).
There have been two major practical approaches for forecast bias estimation with a Kalman filter: state augmentation (Drécourt et al., 2006;Kollat et al., 2011) and separate state and bias estimation (Friedland, 1969). Drécourt et al. (2006) compared both approaches using a linear groundwater model, and concluded that both methodologies outperformed the Kalman Filter without bias estimation. Two-stage state and bias estimation, referred to as bias-aware Kalman filtering by Drécourt et al. (2006), is an attractive approach where the state and the forecast bias are estimated individually. Although it is clearly demonstrated that, in the presence of forecast bias, this methodology outperforms the estimation of the model state alone (Drécourt et al., 2006;De Lannoy et al., 2007), observations are assumed to be unbiased in these studies. Furthermore, we are not aware of assimilation approaches in hydrological studies that estimate both observation and forecast bias, in addition to state variables. The objective of this paper is therefore to develop a methodology, based on the ensemble Kalman filter (EnKF), to estimate observation and forecast biases, as well as model state variables. More specifically, the methodology of Dee and Da Silva (1998), in which two Kalman filters are applied, is expanded to include observation biases as well. The major assumption of the proposed methodology, as opposed to state augmentation, is that the observation and forecast bias errors are independent of each other and of the errors in the unbiased model state variables. This assumption needs to be made in order to enable the derivation of a separate state and bias update equation. In this paper, we will demonstrate that, despite this assumption, reasonable results can be obtained. The equations for the estimation of the biases and the state variables are derived for a linear system, after which the application for nonlinear systems in an ensemble framework is explained. The method is then applied to a very simple rainfallrunoff model, into which discharge values are assimilated, first in a well-controlled synthetic experiment, and then in a real-world example. The performance of the new methodology is then analyzed in detail, and the possibilities for joint observation and forecast bias and model state estimation are assessed.
2 Derivation for a linear system

System description
The equations for the simultaneous estimation of system states and both forecast and observation biases will first be derived for a linear system. The application of the analytical expressions in an ensemble framework will then be explained.
In the bias-aware Kalman filter, the state of the system is propagated from time step k − 1 to time step k: (1) x k is the biased state vector, and f k−1 is the vector with model inputs (for example the meteorologic input data). w k−1 is the model error, which is a random error term with covariance Q k−1 . A k−1 and B k−1 are model matrices propagating states and forcings at time step k − 1 to states at time k. For the remainder of the paper, variables indicated with a[ .] refer to biased variables. The difference between this equation and the equation used in the derivation of the discrete Kalman filter (Kalman, 1960) is that, here, biased state variables are used instead of unbiased variables. The unbiased state vector x k is defined as (2) b m k is the forecast bias. The system is observed as follows: y k is the vector with the biased observations. v k is the zero mean (unbiased) observation error with covariance R k . H k is the observation matrix, and b o k is the observation bias. This equation differs from the equation used in the bias-free discrete Kalman filter through the observation bias term. The unbiased observations can be calculated as follows: The bias vectors are propagated as follows: These equations imply that, if no data are assimilated into the model, the bias at the next time step is simply assumed to be the same as in the previous time step.
In the derivation of the two-stage state and bias update equations, it is important to remark that the errors in the observation and forecast biases are assumed independent of each other and of the error in the unbiased state of the system.

Propagation of the states and biases
In a simulation framework, an estimate of the system state is used in the model. For the remainder of this paper, estimated variables are denoted with an [ .]. [ .] − indicates an a priori estimate (forecast, before the update), and[ .] + an a posteriori estimate (analysis, after the update). The a posteriori state estimate at time step k − 1 is propagated to time step k as follows: The system is observed as follows: The bias estimates are also propagated:

Update of the states and biases
The observations are used as follows to update the states and biases: K m k , K o k , and K k are the Kalman gains for the forecast bias, the observation bias, and the system states, respectively. The filter thus works in two steps. First, the a priori bias and state estimates are used to update the observation and forecast bias estimates. The update is performed using the unbiased estimates: the observation bias is subtracted from the observations, and the observation matrix is applied to the unbiased state vector. The a posteriori forecast bias estimateb m+ k is then used to calculate the a priori unbiased state estimate, which is defined aŝ This definition, in combination with the a posteriori observation bias estimates, is used to update the unbiased state estimates. It should be noted that the biased state is fed back into the model. In other words, Eq. (10) is an output equation, providing the unbiased state estimates, and is no part of the model integration. An analytical expression for the three Kalman gains is thus needed. It is relatively straightforward to prove that the last update in Eq. (9) leads to unbiased state estimates (Appendix A).

Definition of the error covariances
In order to derive an expression for the Kalman gains, a number of error covariances need to be defined. The error covariances of the unbiased states, forecast biases and observation biases can be written as Similarly, the error covariance of the biased states can be written as

The equations for a bias-aware linear system
Appendix B provides the details of the derivation of the equations for the application of the bias-aware Kalman filter. Figure 1 shows a schematic of the equations that need to be applied. The state and bias vectors in step 1 can be initialized with any appropriate prior guess, i.e., typically a spun up biased state estimate forx + k−1 , and zero estimates for the biases. In other words, the model can be applied repeatedly using the same forcings for one or a number of years, until the state variables of the first time step converge to a specific value.

Interpretation of the expressions for the Kalman gains
Step 3 and step 5 in Fig. 1 list the expressions for the three Kalman gains. These expressions can be compared to the expression for the Kalman gain for a linear, bias-unaware (or unbiased) system: In the above derived expressions, firstly the observation bias error covariance appears. For the bias estimation, the a priori bias error covariances are used, while for the state update Kalman gain the a posteriori error covariance is needed. This is a logical consequence of Eq. (9), where the a priori observation bias is subtracted from the biased observations in the innovations (the difference between the unbiased observations and simulations), while the a posteriori bias estimation is used in the state update. Further, for the bias Kalman gains (step 3 in Fig. 1

4) Updating of the bias error covariances
6) Updating of the state error covariance Fig. 1. Schematic of the methodology for a discrete Kalman filter.
is substracted from the observations.x − k −b m− k is defined as the a priori estimate of the unbiased state, with an error covariance matrix equal toP − k − P m− k . Both of these error covariances are used in the Kalman gain. For the system state update,x − k appears in the innovation, and the unbiased error covariance P − k is thus used (step 5 in Fig. 1). As a summary, the Kalman gain takes into account the uncertainty of all the terms in the update equation, which explains the different terms in the denominator for the three Kalman gain expressions.
A final remark is the minus that appears in the expression for the forecast bias Kalman gain. This is a direct result from the derivation shown in Appendix B. This can be explained by the definition of the forecast bias (Eq. 2) and the observation bias (Eq. 4). A positive forecast bias means that the biased state is larger than the unbiased state. A similar remark can be made regarding the observation bias. Assume a model with positive forecast and observation biases. Further, assume a positive observation system (in other words, all nonzero entries in H k are positive). This means that an increase in the state variables will lead to an increase in the observation. Assume also that the unbiased observation predictions are smaller than the actual unbiased observations (meaning that the expression between brackets in Eq. (9) is positive). This may imply that either the biased system state is underestimated, or the forecast bias is overestimated, or the observation bias is underestimated, or a combination of these possibilities. The possible overestimation of the forecast bias explains the minus in the expression for the forecast bias Kalman gain.

Interpretation of the method
The objective of this method is to separate the mismatch between the observations and the model results into forecast and observation bias, and random model and observation error. This is an additional difficulty as compared to the biasunaware KF, where this mismatch is separated into random model and observation error. The Kalman gain (Eq. 13) can be interpreted as the fraction of this mismatch that is assigned to the model noise, and maps this mismatch from observation space onto state-space through the observation operator H k . A similar reasoning can be made for the bias-aware KF. The Kalman gains K m k , K o k (step 3 in Fig. 1), and K k (step 5 in Fig. 1) indicate the fraction of the mismatch that can be attributed to the forecast bias, the observation bias, and the random forecast error, respectively. For the forecast bias and state estimates, these Kalman gains remap the difference between the unbiased observations and the unbiased simulations thereof to state-space.
Step 2 in Fig. 1 also shows that the propagation of the prior unbiased state error covariance contains an extra term as compared to the propagation for a bias-unaware (or unbiased) linear system: More specifically, the a posteriori forecast bias error covariance needs to be added to the forecast error covariance (the term P m+ k ). This can be explained by the update equations. Essentially, Eq. (1) shows that in the system the biased state vector is propagated. The propagation of the biased state error covariance thus appears in step 2 in Fig. 1. However, in the calculation of the unbiased state, the forecast bias is subtracted. This implies that the unbiased state error will consist of the error in the biased state and the forecast bias, which explains the extra term in step 2 in Fig. 1. The definition of the unbiased state forecast (or prior state) in Eq. (10)

General approach
The equations in Sect. 2.5 can easily be modified for an application in a nonlinear system. In this respect, a distinction must be made between the system model and the bias models. In Eq. (5) persistent (and thus linear) bias models are used, while the system matrix A k−1 is replaced by a nonlinear model. One logical way to apply the bias-aware Kalman filter is thus to have a mix between an EnKF for the system state, and a discrete KF for the biases. The use of a persistent bias model is by itself an argument for separate bias-estimation with a discrete KF. Because of the persistent nature of the bias model, an initial spread in the ensemble of biases would remain unaltered during forecast periods. This can be explained by the fact that during the forecast step no observations are used to update the biases, which do not change between time steps. Furthermore, the spread in the bias ensemble would decrease at each analysis step. This is a typical property of the EnKF: the analysis step causes the state variables (and also the biases) to merge to a specific value. Because of the different forcings and model parameters, the state variables will then again evolve to different values. However, since a persistent bias model is used, this evolution will not occur for the biases. This would cause filter divergence, unless some artificial inflation would be applied, would likely lead to inconsistencies between the state and bias estimates. Here, we will leverage off the ensemble state error covariance to approximate the bias error covariance estimate for the discrete KF.
It should be noted that the mixed approach (an EnKF for the system states and a discrete KF for the biases) has been applied in several studies focusing on bias estimation (De Lannoy et al., 2007). This approach is thus extended here for the inclusion of observation biases.

Two-stage state and bias estimation versus state augmentation
A two-stage filter has a number of advantages over state augmentation. Firstly, the dimensions of the state vector do not increase. For a small number of state variables this may not be important, but for large systems this can be a considerable advantage. The calculation of the forecast error covariance requires n 2 s · n e calculations (with n s the number of state variables and n e the number of ensemble members). If the biases are added to the state vector, the calculation of P − k would require (2 n s + n o ) 2 · n e calculations (with n o the number of observations). The increase in the required number of calculations thus evolves approximately quadratically with the number of state variables, which can be a significant drawback for large systems. It should be noted that in the application of the EnKF the forecast error covariance does not need to be calculated explicitly (Reichle et al., 2002). However, the cross-covariance between the system states and the observation simulations needs to be calculated, and a similar reasoning can be made. Another advantage is that, if a model already contains a bias-unaware EnKF, the bias-aware filter equations show that minimal code modification is needed to include the bias estimates.
The separate bias and state estimation is possible through the assumption of uncorrelated state and bias errors. State augmentation could take these correlations into account, but is computationally more expensive through the calculation of the cross-covariance between the system states and the observation simulations.

Estimation of the error covariances
Step 3 and step 5 in Fig. 1 show that a number of error covariances are needed: and P o+ k . These error covariances determine the partitioning of the difference between the observations and forecasts into the different error and bias components. However, it is not straightforward to optimally estimate each of these error covariances. Here, we describe some assumptions made for this paper. The biased state error covariance is given bỹ In order to estimate the forecast bias error covariance, one thus needs to know the unbiased and the biased state error covariances. In this paper, we assume that the unbiased state error covariance is a specified fraction of the biased state error covariance. This is an approach that is used in many papers focusing on the estimation of biases through data assimilation, including Dee and Da Silva (1998), Drécourt et al. (2006), andDe Lannoy et al. (2007). Calculating the biased state error covariance using the ensemble results, we can thus write γ is a filter parameter, between zero and one, which can be obtained through calibration. A value of zero indicates that the entire model error is assumed to be caused by bias, while a value of one indicates that noise is the only cause of errors in the model results.
The observation bias error covariances can be estimated under the assumption that a more uncertain observation prediction is accompanied by a more uncertain observation bias. For this reason, we estimate P o− k as a function of the error covariance of the observation predictions: κ is a filter parameter and can be estimated through calibration. Determining a typical value for this parameter is not straightforward, as it will depend on the magnitude of the different state variables and observations. γ and κ can thus be calibrated in order to optimize the filter performance. If the biases are correctly estimated, the innovations in the update equation (step 8 in Fig. 1) should consist of white noise since the bias is removed from both the observations and the simulations. This means that the autocorrelation length of these zero-mean innovations should be zero. Both parameters can thus be tuned in order to reach this zero value. If the system and observations are unbiased, zero values for both these parameters should be obtained. To summarize, the values for γ and κ are modified until the autocorrelation length of the innovations is equal to zero. An example of this filter parameter tuning is demonstrated in Sect. 7.2.

Summary
In summary, the bias-aware EnKF can be applied as follows. First, the states and biases are propagated from time step k − 1 to time step k: is a nonlinear operator representing the model in state-space, including the model parameters and the meteorological forcings. i is the ensemble member number, and w i k−1 is a realization of the model error, which can be obtained by a perturbation of the model parameters, state variables, and/or meteorological forcings. The next step is then the calculation ofP − k using the ensemble results: N is the number of ensemble members. P − k and P m− k are then calculated using Eq. (16), and P o− k is calculated using Eq. (17). The three Kalman gains are then calculated in step 3 and step 5 in Fig. 1 Before calculating the state Kalman gain, P o− k needs to be updated to P o+ k since this is needed in the expression (step 5 in Fig. 1). This is performed by Using the calculated Kalman gains and observation bias error covariance, the state variables and the biases can then be updated: indicates the average simulated observations, calculated across the ensemble. v i k is a random realization of the observation error, and is needed in order to ensure a sufficient ensemble spread (Burgers et al., 1998).

Evaluation of the methodology
The derived equations are tested through a synthetic study. A very simple rainfall-runoff model is first calibrated for the Zwalm catchment in Belgium. The obtained parameters are then used to generate discharge and storage values. The synthetically true storages are obtained by adding a predefined bias to the modeled storage values (which is consistent with Eq. 2). The synthetic discharge observations are then obtained by adding a predefined observation bias to the discharge, obtained with these biased storages. Furthermore, a random-error term with a predefined standard deviation is added to the synthetic discharge observations as well. This is consistent with Eq. (3). The synthetic observations are then assimilated into the model, and the retrieved storages and discharges can then be compared to the synthetic truth in order to evaluate the performance of the data assimilation algorithm.

Site and data description
The study is performed in the Zwalm catchment in Belgium. Troch et al. (1993) provide a complete description of this test site; only a very short overview is given here. The total drainage area of the catchment is 114.3 km 2 and the total length of the perennial channels is 177 km. The maximum elevation difference is 150 m. The average annual temperature is 10 • C, with January the coldest month (mean temperature 3 • C) and July the warmest month (mean temperature 18 • C). The average annual rainfall is 775 mm and is distributed evenly throughout the year. The annual actual evapotranspiration is approximately 450 mm.
Meteorological forcing data with a one-day time step from 1994 through 2002 were used in this study. The precipitation and all the variables needed to calculate the potential evapotranspiration using the Penman-Monteith equation were measured by the Belgium meteorological station in Kruishoutem. Discharge was measured continuously at the catchment outlet in Nederzwalm.

Model description
The Hydrologiska Byråns Vattenbalansavdelning (HBV) model, of which Fig. 2 shows a schematic, was originally developed by Linström et al. (1997). In this paper, the version of Matgen et al. (2006) is applied. The model uses observed precipitation (R tot (t)) and potential evapotranspiration (ETP(t)) as input, both in ms −1 . t is the time in seconds. The catchment is divided into a soil reservoir, a fast reservoir, and a slow reservoir. There are thus three state variables: the amount of water in the soil reservoir (S(t), m), the slow reservoir (S 1 (t), m), and the fast reservoir (S 2 (t), m).
A number of fluxes are calculated, which depend on the state variables of the system. The actual evapotranspiration ETR(t) (m 3 s −1 ) is first determined: λ is a dimensionless parameter, and S max is the storage capacity of the soil reservoir (m). The infiltration R in (t) (ms −1 ) is calculated as follows: b is a dimensionless parameter. After this, the effective precipitation R eff (t) (ms −1 ) is determined: The calculation of the percolation D(t) (ms −1 ) is then performed: Pe is a percolation parameter (ms −1 ), and β a dimensionless parameter. After this, the storage in the soil reservoir at the end of the time step can be calculated as follows: t is the time step in seconds. S(t + t) is always positive after model calibration.
The input in the fast reservoir R 2 (t) (ms −1 ) is then α is a dimensionless parameter. The outflow from this reservoir Q 2 (t) (m 3 s −1 ) is then determined: S 2,max is the storage capacity of the fast reservoir (m), and κ 2 (m 3 s −1 ) and ψ (-) are model parameters. After this, the storage in the fast reservoir at the end of the time step can be calculated as The input in the slow reservoir R 1 (t) (ms −1 ) is then computed: The outflow from this reservoir Q 1 (t) (m 3 s −1 ) can be calculated as κ 1 is a model parameter (m 2 s −1 ). Finally, the storage in the slow reservoir at the end of the time step is calculated: The total discharge q(t) is simply the sum of Q 1 (t) and Q 2 (t). A triangular unit hydrograph is used for runoff routing. Since in this paper daily time steps are used, and the concentration time of the catchment is only 14 h (Ferket et al., 2008), no routing needs to be performed for this study.
In summary, the model contains ten time-invariant parameters (λ, S max , b, α, P e , β, ψ, S 2,max , κ 2 and κ 1 ), and three   The model was calibrated using particle swarm optimization (Kennedy and Eberhart, 1995), using the data from 1994 through 1997. Table 2 lists the obtained parameter values, and Fig. 3 shows the comparison of the modeled to the observed discharge for the entire simulation period. Based on these results, the rainfall-runoff dynamics of the model are deemed to be sufficiently accurate to be used in a data assimilation study.

Ensemble generation
In order to thoroughly evaluate the performance of the filter, three experiments are performed. Table 1 shows an overview of these experiments. The first experiment considers only observation bias and noise, and no forecast bias. In order to generate the synthetic truth, no bias is added to the storages, and a bias of 0.5 m 3 s −1 and a random error with a standard deviation of 0.1 m 3 s −1 are added to the synthetically true discharge. These synthetic observations are then assimilated into the model with an assimilation interval of 7 days.
The second experiment considers observation bias and noise as well as forecast bias. Again, a bias of 0.5 m 3 s −1 and a random error with a standard deviation of 0.1 m 3 s −1 are added to the synthetically true discharge. However, in order to generate the synthetic truth, bias is added to the storages. Forecast bias is generated this way, in a manner consistent with the filter theory, so that one can assess to what extent the bias-aware EnKF will correctly estimate the true storage and discharge values. The value of the bias added to the storages is obtained by examining the standard deviation in the modeled storages, obtained using the calibrated model parameters. The bias was then assumed to be 10 % of this standard deviation. This resulted in a bias of 20, 0.4, and 0.2 mm for the surface, slow reservoir, and fast reservoir storages, respectively (see Sect. 6).
The third experiment considers only observation noise and forecast bias. A random error with a standard deviation of 1.369 × 10 −7 ms −1 κ 1 6.916 × 10 −7 s −1 0.1 m 3 s −1 is added to the synthetically true discharge, which is again obtained by adding a predefined bias to the storage values.
In order to investigate the possibility to estimate a temporally varying observation and forecast bias, the three experiments are repeated, but with a sinus wave added to the mean biases. The period of this wave is equal to one year and the amplitude equal to 0.25 m 3 s −1 for the observation bias, and 10, 0.2, and 0.1 mm for the surface, slow reservoir, and fast reservoir storages, respectively.
The experiments are applied with an ensemble of 32 members. The ensemble is generated by adding a Gaussian distributed random number with zero mean to each parameter value. The standard deviation of this random number is set to a fraction of the original parameter value. It was ensured that the parameter values did not exceed physical limits. At each time step, a random error is added to the observed precipitation and potential evapotranspiration. Again, the standard deviation of the random error is set to a fraction of the original observed value. The fractions to calculate the standard deviations are calibrated for each experiment to ensure an adequate ensemble spread, following De Lannoy et al. (2006). Ensemble sizes of 256, 128, 64, 32, and 16 members were analyzed, and it was found that for sizes larger than 32 the ensemble statistics no longer varied significantly. For this reason, 32 members were used for the remainder of the analysis.

Filter parameter estimation
Another issue in the application of the bias-aware EnKF is the estimation of the parameters γ (used in Eq. 16 to estimate P − k and P m− k ) and κ (used in Eq. 17). The objective of the determination of these parameters is to obtain stable bias estimates, meaning that the estimates do not continue to decrease or increase as the simulation proceeds. γ is clearly a value between zero and one, while an indicative value for κ is more difficult to determine. It has been found that a too low value for κ leads to observation bias estimates that diverge from the true value during the simulation. Figure 4 shows this for experiment 2 (a constant observation bias). The observation bias and the biases for S and S 1 evolve to clearly unrealistic values. A value of 100 for κ can be seen to lead to stable observation bias estimates. The results of the filter have been found to be less sensitive to the value of γ . A value of 0.1 for this parameter has been found to lead to stable estimates of the forecast bias. As explained in Sect. 3.3, the autocorrelation length of the innovations in step 8 in Fig. 1 should be zero with a zero mean. Figure 5 shows the ensemble average of the autocorrelation functions for each ensemble member for the case of sinusoidal observation and forecast biases. Clearly, for all lags larger than 0, the autocorrelations approach zero, which indicates that γ has been adequately estimated. Figure 6 shows the estimated storages for each of the experiments described above, using a constant observation bias. Figure 7 shows the analysis of the modeled discharges and the evolution of the estimated observation bias. For each of the three experiments, the estimated unbiased storages and discharges are compared to the storages obtained using a bias-unaware EnKF in which only states are estimated and biases are not taken into account. Figure 8 shows the time series of the true discharge, the results of the baseline run, and the results of the bias-aware EnKF for a selected period in the simulation period. Examining the results of experiment 1 (only observation bias), the advantage of the bias estimation is evident. The bias-unaware EnKF leads to relatively large errors in the storages, which practically disappear when bias is taken into account. Figure 7 shows that taking into account the observation bias also leads to an almost perfect estimate of the discharge. The observation bias shows fluctuations around the Hydrol. Earth Syst. Sci., 17, 3499-3521 mean value, but the true value of 0.5 m 3 s −1 is retrieved with reasonable accuracy. Similar conclusions, but to a lesser extent, can be drawn for the results of experiment 2 (both observation and forecast bias). Figure 6 shows that the estimation for the three storages is better for the bias-aware EnKF as compared to the bias-unaware EnKF. For S 1 and S 2 an almost perfect estimate is obtained, while for S the bias is reduced but not eliminated. However, as S does not directly influence the discharge (see the equations in Sect. 6), this should not lead to errors in the discharge estimation. Figure 7 shows that indeed the discharge is almost perfectly estimated, and that the estimated bias fluctuates around the correct value.

State and bias estimation for a constant observation and forecast bias
Regarding experiment 3 (only forecast bias), the results from the bias-aware EnKF are better than for the biasunaware EnKF. Figure 6 shows that for S the improvement is marginal, but for S 1 and S 2 a reduction in the RMSE (root mean square error) can be observed, while the bias in the estimates is almost unaltered. Figure 7 shows that the discharge is almost perfectly estimated, and that the estimation of the observation bias again fluctuates around the correct value. In experiments 2 and 3, S remains biased due to an observability issue: the observed (assimilated) discharge is not directly related to S, and hence this information cannot be efficiently used to update the bias estimate.
Further examination of Figs. 6 and 7 shows that the estimated storages and discharges obtained using the biasaware EnKF are almost identical for experiment 2 and experiment 3. This can be explained by the relatively accurate estimation of the observation bias. Since the bias-unaware EnKF does not take this into account, the results for experiment 3 (where no observation bias is present) are significantly better than the results for experiment 2 for the bias-unaware EnKF. Table 3 shows the relative RMSE increase for both the bias-aware and the bias-unaware EnKF. This relative increase is defined as RI is the relative increase (%), RMSE a the RMSE of the assimilation run, and RMSE b the RMSE of the baseline run (the run in which no data are assimilated). The RMSE values are calculated both for the state variables and the discharges. A positive value indicates an increase in RMSE as compared to the baseline run, a negative value a decrease. If observation and forecast bias are not taken into account, the analyzed discharge tends to be better estimated than for the baseline run, but this does not necessarily mean that the storages are better estimated. In other words, better discharge forecasts do not necessarily lead to better estimates of state variables. On the other hand, when biases are taken into account, the state variables are better estimated as compared to the baseline run (except for S). This also leads to better discharge estimates as compared to the results of a bias-unaware EnKF.  Figure 9 shows the comparison of the modeled storages to the synthetic truth for the three experiments with a sinusoidally evolving observation and forecast bias. This sinusoidal bias was chosen since discharge observations tend to show a seasonal cycle. The sinusoidal bias may be an exaggerated simplification of the true bias, but it is still more realistic than a constant bias. Figure 10 shows the comparison of the modeled discharge to the synthetic truth and the evolution of the observation bias. The same conclusions can be drawn for the experiments with a constant observation bias. In all cases the observation bias and the discharge are better estimated with a bias-aware EnKF, as compared to when a bias-unaware EnKF is used. The estimation of the storages also strongly improves when a bias-aware EnKF is used, as opposed to the use of the bias-unaware EnKF. The only exception is the estimation of S for experiment 3, where the bias and the RMSE slightly worsen. However, this storage does not directly influence the discharge, which explains the almost perfect estimation of the discharge. Similar for a constant observation bias, the results from experiment 2 and experiment 3 are almost identical when a bias-aware EnKF is used, but not when a bias-unaware EnKF is used. This can be explained by the relatively accurate estimation of the observation bias. Table 3 shows that, for the comparison of the results to the results of the baseline run, the same conclusions can be drawn for temporally constant observation and forecast biases. If biases are not taken into account, slightly better discharge estimates than the baseline run are obtained, with not necessarily better state estimates (except for experiment 3 where the discharge estimates are also worsened). Taking into account the biases again leads to better state estimates (except for the storage S).

Benefit of dual observation-forecast bias estimation
In order to demonstrate the benefit of the estimation of both forecast and observation biases, as opposed to the estimation of forecast biases alone (Dee and Da Silva, 1998;Dee and Todling, 2000;Drécourt et al., 2006;De Lannoy et al., 2007;Bosilovich et al., 2007;Reichle et al., 2010), the experiments described above were repeated, but the estimation of the observation bias was turned off. Under these conditions, the state and forecast bias estimation reduces to the methodology described in De Lannoy et al. (2007). Table 4 shows the results of the comparison of the analyzed storages and discharges to the synthetic truth. These statistics can be compared to the statistics in Figs. 6, 7, 9, and 10 in order to assess whether the estimation of both forecast and observation biases leads to better results than the estimation of forecast bias alone.
In all cases, as expected, the incorporation of an observation bias estimation leads to a better estimate of the states and discharges. This is expressed by the lower bias and RMSE values in Figs. 6, 7, 9, and 10 than in Table 4. These results thus clearly show the benefit of a dual state and observation and forecast bias estimation.

Analysis of the ensemble spread
A very important aspect of the application of the Kalman filter is to ensure an adequate ensemble spread. Figure 11 shows the square root of the diagonal elements of the biased a priori error covariance matrix (thus the ensemble spread) and the observation error covariance for the case with a sinusoidal observation and forecast bias. From these plots the conclusions can be drawn that the ensemble spread is stable throughout the simulation and that ensemble collapse or instability do not occur. For the other experiments, similar results were obtained. Figure 12 shows the relationship between these ensemble spreads and the simulated discharge. The surface storage is only slightly correlated to the total discharge, while the spread in the surface and groundwater reservoir storages is more strongly correlated to the total discharge. As can be expected, the standard deviation of the observation bias is strongly correlated to the total discharge.

Observability of the biases
An important aspect in the application of the Kalman filter is the observability of the system. This can be examined through the observability matrix, of which the rank needs to be equal to the number of state variables. This observability matrix can be written as n is the number of state variables, and A k is the Jacobian matrix, resulting from the linearization of f k,k−1 in Eq. (18).
Since we are using persistent bias models, with less observations than the number of state variables, the rank of the observability matrix for the bias vectors will always be smaller than the number of state variables. In other words, the use of persistent bias models will always lead to an unobservable system. However, this observability issue is bypassed by the use of the parameters γ and κ in Eq. (16) and κ in Eq. (17). What these parameters essentially perform is first separate the biased background error covariance into an error covariance of the unbiased states and an error covariance of the forecast bias and then relate the biased background error covariance to the observation bias error covariance. If these two parameters are well chosen by examining the correlation length of the innovations, one should expect an adequate partitioning of the mismatch between observations and simulations into biases and state variables. This is demonstrated by the results mentioned in the sections above. The development of realistic bias models is under investigation, but this subject falls outside the scope of this paper.

Assimilation of in situ discharge data
In order to demonstrate the applicability of the dual bias estimation in real-world situations, in situ observed discharge rates are assimilated instead of synthetically true values. Again, an observation error standard deviation of 0.1 m 3 s −1 has been assumed, and an assimilation interval of 7 days has been used. Persistent bias models are again used. Figure 13 shows the results of this experiment. A seasonal cycle in the observation bias can be seen. This is very realistic, because there is a relatively large scatter in the discharge-water level relationship that is used to invert the water level observations into discharge values (W. Defloor, Department of Operational Water Management, Flemish Environmental Agency, personal communication, 2011). Since the water levels are usually low in the summer and high in the winter, this could lead to a seasonal signal in the observation bias. The bias in the storages evolves to realistic values. The unbiased discharge estimates and observations also show a slightly better match than the biased values, which increases confidence in the obtained results. This real-world application shows, at least for this relatively simple model, that the methodology can also be used in realistic situations. Further investigation using more complicated models in more challenging environments (for example the presence of snow and frozen soils) is needed. However, this is outside the scope of the paper, which focuses on the development of the theoretical framework and the initial assessment using a simple model.

Discussion and conclusions
In this paper, we presented a two-stage hybrid Kalman filter for the simultaneous estimation of forecast and observation biases and model state variables using the EnKF. The filter equations were first derived for a linear system. A first step consists of the updating of the forecast and observation biases, which are then used to update the unbiased state estimates. The Kalman gains for the forecast bias, the observation bias, and the model state variables can be interpreted as the fraction of the mismatch between the observations and the corresponding model simulations, which can be attributed to forecast bias, observation bias, and random forecast and observation error, respectively, remapping this mismatch onto state-space for the model state and bias estimates. The equations were then modified for nonlinear processes and observation systems in an ensemble framework. We followed the same approach as in De Lannoy et al. (2007). More specifically, a hybrid approach between an EnKF and a discrete KF is suggested, in which the state vector is estimated using the EnKF, and the biases are estimated using the discrete KF. The unbiased forecast error covariance and the forecast bias error covariance are calculated as fractions of the biased forecast error covariance. The observation bias error covariance is calculated as a multiplication of the observation prediction error covariance. In order to apply the algorithm to existing data assimilation codes, minimal code modification is needed.
The developed methodology was then applied to a rainfallrunoff model in situations with either only observation bias, both observation and forecast bias, or only forecast bias. The bias-aware EnKF with both observation and forecast bias estimation outperformed the bias-unaware EnKF as well as the bias-aware EnKF with forecast bias estimation only.
The filter depends on two parameters, γ and κ, which are assumed constant in time and which are used to estimate the bias error covariances. κ has been estimated by examining the temporal evolution of the forecast bias. For low values of κ, a continuous absolute growth of the forecast and observation bias was observed. The results were found to be less sensitive to the values for γ . It can be argued that these parameters should be made temporally variable, thus that a model for these two parameters would have to be developed. A more realistic model for the evolution of the biases could also be developed. Although this could certainly be the subject of further investigations, this falls outside the scope of this paper. A next step in the development of the two-stage hybrid filter will be the assimilation of remotely sensed data such as soil moisture values into a spatially distributed, physicallybased land surface model. The reliability of the method can be assessed by examining the modeled discharge. It is well known that a good estimation of surface soil moisture values will not necessarily lead to a good estimation of modeled discharge and vice versa (Lin et al., 1994). As a consequence, the use of discharge data for the calibration of hydrologic models will probably lead to biased surface soil 3516 V. R. N. Pauwels et al.: Simultaneous bias estimations using a two-stage hybrid Kalman filter Fig. 11. Time series of the standard deviation of the a priori biased state error covariance (top three panels) and the a priori observation bias error covariance (bottom panel) for the experiment with sinusoidal observation and forecast biases. moisture estimates. The reliability of the newly developed method can thus be examined by examining the modeled discharge values.
This paper presents an initial step towards dealing with the complex problem of joint observation and forecast bias and state estimation. The overall conclusion, based on the results that have been described, is that there is potential to improve data assimilation results if both observation and forecast bias are taken into account, as opposed to the use of bias-unaware Kalman Filters.  (1) the standard deviations of the a priori biased state error covariance and the a priori observation bias error covariance, and (2) the simulated discharge. In the two top and the left-hand side bottom panels, P in the y axis refers to the a priori forecast error covariance. In the bottom right-hand side panel, Po refers to the a priori observation bias error covariance.

Proof of the unbiased state estimate
It is relatively straightforward to prove that the last update in Eq. (9) leads to unbiased state estimates. We can calculate the temporal average of the state estimate error: The following definitions of the a priori estimate errors are used: A similar definition is used for the a posteriori estimate error. In Eq. (A1) we add and subtractx k to the right-hand side term, and use the previous definition: Through substitution of Eq.
(3) we can write Substitution of Eqs. (A2) and (2) leads to Rearrangement and again substituting Eq. (A2) result in the following expression: By definition, each individual error component on the righthand side is equal to zero. For example, for the state vector we can write We add and subtract e − k and substitute this once by Eq. (A2): This reduces to Comparison to Eq.
(2) shows that by definition the estimate error average needs to be zero. The temporal average of the a posteriori estimate error is thus equal to zero, which means that the system estimates an unbiased state.

Appendix B
The derivation for a linear system

B1 Calculation of the observation bias Kalman gain
First the expression for the Kalman gain for the observation bias will be derived. For this purpose, the a posteriori error covariance of the observation bias (P o+ k ) needs to be minimized. Substitution of Eq. (9) by Eq. (11) leads to