Towards simplification of hydrologic modeling: identification of dominant processes

The Precipitation–Runoff Modeling System (PRMS), a distributed-parameter hydrologic model, has been applied to the conterminous US (CONUS). Parameter sensitivity analysis was used to identify: (1) the sensitive input parameters and (2) particular model output variables that could be associated with the dominant hydrologic process(es). Sensitivity values of 35 PRMS calibration parameters were computed using the Fourier amplitude sensitivity test procedure on 110 000 independent hydrologically based spatial modeling units covering the CONUS and then summarized to process (snowmelt, surface runoff, infiltration, soil moisture, evapotranspiration, interflow, baseflow, and runoff) and model performance statistic (mean, coefficient of variation, and autoregressive lag 1). Identified parameters and processes provide insight into model performance at the location of each unit and allow the modeler to identify the most dominant process on the basis of which processes are associated with the most sensitive parameters. The results of this study indicate that: (1) the choice of performance statistic and output variables has a strong influence on parameter sensitivity, (2) the apparent model complexity to the modeler can be reduced by focusing on those processes that are associated with sensitive parameters and disregarding those that are not, (3) different processes require different numbers of parameters for simulation, and (4) some sensitive parameters influence only one hydrologic process, while others may influence many.


Introduction
It has long been recognized that distributed-parameter hydrology models (DPHMs) are complex because of the subtlety and diversity of the hydrologic cycle which they aim to simulate (Freeze and Harlan, 1969;Amorocho and Hart, 1964). In this study, two different aspects of this complexity are addressed: 1. DPHMs have too many input parameters (Jakeman and Hornberger, 1993;Kirchner et al., 1996;Brun et al., 2001;Perrin et al., 2001;McDonnell et al., 2007). In this article, distributed parameters are defined as model inputs that remain constant through time, but can vary spatially across the landscape. Those who apply these models often have difficulty with understanding what these parameters are and how they are used in the model. Regularly, there are several parameters that may have similar effect on the computations or may constrain the model in unintended ways (Hrachowitz et al., 2014). Despite the developer's claims that these DPHMs are more or less physically based, often there are not measurements or data sources available for reliable development of all of the input parameters. Duan et al. (2006) describes "a gap in our understanding of the links between model parameters and the land surface characteristics". These unmeasured parameters, ostensibly tangible, are really empirical coefficients when it comes to application and calibration (Samaniego et al., 2010).  Mayer and Butler, 1993;Ewan, 2011). Often, the meaning of output variables is not always intuitive and results sometimes can seem contradictory (e.g., when streamflow does not seem to correlate with climate information). The result of these complex issues has led to the study of parameter interaction (Clark and Vrugt, 2006) and equifinality (Beven, 2006).
Developing effective DPHM applications require that the modeler address these two aspects of complexity at the same time (i.e., the uncertainty problem: "If I am uncertain when estimating input parameters, due to either incomplete or inaccurate information, what effect does it have on the output?", and the calibration problem: "I know the output I want, which parameters should I change and how much should I change them?") (Chaney et al., 2015;. While the user of a DPHM can do nothing about the complexity of the model's internal structure, the apparent complexity can be reduced by limiting the parameters and the affected output under consideration (as described by Jakeman and Hornberger, 1993;. Global parameter sensitivity analysis can determine the degree to which different values of parameters can affect the simulation of certain model outputs (Sanadhya et al., 2013). Furthermore, parameter sensitivity can be evaluated with respect to selected output variables, each representing a different aspect of the hydrologic cycle (hereafter referred to as processes). Sensitivity analysis of this form can be used to identify both the input parameters that are the most sensitive (i.e., the parameters that affect the simulation the most) and the dominant process(es) (i.e., those processes which are affected most, by the most sensitive parameters) according to the DPHM.
Any particular DPHM must necessarily be able to simulate any and all hydrological processes that may occur anywhere on the landscape. However, with the application of a DPHM to a specific site, it can become much less complex when the dominant hydrological process(es) are identified, as not all processes are active to the same degree. The mod-eling problem becomes less complex to the modeler when hydrological processes not relevant to the modeled domain or watershed are removed from consideration (Wagener et al., 2003;Guse et al., 2014;Bock et al., 2016). Related to this, various methods have been developed that will group similar watersheds together for purposes of study (Wolock et al., 2004;Winter, 2001;Ali et al., 2012) or for parameter regionalization (He et al., 2011;Merz and Blöschl, 2004;Seibert, 1999;Vogel, 2005). In addition, dominant process concepts have been explored by several researchers as a way to classify watersheds and natural hydrologic systems for the purpose of simplifying DPHMs (Sivakumar and Singh, 2012;Sivakumar et al., 2007). Some have suggested this approach for use as a possible classification framework (e.g., Woods, 2002;Sivakumar, 2004). Pfannerstill et al. (2015) developed a framework for identification and verification of hydrologic processes in simulation models on the basis of temporal sensitivity analysis. Cuntz et al. (2015) describe a method of identifying only informative parameters as a screening step in order to reduce the effort required to perform global sensitivity analysis on the full parameter space. McDonnell et al. (2007) discuss the possibility of simplifying hydrologic modeling by identifying "fundamental laws" so that over-parameterized models are not needed. However, in our opinion we have not made much progress on that front and DPHMs are, in many ways and for many reasons, more complex than ever.
This article describes an approach for identification of sensitive parameters and processes for a modeling application of the conterminous US (CONUS, Fig. 1). Identification and simulation of regional CONUS sub-watersheds are determined by the resolution of the available information and how the DPHM responds to geophysical (e.g., topography, vegetation and soils) and climatological variation. Specifically, we propose to identify the sensitive parameters and dominant hydrologic process(es), thereby reducing the amount of parameter input and number of output variables to consider (Chaney et al., 2015) and address the two aspects of complexity as described above.

Distributed-parameter hydrology model
The US Geological Survey's Precipitation-Runoff Modeling System (PRMS) is the DPHM used in this study. PRMS is a modular, deterministic, distributed-parameter, physicalprocess watershed model used to simulate and evaluate the effects of various combinations of precipitation, climate, and land use on watershed response. Each hydrologic process simulated by PRMS is encoded in a modular piece of source code (i.e., a "module") and is represented by an algorithm that is based on a physical law (e.g., balance of energy required to melt the ice in a snowpack) or empirical relation with measured or estimated characteristics (e.g., a tank model used to simulate interflow). The reader is referred to Markstrom et al. (2015) for a complete description of PRMS.
A fundamental assumption of this study is that PRMS is able to simulate and differentiate hydrologic signals from all the different processes at the scale of the CONUS. Two possible ways to evaluate this are: (1) an analysis of PRMS's internal structure, and (2) the history of PRMS applications. A detailed analysis of PRMS's structure is beyond the scope of this article (see Markstrom et al., 2015); however, PRMS is implemented in a very linear fashion. Each parameter is clearly identified with an equation that is related to simulation of a specific process. Equations are solved sequentially, generally in the order that is defined by water moving through the hydrologic cycle, starting from the atmosphere as precipitation and moving through the rivers as streamflow. The outputs of one equation may be used as inputs to subsequent equations. All of the inputs for a particular equation are required before that equation can be solved. This interdependency in equations can lead to parameter interaction in the simulation of subsequent processes (as described by Beven, 1989;Grayson et al., 1992;Yilmaz et al., 2008;Pfannerstill et al., 2015). For example, parameters related to distribution of temperature and solar radiation may show correlation with each other when evaluated with respect to simulation of evapotranspiration, despite these parameters not being explicit terms in the evapotranspiration equations. Past studies indicate that PRMS has been very useful in waterresource and research studies across the CONUS (Battaglin et al., 2011;Boyle et al., 2006;Hay et al., 2011;Markstrom et al., 2012) and is capable of matching measured data (Bower, 1985;Cary, 1991;Dudley, 2008;Koczot et al., 2011) in a variety of geophysical and climatological settings.
To define the spatial domain for the CONUS application of PRMS, the locations of major river confluences, water bodies, and stream gages have been geo-referenced. Approximately 56 000 stream segments are used to connect these lo-cations. Using these stream segments, the left and right bank areas that contribute runoff directly to each segment have been identified, resulting in approximately 110 000 irregularly shaped hydrologic response units (HRUs) of various sizes (500 m 2 to 14 000 km 2 ) (Viger and Bock, 2014). These HRUs are derived by their geographic and topographic location, affecting their extent and resolution. The CONUS application is forced with values of daily precipitation and daily maximum and minimum air temperature from the DAYMET data set (Thornton et al., 2014). The climate information covers a time period from 1980 to 2013 on a daily time step, but a shorter period (1987-1989 used for warmup, and 1990-2000 used for evaluation) was used in this study.

Calibration parameters
The version of PRMS used in this study has 108 input parameters. A parameter is defined as an input value that does not change over the course of a simulation run. Of these parameters, most would never be modified from their initial values (hereafter referred to as non-calibration parameters, see Viger, 2014) because they are (1) computed directly from digital data sets through the use of a geographic information system (e.g., land-surface characterization parameters), (2) boundary conditions (e.g., parameters to adjust daily precipitation and daily air temperature forcings), or (3) model configuration options (e.g., unit conversions and model output options). This leaves 35 parameters under consideration for improved model performance, hereafter referred to as calibration parameters (Table 1). Each parameter is used within a PRMS code module that simulates a single hydrologic process in PRMS. The output variables of one module may be used as input variables to other modules. It is through these connections that calibration parameters associated with a PRMS module may affect the results of other modules.

Hydrologic processes
PRMS produces more than 200 output variables that indicate the simulated hydrologic response of a watershed through time (Markstrom et al., 2015, see Table 5 in Appendix 1). In this study, eight of these output variables have been selected to represent the response of major hydrologic processes at the HRU resolution. These processes are: (1) snowmelt (PRMS output variable snowmelt) -the amount of water that has changed from ice to liquid and becomes either surface runoff or infiltrates into the soil zone of the HRU; (2) surface runoff (sroff) -water from a rainfall or snowmelt event that travels quickly over the land surface from the HRU to the connected stream segment; (3) infiltration (infil) -the sum of rain and snowmelt that passes into the soil zone of the HRU; (4) soil moisture (soil_moist) -the storage state that represents the amount of soil water in the soil zone above wilting point and below total saturation in the HRU; (5) evapotranspiration (hru_actet) -the total actual evapotranspira- Fraction of the soil zone in which preferential flow occurs soil-zone 0.0-0.1 sat_threshold Water capacity between field capacity and total saturation soil-zone 1.0-999.0 slowcoef_lin Linear coefficient for interflow routing soil-zone 0.001-0.5 slowcoef_sq Non-linear coefficient for interflow routing soil-zone 0.001-1.0 soil2gw_max Maximum soil water excess that is routed directly to groundwater soil-zone 0.0-0.5 soil_moist_max Maximum available water holding capacity of soil zone soil-zone 0.001-10.0 soil_rechr_max Maximum available water holding capacity of recharge zone soil-zone 0.001-5.0 ssr2gw_exp Non-linear coefficient in equation used to route soil-zone water to groundwater soil-zone 0.0-3.0 ssr2gw_rate Linear coefficient in equation used to route soil-zone water to groundwater soil-zone 0.05-0.8 transp_tmax Temperature that determines start of the transpiration period soil-zone 0.0-1000.0 gwflow_coef Linear groundwater discharge coefficient groundwater 0.001-0.5 tion lost from canopy interception, snow sublimation, and soil and plant losses from the root zone; (6) interflow (ss-res_flow) -shallow lateral flow in the unsaturated zone to the connected stream segment; (7) baseflow (gwres_flow) -the component of flow from the saturated zone to the connected stream segment; and (8) runoff (hru_outflow) -the total flow from the HRU contributing to streamflow in the connected stream segment. It is assumed that these eight output variables are representative of the processes typically considered in hydrological studies with DPHMs. Details of how these processes are simulated by PRMS are described by Markstrom et al. (2015).

Performance statistics
For DPHMs, there are many different performance measures that have been developed for different purposes (Krause et al., 2005;Gupta et al., 2008Gupta et al., , 2009Mendoza et al., 2015a, b). Because this study is an analysis of model sensitivity, the performance measures need only track changes in model output and do not necessarily need to include observed measurements. Consequently, performance statistics can be developed for processes that are not normally evaluated by performance measures. Archfield et al. (2014) demonstrated that seven fundamental daily streamflow statistics (FDSS) can be used to group streams by similar hydrologic response and tend to provide non-redundant information. In this study, all seven FDSS were computed for each of the eight PRMS time-series output variables corresponding to the processes. For the purpose of illustration, this article focuses on three of the FDSS: (1) mean; (2) coefficient of variation (CV); and (3) the autoregressive lag 1 correlation coefficient (AR-1). In an intuitive sense, these three statistics can be thought to represent changes in total volume, "spikiness" or "flashi-ness", and day-to-day timing, respectively. These performance statistics are computed on the daily time series of the process variables for the 10-year evaluation period.

FAST analysis
Parameter sensitivity analysis measures the variability of model output given variability of calibration parameter values. This is determined by partitioning the total variability in the model output or change in performance statistics to individual calibration parameters . The Fourier amplitude sensitivity test (FAST) (Schaibly and Shuler, 1973;Cukier et al., 1973Cukier et al., , 1975Saltelli et al., 2006) was selected for this study because it has been demonstrated that it can efficiently estimate non-linear hydrologic model parameter sensitivity (Guse et al., 2014;Pfannerstill et al., 2015;. FAST is a variance-based global sensitivity algorithm that estimates the first-order partial variance of model output explained by each calibration parameter (hereafter referred to as parameter sensitivity). Specifically, this first-order variance is the variability in the output that is directly attributable to variations in any one parameter and is distinguishable from higher order variances associated with parameter interactions. An important caveat is that these higher order variances are not accounted for in the analysis. It is assumed that first-order partial variance is sufficient to identify sensitive parameters. This same assumption, as applied to process identification, may be more problematic. If there are sets of interactive sensitive parameters that have not been identified, then the associated process(es) will not be identified as such. Selected parameters are varied within defined ranges at independent frequencies among different model runs. FAST identifies the variability of parameter sensitivities and their ranks, by means of their contribution to total power in the power spectrum. FAST has been implemented as the 'fast' library in the statistical software R Reusser, 2013;R Core Team, 2015) in two parts. In the first part, the user identifies the calibration parameters and respective value ranges for the test, then FAST generates sets of test calibration parameter values (hereafter referred to as trials).
Calibration parameter values are varied across the trials according to non-harmonic fundamental frequencies. The user then runs the DPHM for each trial and computes corresponding performance statistics. Then the user runs the second part of the FAST package that performs a Fourier analysis of the performance statistics over the trial space looking for the frequency signatures associated with each calibration parameter.
The FAST methodology results in a simple procedure for computing parameter sensitivities on an HRU basis for all the CONUS. The steps in this process are as follows: 1. Assign appropriate ranges for the 35 calibration parameters (Markstrom et al., 2015;as in LaFontaine et al., 2013). These are shown in Table 1. 2. Run the first part of the FAST procedure (as described above) to develop over 9000 unique parameter sets, comprised of value combinations for the calibration parameters. The total number and content of these parameter sets, and the results from their simulation by PRMS, are completely determined by the first part of the FAST procedure in order to investigate the trial space. Each of the prescribed simulations are independent of each other so they can run in parallel on a computer cluster.
3. Compute the FDSS based performance statistics (mean, CV, and AR-1) for each process.
4. Run the second part of the FAST procedure (as described above) using output from step 3, resulting in PRMS parameter sensitivities, at each HRU, for the 56 combinations of seven performance statistics and eight processes (plus totals).
3 Results Figure 2 shows parameter sensitivity as a set of maps ordered by process and performance statistic. This illustrates the spatial variability in parameter sensitivity and the importance that choice of performance statistic can make in terms of evaluation of hydrologic response. In these maps, the HRUs are colored according to the parameter sensitivity, which is computed by summing the first-order sensitivity for all 35 parameters separately for each of the 8 output variables, each corresponding to their respective process. (These sums do not necessarily add up to 1). Then each individual category of modeled process and performance statistic is scaled to account for total sensitivity. This summed sensitivity across the parameters, by each category, is hereafter referred to as cumulative parameter sensitivity. Parameter sensitivities associated with process (column labeled "Process average" in Fig. 2) are averaged across all of the parameter sensitivity values computed for the different performance statistics, while parameter sensitivities associated with the performance statistics (last row labeled "Performance statistic average" in Fig. 2) are averaged across all of the parameter sensitivity values computed for the different processes. These categories are indicated by their position in the rows and columns in Fig. 2. When looking at a single performance statistic for a single process, the cumulative parameter sensitivity can vary from near 0.0 (white colored HRUs) to near 1.0 (black colored HRUs). Low values in these maps indicate that there are no parameters that can be changed in any way to affect the performance statistic (this situation is hereafter referred to as an inferior process). Likewise, each HRU has a cumulative sensitivity value (i.e., the sum of all of the partial sensitivities for each process). The process with the largest sum on an HRU is referred to as the dominant process for that HRU. An example of an inferior process is clearly seen in the case of the mean of the snowmelt process in the southern CONUS HRUs. This is because the occurrence of snow in these areas is very infrequent. Also, there were HRUs for which the value of some performance statistics were mathematically undefined for certain processes (e.g., AR-1 and CV for the baseflow and snowmelt processes). These cases occur when the output variable representing the process does not change at all through time, regardless of the parameter values, and are extreme examples of inferior processes. Likewise, a clear example of a dominant hydrologic process is the CV of interflow in the Intermountain West region of the CONUS (Figs. 1 and 2). This means that for these HRUs, there exist some calibration parameters that can be varied, which affect this process to a very high degree.

Parameter sensitivity by process and performance statistic
Also apparent from Fig. 2 is that there are clear spatial patterns in the parameter sensitivity on the basis of the geographical features of the CONUS. Generally, many of the maps show a sharp break in parameter sensitivity between mountain ranges and comparatively lower elevations, northern contrasted with southern latitudes, and humid vs. arid climates. Specific contrasts can be seen in several maps such as when examining the humid Midwest as opposed to the Great Plains regions and the Pacific coastal areas and the Desert Southwest region of the CONUS (Fig. 1). Additionally, topographic features of the landscape are prominent (e.g., elevation for interflow), while in other maps, climate considerations seem to dominate (e.g., snowmelt). Another specific example is that the mean of each process, which indicates the ability of any parameter(s) to change the total volume of water during a simulation, seems to have a low sensitivity band in the Great Plains region for all processes except for snowmelt (Fig. 1). This band of low sensitivity has been noted in other modeling studies (Newman et al., 2015;Bock et al., 2016).

Parameter count required to parameterize each process
To identify the expected count of parameters required to parameterize a particular process, cumulative parameter sensitivity across all HRUs of the CONUS has been computed and plotted ( Fig. 3a-h). The sensitivity level accounted for by the most sensitive parameter, regardless of which parameter it is, for all HRUs across the CONUS is plotted in position 1 on the x axis of each of these plots (Fig. 3a-h). Then, cumula-tive sensitivity is plotted for the parameter in rank 2, and so on, until the cumulative sensitivity of all 35 calibration parameters is accounted for. The plots in Fig. 3a-h show that far fewer than the full 35 parameters are needed to account for most of the parameter sensitivity. In fact, to account for 90 % of the parameter sensitivity, this count varies from a low value of just over two for snowmelt to an average high value of over nine for runoff in selected HRUs. The actual count of calibration parameters required to account for 90 % of the parameter sensitivity varies by process and region, as shown by the maps in Fig. 3i-p. These maps were generated by counting the number of parameters required to obtain the 90 % cumulative sensitivity level for each HRU. For example, Fig. 3o indicates that for the baseflow process, between three and nine parameters are needed to account for 90 % of the parameter sensitivity in the various HRUs across the CONUS, with the higher count needed in mountainous, Great Lakes, and New England regions. The maps also indicate that between 2 (Fig. 3i) and 13 parameters (Fig. 3k, n, and p) are required for parameterization of these processes. This analysis indicates that more parameters are needed to simulate the components of streamflow (e.g., baseflow, interflow, and surface runoff) than processes that do not result directly in flow (e.g., snowmelt, evapotranspiration, and soil moisture). In addition, simulated processes that are identified as being sensitive to parameters with which they are not normally associated, may indicate that these processes are a convolution of other processes, consequently making parameters sensitive that are not normally sensitive.
Visually, these maps ( Fig. 3i-p) indicate that HRU calibration parameter counts vary regionally. For most processes, higher parameter counts are seen in the more mountainous regions of the Cascade, Sierra Nevada, Rocky, Ozark, and Appalachian mountains, although this is true to a much lesser extent for the evapotranspiration and soil moisture processes ( Fig. 3m and l). Higher values also seem prevalent in the New England and Great Lake regions (Fig. 1). This result seems to indicate that, no matter which part of the hydrologic cycle is simulated, more parameters are required in these regions. In contrast, low parameter counts seem prevalent in the Great Plains and Desert Southwest regions.
Finally, Fig. 3 illustrates the extent to which it is possible to decompose the parameter estimation problem into a sub-set of independent problems, and hence reduce the dimensionality of the inference problem and avoid the troublesome nature of parameter interactions. By considering a single (or reduced set of) processes and performance statistic categories at a time, the sensitive parameter space can be substantially reduced. It also illustrates that there is a strong spatial component to this decomposition. In order to make the information presented in Fig. 3 more useful for DPHM application, the particular sensitive parameters have been determined for each HRU by ranking the calibration parameters by sensitivity for each category of process and performance statistic for each individual HRU and is summa-rized by counting the occurrence of each parameter across the HRUs and ranking them within their respective category of process and performance statistic (Table 2). To address the issue of the spatial variability of these parameters, the percentage of the total number of HRUs for which that parameter is sensitive is shown as the number in parentheses after the parameter name in Table 2. Higher percentage values would indicate that the corresponding parameter is sensitive across more of the CONUS. Refer to Table 1 for a complete description of these parameters.
When looking at the categorical parameter lists of Table 2, it is expected that different parameters would associate with different processes (i.e., along a column), but it is surprising to see how different the parameter lists are for different performance statistics (moving across a row) for the same process. An example of this is the baseflow process: the baseflow coefficient (PRMS parameter gwflow_coef) is the most sensitive parameter for performance statistics CV and AR-1, but is not even in the list of sensitive parameters for the performance statistics related to the mean of the process. This implies that this parameter is influential for affecting the timing of baseflow, while it does not have any effect on the total volume of baseflow.
Further inspection of Table 2 indicates that some calibration parameters occur in many of the 24 categories (8 processes times 3 performance statistics), while some parameters do not occur at all. A count of how many times each parameter occurs provides insight into how many process and/or performance statistic combinations that particular parameter influences. To investigate this for the CONUS application, another view of the information in Table 2 is shown in Fig. 4. The 25 sensitive calibration parameters from Table 2 are listed on the y axis of Fig. 4, ranked by order of the number of times that they appear in the process and/or performance statistic categories. Furthermore, each appearance is indicated by an adjacent circle. Independent of the number of times a parameter occurs within a category (number of circles), the color of the circle visually indicates the proportion of the CONUS HRUs that are affected by that parameter. Specifically, a red circle indicates that more HRUs are affected, while blue indicates that fewer HRUs are affected. Figure 4 shows that 3 specific parameters affect 18 or more process and/or performance statistic categories; 7 parameters affect seven to 14 categories, and 15 specific parameters affect one to five categories. Finally, of the 35 parameters studied, 10 are never used for any combination of process and performance statistic (Table 2 and Fig. 4). It is apparent from Fig. 4, that for the CONUS application of PRMS, the parameters affecting the most process categories are soil_moist_max (maximum available water holding capacity), jh_coef (Jensen-Haise air temperature coefficient), and dday_intcp (intercept in degree-day solar radiation equation). Because these parameters affect so many categories, modelers would be wise to invest their resources in developing the best values possible for these parameters to avoid Table 2. Ordered list of most sensitive Precipitation-Runoff Modeling System calibration parameters by process and performance statistic. The parameters listed in each cell of the table are those that are required to account for 90 % of the cumulative sensitivity across all hydrologic response units (HRUs). The number in parentheses following the parameter name is the proportion of the CONUS HRUs, in percent, in which that parameter is part of the set that accounts for 90 % of the cumulated sensitivity on an HRU-by-HRU basis. These parameters are described in Table 1 (92) tmax_allrain (38), dday_intcp(29), rad_trncf(9), rad_trncf(28), freeh2o_cap(8), radmax(24), dday_intcp (7) tmax_allrain (17), jh_coef(15), freeh2o_cap (14), cecn_coef (14), emis_noppt (13)  unintended parameter interaction during calibration. Ideally, these parameters could be estimated from reliable external data, set for the model and not calibrated. The parameters that affect the least number of process categories (aside from the parameters that are never sensitive) are cecn_coef (convection condensation energy coefficient), ssr2gw_exp (coefficient in equation used to route water from the soil to the groundwater reservoir), emis_noppt (emissivity of air on days without precipitation), potet_sublim (fraction of potential evapotranspiration that is sublimated), and slowcoef_lin (slow interflow routing coefficient). Ideally, these parameters could be set to default values since there is limited value in calibrating them. Also apparent from Fig. 4 is that there are many parameters between these two extreme groups. Parameters like smidx_coef (soil moisture index for contributing area calcu-lation) can appear in several process categories, without any high rankings, while there are other parameters like slow-coef_sq (slow interflow routing coefficient) that appear in relatively few process categories, but have high rankings. This behavior may be due to the vertical routing order (i.e. processes that occur nearer to the surface happen before the deeper ones) of the associated processes (Yilmaz et al., 2008;Pfannerstill et al., 2015). In PRMS, the process of partitioning of precipitation into either direct surface runoff or infiltration (controlled directly by parameter smidx_coef) is "faster" and occurs in the vertical routing order before the process of interflow generation (controlled directly by parameter slow-coef_sq). These parameters may be the best candidates for calibration because they are sensitive, while at the same time interaction across processes is perhaps limited.  Table 2. The circles in each row adjacent to a parameter name indicate how many times the respective parameter occurs in these different categories. Parameters with more circles are affecting more process categories. The color of each circle indicates the extent of the spatial coverage of that occurrence; specifically, red circles (as opposed to blue) indicate that more Hydrologic Response Units are affected by the respective parameter.

Identification of dominant and inferior processes by HRU
To identify the dominant and inferior process(es) by geographic area, the following procedure is done for each HRU: 1. the parameter sensitivity scores are summed for each parameter, resulting in a score for each parameter for each time-series output variable and performance statistic; 2. the parameter scores are averaged by performance statistics, resulting in a score for each process; 3. the process scores are ranked for each HRU; 4. the top (and bottom) ranked process determines the most dominant (and most inferior) single process for each HRU as shown in Fig. 5.
Generally, Fig. 5a shows that evapotranspiration is the most prevalent dominant process for the CONUS. This is probably because it is a major component of the hydrologic cycle and sensitive parameters are available to affect it in every HRU. However, this is not universal, and the dominant process varies by geographic region, with snowmelt being the dominant process in the northern Great Planes and northern Rocky Mountains, total runoff being the most important in the Pacific Northwest, and with interflow important in bands across the Intermountain West (Fig. 1). Each process is dominant somewhere depending on local conditions. Equally informative are the locations of the most inferior processes (Fig. 5b). This clearly shows that PRMS snowmelt parameters are not sensitive across the Central Valley of California, and in the Deep South and the southwestern US (Fig. 1). Areas where runoff is more dominant than evapotranspiration, as in the Cascade Mountains and coastal areas of the Pacific Northwest, are locations where the runoff is a substantially greater part of the water budget. Interestingly, infiltration and baseflow appear to be equally inferior across most of CONUS, with pockets of HRUs that are insensitive to soil moisture, surface runoff, and interflow, depending on local conditions. There are no HRUs that rank evapotranspiration as the most inferior process. Dominant and inferior processes can be identified for HRUs at the watershed scale as well. Figure 5c shows the most dominant process by HRU for the Apalachicola-Chattahoochee-Flint River watershed in the southeastern US. This watershed has been the subject of previous PRMS modeling studies (LaFontaine et al., 2013). When using this information at a finer resolution, it shows that evapotranspiration is the most dominant process watershed wide, but with pockets of HRUs in the northern part of the watershed where runoff is the most dominant and a pocket in the southern part of the watershed where infiltration is most dominant. Likewise, the most inferior process for each HRU is identified in Fig. 5d. This clearly indicates that parameters and performance statistics related to snowmelt, and to a lesser degree baseflow, do not need to be considered when modeling this watershed. Figure 5d also indicates that in the northern part of the watershed, infiltration and runoff are inferior processes as well, which could in part be due to impervious conditions around the Atlanta metropolitan area.

Causes of parameter sensitivity
There are regions where parameter sensitivity is typically high for a particular performance statistic (e.g., New England region (Fig. 1) for performance statistic based on mean of processes) or typically low (e.g., Great Plains region (Fig. 1) for mean of processes) regardless of the process (Fig. 2). Why do the HRUs of some regions exhibit parameter sensitivity to almost all processes, while others exhibit parameter sensitivity to almost none? All other things being equal, there can only be two sources of these spatial patterns: 1. The physiography that is used to define the noncalibration parameters (e.g., elevation, vegetation type, soil type) renders all calibration parameters insensitive. A theoretical example of this could be if an HRU is characterized as entirely impervious, resulting in the nonexistence of any simulated soil water.
2. Patterns in the climate data used to drive the model (e.g., daily temperature and precipitation) could control model response. A theoretical example of this could be an HRU that receives no precipitation.
The hydrologic response of the HRUs in either case would always remain unchanged, regardless of changes in any parameter value. In either case, these sources of information are independent of the DPHM and could lead to the conclusion that the dominant processes identified by the methods outlined in this article could correspond to perceptible dominant processes in the physical world (i.e., how the "real world" works). The number of unique calibration parameters for each process in Table 2 (i.e., counting the parameters across each row) may provide some insight into the complexity of each process as represented in the model structure of PRMS. In theory, more "complicated" hydrologic processes would require more parameters for parameterization than the "simpler" ones. According to this view, runoff (16 calibration parameters), infiltration (12 calibration parameters), and interflow (12 calibration parameters) are the most complex processes to simulate, with soil moisture (4) being the simplest. Baseflow (11 calibration parameters), snowmelt (11 calibration parameters), surface runoff (10 calibration parameters), and evapotranspiration (8 calibration parameters) are in between. This reflects the fact that in PRMS, runoff is a much more complicated calculation with many of the other processes di-rectly contributing information. Also apparent is that more parameters are needed to simulate the components of streamflow (e.g., baseflow, interflow, and surface runoff) than processes that do not result directly in flow (e.g. snowmelt, evapotranspiration, and soil moisture). The only process that does not follow this pattern is infiltration. Storm-event-based infiltration is typically simulated with sub-daily time steps to account for the variability of time and intensity in this process. It is possible that PRMS must compensate for this shortcoming in structure with a more complex parameterization of the process. Table 2 indicates that there are 10 calibration parameters that are never sensitive regardless of the process or performance statistic. This indicates that these parameters should always be set to the default value, with minimal resources used to estimate them, and never be calibrated. Additional modeling studies could reveal situations where these parameters actually do exhibit some sensitivity, perhaps in situations with smaller geographical domains or over different time periods. It is also possible that these parameters are never sensitive, indicating some structural problem or unwarranted complexity in the DPHM, and the removal of some algorithms from the source code of the DPHM is advised. Additional study is required of these 10 non-sensitive calibration parameters, and upon further review of the PRMS source code, a structural problem (e.g., unintended constraint, nondifferentiable behavior, or software bug) might be revealed. Alternatively, the problem could be related to invalid parameter ranges in the FAST analysis or problems with the climate data used to drive the model. Finally, it could be that alter-native or improved performance statistics could resolve this issue.

Choice of performance statistic
The maps of Fig. 2 clearly illustrate the importance that choice of performance statistic can make in terms of evaluation of hydrologic response. When the maps of performance statistics within a single hydrologic process are compared (i.e., the maps across a single row), the spatial patterns and magnitude of the parameter sensitivity can be very different. This could indicate that the performance statistics based on the FDSS truly are non-redundant and are accounting for different aspects of the processes. Table 2 indicates that the baseflow coefficient (PRMS parameter gwflow_coef, Markstrom et al., 2015) is the most sensitive parameter for performance statistics CV and AR-1, but not sensitive to the mean of the baseflow process performance statistics. This points to the fact that despite having knowledge of a parameter being associated with the computation of a certain process, sensitivity analysis can reveal that the response of the simulation is completely different when the performance statistic changes. It also indicates that sensitivity analysis might be an important step in selection of an appropriate performance statistic and that uncritical application of performance statistics may be misleading.

Spatial aspects of dominant and inferior processes
When the dominant and inferior processes are determined for an HRU (Fig. 5), it is possible that certain parameters are included in both the most dominant and most inferior processes at the same time. This apparent contradiction is not necessarily a conflict but indicates that the calibration parameters must work in concert with the evaluation method. For example, there exist HRUs where the evapotranspiration process is dominant and at the same time the runoff or infiltration processes are inferior ( Fig. 5a and b). The parameter soil_moist_max is indicated as being sensitive for all three of these processes (Table 2). This parameter would demonstrate equifinality if evaluated within the context of the inferior processes (i.e., those output variables and performance statistics associated with the inferior process) but would be a very effective calibration parameter resulting in optimal values when viewed within the context of parameters and variables of the dominant process.
This method of identification of inferior and dominant processes for a specific geographical location (i.e., HRU, watershed, or region), determined by sensitivity analysis, is defined within the context of the application of the DPHM and may not necessarily have the same meaning within a different context. However, this methodology does have the ability to spatially classify watersheds and identify dominant processes. This classification scheme depends not only on the physiographic nature of the watershed, but also on the scale, resolution, and purpose that were considered by the modeler when the application was developed.

Further study
Providing modelers with reduced lists of calibration parameters on an HRU-by-HRU, watershed-by-watershed, or region-by-region basis is the first step in the path of this research. Subsequent steps to this approach could be developed into more sophisticated methods where orthogonal output variables and performance statistics could provide much more insight into methods of effective model calibration.
Other advancements in this approach may identify groups of parameters that effectively behave together, thus reducing the number of parameters and making specific model output respond more directly to a single or a few parameters, reducing parameter interaction. This suggests that model parameterization and calibration might benefit from a step-by-step strategy, using as much information as possible to set noninteractive parameters and remove them from consideration before the more interactive parameters are calibrated, reducing the dimensionality of the problem .
Another question for future research is: Does the classification of dominant hydrologic processes, both geographical and categorical, as described in this study, apply in other contexts? Comparable findings from other modeling studies, such as those by Newman et al. (2015) and Bock et al. (2016), might indicate that there could be a connection. These other studies use the same input information (i.e., being driven with the same climate data and using the same sources of information for parameter estimation), and thus simulation results and model sensitivity to this information might be similar. Also, can real world watersheds be classified by sensitivity analysis using DPHMs? Based on the findings of the work presented so far, the answer is inconclusive. Clearly there are some results that indicate that it might be possible. For example, the methods described here effectively identify "snowmelt watersheds" in the mountainous and northern latitudes, but, is all of this necessary to accomplish this? Might simpler methods (e.g., an isohyetal snowfall map) identify snowmelt watersheds just as effectively?
Questions remain about using parameter sensitivity for identification of structural inadequacies within the CONUS application and specifically the PRMS model itself. A full analysis of these parameters and how they relate to their respective process(es) is beyond the scope of this article, but it could relate information about the structure of PRMS. In this study, certain hydrologic processes (e.g., depression storage, streamflow routing, flow-through lakes, and strong groundwater-surface-water interaction) were not considered because of additional data requirements and parameterization complexity. The PRMS model also allows for selection of alternative methods for many of the module types. Each of these modules uses different equations and calibration pa-rameters. Future work might be to determine the effect of using different modules or maybe even to determine the selection of the PRMS modules through sensitivity analysis. Just as the spatial and temporal scope of any modeling project must be defined, the scope of the hydrologic processes, and the detail to which these processes are simulated, must be likewise defined. Also, alternative ways of defining HRUs (e.g., larger or smaller, or even based on dominant process instead of geographic location) could affect the analysis. Model development and application could perhaps proceed by first accounting for those factors that have the most effect.

Conclusion
Watersheds in the real world clearly exhibit hydrologic behavior determined by dominant processes based on geographic location (i.e., land surface conditions and climate forcings). A methodology has been developed to identify regions, watersheds, and HRUs according to dominant process(es) on the basis of parameter sensitivity response with respect to a distributed-parameter hydrology model. The parameters in this model were divided into two groups -those that are used for model calibration and those that were not. A global parameter sensitivity analysis was performed on the calibration parameters for all HRUs derived for the conterminous US. Categories of parameter sensitivity were developed in various ways, on the basis of geographic location, hydrologic process, and model response. Visualization of these categories provides insight into model performance, and useful information about how to structure the modeling application should take advantage of as much local information as possible.
By definition, an insensitive parameter is one that does not affect the output. Ideally, a distributed-parameter hydrology model would have just a few calibration parameters, all of them meaningful, each controlling the algorithms related to the corresponding process. This would result in low parameter interaction and a clear correspondence between input and output. However, this is not always the case, and despite the fact that parameter interaction is unavoidable in these types of models, this behavior is also seen in the real world. For instance, in watersheds where evaporation is very high, antecedent soil moisture is affected, which has a direct influence on infiltration. The real world process of evaporation has an effect on infiltration, just as evaporation parameters have an effect on simulation of infiltration in watershed hydrology models. Application of distributed-parameter hydrologic modeling application require that the uncertainty problem and the calibration problem be addressed at the same time. While the user of a DPHM can do nothing about the complexity of the model's internal structure, the apparent complexity can be reduced by limiting the parameters and the affected output under consideration.
Results of this study indicate that it is possible to identify the influence of different hydrologic processes when simulating with a distributed-parameter hydrology model on the basis of parameter sensitivity analysis. Factors influencing this analysis include geographic area, topography, land cover, soil, geology, climate, and other unidentified physical effects. Identification of these processes allows the modeler to focus on the more important aspects of the model input and output, which can simplify all facets of the hydrologic modeling application.

Data availability
The Precipitation-Runoff Modeling System software used in this study is developed, documented, and distributed by the US Geological Survey. It is in the public domain and freely available from their web site (http://wwwbrr.cr.usgs.gov/ prms). Data analysis and plotting is done with the R software package (http://www.r-project.org), which is freely available, subject to the GNU General Public License.
The climate forcing data set used in this study came from the US Geological Survey Geo Data Portal (http://cida.usgs.gov/climate/gdp). The HRU delineation and default parameterization came from the US Geological Survey GeoSpatial Fabric (http://wwwbrr.cr.usgs. gov/projects/SW_MoWS/GeospatialFabric.html). Finally, the parameter sensitivity output values that were used to make the maps and tables in this article are available at ftp://brrftp.cr.usgs.gov/pub/markstro/hess.