Read "Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program" at NAP.edu

Page 100 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

6
Regression Models of Water Use

This chapter explores the structure of the past National Water-Use Information Program (NWUIP) state-level aggregated water use data, based on corresponding (and routinely collected) demographic, economic, and climatic data. The purpose of this inquiry is to determine if multiple regression models have the potential to explain the temporal and geographic variability across the United States of the aggregated water use estimates produced by the NWUIP. The statistical models examined here are derived using the U.S. Geological Survey (USGS) estimates of total withdrawals for public supply use and thermoelectric power use. A complete analysis of historical withdrawals is described in Dziegielewski (2002a).

NATIONAL WATER USE DATA

Total water use in the United States has been estimated by the USGS every five years since 1950. National estimates focus primarily on measuring total water withdrawals, which include the annual extractions of both fresh water (with separate estimates for surface water and groundwater withdrawals) and saline water. The total withdrawals are subdivided into categories; all point withdrawals are aggregated and reported at the county and state levels. The structure of these reported withdrawals in 1995 (Solley et al., 1998) can be represented as:

(6.1)

where

Page 101 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

TW_t =	total (fresh and saline) water withdrawals in all states, the District of Columbia, Puerto Rico, and the U.S. Virgin Islands in million gallons per day (MGD) during calendar year t
PS_it =	public supply withdrawals (in state i during year t) , MGD
DM_it =	domestic (self-supplied) withdrawals, MGD
CM_it =	commercial (self-supplied) withdrawals, MGD
IR_it =	irrigation withdrawals, MGD
LS_it =	livestock withdrawals, MGD
IN_it =	industrial (self-supplied) withdrawals, MGD
MN_it =	mining withdrawals, MGD
TE_it =	thermoelectric withdrawals, MGD

In the 1995 compilation, freshwater withdrawals were estimated for all eight categories (or sectors), and saline water withdrawals were estimated for industrial, mining, and thermoelectric categories. The freshwater withdrawals are separated into groundwater and surface water for all sectors, and saline withdrawals are separated by source for industrial, mining, and thermoelectric sectors. For example, the total withdrawals for thermoelectric power use, TE_t, can be represented as:

(6.2)

where

TE_it =

withdrawal for thermoelectric power use in state i during year t; and the subscripts f, b, s, and g respectively indicate freshwater, brackish or saline water, surface water, and groundwater.

These eight categories are nonoverlapping and sum up to total withdrawals. However, public supply withdrawals include water delivered by public water supply systems to some commercial, industrial, and thermoelectric uses, and detailed sectoral-use tables in Solley et al. (1998) show both the self-supplied withdrawals and deliveries of water to each sector.

The reported estimates are obtained primarily from detailed inventories of point withdrawals within each accounting unit (i.e., county or state). The point withdrawals represent measured volumes of water at pumping or diversion points or estimates of the withdrawn volumes based on the time of pump operation, irrigated acreage, or some other indirect measure. Indirect measures depend on water use category and assume a specific relationship between the quantities of water use and the values of the corresponding indirect measures (USGS, 2000, Chapter 11). Statistical models of water use permit an explicit consideration of

Page 102 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

the relationships between water use and these indirect measures. These relationships are discussed in the following section.

WATER USE RELATIONSHIPS

Water use at the state level can be estimated indirectly by using multiple regression analysis. In regression models, water use relationships are expressed in the form of mathematical equations, showing water use as a mathematical function of one or more independent (explanatory) variables. The mathematical form (e.g., linear, multiplicative, exponential) and the selection of the right-hand-side (RHS) or independent variables depend on the category and on aggregation of water demand represented by the left-hand-side (LHS) or dependent variable. A large number of econometric studies of water use have been conducted. Hanemann (1998) summarizes the theoretical underpinnings of water demand modeling and reviews a number of determinants of water demand in major economic sectors. Useful summaries of econometric studies of water demand can be found in Boland et al. (1984). Dziegielewski et al. (2002b) reviewed a number of studies of aggregated sectoral and regional demand. A substantial body of work on model structure and estimation methods was performed by the USGS (Helsel and Hirsch, 1992).

Depending on the purpose for which the estimates are used, the dependent variable (i.e., water use) can be presented in different ways. For example, in studies of surface and groundwater resources, the data are usually available as daily, monthly, or yearly withdrawals at a point such as a river intake or a well. Because the water withdrawn is typically used (or applied) over a larger land area, an equivalent hydrologic definition of water use would be the use of water over a defined geographical area (e.g., an urban area, a county, or a river basin). As shown in Equation 6.1, total water use within a larger geographical area such as a county or state can be presented as a sum of water use by several groups of users within a number of subareas.

Generally, water use at any level of aggregation can be modeled as a function of one or more explanatory variables. However, the best results are obtained by breaking down total water use by sector, because different subsets or explanatory variables apply to different sectors. For example, public supply withdrawals can be estimated using the following linear model:

(6.3)

where PS_it represents public supply withdrawals within geographical area i during year t, X_j is a set of j explanatory variables (e.g., air temperature, precipitation, price of water, median household income, and others), which are expected to

Page 103 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

explain public supply withdrawals, and ε_it is a random error term. The coefficients a and b_j can be estimated by fitting a multiple regression model to the historical data. This procedure has some parallels in modeling river loads, sediment-rating curves, and urban nonpoint pollution loads. Examples of studies of those subjects, which utilize statistical approaches, include Cohn et al. (1989) and Christensen et al. (2000).

WEATHER NORMALIZATION OF WATER USE

The quantity of water withdrawn in any given year depends on weather conditions. Water withdrawals for most purposes increase during periods of hot and dry weather and decrease during periods of cool and wet weather. This dependence of withdrawals on weather conditions can be determined by including weather-related variables in the set of explanatory variables X_j in Equation 6.3 above.

The accuracy of the weather adjustment depends on the length of the time interval used in data averaging. The best results are obtained by modeling time-series data on daily or weekly water use; the relationship can be masked when monthly and seasonal data are used. For example, water use is negatively correlated with precipitation. However, if monthly data are used, it is possible that total precipitation during a given month could be higher than normal but concentrated during the last two days of the month. Water use during that month would be higher than normal because of the dry conditions during all but the last two days of the month, thus indicating a misleading positive correlation between water use and precipitation.

The selection of variables to represent weather conditions depends on the sector. In models of domestic demands, commonly used measures of weather conditions include antecedent precipitation (or antecedent rainless days) and air temperature. Evapotranspiration is often used in models of water use for landscape watering and irrigation demands, and cooling degree-days and heating degree-days are used to estimate industrial demands or thermoelectric power use (Boland et al., 1984; Dziegielewski et al., 1996).

The use of weather variables in multiple regression models is illustrated in the later sections of this chapter. The next section explores the structure of water demand in public supply sector water use and presents several statistical models that were fitted to the historical estimates of public supply withdrawals in the lower 48 states.

STATE-LEVEL MODELS OF PUBLIC SUPPLY WITHDRAWALS

Public supply water is water withdrawn by public or private water suppliers and delivered to users. The public supply withdrawals estimated by the NWUIP for the years 1980, 1985, 1990 and 1995 in each of the lower 48 states were used

Page 104 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

in regression analysis. Twenty-one variables were selected as the likely predictors of public supply withdrawals at the state level and the following:

Population: resident state population, population served, population density, and percent urban population;
Income: median family income, state per capita income;
Economy/employment: civilian labor force, gross state product per capita, average (weighted) price of water;
Housing mix: percentages of single-family housing units, multifamily housing units, and mobile homes;
Weather: total precipitation (during growing season), average air temperature (during growing season), and extreme monthly value of Palmer Drought Severity Index (PDSI); and
State water law: prior appropriation, riparian or riparian with permits.

These variables are measures of demographics, affluence, economic activity, housing stock, weather, and water allocation arrangements. Six indicator (binary) variables were constructed to represent the legal systems of water rights in each state for allocating surface water and groundwater to uses and users. A measure of “dryness” for weather conditions was chosen as the lowest monthly value of the PDSI during the data year for each state. PDSI may have significant limitations in capturing the effects of dry weather on water use and has been found not to be a nationally consistent measure of dryness (Alley, 1984; Guttman et al., 1992). There are other indicators of the evaporative demand of the atmosphere as it affects the consumptive use of water (e.g., Class A pan evaporation, reference crop evapotranspiration); however, the availability of such measures at the geographical scales used in this analysis is limited.

Population served by public water supply systems was used to express the dependent variable as average public supply withdrawal per capita per day for each state and data year. If the per capita rate of withdrawal in each state can be predicted with sufficient accuracy, then total public supply withdrawals can be estimated by multiplying the per capita withdrawal by population served.

One advantage of modeling the per capita withdrawal is that by expressing total withdrawals in per capita terms, the dependent variable is “normalized” across states, and the problems associated with heterogeneity of total withdrawals among the states are avoided. Also, the “out of range” values of per capita withdrawal can be easily spotted in the data and investigated. It should be noted, however, that regression analysis can also be applied to total public supply, not just to per capita public supply withdrawals as described here.

It should be emphasized that the regression models presented here are for illustrative purposes only, as many details about model diagnostics and other aspects of the analysis have been omitted for clarity. Detailed discussions about

Page 105 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

potential bias in the estimators and alternative estimation techniques are described in Dziegielewski et al. (2002a).

Table 6.1 shows the coefficients of a linear regression (see Equation 6.3) of 1980–1995 state-level data (excluding the District of Columbia) on per capita public supply withdrawals using the ordinary least squares (OLS) procedure. The shorter data series for 1980–1995 was selected to take advantage of improved data collection procedures and to capture the recent trend of declining water use since the 1980 compilation.

The model shown in Table 6.1 explained 52 percent of the variance in per capita usage rates among states and across reporting years. The predictive properties (regression fit) of the model are limited as indicated by both the absolute and relative size of the residuals shown below the table. The mean absolute percentage error (APE) is 12.9 percent, and the root mean squared error is 31.6 gallons per capita per day (gpcd).

Despite the significant unexplained variance, the regression model in Table 6.1 can be considered to be a reasonable “explanatory” model, which reveals the structure of demand for public water supply even in the geographically aggregated data. The size and signs of the estimated regression coefficients fall within the ranges of expected values. These coefficients can be interpreted to mean that across the United States, from 1980 to 1995, the mean withdrawal was 183.7

TABLE 6.1 Linear Regression Model for State-Level Per-Capita Public Supply Withdrawals, 1980–1995

Dependent/Explanatory Variable	Regression Coefficient	t-Ratio	F-value Probability
Intercept (gpcd)	115.881	3.28	0.0012
Average price of water ($/1,000 gal., real 1995 dollars)	–7.779	–2.63	0.0091
Gross State Product per capita ($1,000, real 1995 dollars)	1.676	3.22	0.0015
Precipitation in summer months (May to Sept., in inches)	–2.119	–4.02	0.0001
Average temperature during summer (Fahrenheit degrees)	0.983	2.15	0.0326
Indicator of states with prior appropriation groundwater rights system	29.136	3.05	0.0027
Indicator of states with prior appropriation surface water rights system	17.218	1.81	0.0716
NOTES: Mean water use = 183.7 gpcd; n = 192; R² = 0.52; mean APE = 12.9%; root MSE = 31.6 gpcd; Nine observations of per capita withdrawal in the original data were adjusted using a data-smoothing procedure.

Page 106 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

gpcd (from the data). This average withdrawal rate would decrease by 7.8 gpcd if price were increased by $1/1,000 gallons, and it would increase by 1.7 gpcd if the gross state product per capita increased by $1,000. Because a significant portion of public supply withdrawals is used to supply industrial and commercial uses, the gross state product variable captures the effects of the relative volume of nonresidential uses together with the effect of the ability to pay for water, which is typically captured by per capita or median household income variables in models of residential use. The binary indicator variable, which assumes the value of 1 for states with prior appropriation groundwater rights (generally western states), indicates that on average, these states withdrew 29 gpcd more than states with riparian and riparian with permits systems. Also, in states with prior appropriation surface water rights, average per capita withdrawals were on average higher by 17.2 gpcd than in riparian law states. The water rights variables most likely are an indirect measure of the arid climate of the states that use the prior appropriation system rather than indicating increased use because of appropriation rights.

The effects of individual explanatory variables can be also expressed in terms of elasticity of water demand with respect to changes in the values of each dependent variable. Elasticity measures the percentage of change in the independent variable that would be caused by a 1.0 percent increase in the value of independent variable. For example, the elasticity of demand with respect to price (estimated at the means) is –0.10. This value is found by multiplying the regression coefficient –7.779 by the ratio of average price to average per capita withdrawal in the data. An elasticity of –0.10 is relatively low (in absolute value), but it is close to expectation for aggregate public supply data. Also, the elasticity of demand with respect to income (as represented by gross state product) is +0.22. These elasticity values indicate that a 1.0 percent increase in price would result in a 0.10 percent decrease in demand while a 1.0 percent increase in per capita gross state product would result in a 0.22 percent increase in demand.

The estimated regression coefficients for temperature and precipitation in Table 6.1 clearly show the effect of weather on withdrawals and can be used in normalizing water use for weather. In this context, withdrawals during normal weather could be predicted by substituting into the regression equation “normal” values of average air temperature during summer months and total precipitation during the growing season for these dependent variables. The regression coefficients of the two weather variables in the model indicate that the average per capita demand in a state decreases by 2.1 gallons per day (gpd) per one-inch increase in precipitation during the growing season (elasticity at the mean is –0.19). The per capita demand increases by approximately 1 gpd per one-degree increase in average annual temperature (elasticity at the mean is +0.37). These elasticity values indicate that per capita public supply withdrawals decrease by 0.19 percent for each one percent increase in precipitation and increase by 0.37 percent for each one percent increase in average temperature.

Page 107 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

The predictions from the model in Table 6.1 can be improved by supplementing them with information that is contained in model residuals (i.e., differences between actual and predicted values). This can be done by introducing binary variables, which designate individual states. In a model with binary state indicator variables, the average value of residuals for each state is added to the predicted value for that state thus reducing the prediction error. Similarly, if the state residuals contain an increasing or decreasing time trend, such a state-specific trend can also be added to the prediction. However, the addition of separate intercepts and time trends for some states does increase the number of model parameters. If the resulting model is overspecified, the coefficients of the continuous variables, which form the structural component of the model, may be biased. Such bias is small when the inclusion of a state-specific intercept (or trend) does not result in an appreciable change in the value of the estimated coefficients of the structural variables. Still, as with any statistical model, careful evaluation of the model predictions is recommended before accepting the final form of model.

An alternative model was fitted using a stepwise procedure that selected the best explanatory variables from both the continuous variables used in the model shown in Table 6.1 and the binary variables, which designate individual states. In addition, a time trend variable was fitted to the data with trend adjustments for several individual states. The model was estimated using a truncated subset of data for 1980, 1985, and 1990, which excluded the 1995 data. The estimated regression coefficients and other related information for this extended model are shown in Table 6.2.

An estimate of per capita public supply withdrawals for any state and year can be made using the model in Table 6.2. This can be done by substituting the corresponding values of price, per capita gross state product, total summer precipitation, and average temperature and adding four “intercept adjustors”—one for state groundwater law system, one for state surface water law system, one indicator of an individual state (if present in the model), and one state-specific trend (if present)—using the following equation:

(6.4)

where

PS_it =	per capita withdrawal (gallons per day) in state i during year t
AP_it =	average price in constant 1995 dollars
GP_it =	gross state product per capita in constant 1995 dollars
R_it =	total summer season precipitation in inches
T_it =	average summer temperature, degrees Fahrenheit

Page 108 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

TABLE 6.2 “Extended” Per Capita Model of Public Supply Withdrawals, 1980–1990

Variables	Estimate	Std Error	t Ratio P	rob>\|t\|
Intercept (gpcd)	90.659	23.195	3.91	0.0002
Average price of water ($/1,000 gal)	–4.726	1.624	–2.91	0.0044
Gross state product per capita ($1,000)	2.430	0.352	6.91	<0.0001
Total precipitation during summer, inches	–1.299	0.365	–3.55	0.0006
Average temperature during summer (deg. F)	0.777	0.270	2.88	0.0048
States w/ prior appropr. groundwater rights	17.386	7.529	2.31	0.0229
States w/ prior appropr. surface water rights	38.697	6.980	5.54	<0.0001
Indicator for Alabama	50.543	11.722	4.31	<0.0001
Indicator for California	–47.292	13.000	–3.64	0.0004
Indicator for Connecticut	–29.507	8.065	–3.66	0.0004
Indicator for Delaware	–25.258	7.956	–3.17	0.002
Indicator for Florida	16.950	8.513	1.99	0.0491
Indicator for Idaho	27.171	9.020	3.01	0.0032
Indicator for Kansas	–60.388	9.506	–6.35	<0.0001
Indicator for Massachusetts	–32.888	7.805	–4.21	<0.0001
Indicator for Michigan	16.514	7.894	2.09	0.0389
Indicator for Montana	36.237	8.938	4.05	<0.0001
Indicator for Nevada	80.910	8.395	9.64	<0.0001
Indicator for New Hampshire	–23.742	7.714	–3.08	0.0027
Indicator for New Jersey	–14.228	7.744	–1.84	0.069
Indicator for North Dakota	–104.913	12.410	–8.45	<0.0001
Indicator for Oklahoma	–56.023	12.707	–4.41	<0.0001
Indicator for Oregon	–26.390	8.667	–3.05	0.0029
Indicator for Pennsylvania	33.247	7.521	4.42	<0.0001
Indicator for Rhode Island	–27.130	7.639	–3.55	0.0006
Indicator for South Dakota	–70.827	9.011	–7.86	<0.0001
Indicator for Utah	64.321	8.721	7.38	<0.0001
Indicator for Virginia	–22.074	7.454	–2.96	0.0038
Indicator for Washington	32.040	12.270	2.61	0.0103
Indicator for Wisconsin	27.198	7.787	3.49	0.0007
Trend adjustor for Alabama	–3.333	1.769	–1.88	0.0622
Trend adjustor for California	3.555	1.765	2.01	0.0466
Trend adjustor for Illinois	2.645	1.201	2.2	0.0299
Trend adjustor for Maryland	4.453	1.144	3.89	0.0002
Trend adjustor for Nebraska	3.668	1.335	2.75	0.0071
Trend adjustor for North Dakota	3.960	1.754	2.26	0.0261
Trend adjustor for Oklahoma	4.724	1.758	2.69	0.0084
Trend adjustor for Texas	–3.853	1.313	–2.93	0.0041
Trend adjustor for Washington	–3.860	1.758	–2.2	0.0303
NOTES: N = 144; R²_adj = 0.93; root MSE = 12.4 gpcd; mean APE = 6.3%.

Page 109 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

LG_it =	indicator for state groundwater law system (equals 1 if prior appropriation, 0 otherwise)
LS_it =	indicator for state surface water law system (equals 1 if prior appropriation, 0 otherwise)
a_i =	intercept adjustor for individual states
S_i =	indicator for individual states (equals 1 if the state is included in the model, 0 otherwise)
b_i =	trend coefficient describing changes in withdrawals in gpcd per year for individual states
Y_i =	year since 1980 (equals 5 for 1985, 10 for 1990, and 15 for 1995)
D_i =	indicator for state-specific trend (equals 1 gpd if the state is included in the model, 0 gpd otherwise)

This model in Table 6.2, which contains significant “intercept effects” for 23 individual states and trend effects for 9 states, explained 93 percent of variance in per capita withdrawals in the 1980–1990 data. The removal of one data year (1995) and the addition of binary variables had some effect on the estimated coefficients of the continuous variables when compared to those presented in Table 6.1. The coefficients of the price and precipitation variables have significantly less negative values when compared to the explanatory model in Table 6.1. The differences in the estimated coefficients indicate that the structural component of the model in Table 6.1 is not robust with respect to changes in the number of observations in the data and the inclusion of the binary variables to designate individual states. However, all six coefficients (including the binary water rights indicator variables) in Table 6.2 have the expected signs and remain statistically significant.

The model statistics shown below Table 6.2 indicate that the mean absolute percentage error (APE) for in-sample predictions is 6.3 percent as compared to 12.9 percent in the explanatory model (Table 6.1). The out-of-sample prediction errors for the 1995 data, which were not used to estimate the model, are shown for individual states in Table 6.3.

The comparison of the predicted and actual values in Table 6.3 indicates that the predictions for the 1995 data year were within ±10 percent for 24 states. In 17 states, the 1995 predictions were between ±10 percent and ±20 percent, and in 8 states, the absolute percentage error was greater than 20 percent. The largest error of 33.5 percent was obtained for California. The mean absolute percentage error for all 48 states in 1995 was 13.4 percent. The mean APE of 13.4 percent would also apply to the estimates of total public supply withdrawals for each of the lower 48 states (in million gallons per day), generated by multiplying the estimated per capita value by population served. If the model predictions for individual states were to be used to prepare an estimate of the total national public supply withdrawals for 1995, then due to the compensating positive and negative

Page 110 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

TABLE 6.3 “Out-of-Sample” Predictions of Per Capita Public Supply Withdrawals for 1995

	Withdrawals (mgd)				Withdrawals (mgd)
State	Actual	Predicted	% Diff.	State	Actual	Predicted	% Diff.
Alabama	237.1	171.4	–27.7	Nebraska	221.4	272.9	23.3
Arizona	206.1	231.5	12.3	Nevada	324.8	339.8	4.6
Arkansas	190.8	171.3	–10.2	New Hampshire	140.0	154.5	10.4
California	184.5	246.4	33.5	New Jersey	149.5	168.8	12.9
Colorado	207.7	238.5	14.8	New Mexico	225.4	239.7	6.3
Connecticut	155.2	164.0	5.7	New York	185.1	188.6	1.9
Delaware	158.6	169.9	7.1	North Carolina	162.1	170.1	4.9
Florida	169.1	172.0	1.7	North Dakota	148.9	176.9	18.7
Georgia	195.5	177.2	–9.3	Ohio	153.1	174.8	14.2
Idaho	242.9	256.7	5.7	Oklahoma	193.8	210.5	8.6
Illinois	175.3	217.7	24.2	Oregon	234.8	213.0	–9.3
Indiana	156.1	169.2	8.4	Pennsylvania	170.8	204.0	19.5
Iowa	173.2	171.1	–1.2	Rhode Island	130.2	147.1	13.0
Kansas	159.1	157.8	–0.8	South Carolina	199.6	158.8	–20.4
Kentucky	147.8	163.5	10.6	South Dakota	146.7	158.6	8.1
Louisiana	165.8	175.8	6.0	Tennessee	175.9	166.4	–5.4
Maine	141.7	160.5	13.3	Texas	187.7	169.0	–9.9
Maryland	200.0	244.7	22.3	Utah	268.9	304.2	13.1
Massachusetts	130.0	160.5	23.5	Vermont	148.3	164.2	10.7
Michigan	188.4	183.8	–2.4	Virginia	158.5	155.0	–2.2
Minnesota	145.2	178.0	22.6	Washington	266.3	216.0	–18.9
Mississippi	151.8	158.1	4.1	West Virginia	133.7	149.2	11.6
Missouri	161.5	167.9	4.0	Wisconsin	168.6	195.4	15.9
Montana	222.1	253.6	14.2	Wyoming	260.6	250.7	–3.8

prediction errors among individual states, the prediction error in the national total would be +2.2 percent.

STATE-LEVEL MODELS FOR THERMOELECTRIC WITHDRAWALS

State-level data for public water supply withdrawals are more accurate than data for thermoelectric cooling withdrawals. This is because public supply withdrawals are generally metered while withdrawals for thermoelectric cooling are more likely to be estimated based on pumping times and rated capacities of pumps.

The largest quantity of withdrawals from surface (and groundwater) sources is for thermoelectric power. The variables that can be examined as potential predictors of state-level thermoelectric withdrawals include the following:

Page 111 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

Energy generation by fuel type: total thermoelectric generation, percent coal generation, percent petroleum generation, percent natural gas generation, and percent nuclear generation;
Generation by method: percent nuclear steam generation, percent conventional steam, and percent internal combustion;
Installed generation capacity: total generation capacity (megawatt), percent conventional steam, percent nuclear steam, and percent internal combustion;
Availability of cooling towers: total number of cooling towers, rated generation capacity with cooling towers (megawatt), number of cooling towers at coal steam plants, capacity (coal) with cooling towers (megawatt), number of cooling towers at petroleum/gas plants, capacity with cooling towers at petroleum/gas steam plants (megawatt);
Weather conditions: cooling degree-days, heating degree-days, average annual air temperature;
State water law: prior appropriation, riparian, riparian with permits; and
Number of generating units: within coal, petroleum, gas, and nuclear categories.

Total withdrawals for thermoelectric power differ greatly among states, and the reported volumes are not well correlated with the total amount of thermoelectric generation in each state. However, when states with small generation and low water withdrawals (i.e., generally less than 1,000 MGD) are removed from the sample, a significant improvement in this relationship is achieved.

Table 6.4 presents a multivariate model of unit water withdrawals expressed as gallons per kilowatt hour for a group of states with large generation. The estimated regression coefficients indicate that the best explanatory variable for the quantity of withdrawals per kilowatt hour is percent generation capacity in plants that utilize “closed-loop” systems (i.e., cooling towers) relative to capacity

TABLE 6.4 Linear Model of Thermoelectric Withdrawals per Kilowatt-Hour

Variable	Estimated Withdrawal	t Ratio	Prob. >\|t\|
Intercept	49.376	15.53	<0.0001
Percent generation capacity with cooling towers	–0.362	–8.02	<0.0001
Percent utilization of existing capacity	–0.423	–4.99	<0.0001
Percent generation from coal	–0.096	–3.43	0.0009
Average size of generating units	0.174	6.34	<0.0001
Total heating degree-days	0.002	4.11	<0.0001
States w/ prior appropr. surface water law	3.962	–2.9	0.0047
NOTES: N = 91, R2 = 0.80; root MSE = 6.3 gal./kWh; mean APE = 17.6%.

Page 112 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

of plants that depend on “once-through” cooling systems. Other predictors include percent utilization of existing capacity, percent thermoelectric generation from coal fuel, average size of generating units, and total heating degree-days. Additional explanation is provided by the “water law” variable, which indicates lower unit water withdrawals in states with prior appropriation surface water law (primarily western states). The model reveals the underlying structure of the thermoelectric demand despite the high level of data aggregation. All model coefficients have the expected signs and are statistically significant. They point to the importance of technological alternatives (i.e., once-through vs. evaporative cooling or combined-cycle generation) as determinants of water withdrawals.

Although the regression model in Table 6.4 explains 80 percent of the variance in per kilowatt-hour thermoelectric water withdrawals, the mean absolute percentage error for in-sample predictions remains relatively high at 17.6 percent. As in the public supply sector, improved predictions of the thermoelectric withdrawals model could be obtained by introducing binary state indicator variables.

Potential Model Improvements

The first step in improving the predictive properties of regression models of water use would be to enhance the quality of the data used in estimating the model parameters. Indeed, one of the advantages to regression approaches is that they may reveal cause-effect relationships that provide insight into data limitations. That is, because errors in the explanatory variables can be minimized, poor model predictions for individual states or years may suggest data errors in the USGS water use compilations. Thus, this approach may add value to both the assessment of water use and the quality control of the data. The effort expended to improve the data must, of course, be balanced with the effort expended to obtain reliable prediction variables.

Historical and current data on some of these explanatory variables exist, as they are routinely collected and archived by federal, state, and local governmental agencies. For example, the NWUIP currently collects data on population served and irrigated acreage. However, data on other variables, such as retail and wholesale water prices and thermoelectric generation capacity with cooling towers, are not routinely collected. If justified by their explanatory contribution in water use estimation models, such data collection and archiving could be added to NWUIP or state-level programs.

A second step would involve respecification of the predictive models. The relationships between the independent and dependent variables are likely to be different between the states of the humid East and the more arid West. The states, therefore, could be separated into groups based on geography and separate relationships estimated for groups of states, thus allowing the regression coefficients to vary among different regions of the country.

Page 113 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

A third step would involve the introduction of additional variables in the multivariate regressions. Such variables, like marginal price of water or water conservation activity, are difficult to measure at the state level although they are known to have a significant influence on water use. For example, the results in Table 6.3 show a significant overprediction of per capita rates in California, a state with an aggressive water conservation programs. A variable that could capture the differences in water conservation efforts through time and among the different states could potentially improve these predictions.

Also, developing multiple regression models of withdrawals at the county level and obtaining the state totals by summing up the county-level estimates could also improve the state-level estimates of water withdrawals. However, the county-level data, which were developed by the NWUIP for 1985, 1990, and 1995, contain many apparent errors, and reliable models can be developed only after the accuracy of a number of data points can be verified.

Finally, given the potential for improvements in the data and models through the application of the “science of water use,” the final statistical models for estimating water use may be of different form and structure than the examples developed here. However, the linear models used in this chapter to illustrate the approach do show the promise of the method.

CONCLUSIONS AND RECOMMENDATIONS

The examples presented in this chapter indicate that statistical models are a promising approach for estimating some categories of water withdrawals per unit (i.e., per capita or per kilowatt hour) within an acceptable estimation error. Based on the results presented in this chapter, the following conclusions can be drawn:

A large number of potential explanatory variables for water use exist and can be used in constructing multiple regression models for the major categories of water withdrawals.
Despite the state-level aggregation of the withdrawal data, these regression models reveal the underlying structure of water demand within several major sectors of use, and they reveal the key explanatory variables.
The predictive properties of the models can be improved through appropriately specified models and through the inclusion of both the standard explanatory variables and the indicator variables for individual states or counties to capture their “unique” water use characteristics as well as state-specific trends in usage rates over time.
The coefficients derived from regression models for adjustment of water use according to weather variations may be helpful in adjusting state-level water use estimates developed through statistical sampling or other means for departures from normal weather conditions in the year the estimates were made.

Page 114 Cite

Suggested Citation:"6. Regression Models of Water Use." National Research Council. 2002. Estimating Water Use in the United States: A New Paradigm for the National Water-Use Information Program. Washington, DC: The National Academies Press. doi: 10.17226/10484.

×

In summary, the data on water withdrawals and use that have accumulated under the NWUIP offer an excellent opportunity for advancing the “science of water use” and for understanding the structure and trends in national water use. The development of statistical models can be helpful in the quality assurance/ quality control process for future national compilations and for estimating water use in states or counties with inadequate data on withdrawals. Still, many challenges relating to data quality, inconsistent variable definitions, and statistical methodology need to be addressed, and they represent a fertile area for applied research as part of the NWUIP. As part of its research on estimation methods, the USGS should undertake a systematic investigation of water use models as it has done for estimation of river loads, urban nonpoint pollution discharges, and other hydrologic quantities.