PAPERBACK
\$52.75

3—Assessment Methods

Review of Existing Methods

Assessment methods generally are based on two types of mathematical models: (1) a model of the dynamics of the fish population under consideration, coupled with (2) a model of the relationship of observations to actual attributes of the entire fish population. These models are placed in a statistical framework for estimation of abundance and associated parameters and include assumptions about the kinds of errors that occur in each model and an assumption about the objective function* used to choose among alternative parameter values. These errors can be characterized broadly as either process errors or observational errors.

Process errors arise when a deterministic component of a population model inadequately describes population processes. Such errors can be found in the modeled relationships between recruitment (or year-class abundance) and parental spawning biomass and can occur as unpredictable variations in the age-specific fishing mortality rates of an exploited population from year to year. In contrast, observational errors arise in the process of obtaining samples from a fishery or by independent surveys. Decisions must be made regarding the type of probability distribution appropriate for each kind of error (e.g., normal, lognormal, or multinomial) and whether errors are statistically independent, correlated, or auto-correlated.

The form of the objective function chosen for parameter estimation is based on a likelihood function. In the models reviewed in this report, the objective is to maximize the total log-likelihood function. The total log-likelihood is generally a weighted sum of log-likelihood functions corresponding to the different types of observations. It is often simplified to an analogous problem of minimization of weighted least squares.

This report concentrates on complex population models, although the committee acknowledges that less structured stock assessment approaches may be more appropriate for some fisheries, as described in Chapter 1. Examples include the use of linear regression models for some Pacific salmon species and multispecies trend analyses (Saila, 1993).

 * Objective functions measure the "goodness of fit" between the population model and the observations. † A likelihood function gives the probability of obtaining a particular set of data as a function of a model having parameters that are unknown. By maximizing the likelihood with respect to the parameters, one estimates the parameters that provide the highest probability that the data occurred. Similarly, minimizing a sum of squares finds the parameters that make the data and the model predictions as close as possible in terms of squared deviations.

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 27
--> 3— Assessment Methods Review of Existing Methods Assessment methods generally are based on two types of mathematical models: (1) a model of the dynamics of the fish population under consideration, coupled with (2) a model of the relationship of observations to actual attributes of the entire fish population. These models are placed in a statistical framework for estimation of abundance and associated parameters and include assumptions about the kinds of errors that occur in each model and an assumption about the objective function* used to choose among alternative parameter values. These errors can be characterized broadly as either process errors or observational errors. Process errors arise when a deterministic component of a population model inadequately describes population processes. Such errors can be found in the modeled relationships between recruitment (or year-class abundance) and parental spawning biomass and can occur as unpredictable variations in the age-specific fishing mortality rates of an exploited population from year to year. In contrast, observational errors arise in the process of obtaining samples from a fishery or by independent surveys. Decisions must be made regarding the type of probability distribution appropriate for each kind of error (e.g., normal, lognormal, or multinomial) and whether errors are statistically independent, correlated, or auto-correlated. The form of the objective function chosen for parameter estimation is based on a likelihood function.† In the models reviewed in this report, the objective is to maximize the total log-likelihood function. The total log-likelihood is generally a weighted sum of log-likelihood functions corresponding to the different types of observations. It is often simplified to an analogous problem of minimization of weighted least squares. This report concentrates on complex population models, although the committee acknowledges that less structured stock assessment approaches may be more appropriate for some fisheries, as described in Chapter 1. Examples include the use of linear regression models for some Pacific salmon species and multispecies trend analyses (Saila, 1993). *   Objective functions measure the "goodness of fit" between the population model and the observations. †   A likelihood function gives the probability of obtaining a particular set of data as a function of a model having parameters that are unknown. By maximizing the likelihood with respect to the parameters, one estimates the parameters that provide the highest probability that the data occurred. Similarly, minimizing a sum of squares finds the parameters that make the data and the model predictions as close as possible in terms of squared deviations.

OCR for page 27
--> General types of population models include surplus production, delay-difference, age-based, and length-based models (see Chapter 5 for specific references and more detailed descriptions of the models used in this study). They rely on rates of change in biomass and productivity that can be calculated based on information about yield from fisheries, recruitment, and natural deaths. Detailed presentations of these models are given in Ricker (1975), Getz and Haight (1989), Hilborn and Walters (1992), and Quinn and Deriso (in press). Models of population change can be written as a differential equation or, in words, the rate of change in biomass () equals productivity () minus yield (). The productivity of a population depends on the recruitment of progeny () and the growth () and death () of existing individuals. Barring time-dependent processes in Equation (3.1), equilibrium biomass and yield result only if some of the rates in Equation (3.1) are regulated by population densities; otherwise, the population can either increase or decrease without limit. Surplus Production Models This type of model can be implemented with an instantaneous response (no lags) or 1-year difference equation approximation. These models have simple productivity parameters embedded and require no age or length data (Schaefer, 1954; Fletcher, 1978; Prager, 1994). Estimation is accomplished by fitting nonlinear model predictions of exploitable biomass to some indices of exploitable population abundance (usually standardized catch per unit effort, CPUE). The primary advantages of surplus production models are that (1) model parameters can be estimated with simple statistics on aggregate abundance and (2) the models provide a simple response between changes in abundance and changes in productivity. The primary disadvantages of such models are that (1) they lack biological realism (i.e., they require that fishing have an effect on the population within 1 year) and (2) they cannot make use of age- or size-specific information available from many fisheries. However, in some circumstances, surplus production models may provide better answers than age-structured models (Ludwig and Walters, 1985, 1989). Delay-Difference or Aggregate-Matrix Models These models incorporate age structure and provide a method for fitting an age- or size-structured population model to data aggregated by age (Deriso, 1980; Schnute, 1985; Horbowy, 1992). Estimation can be accomplished by fitting nonlinear model predictions of aggregate quantities to CPUE, biomass indices, and/or recruitment indices. Delay-difference models are a special-case solution to a more general aggregate-matrix model made possible by the assumption of a particular age-specific growth model (the von Bertalanffy equation [Ricker, 1975]). These types of models share the advantages of surplus production models; additionally, the functional relationship between productivity and abundance accounts for both yield-per-recruit and recruit-spawner effects. Unlike production models, the parameters of delay-difference or aggregate-matrix models have direct biological interpretations, but they cannot make full use of age- or size-specific information. In addition, these models require the estimation of more initial conditions than production models, unless a simplifying assumption, such as an initial equilibrium condition, is made. Age-Based or Integrated Models Age-based models use recursion equations to determine abundance of year classes as a function of several parameters (Fournier and Archibald, 1982; Deriso et al., 1985; Megrey, 1989; Methot, 1989, 1990; Gavaris, 1993). Relationships between spawning stock biomass and recruitment are not required but can be used. Because of the

OCR for page 27

OCR for page 27
--> parameters estimated jointly with abundance. Other sources of information about stock recruitment parameters can be included in the analysis, although this is not generally done. Depensation* usually is assumed not to occur. Many assessment methods require no specification of this parameter. Using Survey Data In Models If properly calibrated, fishery-independent trawl surveys can be used to estimate the absolute abundance of a fish population. Numbers at age a in year t (Na,t) can be estimated as where ρa is the probability that a fish of age a in the path of the trawl is captured, D is the area of the survey stratum, d is the area swept by the trawl gear, and Ia,t is the survey index of numbers at age. For fixed-gear recovery methods (e.g., longlines, pots, gillnets) it is not possible to estimate the area sampled, and even for trawl surveys it is difficult to estimate the probability of capture (ρa) accurately. In these cases, estimates from the survey data are assumed to measure relative abundance and are combined with other information about the fish population to estimate existing and past population sizes. In the ADAPT model's original form, the survey estimate of population numbers at age is assumed to be related to actual population numbers as The parameter β is not defined in the model document, but is a commonly used form to express a nonlinear relationship between true abundance and a survey index; qa represents the catchability of fish of age a to the survey gear and is assumed to be constant over time. The random component of the model (Σa, t) is assumed to be an independent, symmetrically distributed random variable with constant variance and a mean value of zero. This distributional assumption allows for the use of standard nonlinear least squares to estimate the model parameters (Gavaris, 1993). Recently, Myers and Cadigan (1995) developed a random-effects mixed model within a maximum likelihood framework to estimate the parameters of the ADAPT model. Their approach allows for a correlated within-year error structure for the survey data to deal with "year effects" in survey abundance-at-age estimates (see Smith and Page, 1996). The implicit parameters of Equation (3.3) that must be estimated, whatever the approach, are the fishing mortalities used to estimate Na in the underlying VPA (virtual population, or cohort, analysis) of the ADAPT method from catch-at-age data (Mohn and Cook, 1993). Absolute abundance is sometimes derived by calculating the area swept by the gear and assuming that all animals in the path of the gear will be captured (i.e., the catchability equals 1). In reality, absolute abundance estimates may be under- or overestimates, depending on whether catchability is greater than 1 due to herding by the trawl gear or less than 1 because of escapement from the path of the trawl. A different formulation is used in the Stock Synthesis method (Methot, 1990), which more naturally accommodates year effects in the surveys. Instead of fitting the model to age-specific abundance indices, the Stock Synthesis model treats the indices of overall stock abundance separately from the age composition of the survey catches. Thus, year effects affect only year-specific abundance indices and do not introduce correlations among the age-specific observations. The expected value for survey numbers in year t (It) for the Stock Synthesis model (Methot, 1990) is determined as *   Depensation is a reduction in per capita productivity at low stock sizes.

OCR for page 27
--> where q is catchability for fully-recruited ages, sa is age specific availability or selectivity, and Na,t is population abundance in numbers of fish in year t and age a. The survey can measure either relative (q 1) or absolute (q = 1) abundance. Note the correspondence with the ADAPT formulation in (3.3) by letting qa = qsa and β = 1. In the Stock Synthesis method, variances from the log transform of survey abundance estimates are included, if available, directly in the survey index component of the log-likelihood expression. Therefore, the impact of optimizing the survey design on the resultant estimates from Stock Synthesis can be studied directly. Although there is the possibility of using inverse variance weighting in the nonlinear least squares in ADAPT, this is not usually done. However, bootstrap and Monte Carlo methods are available for linking variation in the survey estimates to variation in the resultant estimates from ADAPT (Restrepo et al., 1992; Smith and Gavaris, 1993b). Ultimately, sampling programs should be evaluated with respect to the precision of the quantities being used to estimate stock status. The assumption of constant catchability and availability with age over time implied by the use of qa in ADAPT and sa in Stock Synthesis can be confounded by changes in both availability and catchability. Bayesian Approaches Fishery management involves decision-making in the presence of uncertainty. Fishery stock assessments should provide the quantitative support needed for managers to make regulatory decisions in the context of uncertainty. This support includes an evaluation of the consequences of alternative management actions. However, there is often considerable uncertainty that can be expressed as competing hypotheses about the dynamics and state of a fishery. The consequences of management actions may differ depending on which of these hypotheses is true. An appropriate means of providing quantitative support to managers in the presence of uncertainty is through the use of Bayesian statistical analysis. This section discusses the problem of building models of complex fishery systems having many parameters that are unknown or only partially known and the use of Bayesian methodology. Numerous papers and books have been published related to the application of Bayesian analyses in fisheries (e.g., Gelman et al., 1995; Punt and Hilborn, 1997). There are three major elements in the Bayesian approach to statistics that should be indicated clearly: likelihood of describing the observed data, quantification of prior beliefs about a parameter in the form of a probability distribution and incorporation of these beliefs into the analysis, and inferences about parameters and other unobserved quantities of interest are based exclusively on the probability of those quantities given the observed data and the prior probability distributions. In a fully Bayesian model, unknown parameters for a system are replaced by known distributions for those parameters observed previously, usually called priors. If there is more than one parameter, each individual distribution, as well as the joint probability distributions, must be described. A distinction must be made between Bayesian models, which assign distributions to the parameters, and Bayesian methods, which provide point estimates and intervals based on the Bayesian model. The properties of the methods can be assessed from the perspective of the Bayesian model or from the frequentist* perspective. Historically, the "true" Bayesian analyst relied heavily on the use of priors. However, the modern Bayesian has evolved a much more pragmatic view. If parameters can be assigned reasonable priors based on scientific knowledge, these are used (Kass and Wasserman, 1996). Otherwise, "noninformative" or ''reference" priors are *   Frequentist statistical theory measures the quality of an estimator based on repeated sampling with a fixed, nonrandom set of parameters. Bayesian statistical theory measures the quality of an estimator based on repeated sampling in which the parameters also vary according to the prior distributions. Most beginning statistics courses focus on frequentist methods such as the t-test and analysis of variance.

OCR for page 27
--> used.* These priors are, in effect, designed to give resulting methods properties that are nearly identical to those of the standard frequentist methods. Thus, the Bayesian model and methodology can simply be routes that lead to good statistical procedures, generally ones with nearly optimal frequentist properties. In fact, Bayesian methods can work well from a frequentist perspective, as long as the priors are reasonably vague about the true state of nature. In addition to providing point estimates with frequentist optimality properties, the posterior intervals for those parameter estimates are, in large data sets, very close to confidence intervals. Part of the modern Bayesian tool kit involves assessing the sensitivity of the conclusions to the priors chosen, to ensure that the exact form of the priors did not have a significant effect in the analysis. There are differences of opinion among scientists about whether frequentist or Bayesian statistics should be used for making inferences from fishery and ecological data (Dennis, 1996). The basis for selection of the various prior distributions used in a stock assessment should be documented because the choice of priors can be a source of improper use of statistics. A rationale has to be constructed to indicate which models were considered in the analysis and why some models were not considered further (i.e., were given a prior probability of zero), even though they may be plausible. The use of standard procedures permits independent scientific review bodies to verify the plausibility of hypotheses included in the assessment and assign their own prior probability values to the selected hypotheses. There are two general classes of Bayesian methods. Both are based on the posterior density, which describes the conditional probabilities of the parameters given the observed data. This is, in effect, a modified version of the models' prior distribution, where the modification updates the prior based on new information provided by the data. In one form of methodology, this posterior distribution is maximized over all parameters to obtain "maximum a posterori" (MAP) estimators. It has the same potential problem as maximum likelihood in that it may require maximization of a high-dimension function that has multiple local maxima. The second class of methods generates point estimators for the parameters by finding their expectations under the posterior density. In this class, the problem of high-dimension maximization is replaced with the problem of high-dimension integration. Over the past 10 years, Bayesian approaches have incorporated improved computational methods. Formerly, the process of averaging over the posterior distribution was carried out by traditional methods of numerical integration, which became dramatically more difficult as the number of different parameters in the model increased. In the modern approach, the necessary mean values are calculated by simulation using a variety of computational devices related more to statistics than to traditional numerical methods. Although this can greatly increase the efficiency of multiparameter calculations, the model priors must be specified with structures that make the simulation approach feasible. Although they are not dealt with extensively here, a number of classes of models and methods have an intermediate character. For example, there are "empirical Bayes" methods, in which some of the parameters are viewed as arising from a distribution that is not completely known but rather known up to several parameters. There are also "penalized likelihood methods," in which the likelihood is maximized after addition of a term that avoids undesirable solutions by assigning large penalty values to unfeasible parameter values. The net effect is much like having a prior that assigns greater weight to more reasonable solutions, then maximizes the resulting posterior. Another methodology used to handle many nuisance parameters is the "integrated" likelihood, in which priors are assigned to some of the parameters to integrate them out while the others are treated as unknown. This provides a natural hybrid modeling method that could have fishery applications. Advantages of Bayesian Models and Methods A number of features of Bayesian modeling make it particularly useful for fish stock assessments: In a complex model, if a key parameter is treated as totally unknown, the set of parameters of the model *   Such priors are sometimes "improper" in that the specified prior density is not a true density because it does not integrate to 1. A prior distribution is proper if it integrates to 1.

OCR for page 27