Cover Image

PAPERBACK
$52.75



View/Hide Left Panel

5—
Simulations

Simulation Approach

The major goal of the committee's simulation study was to evaluate the performance of stock assessment methods and subsets of information (fishery, survey, ageing) for simulated fish populations where the true population parameters are known and where common assumptions usually made in stock assessments are violated. This project was similar in principle to a study by the International Council for Exploration of the Sea (ICES, 1993) Working Group on Fish Stock Assessment Methods, which compared a variety of age-structured methods. However, the violations considered herein are more severe than in the ICES study.

At its meeting on January 16-18, 1996, the committee designed the simulation model (the set of parameters and assumptions) that would generate simulated data sets to be used in the study. The committee used an age-structured model to generate 30 years of commercial catch and survey information as the basis of the simulations.* Complete details describing the procedure used to generate simulated data are given in Appendix E. The 30-year data series was longer than typically available because the committee was more interested in determining assessment failures due to violations of assumptions than in studying failures caused by shortness of the time series, although the latter problem also can be experienced in actual assessments. Simulated catch-age data were produced for ages 1-15, with the age 15 group containing information for all 15+ fish. The population was affected by natural mortality and fishing mortality; fishing mortality was an increasing function of age, as described by an asymptotic selectivity function. Fishing mortality also varied over time, with fishing effort being varied to achieve desired population trends and realistic variation. Recruitment to the population was governed by an asymptotic (Beverton-Holt) spawner-recruit relationship (Chapter 3) with a large, auto-correlated, environmental error component.

Certain features were included in the simulation model to test the robustness of stock assessment methods:

  1. Ageing error: Many studies have shown that ageing error is a major problem in fisheries stock assessment (e.g., Summerfelt and Hall, 1987). Mean ageing error in the simulated data sets varied from 0 at age 1 to -1 year at age 15, with increasing variation as age increased. Ageing error was included because it corrupts information contained in the age composition data about year-class progression. To simplify the analyses, ages in the age 15+

*  

The simulated data and instructions will be available at the Ocean Studies Board site on the World Wide Web at http://www2.nas.edu/osb/.



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 59
--> 5— Simulations Simulation Approach The major goal of the committee's simulation study was to evaluate the performance of stock assessment methods and subsets of information (fishery, survey, ageing) for simulated fish populations where the true population parameters are known and where common assumptions usually made in stock assessments are violated. This project was similar in principle to a study by the International Council for Exploration of the Sea (ICES, 1993) Working Group on Fish Stock Assessment Methods, which compared a variety of age-structured methods. However, the violations considered herein are more severe than in the ICES study. At its meeting on January 16-18, 1996, the committee designed the simulation model (the set of parameters and assumptions) that would generate simulated data sets to be used in the study. The committee used an age-structured model to generate 30 years of commercial catch and survey information as the basis of the simulations.* Complete details describing the procedure used to generate simulated data are given in Appendix E. The 30-year data series was longer than typically available because the committee was more interested in determining assessment failures due to violations of assumptions than in studying failures caused by shortness of the time series, although the latter problem also can be experienced in actual assessments. Simulated catch-age data were produced for ages 1-15, with the age 15 group containing information for all 15+ fish. The population was affected by natural mortality and fishing mortality; fishing mortality was an increasing function of age, as described by an asymptotic selectivity function. Fishing mortality also varied over time, with fishing effort being varied to achieve desired population trends and realistic variation. Recruitment to the population was governed by an asymptotic (Beverton-Holt) spawner-recruit relationship (Chapter 3) with a large, auto-correlated, environmental error component. Certain features were included in the simulation model to test the robustness of stock assessment methods: Ageing error: Many studies have shown that ageing error is a major problem in fisheries stock assessment (e.g., Summerfelt and Hall, 1987). Mean ageing error in the simulated data sets varied from 0 at age 1 to -1 year at age 15, with increasing variation as age increased. Ageing error was included because it corrupts information contained in the age composition data about year-class progression. To simplify the analyses, ages in the age 15+ *   The simulated data and instructions will be available at the Ocean Studies Board site on the World Wide Web at http://www2.nas.edu/osb/.

OCR for page 59
--> group were not tracked; all fish in that group had the same probability of being misaged. This is not entirely realistic because older fish in this group are probably less likely to be aged outside the 15+ group (for example, at age 14). Nevertheless, the few fish in this group relative to other ages make this a minor concern (the effects of ageing error are discussed later in this chapter). Fishery catchability changes: Fishery catchability was composed of two factors; one varied as a function of time and the other as a function of abundance. The first factor increased exponentially as time passed to mimic improvements in vessel efficiency due to technological improvements and learning. The second factor was a power function of abundance with an exponent of 0.4 (a little stronger than a square-root relationship). This factor was included to simulate the hyper stability* often observed in fishery catch per unit effort (CPUE). In a stock displaying hyper stability, CPUE tends to decrease more slowly than actual population size, leading to possible stock assessment errors and risk of population collapse because there are fewer fish available for harvest than indicated by CPUE trends (Hilborn and Walters, 1992). Age selectivity† differences: For three of the five simulated data sets (1, 2, and 3), increased selectivity on younger fish by the fishery occurred in the last 10 years compared with the first 20 years, as shown in Figure 5.1. This feature was included because many assessments assume constant age selectivity and because selectivity changes can mimic changes in length-age relationships. Many actual fisheries appear to have changes in selectivity. For walleye pollock in the Bering Sea, changes in selectivity have resulted from changes in fishing patterns due to learning by fishers and spatial patchiness of fish populations (Quinn and Collie, 1990). When large year classes emerge, harvesters continue to target them as they age. For Pacific halibut, there is evidence of a substantial reduction in size of fish at a given age over the past 15 years (Clark, 1996). As a result, the selectivity of young age classes has been reduced because (1) the longline gear used is less efficient at catching smaller fish and (2) a larger fraction of the young individuals are below the legal size and must be discarded (Parma and Sullivan, 1996). The age of 50% maturity was much higher than the age of 50% selectivity to the fishery (Figure 5.1). The model created a population that could be quite susceptible to overexploitation because fish reproduced at a greater age than the age at which they are recruited to the fishery. Such a situation is exemplified by cod, haddock, and flounder in U.S. waters, which start to be recruited at age 1 and mature at age 2+, although the difference is more pronounced in the simulated populations. The survey gear had a dome-shaped selectivity function as shown in Figure 5.1 ("survey selectivity"). This choice was made because dome-shaped selectivity and natural mortality are often confounded in stock assessment applications. In addition, one data set (3) had doubled survey catchability for the last 15 years. This feature mimicked a change in survey vessel; analysts were told that a change of vessel occurred after 15 years. Most stock assessment models assume constant natural mortality. In the committee's simulations, natural mortality was constant for fish of all ages during a given year but varied from year to year; this is probably true in actual populations due to variations in predation by other species and the changing incidence of disease with age. Natural mortality was modeled as a uniform random variable between 0.18 and 0.27 (with a mean value of 0.225). In some fisheries, catch statistics are inaccurate, which most likely involves underreporting (see Chapter 2). One data set (2) included underreporting of catch by 30%. Various process and measurement errors in the population's dynamics and the data were included in the model for realism, including random variation in recruitment, fishery catchability, survey catchability, fishery selectivity, fishing effort, ageing error, and sampling for ages. Five data sets were generated with the age-structured model. The simulation model was constructed in an Excel spreadsheet. The simulation procedure can be visualized as *   "Hyper stability" is explained further by Hilborn and Walters (1992). Its counterpart is "hyper depletion," in which CPUE decreases faster than the actual population. Hyper stability seems to be more common than hyper depletion in actual populations. †   Age selectivity measures the vulnerability of different-age fish to the fishing gear relative to a reference age.

OCR for page 59
--> FIGURE 5.1 Maturity, fleet selectivity, and survey selectivity for data set 1. The five simulated data sets differed in population trend, temporal changes in fishery selectivity, underreporting of catch, and changes in survey catchability (Table 5.1). The population trend simulated either a pristine stock being fished to a low level (data sets 1-4) or a depleted stock under recovery (data set 5), two common scenarios for fisheries. Fishery selectivity either was constant over time or declined stochastically from age 5 to age 3 in the last 15 years (see Appendix E). Although this design does not include all possible combinations of the four factors, the committee believed that the analysts could not have devoted the time to additional analyses. Quantification of these factors is described in Appendix E. By comparing results from these data sets, the effects of particular factors can be understood. Data set 4 can be considered the easiest case; it has no changes in age-specific selectivity of the fishery or survey catchability over time and no underreporting. However, it did include changes in fishery catchability over time and as a function of biomass. Data set 3 can be considered the most difficult case because it includes changes in age selectivity as well as survey and fishery catchability. A comparison of results from data sets 1 and 2 shows the effect of underreporting. A comparison of results from data sets 1 and 3 shows the effect of the change in survey catchability. A comparison of results from data sets 1 and 4 shows the effect of the decrease in fishery selectivity. A comparison of results from data set 4 with data set 5 shows the effect of a decreasing population versus a recovering population. The true exploitable biomass and the fishery and survey indices of exploitable biomass over time are shown for each data set in Figure 5.2. Each series has been scaled by its mean to show relative patterns over time. Except for data set 3, the survey index has the same pattern as biomass; for data set 3, the doubling of catchability causes the survey index to underestimate relative biomass at the beginning and to overestimate relative biomass at the end of the period. For each data set, the fishery index does not have the same trend as exploitable biomass because of the increasing catchability of the fishery over time, the decreased age selectivity in some data sets, and the dependence of catchability on biomass. The survey index is more variable than the fishery index; the survey relative error was 30% versus 20% for the fishery. The committee sought assistance from National Marine Fisheries Service (NMFS) analysts who regularly use the major types of stock assessment methods for real assessments. The models tested (listed in order of complexity) included a production model; two versions of a delay-difference model; and age-structured analyses using ADAPT, a spreadsheet, Stock Synthesis, and Autodifferentiation Model Builder (ADMB, a commercial

OCR for page 59
--> TABLE 5.1 Characteristics of Simulated Data Sets Data Set Population Trend Age at 50% Selectivity Underreporting Survey Catchability 1 Depletion Lower later None Constant 2 Depletion Lower later 30% Constant 3 Depletion Lower later None Higher later 4 Depletion Constant None Constant 5 Recovery Constant None Constant package).* The spreadsheet implementation contained features similar to the Stock Synthesis program, but was a simpler implementation of the generic age-structured assessment (ASA) model. Further details about the implementation of these methods are given later in this chapter. The committee also utilized the services of a non-NMFS expert who performed additional ADAPT analyses. Data sets were sent to analysts in mid-March 1996 (see Appendix F for transmittal letter). Although the purpose of this exercise was to compare methods, the implementation of each method was affected by complex interactions among the individuals involved in the analyses, subjective and objective modeling decisions, the base model used, and the computer implementation of the model. In addition to the five sets of catch, age composition, CPUE, and survey data from years 1 to 30, analysts were given growth and maturity parameters (see Appendix E), the ageing error probability matrix (Richards and Schnute, 1992), and information about the structure of the population and the data. Analysts were not provided with information about natural mortality, catchability, selectivity, the recruitment process, or the amount of underreporting (although they were warned that underreporting might have occurred). Analysts were requested to perform the analyses with three combinations of abundance indices: CPUE data only (coded "F") Survey data only ("S") Both CPUE and survey data ("B") Analysts were asked to perform these analyses independently, that is, not to use results of one analysis to initiate others or to work with analysts using other methods. As mentioned earlier, the survey index has a greater relative error associated with it than does the fishery index. In other aspects, the survey index is less variable because surveys either are intentionally designed to be unchanging over time (e.g., same gear, sampling design, sampling methods, and sampling areas) or are changed only with associated calibration experiments to ensure that survey data are comparable over time. Thus, catchability and selectivity for the survey data were assumed to be constant (except for catchability in data set 3). Randomized sampling designs used in surveys reduce the hyper stability effect. Conversely, commercial fishers learn and change gear, fishing areas, and fishing methods to maintain or increase their harvests. The data sets provided to analysts are representative of differences known to exist between survey and fisheries data. The committee deliberately constructed the simulation according to conventional wisdom that a survey should provide a better index of abundance than data from a fishery. Nevertheless, the committee could have just as easily interchanged the fishery and the survey to make the survey the bad source of information. Hence, the simulation should be interpreted as having contained two indices of abundance, one of which was usually a good measure of abundance, and the other not. Analysts' results were received near the beginning of May 1996 and summarized prior to the committee's *   Including ADMB in this analysis does not imply its endorsement by the committee. The model implemented using ADMB, however, possesses special characteristics not included in models commonly used by NMFS analysts. These characteristics include the ability to add a large number of process errors for recruitment, selectivity, catchability, and natural mortality. A Bayesian framework allows results to be synthesized in terms of posterior probability distributions for selected population parameters.

OCR for page 59
--> FIGURE 5.2 True exploitable biomass, survey biomass index, and fishery CPUE for the five simulated data sets. Data are plotted as quantity divided by the mean of the quantity over the 30 simulated years. May meeting. Analysts cautioned the committee that the time they had available for the analyses was limited compared to a normal stock assessment. They noted that other information they would normally have available for doing a stock assessment (species characteristics, reports from harvesters and biologists, and other detailed information) was not given to them in this exercise. Such factors may have compromised the ability of analysts to obtain the absolutely best estimate, and the results presented herein do not necessarily reflect real-world conditions,* but all analysts operated under the same constraints, so this exercise constitutes a reasonable comparison of *   The simulation was not meant to replicate all the aspects of a real-world assessment, in which more time, human resources, and data about species would be available for an assessment and analysts would have greater freedom to reject bad data sets, conduct open discussion, and build models iteratively.

OCR for page 59
--> methods. One other caveat is that each data set represented the results of one replication of a stochastic process. The possibility that any given data set was extreme cannot be eliminated. Other caveats from the analysts will be discussed more completely in a planned National Oceanic and Atmospheric Administration (NOAA) Technical Memo that presents the analysts' reports. The committee met with analysts on May 14, 1996. Each analyst presented the model results, as well as an indication of problems or insights gained in the analysis. The true values of biomass, recruitment, and exploitation were compared with analysts' estimates. The committee and analysts discussed what further work should be undertaken. Three types of additional analyses were performed after the May meeting. First, because analysts estimated and used different values of natural mortality in the initial set of model runs (ranging from 0.15 to 0.25), the committee requested that they repeat the analyses with a common value for natural mortality equal to the true average natural mortality of 0.225. The committee asked four analysts to perform these in-depth analyses for the (1) ADAPT, (2) ASA spreadsheet, (3) Stock Synthesis, and (4) ADMB age-structured models. For each of the simulated scenarios, analysts were asked to use the three combinations of fishery and survey data, as in the previous analyses. Second, a standard set of definitions of key management variables was agreed to and it was decided to calculate TAC (total allowable catch) in year 31 using a rate based on F40% (as described in Chapter 4). Some analysts tried additional methods to improve the assessment; these results are reported later in this chapter. Most existing assessment methods provide some estimate of precision of the parameter estimates. However, these are based on the structure of the assessment model being correct. Thus, unless the model structure is flexible enough to allow for major sources of uncertainty about the processes and data to be incorporated, estimates of precision tend to underestimate the true uncertainty in the assessment. It would have taken a considerable amount of additional work by both the analysts and the committee to evaluate estimates of uncertainty, so this was not done. Finally, the committee decided to undertake retrospective analyses (see Chapter 3 for more detail about this subject) to determine the persistence of over- or underestimation of population parameters over time by the different methods. Although initial results indicated that the different methods were often able to recognize that the stock was severely depleted (or substantially recovered) at the end of the simulated time period, earlier recognition would be necessary for management to react in a timely fashion. Thus, retrospective analysis is important to determine how long it would take for the assessments to recognize underlying stock trends. The analysts were to use their methods on 16 subsets of the total data set: years 1-15, 1-16, …, 1-30. The committee collected the results of these analyses in August 1996, summarized them, and conducted additional statistical analyses of the results to test whether trends of estimates and trends of the true values were parallel (even if biased). The committee met in August 1996 to review results obtained to that point, to formulate additional analyses of the simulation results, and to develop preliminary findings and recommendations. Simulation Results This section describes the approaches taken for each specific model and presents the simulation results. The simulation results using mortality rates selected by the analysts are contrasted with results using the true M (natural mortality rate). Methods are compared on the basis of summary statistics related to important management parameters. Finally, the committee's additional analyses of model performance and retrospective analyses are presented. Results Using Estimated M Production Models The production model used is described in Prager (1994). The production model estimates productivity parameters from total harvest and indices of biomass (CPUE, survey abundance over ages). No age-structured

OCR for page 59
--> information was used. It was difficult for this model to produce reasonable estimates for many data set-data source combinations. In particular, no reasonable estimates could be obtained from data set 3, the most difficult one, because neither CPUE nor survey data were consistent indices of abundance over time. Estimates by this method are shown in Table 5.2, along with the analyst's confidence in the results. Estimates were provided for MSY (maximum sustainable yield)*, FMSY, EMSY, BMSY, B30, and fishing mortality and biomass in year 30 relative to the MSY level (Frel and Brel, respectively).† The committee calculated relative fishing mortality from the fishing effort information and relative biomass from exploitable biomass values provided by the analyst. The analyst's confidence in the results was generally low, with some moderate confidence in results from data set 4 (the easiest). When both fishery and survey data were used, estimates of MSY were 4 to 20% below the true value. However, estimates of absolute and relative fishing mortality and biomass generally were not close to the true values. Curiously, the use of fishery CPUE alone sometimes produced estimates close to the true values, even though the simulated CPUE was a biased measure of biomass. Nevertheless, the estimates often differed substantially from the true values, suggesting a lack of robustness of production models for data such as the simulated data used in this study. The analyst cogently summarized the limitations of production models as follows: ''The simulated data sets do not seem well suited to simple production modeling, and confidence in the quantitative validity of most of the results obtained is low. Noisy data, poorly correlated CPUE and survey indices, and relatively constant effort levels all probably contribute to this situation. Without knowing the underlying population model and conducting simulations, it is impossible to say to what degree age-structure effects also contribute. The apparently high fishing mortality rates and the extensive age-structured data available suggest that these fisheries are better suited to analysis by cohort-based methods. This suggestion is strengthened by several sets of simulated data in which apparently constant fishing effort leads to a population increase and then a decrease—such a scenario is incompatible with the assumptions underlying simple production models. This suggests either environmental forcing of recruitment, nonconstant catchability, or both." Delay-Difference Models Two analysts fitted the Schnute version of Deriso's delay-difference model (Deriso, 1980; Schnute, 1985), using total harvest data, the two indices of abundance, and a recruitment index (age 5 fishery or survey data for the first delay-difference method [DD] and age 3 [data sets 1-4] or age 4 [data set 5] fishery or survey data for the second delay difference method [DDKF]). For both models, recruitment was assumed to be knife-edge (i.e., all fish are vulnerable to the fishery at the same age). The population parameters estimated from this model therefore may not be strictly comparable to the true parameters, which were calculated from the age-dependent selectivity function. The DD method was a measurement error model. Results from this model included estimates of biomass, recruitment, and fishing mortality. Natural mortality values were the same as those used in the first Stock Synthesis method described below (0.251. 0.251, 0.169, 0.201, and 0.191 for data sets 1 to 5, respectively). For many of the DD results (Figure 5.3), deviations in exploitable biomass from the true value showed little trend, but there are substantial deviations in absolute values over the entire time period. Not surprisingly, the use of only fishery data (left panel of Figure 5.3) produced the most extreme deviations because the fishery data had a time trend in catchability that was not incorporated in the delay-difference models. Use of fishery data tended to lead to overestimates of biomass and failure to estimate the correct trend in exploitation fraction. Better results were obtained by using survey data alone or by using both data sources, but substantial discrepancies still remained. The deviations for data set 4 were particularly surprising, because this data set was relatively well *   MSY is given in metric tons (tonnes, t). †   Throughout this chapter, F = fishing mortality rate, E = effort, and B = biomass.

OCR for page 59
--> TABLE 5.2 Comparison of Results of Single Runs of Production Models with True Valuesa Data Set 1 1 1 True 2 2 True 3 True Index Fb S B1   F B1   All   MSY 580 90 300 312 995 218 240   315 Frel 1.6 13 2.2 1.4 1.9 2.7 1.8   2.0 Brel 0.18 0.11 0.23 0.14 0.07 0.21 0.19   0.36 Confidence Lc L L to N   L to N L   N   FMSY 0.21 0.03 0.31 0.196 0.21 0.25 0.158   0.151 EMSY 983 nad 1236 1223 917 1169 949   1096 BMSY 2744 3183 954 1924 4639 1162 1820   2480 E30   1703   1694   2139 B30 480 430 236 276 324 322 346   903 Data Set 4 4 4 4 True 5 True   Index F S B1 B2   F   MSY 2470 300 430 480 513 635 564   Frel 1.6 3.4 1.7 1.5 1.4 0.51 0.6   Brel 0.03 0.11 0.17 0.18 0.05 1.4 2.09   Confidence VL L to M L to M L to M   L   FMSY 0.20 0.12 0.28 0.28 0.252 0.13 0.28   EMSY 128 na 180 169 1827 2123 2025   BMSY 12600 2540 1546 1701 2477 4738 2473   E30   2552   1139   B30 388 309 264 290 115 6950 5158   a The analyst examined all data sets and sources of data; those not given in this table were assigned no confidence by the analyst and productivity results were not meaningful. The analyst would not normally report absolute estimates FMSY and BMSY, believing that relative values are more accurate and more useful for management. They are included in this table for scientific interest. The major characteristics of the five data sets are given in Table 5.1 True values are those calculated by the committee from known parameter values. bData source: F = fishery; S = survey; B1 = both, using standard methods; B2 = both, using alternative techniques such as iterative reweighting or a combined survey-fishery index. cConfidence: M = moderate, L = low, VL = very low, N = none dEstimates of EMS are not available from production models using survey indices only. However, the estimate of Frel (identical to Erel) serves the same purpose and is often more useful in practice. NOTE: FMSY, EMSY, and BMSY are fishing mortality, effort, and biomass, respectively, at MSY. E30 and B30 are effort and biomass in year 30. Frel and Brel are fishing mortality and biomass in year 30 relative to FMSY and BMSY. E30 is a known value used to calculate Frel. structured. Nevertheless, the delay-difference model, by utilizing some information about age structure, showed improved estimates compared with production models. However, poor indices of abundance and the failure to use more age-structured data led to estimates that were more variable than the age-structured models discussed below. Models were rerun after the true average M value was provided, and those results are discussed in the following section. ADAPT The ADAPT approach is described in Gavaris (1988), Conser and Powers (1989), Restrepo and Powers (1991), and Conser (1993). It is essentially a cohort analysis on catch-age data where indices of abundance are included to estimate a relatively small number of parameters. The major assumption is that there are no errors in the catch-age data.

OCR for page 59
--> FIGURE 5.3 Percent relative deviations of estimated exploitable biomass from true exploitable biomass for model DD. Deviations greater than 400% are not shown in this figure and Figures 5.4 - 5.12. The major characteristics of the five data sets are given in Table 5.1.

OCR for page 59
--> The team of analysts who met for two days to perform the ADAPT analyses found that the ageing error matrix provided by the committee was more suitable to forward types of analyses such as ASA, SS, and ADMB and decided that they could not use such information. An M value of 0.15 was used in the ADAPT analyses, based on heuristic examination of the data and an attempt to estimate M from the data after making some assumptions about selectivity and catch variability. Generally, the procedure used was (1) to examine the data and pool it beyond some age into a plus group, (2) to obtain selectivity estimates using a separable virtual population analysis, (3) to fix the ratio of fishing mortality for the plus group to the next-youngest age, and (4) to run a nonlinear least-squares procedure to estimate population parameters. Estimates of total and exploitable biomass, recruitment, spawning stock biomass, total biomass averaged over the year (Bave), and average exploitation fraction (Y/Bave, where Y = yield) were calculated. Age-specific estimates of catchability were examined and trends were discovered in some data sets. Therefore, except for data set 5, fishery CPUE data were not used, despite the committee's request for these sets of analyses, because the ADAPT team did not believe the information was useful. For data set 5, this group attempted a fit of the combined survey and fishery CPUE. The relative trend in the estimates obtained was similar to the other age-structured methods (Figure 5.4). However, estimates of abundance were negatively biased, because the choice of natural mortality of 0.15 was too low compared to the true average M of 0.225. These results confirm the general conclusion that underestimating natural mortality leads to underestimation of abundance. A consequence of underestimating abundance is a tendency to overestimate exploitation rate. The team did not compute TAC because the procedure it would have used would have taken more time than was available. Separable ASA Models A family of age-structured assessment models is based on a statistical formulation of age-structured information and the assumption that fishing mortality is separable into age-selectivity multiplied by a full-recruitment fishing mortality (Doubleday, 1976; Fournier and Archibald, 1982; Deriso et al., 1985; Methot, 1989, 1990). A generic age-structured assessment model with these characteristics was formulated in a spreadsheet (ASA). It included a modification of the method assuming that all catch was taken halfway through the year and demonstrated that this approximation was accurate. The likelihood function consisted of a multinomial component that incorporated ageing error and a residual sum of squares term for the logarithm of each abundance index (number per boat-day from the fishery or survey catch in numbers, either pooled or by age). For data set 3, ASA assumed different catchabilities for years 1-15 and years 16-30 due to the change in survey vessel. An asymptotic curve for survey selectivity was used, but the analyst who ran ASA noted that he would have considered using a dome-shaped selectivity curve (the true situation) if he had other information to justify that choice. This spreadsheet model had similar characteristics to the Stock Synthesis model described next, but was simpler. Estimates of natural mortality came from the Alverson-Carney procedure (based on longevity of a species and growth) and were 0.251, 0.251, 0.169, 0.201, and 0.191, for data sets 1 to 5, respectively. Trends from the survey and the fishery did not match, and the analyst suspected that the fishery data had trends in catchability. This analyst did a great deal of preliminary data exploration work to help him discover structural features in the data sets that improved his analyses. Deviations of estimated exploitable biomass from the true value were generally smaller when using only survey data or both compared to only fishery data (Figure 5.5). The ASA method usually resulted in less trend in the deviations over time than the methods previously described. The Stock Synthesis method was implemented in a computer program as described in Methot (1989, 1990) and denoted SS-P in this report. A natural mortality value of 0.2 was used. Some models with constant recruitment were fitted to the data to mimic a production model. No results were presented from these analyses because recruitment was obviously not constant. The basic configuration of the model accounted for ageing error, used logistic curves for fleet and survey selectivity, and calculated fishing mortality values that equated the observed and predicted catch biomass values each year. The analyst calculated effective sample sizes from the mean squared error and showed that these were similar to actual sample sizes, which suggested that the SS-P

OCR for page 59
--> FIGURE 5.4 Percent relative deviations of estimated exploitable biomass from true exploitable biomass for the ADAPT model.

OCR for page 59
--> FIGURE 5.19 Comparison of estimated and true biomass (thousand metric tons) in terminal years for 16 retrospective assessments (terminal years 15-30 assessments) conducted on data set 3 using Stock Synthesis (SS-P3, SS-P6, and SS-P7), AD Model Builder, and NRC ADAPT; M was fixed at the true average value in all cases.

OCR for page 59
--> FIGURE 5.20 Comparison of estimated and true biomass (thousand metric tons) in terminal years for 16 retrospective assessments (terminal years 15-30 assessments) conducted on data set 4 using Stock Synthesis (SS-P3, SS-P6, and SS-P7), AD Model Builder, and NRC ADAPT; M was fixed at the true average value in all cases.

OCR for page 59
--> FIGURE 5.21 Comparison of estimated and true biomass (thousand metric tons) in terminal years for 16 retrospective assessments (terminal years 15-30 assessments) conducted on data set 5 using Stock Synthesis (SS-P3, SS-P6, and SS-P7), AD Model Builder, and NRC ADAPT; M was fixed at the true average value in all cases.

OCR for page 59
--> TABLE 5.13 Average of Relative Deviations Between Estimate and True Exploitable Biomass in Terminal Year (for terminal years 15-30)   SS-P6     SS-P7 ADMB4   NRC ADAPT   Data Set F B S F F B S F B S 1 106 13 -3 73 27 -28 -31 207 28 13 2 40 -23 -23 11 52 -33 -35 251 13 2 3 51 51 21 52 63 18 14 72 60 57 4 25 19 15 31 52 -1 8 413 102 58 5 -9 -3 -9 -3 -53 -55 -56 -51 -43 -43 Average 43 11 0 33 28 -20 -20 178 32 17 NOTE: Boldface values indicate average deviations not exceeding 25%. TABLE 5.14 Serial Correlation Between Errors in the Estimates of Exploitable Biomassa in the Terminal Year   SS-P6     SS-P7 ADMB4   NRC ADAPT   Data Set F B S F F B S F B S 1 0.86 0.71 0.56 0.89 0.64 0.54 0.47 0.46 0.38 0.41 2 0.26 0.21 0.46 0.02 0.32 0.51 0.56 — 0.32 0.27 3 0.66 0.04 0.42 -0.12 0.72 0.42 0.30 0.71 0.66 0.68 4 0.56 0.54 0.39 0.64 0.75 0.22 0.32 — 0.41 0.40 5 0.39 0.10 -0.04 0.38 0.37 0.07 0.00 0.30 0.08 0.07 a Deviations between estimated and true exploitable biomass (for terminal years 15-30). TABLE 5.15 Average of Relative Absolute Deviations Between Estimate and True Exploitable Biomass in Terminal Year (for terminal years 15-30)   SS-P6     SS-P7 ADMB   NRC ADAPT   Data Set F B S F F B S F B S 1 133 40 40 114 65 44 44 207 38 32 2 75 24 28 54 56 33 36 251 30 23 3 83 55 45 66 84 26 24 76 65 63 4 44 34 35 48 52 22 34 388 104 63 5 18 11 13 15 53 55 56 51 43 43 Average 71 33 32 60 62 36 40 195 56 45 NOTE: Relative deviation = (estimated - true)/true; Relative absolute deviation = Irelative deviation.

OCR for page 59
--> TABLE 5.16A Number of Assessments with Estimates of Terminal Exploitable Biomass Within ± 25% of True Value   SS-P6 ADMB4 NRC ADAPT Data Set B F F S F B S B S F 1 6 2 3 5 4 3 3 4 7 0 2 9 6 3 8 7 6 5 8 11 2 3 2 0 1 3 5 10 10 3 3 4 4 9 5 6 9 7 11 7 6 7 0 5 16 12 12 14 1 0 0 0 0 0 NOTE: Maximum n = 16. TABLE 5.16B Number of Assessments with Relative Errors in Estimates of Terminal Exploitable Biomass Within ± 50% of True Value   SS-P6 ADMB4 NRC ADAPT Data Set B F F S F B S B S F 1 12 6 4 8 6 9 8 12 13 2 2 14 4 11 14 10 13 13 14 15 3 3 11 7 6 11 8 13 14 6 6 7 4 13 10 10 6 11 14 14 7 8 0 5 16 16 16 16 6 5 5 15 14 8

OCR for page 59
--> TABLE 5.17 Pearson Correlation Between Relative Deviation and True Change in Exploitable Biomass for Terminal Years 16-30   SS-P3     SS-P7 ADMB4 NRC ADAPT   Data Set F B S F F B F B S 1 -0.37 -0.48 -0.53 -0.39 -0.34 -0.36 -0.17 -0.55 -0.59 2 -0.48 -0.31 -0.33 -0.60 -0.35 -0.42 -0.52 -0.57 -0.64 3 -0.53 -0.51 -0.58 -0.46 -0.29 -0.37 -0.17 -0.11 -0.13 4 -0.28 -0.05 -0.10 -0.27 -0.40 -0.26 -0.03 -0.02 0.02 5 -0.51 -0.38 -0.49 -0.47 -0.46 -0.45 -0.38 -0.48 -0.46 Average -0.43 -0.35 -0.41 -0.44 -0.36 -0.37 -0.25 -0.35 -0.36 NOTE: Change in biomass is between terminal year and previous year; the lowest coefficient for each data set is noted in boldface type. TABLE 5.18 Percent of Positive Relative Deviations Given a Positive True Change (labeled as an ''up") in Exploitable Biomass for Terminal Years 16-30 Assessments     SS-P3     SS-P7 ADMB NRC ADAPT   Set # Ups F B S F F B F B S 1 6 67% 50% 17% 33% 50% 17% 100% 67.00% 50.00% 2 7 14% 0.00% 0.00% 14% 86% 0.00% 100% 29.00% 14.00% 3 5 20% 80% 20% 60% 20% 40% 100% 100% 100% 4 5 20% 80% 60% 40% 80% 20% 100% 100% 80.00% 5 8 13% 38% 13% 38% 0.00% 0.00% 0.00% 0.00% 0.00% Average 6.2 27% 50% 22% 37% 47% 15% 80.00% 59.00% 49.00% NOTE: Change in biomass is between terminal year and previous year. Percent positive deviations below 50% indicate a tendency of the method to underestimate biomass during periods of population increase. Relative deviation = (estimate-true)/true exploitable biomass for terminal year. TABLE 5.19 Percent of Negative Relative Deviations Given a Negative True Change (labeled as a "down") in Exploitable Biomass for Terminal Years 16-30 Assessments     SS-P3     SS-P7 ADMB NRC ADAPT   Set #Downs F B S F F B F B S 1 9 33% 56% 56% 56% 44.00% 67.00% 0.00% 33.00% 33.00% 2 8 25% 63% 50% 50% 25.00% 100% 0.00% 13.00% 25.00% 3 10 10% 0.00% 10% 0.00% 20.00% 10.00% 20.00% 20.00% 20.00% 4 10 30% 30% 30% 20% 10.00% 50.00% 0.00% 10.00% 10.00% 5 7 57% 29% 57% 57% 100% 100% 100% 100% 100% Average 8.8 31% 35% 41% 37% 40.00% 65.00% 24.00% 35% 38% NOTE: Change in biomass is between terminal year and previous year. Percent negative deviations below 50% indicate a tendency of the method to over-estimate biomass during periods of population decrease. Relative deviation = (estimate-true)/true exploitable biomass for terminal year.

OCR for page 59
--> FIGURE 5.22 Example of retrospective error pattern (estimates from model SS-P3[F]) over assessments for terminal years 16-30 and all data sets. spawner-recruit relationship into age-structured models, as shown in Deriso et al. (1985), which could then be used to calculate MSY and other such parameters. In ADMB, a recruitment model is specified which can include a spawner-recruit relationship. Age-structured models are able to reconstruct annual recruitment values without including a spawner-recruit relationship in the assessment model. In this study, analysts using the delay-difference model did not use a spawner-recruit relationship but instead used a recruitment index derived from the survey. Although the production model used a logistic stock-production relationship, one can construct a form of this model which incorporates a recruitment index. (This was mentioned as a possibility by the analyst but not used, because a recruitment index requires age-structured information and the goal was to see what could be extracted from a simple model). For management, some biological reference points (BRPs, such as F40%) related to spawning biomass per recruit can be calculated without using a spawner-recruit relationship, and such BRPs are being used with increasing frequency in actual assessments. But the spawner-recruit relationship should be investigated to evaluate whether the BRP is appropriate. Whether it is better to do so by incorporating the relationship within the assessment model or as a separate analysis using output from the assessment model is an open question. Because a full evaluation of stock assessment models could not be done, the committee used its resources to explore the performance of the models in a simulated stock assessment setting in an attempt to (1) evaluate existing methods and (2) suggest new directions of research into stock assessment methods. It is obvious that a more comprehensive evaluation of stock assessment methods should be undertaken, given the results of this study. Issues related to the treatment of measurement and process errors, the functional dependence of population

OCR for page 59
--> parameters (e.g., catchability) on biomass, the choice and weighting of individual data sets, information conflicts among data sets, and the appropriate level of model complexity are all unresolved at present, and require attention if there is to be greater confidence in stock assessments. These would be fruitful areas of research. The committee's analysis indicates that high-quality data, fundamentally the availability of reliable indices to calibrate the models, are essential to produce reliable abundance estimates. In most cases, use of the fishery abundance index resulted in poor performance unless the model contained additional parameters to deal with the trend in the index. Surplus production and delay-difference models did not perform as well overall as age-structured models; this is not surprising because the simulated data were designed for use with age-structured methods. Surplus production models require a straightforward and immediate response of the population to changes in harvesting levels. The simulated populations were more affected by recruitment fluctuations than by changes in harvest levels. The corruption of indices of abundance by catchability and selectivity changes and by underreporting of catch would make stock assessment with surplus production models nearly impossible. Better results were achieved for delay-difference models because analysts utilized an index of recruitment from the survey and/or fishery data, rather than relying on a stock-recruitment model. Using a knife-edge selectivity assumption in these models when there was an underlying selectivity pattern with age increased the uncertainty and potential bias in estimates of population parameters. Nevertheless, better results were obtained for these models when the survey index was used alone than when only the fishery index or both indices were used. Among the age-structured models, simple models such as ASA, SS-P3, or NRC ADAPT performed reasonably well when only the survey index was used and when the dynamics of the population and harvest were not too complex. More complex models such as SS-P6, SS-P7, and ADMB4 were sometimes able to handle more complex dynamics and indices with trends. However, the success of these more complicated models depended on correct specification of the dynamic changes in selectivity, catchability, and natural mortality. Simulation results suggest that models with greater complexity offer promise for improving stock assessment. The Kalman filter (in DDKF) and generalized parametric approach (in AD Model Builder) allowed more realistic treatment of process and measurement errors. The Bayesian treatment of parameters (also in AD Model Builder) provided a means for incorporating uncertainty directly into the analysis and yielded results in terms of posterior probability distributions, which explicitly presented uncertainty. The incorporation of functional dependence of catchability and flexibility in model specification (in SS-P6 and SS-P7) provided a more deterministic way of adding realism. Although no specific model outperformed others in the simulations, the committee was intrigued with how more complex models could reduce, at least partially, the biasing effects related to fishery catchability and selectivity changes. Simulation results showed that when there is substantial recruitment variability, production models do not perform well. Only with populations that exhibit a strong negative response to fishing should these models be used for routine assessment. Nevertheless, there will be situations in which data limitations preclude the use of other methods. Delay-difference models fared better than production models but worse than age-structured models. Although delay-difference models might be used in situations in which ageing is subject to great error or not possible, it would be more prudent to utilize the age or length information in stock assessments. One of the reasons delay-difference models performed as well as they did in the simulations was use of a recruitment index from the survey. Thus, the development of recruitment indices for use in stock assessments should be considered. Although the construction of better stock assessment models is likely to lead to better assessments, accurate and precise information about the population is of paramount importance. Although this conclusion is fairly obvious ("garbage in-garbage out"), the simulation study provided a clear illustration of the importance of good data and information. In data set 4, there were few violations of the underlying assumptions used in the assessments, and not surprisingly, most of the stock assessment models performed acceptably. The worst results for this data set were obtained when only the fishery index was used, showing that bad data weaken a stock assessment. Each of the other data sets had some additional complicating factor, that resulted in poorer results than for data set 4. Hence, poor information (e.g., not knowing about a change in catchability) is an additional factor that weakens a stock assessment. When the combination of poor data and poor information becomes large enough (e.g., data set 3), it can be almost impossible to extract any useful stock assessment information.

OCR for page 59
--> The major conclusion from the simulation study was that a good index of abundance is needed for useful stock assessment information, not that fishery indices should not be used. Much effort is required to validate any index as a measure of abundance. Specific examples of poor information incorporated into the simulation study deserve further comment. The misspecification of natural mortality has long been recognized as a critical problem in stock assessment. Overestimation of natural mortality leads to overestimation of both population abundance and optimal harvesting rate due to yield-per-recruit consequences. In the simulation study, natural mortality was rarely overestimated. Rather, the variability in natural mortality made it difficult to estimate other parameters. Two possible approaches to incorporating variable natural mortality are (1) to pursue methods such as multispecies virtual population analysis (MSVPA), that utilize data on food habits of different species, and (2) to use Bayesian specification of priors for natural mortality, provided that appropriate priors can be found (e.g., through meta analysis). Underreporting in data set 2 led to underestimation of population biomass, with greater effects at the beginning of the data series. Thus, the proportionate decline in population over time was underestimated, which could lead managers to think that less strict harvesting policies are adequate to rebuild a depleted population. The obvious solution is to design the catch-reporting system so that underreporting is less likely to occur (see Chapter 2). It is notable that underreporting can work to the detriment of fishers in the long run by corrupting the data used in assessment models. The decline in age selectivity in data sets 1, 2, and 3 resulted in a fishery index that would be higher than under constant selectivity, because more younger fish would be harvested and added to the fishery index. Such a change could occur by targeting of smaller fish as a population became depleted or by a change in the age selectivity curve due to density-dependent or environmentally induced changes in growth. Collection and analysis of growth data are critical to understanding changes in age selectivity. If changes can be detected from such observations, modeling can readily account for them, either by utilizing length-based methods or by having separate sets of age selectivity parameters (e.g. in SS-P7 or ADMB). The change in survey catchability in data set 3 created a situation in which neither index of abundance was proportional to biomass. When a change in catchability was incorporated into the models, its value could be estimated in many cases, with a resultant improvement in stock assessment results. If the committee had not told the analysts of the potential change, however, it is doubtful whether it would have been detected. The implication is that calibration of survey catchability is an important consideration; calibration studies should be done when there are changes in vessels, crews, or operations that affect the way a survey is conducted (see ASMFC, 1997). Retrospective analyses from the simulation revealed that stock assessments can vary substantially from the true values over time. These analyses also illustrated that the departures of estimated stock biomass from true values can persist in one direction over time. Consequently, management actions could have deleterious effects on the population long before they are observed. Thus, conservative harvest policies should be developed. The committee encourages greater use of retrospective analysis in stock assessment. The results can suggest when model misspecification is occurring and when data sources are providing contradictory information. The retrospective analysis reported herein focused on an evaluation of modeling. Another type of retrospective analysis involves summarization of previous stock assessment results regardless of which model was used. This type of retrospective analysis can be useful in examining the stability of the stock assessment process as actually implemented. The additional investments in stock assessment research recommended by the committee require additional commitments of personnel, field research, and analytical research. Chapter 4 contains a discussion of where investments should be made—not merely in collecting more of the same information but in improving the type of information collected as well. Simulation results suggest that if such investments cannot be made, some stock assessments will be far from the truth periodically, and consequently that management mistakes leading to fish population collapses and other negative consequences will be made. The approach of having a committee conduct simulation research involving NMFS scientists in an independent review of stock assessment methods is not undertaken frequently. This approach was quite useful for brainstorming issues related to stock assessment, and new understanding and novel approaches to stock assessment were inspired by this project. This work was designed to evaluate not the individual NMFS scientists

OCR for page 59
--> involved but the methods used in stock assessment. Nevertheless, it took a great deal of courage for these scientists to participate, and the committee was impressed with the analysts' willingness and ability to provide what it requested, their creativity in applying existing methods and developing new ones, and their fortitude in taking on a difficult assignment in addition to their regular duties. To foster excellence in stock assessment, NMFS should continue to support and encourage scientists to engage in creative stock assessment activities (e.g., workshops, gaming sessions, and conferences) so that the process of doing stock assessment does not become routine and stale.

OCR for page 59
This page in the original is blank.