Page 207 Cite

Suggested Citation:"Appendix H: Defining and Managing Outliers in MRIP Output: An Order Statistics Approach." National Academies of Sciences, Engineering, and Medicine. 2021. Data and Management Strategies for Recreational Fisheries with Annual Catch Limits. Washington, DC: The National Academies Press. doi: 10.17226/26185.

×

Appendix H

Defining and Managing Outliers in MRIP Output: An Order Statistics Approach

The Marine Recreational Information Program (MRIP) produces estimates of recreational fish catch and variance of catch by 2-month wave and by year. These estimates are produced by domain, such as by species, by geographic region, or by fishing mode. Fishing regulations are typically based on recent MRIP catch estimates or statistics derived from MRIP catch estimates, such as the third-largest of the five most recent MRIP catch estimates. In such derived statistics, the influence of so-called outlier estimates on the derived statistics is an important issue. The questions of how to define an outlier, how to decide whether an outlier of a given magnitude should trigger a change in management policy, and how to update management policy given a triggering outlier are important for fishery managers. This appendix presents a method for answering these questions based on the statistical concept of order statistics.

ORDER STATISTICS

The statistical concept of order statistics offers one approach to defining, identifying, and measuring outliers. Order statistics provide a method for determining the probabilities that the first-largest, second-largest, third-largest, etc., in a set of ordered numbers will take particular values.

For example, denote the i = 1…n annual MRIP catch estimates for a particular fish species as X₁, X₂, …, X_n. Assuming no change in the fish population or fishery from year to year (an assumption that can be tested later), the X_i are independent and identically distributed random variables having a common probability density f(X) and a common cumulative distribution function F(X). For nonrare fish species, f(X) is typically the density function for the normal distribution, and F(X) is the cumulative distribution function for the normal distribution. (For rare fish species, f(X) and F(X) could be the probability mass function and cumulative distribution function for the Poisson or negative binomial distribution [see Appendix E].)

Arrange the X₁, X₂, …, X_n values in order from smallest to largest, and use subscripts j = 1 to n in parentheses to denote the order of the values as shown below:

Page 208 Cite

Suggested Citation:"Appendix H: Defining and Managing Outliers in MRIP Output: An Order Statistics Approach." National Academies of Sciences, Engineering, and Medicine. 2021. Data and Management Strategies for Recreational Fisheries with Annual Catch Limits. Washington, DC: The National Academies Press. doi: 10.17226/26185.

×

X₍₁₎ = the smallest of the set X₁, X₂, …, X_n.

X₍₂₎ = the second-smallest of the set X₁, X₂, …, X_n.

X₍j₎ = the jth-smallest of the set X₁, X₂, …, X_n.

X₍n – ₁₎ = the second-largest of the set X₁, X₂, …, X_n.

X₍n₎ = the largest of the set X₁, X₂, …, X_n.

PROBABILITY DISTRIBUTIONS OF ORDER STATISTICS

It can be shown (Ross, 1988, p. 225) that the joint density function f_joint of the order statistics is given by:

The density function f(X₍j₎ = x) of the jth-order statistic X₍j₎ can be obtained by integrating the joint density function above to find (Ross, 1988, p. 227):

The cumulative distribution function F(X_(j) ≤ b) of the jth-order statistic X_(j) can be obtained by integrating the density function f(X_(j) = x) of the jth-order statistic to find (Ross, 1988, p. 227):

For example, F(X_{(n – 2)} ≤ 3000) gives the probability that the third-largest catch out of n catches is less than or equal to 3,000. Similarly, 1 – F(X_{(n – 2)} ≤ 3,000) gives the probability that the third-largest catch out of n catches is greater than 3,000.

FISHERIES APPLICATIONS: DEFINING AN OUTLIER

First, consider the problem of trying to determine whether the largest value of catch in n time periods is an outlier. Assume that i = 1 to n time periods of catch data are available for a nonrare fish species, where fish catch in each time period i, X_i follows a normal distribution f(X_i), with the same mean μ and variance σ² for all i, and where F(X_i) is the normal CDF of X_i. Suppose fishery managers are trying to decide whether the largest value of catch from the n time periods, namely X_{(j = n)}, is an outlier. One possible definition of outlier would be any value of X_{(j = n)} with a chance of occurring that is less than the fishery manager’s preselected level of statistical significance (say, 5 percent). The “threshold” value of catch denoted b for this definition of outlier would be the value of b that is the solution to:

Hence, if the largest catch X(j = n) in the n time periods is greater than b, it would be considered an outlier because it has a less than 5 percent probability of occurring by chance alone. Similarly, if a fishery regulation were based on the third largest of the five most recent MRIP catch estimates,

Page 209 Cite

Suggested Citation:"Appendix H: Defining and Managing Outliers in MRIP Output: An Order Statistics Approach." National Academies of Sciences, Engineering, and Medicine. 2021. Data and Management Strategies for Recreational Fisheries with Annual Catch Limits. Washington, DC: The National Academies Press. doi: 10.17226/26185.

×

then the threshold value c for the third-largest catch estimate in n = 5 catch estimates to have a 5 percent chance of occurring is the solution to:

where any value for the third-largest catch greater than c would be considered an outlier.

FISHERIES APPLICATIONS: DECIDING WHETHER AN OUTLIER SHOULD TRIGGER A MANAGEMENT CHANGE

If an outlier were to occur, fishery managers would first check to ensure that the outlier was not due to an error in the data or an error in data processing. If the outlier were not due to an error, managers would need to decide whether (1) the outlier occurred by chance alone, and so should not trigger a change in fishery management policies (e.g., a change in control rules); or (2) the outlier is an indication that either the fish population or the fishery is changing, and that as a result, the probability distribution of X is shifting, so the outlier should trigger a change in fishery management policies. Typically, fishery managers would use their prespecified level of statistical significance (say, 5 percent) to decide between (1) and (2). If the outlier exceeded the threshold value of catch (such as b or c in the above examples), managers would decide that either the fish population or the fishery was changing, and that as a result, the probability distribution of X was shifting, so fishery management policies should be changed (or at least warrant further investigation).

FISHERIES APPLICATIONS: HOW TO UPDATE MANAGEMENT POLICY GIVEN A TRIGGERING OUTLIER

Given a triggering outlier, the outlier value of catch would be used to update the probability distribution of fish catch using Bayesian updating methodology as described under the Bayesian model of in-season management outlined in this report (see Appendix D). Other fisheries management policies (e.g., control rules) could then be updated based on the updated probability distribution of fish catch.

REFERENCE

Ross, S. 1988. A First Course in Probability, 3rd Edition. New York: Macmillan.

Page 210 Cite

Suggested Citation:"Appendix H: Defining and Managing Outliers in MRIP Output: An Order Statistics Approach." National Academies of Sciences, Engineering, and Medicine. 2021. Data and Management Strategies for Recreational Fisheries with Annual Catch Limits. Washington, DC: The National Academies Press. doi: 10.17226/26185.

×

This page intentionally left blank.

Data and Management Strategies for Recreational Fisheries with Annual Catch Limits (2021)

Chapter: Appendix H: Defining and Managing Outliers in MRIP Output: An Order Statistics Approach

Appendix H

Defining and Managing Outliers in MRIP Output: An Order Statistics Approach

ORDER STATISTICS

PROBABILITY DISTRIBUTIONS OF ORDER STATISTICS

FISHERIES APPLICATIONS: DEFINING AN OUTLIER

FISHERIES APPLICATIONS: DECIDING WHETHER AN OUTLIER SHOULD TRIGGER A MANAGEMENT CHANGE

FISHERIES APPLICATIONS: HOW TO UPDATE MANAGEMENT POLICY GIVEN A TRIGGERING OUTLIER

REFERENCE

Welcome to OpenBook!

Get Email Updates

Data and Management Strategies for Recreational Fisheries with Annual Catch Limits (2021)

Chapter: Appendix H: Defining and Managing Outliers in MRIP Output: An Order Statistics Approach

Appendix H Defining and Managing Outliers in MRIP Output: An Order Statistics Approach

ORDER STATISTICS

PROBABILITY DISTRIBUTIONS OF ORDER STATISTICS

FISHERIES APPLICATIONS: DEFINING AN OUTLIER

FISHERIES APPLICATIONS: DECIDING WHETHER AN OUTLIER SHOULD TRIGGER A MANAGEMENT CHANGE

FISHERIES APPLICATIONS: HOW TO UPDATE MANAGEMENT POLICY GIVEN A TRIGGERING OUTLIER

REFERENCE

Welcome to OpenBook!

Get Email Updates

Appendix H

Defining and Managing Outliers in MRIP Output: An Order Statistics Approach