**GENERAL APPROACHES TO CAPTURE-RECAPTURE SAMPLING**

Capture-recapture sampling (CRC) has a history reaching back at least to the 19th century (Bohning, 2008; Goudie and Goudie, 2007). It is often used to estimate the total number of individuals in a population. In its simplest form, an initial sample is obtained from the population and the individuals in the sample are “marked” in such a way that one can subsequently observe if the individual was in the sample. A second sample is obtained independently, and the number of individuals marked in the first sample is recorded. Under simplifying assumptions about the representativeness of marked individuals in both samples, the total number of individuals in the population can be estimated (Thompson, 2002). In the case of more than one recapture sample, the names “multiple-recapture,” “multiple-system methods,” or “multiple list” are often used.

CRC methods have a long history in the estimation of the abundance of biological populations, such as fish, birds, and mammals. More recently, they have been used to estimate the abundance of hard-to-reach human populations such as the homeless (Hopper et al., 2008; Laska and Meisner, 1993; Sudman et al., 1988) and to adjust for census undercounts of minorities (Darroch et al., 1993). For human populations, CRC methods are referred to as “dual-system methods” or “dual-list methods.”

Let *N* be the population size, *n* and *m* be the initial and second sample sizes, and *X* be the number of marked individuals in the second sample. Intuitively, if the second sample is representative of the population as a whole, then the proportion of marked individuals in it will be close to

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 135

Appendix B
Review of Capture-Recapture Ideas for
Measuring the Flow of Unauthorized
Crossings at the U.S.–Mexico Border
GENERAL APPROACHES TO CAPTURE-RECAPTURE SAMPLING
Capture-recapture sampling (CRC) has a history reaching back at least
to the 19th century (Bohning, 2008; Goudie and Goudie, 2007). It is often
used to estimate the total number of individuals in a population. In its sim-
plest form, an initial sample is obtained from the population and the indi-
viduals in the sample are “marked” in such a way that one can subsequently
observe if the individual was in the sample. A second sample is obtained
independently, and the number of individuals marked in the first sample is
recorded. Under simplifying assumptions about the representativeness of
marked individuals in both samples, the total number of individuals in the
population can be estimated (Thompson, 2002). In the case of more than
one recapture sample, the names “multiple-recapture,” “multiple-system
methods,” or “multiple list” are often used.
CRC methods have a long history in the estimation of the abundance
of biological populations, such as fish, birds, and mammals. More recently,
they have been used to estimate the abundance of hard-to-reach human
populations such as the homeless (Hopper et al., 2008; Laska and Meisner,
1993; Sudman et al., 1988) and to adjust for census undercounts of mi-
norities (Darroch et al., 1993). For human populations, CRC methods are
referred to as “dual-system methods” or “dual-list methods.”
Let N be the population size, n and m be the initial and second sample
sizes, and X be the number of marked individuals in the second sample.
Intuitively, if the second sample is representative of the population as a
whole, then the proportion of marked individuals in it will be close to
135

OCR for page 135

136 ESTIMATING ILLEGAL ENTRIES AT THE U.S.–MEXICO BORDER
the proportion in the population. Thus, the size of the population can be
estimated by equating these two proportions and solving for it: N = mX/n.
This is the so-called Petersen estimator (Seber, 2002).
The International Working Group for Disease Monitoring and Fore-
casting (1995a, 1995b) provides an excellent discussion of classical capture-
recapture ideas. Other good discussions are given by Seber (2002) and
Thompson (2002:Chapter 18). In a special issue of an academic journal
focusing on recent developments in CRC, an editorial by Bohning (2008)
also succinctly describes the state of CRC research.
Log-linear models are important in demography and are very useful in
analyzing CRC data (Bishop et al., 1975). Such models have been proposed
to allow for departures from homogeneity of the capture probabilities be-
tween individuals and/or associations between the two sampling processes
(Fienberg, 1972). The capture history of an individual can be classified into
four categories based on observation or non-observation in the first and
second sample. This can be represented by a four-cell multinomial model.
If the capture probabilities of the individuals are homogeneous within each
of the samples, then the maximum likelihood estimate of N is the integer
part of the Petersen estimator. If the captures and recaptures are treated as
separate factors, then the number of capture histories falling into the vari-
ous categories can be modeled as Poisson or multinomial counts. Different
estimators can be derived under different assumptions about the popula-
tion and sampling processes. More importantly, log-linear models allow for
(positive or negative) dependencies between the captures to be modeled,
especially if there are multiple recaptures (Bishop et al., 1975). A good
application of this approach when two recaptures are made is given by
Darroch and colleagues (1993). Pledger (2000) developed a unified linear-
logistic framework for fitting many of these models. Baillargeon and Rivest
(2007) present an R package to estimate many capture-recapture models,
focusing on those that can be expressed in log-linear form.
Other approaches tend to model the heterogeneity in specific forms,
typically by incorporating random effects for them. Darroch and colleagues
(1993) developed Rasch-type models for CRC in the context of human
censuses and supplementary demographic surveys. They also developed
log-linear quasi-symmetry models. Other extensions include methods of
finite mixtures to partition the population into two or more groups with
relatively homogeneous capture probabilities. Examples of these are the
logistic-normal generalized linear mixed model and log-linear latent class
models with homogeneity within the classes (Agresti, 2002:Sections 12.3.6,
13.1.3, 13.2.6).
Fienberg and colleagues (1999) integrate many of the above approaches
for multiple-recapture or multiple-list data in developing a mixed effects
approach (fixed effects for the lists and random effects for the individu-
als). This approach allows the modeling of the dependence between lists

OCR for page 135

APPENDIX B 137
and the incorporation of covariates. They develop Bayesian inference for
their specification. Manrique-Vallier and Fienberg (2008) expand on this
approach, modeling individual-level heterogeneity using a Grade of Mem-
bership model wherein individuals are postulated as mixtures of latent
homogeneous but extreme “ideal” types.
Many populations, including that of unauthorized crossers, are open
in the sense that the population experiences change during or between
the sampling (e.g., births, deaths). Many of the models reviewed above
implicitly presume the population is closed (i.e., have fixed and unchang-
ing membership). For open populations, interest typically has focused on
the case where the population is closed during the period of each capture
and experiences immigration and mortality between the capture periods.
Cormack (1989) reviews many of the classical models for this case. Pledger
and colleagues (2003) extend these to allow for individual heterogeneity in
survival and capture rates using a finite mixture formulation. These models
are receiving continuous development (see the review by Royle and Dorazio
[2010]).
CAPTURE-RECAPTURE APPLICATIONS TO
UNAUTHORIZED BORDER CROSSINGS
The most direct expression of capture-recapture ideas as applied to
unauthorized border crossings is the work of Espenshade (1990, 1995b)
and Singer and Massey (1998). They develop simple CRC models in the
context of apprehensions (“capture”) and re-apprehension (“recapture”)
of unauthorized crossers. Specifically, Espenshade (1995b) models as a
geometric distribution the number of crossings an individual makes until a
successful crossing. Under assumptions that individuals continue to attempt
crossings until they succeed, that the probability of success is the same for
each attempt, and other strong assumptions, he derives the equivalent of the
Petersen estimator for the number of unauthorized crossers. He does not
develop measures of uncertainty of this estimate, nor does he tie the work
into the broader CRC literature. This approach is similar in spirit to that of
the “frequency of apprehension frequencies” discussed in Chapter 5. Chang
and colleagues (2006) extend these methods to treat “discouragement” due
to prior apprehension and “return and rentry” due to unobserved exit and
reentry into the United States. However, the panel did not have access to
their paper and therefore could not review it; the only available description
was by Morral and colleagues (2011).
A variant of CRC is “red teaming,” in which individuals are recruited
to attempt to cross so as to get an estimate of the probability of apprehen-
sion. This is referred to as plant-capture in the ecological literature (Goudie
et al., 2007).

OCR for page 135