the proportion in the population. Thus, the size of the population can be estimated by equating these two proportions and solving for it: N = mX/n. This is the so-called Petersen estimator (Seber, 2002).

The International Working Group for Disease Monitoring and Forecasting (1995a, 1995b) provides an excellent discussion of classical capture-recapture ideas. Other good discussions are given by Seber (2002) and Thompson (2002:Chapter 18). In a special issue of an academic journal focusing on recent developments in CRC, an editorial by Bohning (2008) also succinctly describes the state of CRC research.

Log-linear models are important in demography and are very useful in analyzing CRC data (Bishop et al., 1975). Such models have been proposed to allow for departures from homogeneity of the capture probabilities between individuals and/or associations between the two sampling processes (Fienberg, 1972). The capture history of an individual can be classified into four categories based on observation or non-observation in the first and second sample. This can be represented by a four-cell multinomial model. If the capture probabilities of the individuals are homogeneous within each of the samples, then the maximum likelihood estimate of N is the integer part of the Petersen estimator. If the captures and recaptures are treated as separate factors, then the number of capture histories falling into the various categories can be modeled as Poisson or multinomial counts. Different estimators can be derived under different assumptions about the population and sampling processes. More importantly, log-linear models allow for (positive or negative) dependencies between the captures to be modeled, especially if there are multiple recaptures (Bishop et al., 1975). A good application of this approach when two recaptures are made is given by Darroch and colleagues (1993). Pledger (2000) developed a unified linear-logistic framework for fitting many of these models. Baillargeon and Rivest (2007) present an R package to estimate many capture-recapture models, focusing on those that can be expressed in log-linear form.

Other approaches tend to model the heterogeneity in specific forms, typically by incorporating random effects for them. Darroch and colleagues (1993) developed Rasch-type models for CRC in the context of human censuses and supplementary demographic surveys. They also developed log-linear quasi-symmetry models. Other extensions include methods of finite mixtures to partition the population into two or more groups with relatively homogeneous capture probabilities. Examples of these are the logistic-normal generalized linear mixed model and log-linear latent class models with homogeneity within the classes (Agresti, 2002:Sections 12.3.6, 13.1.3, 13.2.6).

Fienberg and colleagues (1999) integrate many of the above approaches for multiple-recapture or multiple-list data in developing a mixed effects approach (fixed effects for the lists and random effects for the individuals). This approach allows the modeling of the dependence between lists



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement