Two main statistical approaches are used by cluster methods. In one, distances between all pairs of events (case-case, case-control, case-exposure, etc.) are calculated and summarized. These may be summarized as the number of case pairs closer than a certain separation distance, such as the mean distance between cases or such as the mean distance of each case to the nearest other case. On the other hand, the whole distribution of distances may be used. If controls are used, then a summary of the case distances may be compared to a summary of the control distances or even to case-control distances.
In the other approach, called the cell occupancy approach, the study area (i.e., time, space, or space-time) is divided into a set of cells, and each is assigned an expected number of cases on the basis of the overall disease rate, possibly adjusting for population density or other risk factors. Then, the observed number of cases in each cell is compared to the expected number in each cell.
Two main characteristics are used to evaluate preepidemiologic methods: bias and sensitivity. When one is planning to undertake a preepidemiologic investigation, it is important to evaluate these characteristics so that one appreciates the strengths and limitations of the method. When working with communities, it is at least as important to discuss these characteristics with the community representatives, so that they have an understanding of what the method can and cannot do. If the residents do not appreciate the limitations at the outset and the study gives a negative result, the community representatives may believe that the scientists are hiding something and that they purposefully used a method that could not find a cluster. However, if they understand the limitations at the outset, they can either accept or reject the methodological approach independent of the results, a far more objective and satisfying evaluation. On the other hand, researchers could work with residents to select a method that both is appropriate scientifically and accommodates residents' preferences. These characteristics are now considered.
Bias, the finding of a false effect or the obscuring of a real effect, is a critical factor of concern. For example, if investigators conducted a study of lung cancer incidence around an industrial facility, they might find a cluster of disease near the facility. However, it might be that data on smoking were not available, and it might be that a greater proportion of the population that lived close to the facility than those that live far away were smokers (due to clustering of people of similar ethnicity or socioeconomic status). If the investigators had been able to adjust for smoking, the cluster may have been fully explained, exonerating the industrial facility. Therefore, by not adjusting the data for smoking patterns, the investigators drew the wrong conclusion. Unfortunately, there is no easy way to test for this type of bias unless data on all possible factors that could