possible to specify the appropriate groups to which the prediction equations, test score cutoffs, or other classification schemes will apply. For example, the category "convicted male felons" in one state is not necessarily closely similar to the same category in another state.
Often, a research strategy is proposed in which tests, score cutoffs, or other methods are to be developed with one population and then subsequently tested on similar populations to demonstrate their robustness. However, the results to date of classification and prediction research on violence do not give much comfort to those who hope for the presence of robust findings. Instead, the sequential research strategy frequently demonstrates that the classification techniques applied to another group yield entirely different (unstable) categories, or that the classification or prediction equations are highly inaccurate when applied to different study populations.
A related difficulty is that the original study population, unknown to the researchers, may differ from other populations in regard to its variance on variables strongly correlated with violence. Such variables will not emerge in the analysis as being significantly associated with violence, even if they are measured correctly in the study population. In addition, unmeasured variables pertinent to the selection process may also be predictors of violence. Such variables tend to be forgotten in interpreting the results of the analysis, especially if the researchers never had a genuine opportunity to observe or measure the variables in question. Two-stage prediction studies, in which the probability of selection is estimated first and then the outcome is estimated, can help reduce these types of ambiguities in research results. However, there are not many studies in the existing violence prediction and classification literature that control for potential selection biases in this way.
The problem of classification and prediction instruments not transporting well from one setting to another is compounded by researchers' well-intentioned efforts to maximize prediction accuracy within a construct sample. Stepwise computing algorithms and test-retest procedures used on the same set of data will typically produce predictions that maximize sample-specific correlations; such predictions are prone to shrinkage (reduction in prediction accuracy) when used in a second sample that does not