more confidence in assigning ratings (although not necessarily greater accuracy) in the face of almost complete uncertainty.
Although the number of individuals in each group was small, the two pilot tests suggest that implementation of the model will require very clear and careful descriptions of the criteria as well as several rounds of voting and discussion conducted in conference or by other methods to establish criterion weights. Some criteria, such as prevalence, are familiar to many people but are used in this model in specific ways, particularly when referring to procedures and screening technologies. Other criteria, such as burden of illness, are unfamiliar and require a clear definition to ensure that group members use them comparably.
The committee drew several conclusions from its pilot tests. First, the model is feasible, but those implementing it will need to establish a method (e.g., a training session or other form of education) to ensure a common understanding of the criteria. Second, there is considerable merit to using a two-stage group method that first anchors the ends of a given subjective criterion for a given candidate list and then assigns scores within these extremes. Third, it will be critical to establish the reliability of the criterion weighting process to ensure that the process is informed and stable—as well as efficient. Fourth, the model should be modified on the basis of use and experience. Aspects of validity include the reasonableness of the product and its acceptability to and employment by intended users. The committee's pilot test began this process of evaluation and modification, but it must be continued by the model's users.