Reliability refers to the method's trustworthiness, as indicated by stability over time or among different groups, or by consistency in application by different researchers or in different contexts. Theoretically, reliability quantifies the degree to which a constructed test overlaps a perfect measure of the characteristic of interest (Golden et al., 1984).
Measures available for estimating the reliability of classification or prediction research include internal consistency, interrater reliability, and test-retest reliability.
Internal Consistency Measuring internal consistency differs from a consistency check for validity of responses to logically equivalent or reversed items, mentioned above. Coefficient alpha (Cronbach, 1951), a common measure of internal consistency, is based on the average correlations among items purporting to be related to the same theoretical construct. An alpha coefficient exceeding .80 is generally deemed to show that a measurement device is internally consistent. The Kuder-Richardson 20 formula (KR20) (Kuder and Richardson, 1937) is a similar statistical measure designed for dichotomous items.4
Interrater Reliability Interrater reliability is ascertained by having multiple researchers or examiners score or code the identical set of raw data or the same observed behavior. The correlations among corresponding items are then calculated, with a typical standard of acceptance being average correlations exceeding .80 and preferably above .90.
Test-Retest Reliability Test-retest reliability is determined by asking respondents the same questions twice in a single administration of a questionnaire or on two different occasions separated by days or weeks, and calculating correlations between corresponding items.5
Accuracy refers to the discriminating power of the method: the magnitude of the distinctions among subgroups or the proportion of a subgroup actually displaying the outcome predicted for them. A prediction equation is valid if there is some statistically significant correlation between the predictors and the actual outcomes