many readers may want to focus primarily on identifying the questions they need to ask about assessments under consideration and understanding the concepts well enough to appreciate the responses, rather than on a deep understanding of the statistical processes that determine how those questions can be answered.
Before an assessment instrument or test is used for the purpose of making decisions about children, it is necessary to have evidence showing that the assessment does what it claims to do, namely, that it accurately measures a characteristic or construct (or “outcome” as we are referring to it in this report). The evidence that is gathered to support the use of an assessment is referred to as validity evidence. Generally, when one asks the question “Is the assessment doing what it is supposed to do?” one is asking for validity evidence. A special kind of validity evidence relates to the consistency of the assessment—this may be consistency over repeated assessment or over different versions or forms of the assessment. This is termed reliability evidence.
This chapter reviews the history and logic of validity and reliability evidence, especially as it pertains to infants and young children. It is important to note that, first, when judging validity or reliability, one is judging a weight of evidence. Hence, one does not say that an assessment is “valid” or is “reliable”; instead, one uses an accumulation of evidence of diverse kinds to judge whether the assessment is suitable for the task for which it is intended. Second, when mustering evidence for validity or reliability, the evidence will pertain to specific types of uses (i.e., types of decisions). Some forms of evidence inform a wider range of types of decisions than others. Nonetheless, one should always consider evidence as pertaining to a specific set of decisions.
The field of assessment of human behavior and development is an evolving one and has undergone many changes in the last half-century. Some changes are the result of developments in the field itself; others are responses to the social and political context