The concern for fairness is reflected in the procedures used to develop assessment tasks, in the content and language of the assessment tasks, in the processes by which students are assessed, and in the analyses of assessment results.
LARGE-SCALE ASSESSMENTS MUST USE STATISTICAL TECHNIQUES TO IDENTIFY POTENTIAL BIAS AMONG SUBGROUPS. Statistical techniques require that both sexes and different racial and ethnic backgrounds be included in the development of large-scale assessments. Bias can be determined with some certainty through the combination of statistical evidence and expert judgment. For instance, if an exercise to assess understanding of inertia using a flywheel results in differential performance between females and males, a judgment that the exercise is biased might be plausible based on the assumption that males and females have different experiences with flywheels.
ASSESSMENT TASKS MUST BE MODIFIED APPROPRIATELY TO ACCOMMODATE THE NEEDS OF STUDENTS WITH PHYSICAL DISABILITIES, LEARNING DISABILITIES, OR LIMITED ENGLISH PROFICIENCY. Whether assessments are large scale or teacher conducted, the principle of fairness requires that data-collection methods allow students with physical disabilities, learning disabilities, or limited English proficiency to demonstrate the full extent of their science knowledge and skills.
ASSESSMENT TASKS MUST BE SET IN A VARIETY OF CONTEXTS, BE ENGAGING TO STUDENTS WITH DIFFERENT INTERESTS AND EXPERIENCES, AND MUST NOT ASSUME THE PERSPECTIVE OR EXPERIENCE OF A PARTICULAR GENDER, RACIAL, OR ETHNIC GROUP. The requirement that assessment exercises be authentic and thus in context increases the likelihood that all tasks have some degree of bias for some population of students. Some contexts will have more appeal to males and others to females. If, however, assessments employ a variety of tasks, the collection will be "equally unfair" to all. This is one way in which the deleterious effects of bias can be avoided.
The inferences made from assessments about student achievement and opportunity to learn must be sound.
When making inferences from assessment data about student achievement and opportunity to learn science, explicit reference needs to be made to the assumptions on which the inferences are based.
Even when assessments are well planned and the quality of the resulting data high, the interpretations of the empirical evidence can result in quite different conclusions. Making inferences involves looking at empirical data through the lenses of theory, personal beliefs, and personal experience. Making objective inferences is extremely difficult, partly because individuals are not always aware of their assumptions. Consequently, confidence in the validity of inferences requires explicit reference to the assumptions on which those inferences are based.
For example, if the science achievement on a large-scale assessment of a sample of students from a certain population is high, several conclusions are possible. Students