Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
15 Methodological Issues in Research on Educational Interventions Research on educational interventions for young children with au- tism should inform consumers, policy makers, and scientists about prac- tices that produce positive outcomes for children and families. Ultimately, such research should be able to demonstrate that there is a causal relation- ship between an educational intervention and immediate or long-term changes that occur in development, behavior, social relationships, and normative life circumstances. A primary goal of early intervention re- search is to determine the types of practices that are most effective for children with specific characteristics (Guralnick, 1997). If young children with autistic spectrum disorders were homoge- neous in intelligence, behavior, and family circumstances, and if research- ers and educators could apply a uniform amount of treatment in nearly identical settings and life circumstances, then a standard, randomized- group, clinical-trial research design could be employed to provide un- equivocal answers to questions about treatments and outcomes. How- ever, the characteristics of young children with autistic spectrum disorders and their life circumstances are exceedingly heterogeneous in nature. This heterogeneity creates substantial problems when scientists attempt to use standard research methodology to address questions about the effective- ness of educational treatments for young children with autistic spectrum disorders. In this chapter we examine a range of issues related to research de- signs and methodologies. We begin by discussing the different research literatures that could inform early intervention research but which cur- rently are relatively independent. We then consider a range of method- 193
194 EDUCATING CHILDREN WITH AUTISM ological issues pertaining to research involving children with autistic spec- trum disorders, including information useful for describing samples; the benefits and practical problems of using randomized, clinical trial re- search design and the movement toward treatment comparison and apti- tude-by-treatment interactions; the relative benefits and limitations of single-subject research methodology; assessing fidelity of treatment; po- tential use of current methodologies for modeling developmental growth of children and factors affecting growth; and group size. SEPARATE LITERATURES There are several distinct, substantial, and independent bodies of re- search addressing issues concerning young children with autistic spec- trum disorders. One basic body of literature describes and attempts to explain the neurological (Minshew et al., 1997), behavioral (Sigman and Ruskin, 1999), and developmental (Wetherby and Prutting, 1984) charac- teristics of children with autistic spectrum disorders. A second body of research has addressed issues related to diagnosis, particularly early di- agnosis, of autism (Lord, 1997) and the related issue of prevalence (Fombonne, 1999). A third body of literature has examined the effects of comprehensive treatment programs on the immediate and long-term out- comes for young children with autistic spectrum disorders and their fami- lies (e.g., Harris et al., 1991; McEachin et al., 1993; Rogers and DiLalla, 1991; Strain and Hoyson, 2000). A fourth body of research has addressed individual instructional or intervention approaches that focus on specific aspects of a childâs behavior, such as social skills (McConnell, 1999), lan- guage and communication (Goldstein, 1999), or problem behavior (Horner et al., 2000). These four bodies of literature have different primary pur- poses (and research questions), conceptual and theoretical frames of ref- erence, and research methodologies. However, these research literatures all have the potential of informing the design, content, and evaluation of intervention procedures. Similarly, funding for autism intervention and educational research has also come from a number of federal institutes with separate, but over- lapping missions. These include the Office of Special Education Pro- grams (OSEP) in the U.S. Department of Education and the National Insti- tute of Child Health and Human Development (NICHD), National Institute of Mental Health (NIMH), National Institute of Neurological Disorders and Stroke (NINDS) and National Institute on Deafness and Other Communication Disorders (NIDCD), in the U.S. Department of Health and Human Services. More recently, parent-initiated, nonprofit agencies such as Autism Society of America Foundation, Cure Autism Now (CAN), and the National Alliance for Autism Research (NAAR) have had an increasing role in supporting and instigating research.
METHODOLOGICAL ISSUES IN RESEARCH 195 Although several of these literatures appear to be internally well inte- grated, there is remarkably little integration across literatures. For ex- ample, the information from the literature describing characteristics of children with autistic spectrum disorders is often not linked to treatment programs. Likewise, the developmental literature, which is descriptive in nature, has only rarely been integrated into individual intervention prac- tice research, which tends to be behaviorally oriented (see Lifter et al., 1993 for a notable exception). Similarly, research that emphasizes the relationships among behaviors in response to treatment has been much more rare than descriptive studies of development in multiple domains (Wolery and Garfinkle, 2000). Integration of the collective body of knowledge represented in these four literatures is important and could inform practice. It would be pro- ductive for leaders from these four research traditions to communicate regularly around the common issue of educational interventions for young children with autistic spectrum disorders. This communication could foster the research integration that appears to be missing from the literature. Communication could be enhanced by a series of meetings that bring together researchers and agencies who sponsor research, focus- ing on the task of reporting implications for designing programs for young children with autistic spectrum disorders. EARLY SCREENING AND DIAGNOSIS One assumption in early intervention research is that treatment should begin as soon as possible. However, to accomplish this, children must be identified. Early diagnosis has important implications for treat- ment, since different interventions would be appropriate for very young children (e.g., 15 months of age) than for children of 2 or 3 years old. There is a difference between screening and diagnosis. Screening, as understood in the United States, may mean two things. One is a process carried out by a primary care provider to decide whether a referral for more services is warranted: for example, a pediatrician, told by parents that their 18-month old child has poor eye contact and has stopped speak- ing within the last month, must decide whether and where to refer the child for further assessment. A second type of screening is a public health process by which health care providers routinely assess for risk for autis- tic spectrum disorders in children whose parents have not necessarily raised concerns. Diagnosis is a much more comprehensive process carried out by a specialized team of professionals. For autistic spectrum disorders, diag- nosis involves not only identifying the disorder and any other develop- mental and behavioral disorders associated with it, but also helping par- ents to understand the meaning of the diagnostic terms and what the
196 EDUCATING CHILDREN WITH AUTISM parents can do to help their children. (Issues relating to diagnosis are discussed in detail in Chapter 2.) In the early 1990s, the Checklist for Autism in Toddlers (CHAT) was developed as a creative, theoretically based attempt at a public health screening instrument (Baron-Cohen et al., 1992). With follow-up, how- ever, it appeared that the sensitivity of the CHAT in identifying autism in nonreferred children was far too low to be considered an appropriate screening tool (Baird et al., 2000). Nevertheless, the instrument has made a significant contribution as a first step in this area. The techniques de- scribed in the CHAT may also be helpful in providing a primary health care professional with some behaviors on which to focus during screen- ing (e.g., eye contact, pretending). Pilot data from a modification of this instrument, the M-CHAT, are in press. Other screening tools, such as the Pervasive Developmental Disor- ders Screening Test (PDDST; Siegel, 1998) and the Screening Tool for Autism in Two Year Olds (STAT, Stone, 1998), are used to determine whether further diagnostic assessments are merited after a concern has arisen. Each of these instruments has promise: an initial empirical evalu- ation of the STAT has just been published (Stone et al., 2000); an evalua- tion of the PDDST is not yet available. The Autism Screening Question- naire (ASQ; Berument et al., 1999) was developed for screening research participants 4 years of age and older. It has not yet been tested with younger children or with families who have not already received a diag- nosis of autistic spectrum disorder. Chapter 2 provides more information about screening, as do the interdisciplinary practice parameter guidelines described by Filipek and colleagues (2000). An adequate screening in- strument is not currently available either for public health screening or for a brief assessment when a concern arises. Addressing this need is a high priority for researchers. It involves determining how specifically the features of autistic spectrum disorders can be defined in toddlers and contrasting the benefits of this approach with more general identification of risk status. Research in diagnosis is at a quite different stage. Well-standardized and documented diagnostic instruments have been available for years. These include the Childhood Autism Rating Scale (CARS; Schopler et al., 1988), the Autism Diagnostic Interview-Revised (ADI-R; Lord et al., 1993), and the Autism Diagnostic Observation Schedule (ADOS; Lord et al., 2000). Although there are many ways that these instruments could be improved, their ability to document autism in a reliable and standard- ized way has been demonstrated. There are also numerous other instru- ments, including the Autism Behavior Checklist (Krug et al., 1980) and the Gilliam Autism Rating Scale (Gilliam, 1995), about which there are more questions regarding the degree to which their scores reflect accu- rate diagnosis.
METHODOLOGICAL ISSUES IN RESEARCH 197 Difficulties also remain for the most well-standardized instruments. While the CARS has been repeatedly shown to produce autism categori- zations much like diagnoses, the items on the scale no longer reflect cur- rent diagnostic criteria. The ADI-R and the ADOS produce operational categories that fit with current conceptualizations of autism, but they require training and are intended to be used by experienced clinicians. The ADI-R is also quite lengthy, taking about 2 hours to administer. Stan- dardization samples for both instruments are small, though replications of their diagnostic categorizations have been good (Yirmiya et al., 1994; Tanguay, 1998). Neither provides adequate discrimination between au- tism and other autistic spectrum disorders, though the ADOS makes a first attempt to do so. Thus, these instruments are important in providing standards for research, but their contributions to educational practice will require training of specialists (both in and outside educational systems) and perhaps modification of the instruments. DESCRIPTION OF PARTICIPANTS IN STUDIES To interpret the results of early intervention research and to conduct some of the sophisticated analyses described below, it is important to understand the characteristics of the participants in the studies. As men- tioned above, heterogeneity in child characteristics is nearly as much a defining feature of autistic spectrum disorders as are the DSM-IV criteria. Children with the same diagnosis of autistic spectrum disorders, gender, chronological age, and IQ score may well have a range of other different characteristics (e.g., problem behaviors, communication skills, play skills) and may respond differently to intervention treatments. In most research on comprehensive intervention programs using group designs, a limited amount of information is provided about the children participating in the study. Individual intervention practices research often uses a single-sub- ject design; anecdotal descriptions of participantsâ behaviors are some- times provided in addition to demographic information, but such de- scriptions do not follow a standard format. These limitations are reflected in the small proportion of studies that meet the highest standards for research in internal or external validity, as shown in Figures 1-1 and 1-2 (in Chapter 1), and the greater but still variable proportion that meet the second level of criteria in these areas. Vaguely described samples pose a problem for both group and single- subject designs. One problem is related to internal validity of the study (i.e., the degree to which a researcher can rule out alternative hypotheses that account for treatment outcomes [Campbell and Stanley, 1963]). Un- less specific information about participants is provided, it is impossible to know to whom the results of the study apply. For group design research, there are additional problems. When random assignment to treatment
198 EDUCATING CHILDREN WITH AUTISM groups occurs, the assumption is that the groups will be equivalent. How- ever, with a relatively small sample size, which is the case for most stud- ies of intervention effectiveness, it is essential for the researcher to con- firm that participants in different groups are equivalent on major variables that might affect outcome. If participants are vaguely described, then there is limited information about the equivalence of comparison groups. The recruitment, selection, and attrition of participants are also im- portant issues. Standards and expectations for reporting how potential research participants were identified and persuaded to participate, how they were selected from the pool of potential participants, and how many participants completed the study have been very different within differ- ent disciplines (e.g., experimental psychology and epidemiology) and dif- ferent perspectives (e.g., developmental and behavioral). With increasing attempts to integrate perspectives (see Filipek et al., 2000) to produce practical guidelines or meta-analyses, this information becomes crucial. For example, it is much more difficult to interpret results of a meta-analy- sis of success rates when a potentially large number of participants pro- posed for the research may have not been selected because they were deemed likely to be poor responders to an intervention, and another sig- nificant proportion of participants may not have completed their course of treatment. If samples are to be combined, and if interpretations are going to span fields, then there will be a need for more information about these processes. Researchers are often interested in the interactions between child or family characteristics and treatment, sometimes referred to as aptitude- by-treatment interactions. Such analyses allow researchers to determine if the intervention was more effective for participants with certain charac- teristics. For example, one type of comprehensive treatment program might produce more positive outcomes for children who communicate verbally than for children who are nonverbal. The analysis requires that a reliable measure of the child characteristic or âaptitudeâ variable be collected. Vague participant descriptions could preclude the possibility of such analyses. General, nonstandard participant descriptions also affect the external validity of studies (i.e., the degree to which the findings of a study can be generalized to other individuals not in the study [Campbell and Stanley, 1963]). To interpret for whom an individual intervention procedure or comprehensive intervention program might be effective, one has to have a clear understanding of who participated in the study. Both single- subject and group studies build their evidence for external validity on study replications. To compare the findings of different studies, research- ers must be able to determine that children with similar characteristics participated in the study. In many studies of children with autistic spectrum disorders, descrip-
METHODOLOGICAL ISSUES IN RESEARCH 199 tions of the familiesâ characteristics are either limited or absent. Family and community characteristics represent potential risk and opportunity variables (Gabarino and Ganzel, 2000); yet, there has been very limited research on the effects of such family and community variables on out- comes for children with autistic spectrum disorders (Wolery and Garfinkle, 2000). For example, it is possible that a young child with au- tism who lives in a single-parent family and low-income neighborhood will respond differently to treatment than a child with autism from a two- parent family living in a middle-class neighborhood. In order to investi- gate the effect of family and community characteristics on treatment out- comes, it is necessary to provide descriptive information about families of children who participate in intervention research. In order to further knowledge of the effects of interventions, it is critical that researchers develop and use standard procedures for describ- ing the characteristics of participants in their studies and of their families. In addition to the information that is routinely provided (e.g., standard- ized diagnosis, chronological age, gender, IQ), standard information should include measures of adaptive behavior, communication, social skills, school placement, and race. Also, information about the family should include number of parents living in the family, parentsâ education levels, and socioeconomic status. Although some recent studies have begun providing such information, this has not been the norm for the field. METHODOLOGICAL ISSUES To examine effectiveness of comprehensive early intervention pro- grams and individual intervention practices for children with autistic spectrum disorders, standards must be established for determining the causal relationship between the treatment procedures and the identified outcomes. The various experimental methodologies employed reflect the different literatures noted earlier. Studies documenting the effects of comprehensive treatment programs have employed experimental group designs, while those documenting individual practices have primarily employed single-subject designs, often replicated across several subjects. Randomized Clinical Trials The most rigorous approach for experimental group research design is the randomized clinical trial. In this design, study participants are randomly assigned, if possible by someone not associated with the pro- gram or knowledgeable about the participantsâ characteristics, to a treat- ment group that receives the educational intervention or to a comparison group that receives no educational intervention or a different form of
200 EDUCATING CHILDREN WITH AUTISM intervention (Kasari, 2000). Measurement of potential treatment effects (e.g., developmental assessments, family measures) occurs before the edu- cational intervention begins and again at the end of the intervention; the measurement is blind to which group a participant has been assigned to. Assuming that the groups are equivalent on the pretest measures, differ- ences at the end of the intervention are attributed to the treatment. As noted above, the purpose of random assignment is to control for or re- duce the likelihood that confounding variables (e.g., very determined parents requesting a particular treatment) would account for differences in outcomes for the treatment and contrast groups. Reviews of the literature to date (Rogers, 1998) and individual papers prepared for this committee (Kasari, 2000; Wolery and Garfinkle, 2000) show that the randomized clinical trial model has only rarely been used to determine treatment outcomes (see Jocelyn et al. and Smith et al.  for exceptions). Other studies have attempted to address the re- search question of treatment effectiveness by employing quasi-experi- mental designs (Cook and Campbell, 1979) in which nonrandomized con- trol or contrast groups are used as a basis for gauging treatment effects (Fenske et al., 1985). Another approach has been to use single group designs in which the changes in childrenâs development while they are in the program are compared with childrenâs rates of development before they entered the program, or to the rate of development of typically de- veloping children (Harris et al., 1991; Hoyson et al., 1984). These designs, while providing some information about treatment outcomes, may not control for important confounding variables, such as subject selection and nonspecific or placebo effects (see Campbell and Stanleyâs  classic paper on group experimental methodology). For programs providing treatment to young children with autistic spectrum disorders and their families, random assignment is often a diffi- cult procedure. By its very nature, it requires that some children and families be assigned to an alternative treatment condition. Unless two treatments of equal potential value can be compared, such assignment creates the ethical issue of not providing the most promising treatment to children who might benefit. An argument is sometimes made (as it often is in medical treatment studies) that until a treatment is supported by a randomized clinical trial, the evidence for effectiveness of the treatment does not exist. In addition, when children are randomly assigned to two different treatment conditions, a researcher still must closely assess the experiences of the child and family, because families may seek and obtain services for their children outside of the treatment study. Ideally, chil- dren and families could be assigned to equally attractive alternative treat- ments, so that the research question changes from one of single treatment effectiveness to treatment comparison. However, this approach would require the availability of two different and equally strong programs,
METHODOLOGICAL ISSUES IN RESEARCH 201 usually within the same geographic area, and the willingness of the pro- grams and parents to participate. This situation does not often occur. Another issue related to random assignment is the heterogeneity of the population of children with autistic spectrum disorders. Most treat- ment studies, because of the prevalence of autistic spectrum disorders and the expense and labor intensity of treatment, will have small sample sizes. Random assignment within a relatively small, heterogeneous sample does not ensure equivalent groups, so a researcher may match children on relevant characteristics (e.g., IQ score, age) and then select from the matched sets to randomly assign children to control and treat- ment groups. As noted above, such stratification of the sample of partici- pants requires a thorough description of the participants as well as confi- dence that the variable(s) on which children are matched are of greatest significance. An issue related to the size and heterogeneity of groups in the ran- domized clinical trail approach is statistical power (Cohen, 1988). Groups have to be large enough to detect a significant difference in treatment outcomes when it occurs. The smaller the size of the group, the larger the difference in treatment outcomes has to be in order to show a statistically significant effect. Also, variability on pretest measures, as may occur with heterogeneous samples, sometimes obscures treatment differences if the sample size is not sufficiently large. Because the number of children with autistic spectrum disorders enrolled in particular treatment pro- grams often is not large, sample size and within-group variability are challenges to the use of randomized clinical control methodology for determining the effectiveness of educational interventions for those children. Single-Subject Designs In contrast to group experimental designs, single-subject design meth- odology uses a smaller number of subjects and establishes the causal relationship between treatment and outcomes by a series of intrasubject or intersubject replications of treatment effects (Kazdin, 1982). The two most frequently used methods are the withdrawal-of-treatment design and the multiple baseline design. In the withdrawal of treatment design, a baseline level of perfor- mance (e.g., frequency of stereotypic behavior or social interactions) is established over a series of sessions, and a treatment is applied in a sec- ond phase of the study. When reliable changes in the outcome variable occur, the treatment is withdrawn in the third phase of the study, and concomitant changes in the outcome variable are examined. Often, the treatment is reinstated in a fourth phase of the study, with changes in the outcome variable expected. Changes in the outcome variable (e.g., in-
202 EDUCATING CHILDREN WITH AUTISM creases in desired behavior or decreases in undesirable behavior) that reliably occur when the treatment is implemented and withdrawn indi- cate a functional (i.e., causal) relationship between the treatment and out- come variables (Barlow and Hersen, 1984). This design is usually repli- cated with at least two or three participants. In a multiple baseline design, three (or more) participants may be involved. Data are collected for all participants in an initial baseline phase, and then the treatment is begun with one participant while the others remain in the baseline phase of the study. When changes occur for the first participant, the treatment is introduced for the second partici- pant, and when changes occur for the second participant, the treatment is introduced for the third participant. Variations on this design include multiple baselines across behaviors of single individuals and multiple baselines across settings. Again, the researcher infers a functional rela- tionship when changes reliably occur only after the treatment is imple- mented across (usually three) participants, settings, or behaviors. Single-subject designs differ from group designs in three ways. First, changes in the outcome variables are measured frequently (e.g., daily, weekly) rather than at the beginning and end of the treatment. The sec- ond is that visual analysis of differences in trends in the data (e.g., in- creases in social interaction or decreases in stereotypic behavior) is usu- ally used to determine the effectiveness of treatment, rather than statistical analyses between groups. Third, unlike group designs, in which the treat- ments often represent a range of theoretical perspectives, treatments evaluated through single-subject designs tend to follow an applied be- havior analysis theoretical orientation (Kazdin, 1982). There are methodological problems and limitations when single-sub- ject designs are applied to studying children with autistic spectrum disor- ders. The most obvious is that only a small number of children are in- volved in any single study, so the applicability of findings of a single study to other children is limited. Single-subject designs build their exter- nal validity on systematic replications across studies (Tawney and Gast, 1983). One set of current standards stipulates (Lonigan et al., 1998) that nine replications of studies with good experimental designs and treat- ment comparisons should be required for effectiveness of an intervention to be âwell-established,â while three replications of studies with the ac- ceptable methodological characteristics are necessary for an intervention to be identified as âprobably efficacious.â These are arbitrary, though useful, designations. The issue of inter- and intrasubject variability also exists for this meth- odology. Single-subject designs require that some level of stability in the participantsâ performance be reached before another phase is imple- mented, and variability in participantsâ behavior, as occurs for children with autistic spectrum disorders, may obscure comparisons across phases.
METHODOLOGICAL ISSUES IN RESEARCH 203 As noted above, the characteristics of the participants must be described explicitly in single-subject methodology, and variability in the character- istics of children with autistic spectrum disorders could result in children with very different characteristics participating in the same study. Such variability could contribute to the limitations of the external validity of a study. Two key issues in single-subject methodology relate to generalization and maintenance of treatment effects. In this context, generalization re- fers to the occurrence of desired treatment outcomes outside of the treat- ment settings and with individuals who were not involved in the treat- ment. Maintenance refers to the continued performance of the behaviors or skills acquired in treatment after the treatment has ended. Reviews of the literature suggest that evidence for generalization and maintenance data is weak for some single-subject treatments or has not routinely been assessed (Horner et al., 2000; McConnell, 1999). It should be emphasized that the issues of maintenance and generalization are not unique to single- subject research. Group design studies of comprehensive intervention programs have not often used measures of generalization and mainte- nance; the notable exceptions are the studies that have examined long- term follow-up of participants in comprehensive treatment programs (e.g., Harris and Handleman, 2000; McEachin et al., 1993; Strain and Hoyson, 2000). As shown in Figure 1-3 (in Chapter 1), generalization to natural settings was studied in about 30 percent of reported research concerning social and communication interventions, and not at all in the research reviewed in other areas. Some measurement of generalization and/or maintenance was addressed in an additional 10 to 40 percent of studies, with the greatest frequency in positive behavioral and communication interventions, but there is still much room for improvement. For research on early interventions for young children with autistic spectrum disor- ders, assessment of generalization and maintenance should be a standard feature of single-subject and group design studies. Particularly in autism, generalization to new contexts cannot be assumed, though it is the goal of most interventions. Developmental and Nonspecific Effects Two other related methodological issues affect both single-subject and pre-post group designs: the effects of development on maturation and the nonspecific, positive effects of participating in an intervention (even if no specific treatment is offered, as in placebo effects). Nonspecific treatment effects may also occur in single-subject designs. Both of these issues are relevant, to different degrees, to many studies in autistic spec- trum disorders conducted from a range of theoretical perspectives. For many behaviors, most children with autistic spectrum disorders show
204 EDUCATING CHILDREN WITH AUTISM gradual improvement, whether or not they receive intervention. For ex- ample, some children with autism learn to talk without direct language intervention; many learn to sit, dress themselves, and sort and match items without highly specific interventions. In addition, there are carryover effects of one intervention to another (e.g., teaching appropriate play often decreases repetitious behavior and may increase eye contact). This carryover is a positive factor that is extremely important for children. However, it limits interpretation of designs, such as multiple baselines, that assume that behaviors are independent, and designs such as pre-post testing, which assume that all improvements are due to the treatment specified (and not to carryover from other phenomena, such as a change in parentsâ behavior). For children and their families, there are also strong effects of being in a program and feeling that they are receiving treatment, even when there is no âactive ingredientâ of the intervention. These effects have been repeatedly documented in education, medicine, and psychology in com- parisons of open trials with randomized clinical trials; they are also rel- evant to single-subject designs in which the intervenor is also the princi- pal data collector. âBlindnessâ to which children and families receive which treatments, and to the characteristics of participants, in at least some of the assessmentsâeven in single-subject designsâwould consid- erably improve the interpretability of results. On the whole, developmental and nonspecific or placebo effects are positive factors for children and families. They attest to the positive tra- jectory of many behaviors and the power of hope and perceived purpose. However, recognizing the potential contributions of these factors is cru- cial in interpreting the results of specific interventions. There are method- ological features of research designs that can be applied to control for maturation and nonspecific effects. For example, a randomized group design using a contrast intervention as a control for a treatment of interest and a single-subject design in which the baseline has a form of treatment being provided can be applied to enhance the interpretation of such ef- fects. Replications and Measures of Treatment Effects For single-subject and group experimental designs, the issues of rep- lication of studies and measurement of treatment outcomes are impor- tant. Research on comprehensive intervention programs and individual intervention approaches tends to be conducted and replicated by indi- viduals who developed the approaches. Evidence for the effectiveness of these approaches is strengthened when researchers who are independent of the developers replicate findings of effectiveness. This form of replica- tion has generally not occurred in the research on comprehensive treat-
METHODOLOGICAL ISSUES IN RESEARCH 205 ment programs. For individual intervention techniques, interventions addressing language and communication skills (see Goldstein, 1999) and problem behaviors (see Horner et al., 2000) are the most often replicated by different investigators. Independent measurement or verification of treatment outcome is another important issue. The potential effect of experimenter bias exists when outcome assessments are conducted by individuals who know about the nature of the research study, the treatment groups to which children are assigned, and the phases of studies in which children are participating. For most group and single-subject design research, out- come data are collected by project staff; this may introduce a potential confounding effect. This confounding effect may be countered by having blind or naive assessors collect pre- and post-outcome data for group designs and daily performance data for single-subject designs. Also, for single-subject designs, the assessment of socially important outcomes of interventions by individuals outside of the project, called âsocial validityâ (Schwartz and Baer, 1991; Wolf, 1978), provides some control of potential bias by observers, raters, and testers. Interaction Between Treatment and Child or Family Characteristics In experimental group designs, the average or mean performances of children on outcome measures and standard deviations are generally re- ported for each group. The standard deviation describes the variation of outcome scores around the mean. In group-design studies, children make different amounts of progress, with some possibly scoring much higher and some scoring much lower than the mean. Analyses of group means does not provide information about which children benefited the most or least from treatment. To obtain more specific knowledge about the characteristics of chil- dren that are associated with performance, researchers analyze aptitude- by-treatment interactions or ATIs. For example, an examination of differ- ent language training curricula for preschool children with disabilities (not specifically autism) did not find a main effect for treatment (i.e., both treatments appeared to be equally effective) (Cole et al., 1991). However, when they analyzed the interaction of treatment by aptitude, they found that children who were higher performers on pretest measures benefited more from a didactic language training approach, and children who were lower performers at pretest benefited more from a responsive curriculum approach to language training. This type of aptitude-by-treatment-interaction analysis has the poten- tial for providing valuable information about the characteristics of chil- dren with autistic spectrum disorders that are associated with outcomes
206 EDUCATING CHILDREN WITH AUTISM for comprehensive treatment programs, but these analyses have rarely been conducted. Studying interactions between child or family features and treatment requires a sample size large enough to generate sufficient power to detect a difference. For example, in one study, children diag- nosed as having autism or pervasive developmental disorder were ran- domly assigned to an intensive intervention program based upon the UCLA Young Autism Project model or a parent training model. Although it appeared that children with pervasive developmental disorder scored consistently higher than children with autism on some measures, there were no significant differences between groups (Smith et al., 2000). The authors attributed the failure to find significant difference to the small sample size (6-7 in each subgroup in each experimental condition). In another example, Harris and Handleman (2000) examined class place- ments of children with autism 4-6 years after they had left a comprehen- sive early intervention program. In an aptitude-by-treatment-interaction type analysis, they found that children who entered their program at an earlier age (mean = 46 months) and had relatively higher IQ scores at intake (mean = 78 months) were significantly more likely to be in regular class placements, and children with relatively lower IQ scores at intake who entered the program later (54 months) were more likely to be placed in special education classes. Even with a relatively small number of par- ticipants (28), the robustness of this finding provided information about characteristics of the children who were likely to benefit most from the program. Fidelity of Treatment In addition to assessing outcome measures, it is important for re- searchers examining the effects of educational interventions to verify that the treatment was delivered. Measurement of the delivery of an indi- vidual intervention practice or comprehensive intervention program has been called fidelity of treatment, treatment implementation, and proce- dural reliability (Billingsley et al., 1980; Hall and Louchs, 1977). Here we use the term treatment fidelity. Treatment fidelity requires that researchers operationally define their intervention or the components of their comprehensive program well enough so that they or others can assess the degree to which procedures have been carried out. Such assessment takes different forms (e.g., direct observations with discrete behavioral categories, checklists, etc.). For ex- ample, staff of the LEAP preschool program (see Chapter 12) have devel- oped a set of fidelity-of-treatment protocols that assess whether eight components of the program are being implemented: positive behavioral guidance, interactions with families, teaching strategies, interactions with children, classroom organization and planning, teaching communication
METHODOLOGICAL ISSUES IN RESEARCH 207 skills, IEPs and measuring progress, and promoting social interaction (LEAP Preschool and Outreach Project, 1999). These protocols could be used in a research capacity to document the level of implementation of the comprehensive program. Also, as Strain (2000) indicated, they were used in the LEAP program to provide feedback to staff on their level of implementation in order to maintain treatment fidelity. Some researchers use hours of service provided as a measure of the intensiveness of inter- vention (Smith et al., 2000). Although it provides important information, hours of service is not an adequate measure of treatment fidelity, because it does not describe the procedures used during the service hours. As- sessment of treatment fidelity has a long history in general education (see Leinhardt, 1980) and has been proposed as a standard for high quality intervention research in early intervention for children with disabilities (LeLaurin and Wolery, 1992). However, one review of early intervention programs for children with autism (Wolery and Garfinkle, 2000) found that only 4 out of 15 programs provided any evidence of implementation of program components. In future research on educational intervention for young children with autistic spectrum disorders and their families, measurement of the fidelity of treatment should be a standard feature of the program of research and publication of findings. Modeling Growth and Intervention Effects In most experimental group studies, as noted above, the developmen- tal growth of children with autistic spectrum disorders is measured through the collection of pretest and posttest outcome measures, followed by analyses of differences between groups. More sophisticated proce- dures for examining the growth and development of children are avail- able (Dunst and Trivette, 1994), but they have not been used in analyses of intervention outcomes for young children with autistic spectrum disor- ders. Growth curve analysis (Burchinal and Appelbaum, 1991) and the related techniques of hierarchical linear regression modeling (Bryk and Raudenbush, 1987) and structural equation modeling (Willet and Sayer, 1994) have been used to model the growth of groups of children for whom longitudinal data are available. These techniques may also be used to examine patterns of growth for children with different types of character- istics or children involved in different types of treatment conditions or programs (e.g., Burchinal, 1999; Burchinal, Bailey and Synder, 1994; Hatton et al., 1997). Natural history studies of development in children with autistic spectrum disorders are critical using these methods to pro- vide both theoretically based insight and empirical âbaselines.â The advantage of growth curve analysis and related regression mod- els is that they allow researchers to control for nested variables (e.g., children participating in the same intervention but in different class-
208 EDUCATING CHILDREN WITH AUTISM rooms), nonrandom missing data (i.e., an assessment that occurred at the wrong time or that is missing), and extreme scores of students (Burchinal et al., 1994). Also, hierarchical linear regression modeling and structural equation modeling allow researchers to determine the relationships of variables, in addition to assignment to an early intervention and contrast group conditions, that are associated with development of children (e.g., family characteristics, degree of implementation of the program). One difficulty in using these techniques in studies of children with autistic spectrum disorders is that many of these techniques require large sample sizes, but most studies of young children with autistic spectrum disorders have small numbers. Nevertheless, to the extent possible, re- searchers of educational intervention programs for young children with autistic spectrum disorders should consider adopting these or similar models for analyzing variables affecting childrenâs development and learning. This may require that program developers include sufficient sample sizes in their programs over several years; multiple data points per participant are also required. Group Size and Experimental Group Design A clear problem mentioned at several points in the preceding discus- sion is that methodological tools available to researchers, such as studies of individual differences in response to treatments and sophisticated re- gression-based techniques, such as hierarchical linear regression model- ing, are limited by the number of children with autistic spectrum disor- ders in intervention programs and the number of data points collected. Implementing an early intervention program for children and families is a labor-intensive and expensive endeavor. Because of the expense, length of treatment, and heterogeneous nature of autistic spectrum disorders, the number of young children in an individual treatment program is usually small. As noted, one solution for program developers is to collect data for multiple cohorts, building their numbers across years. However, this approach requires multiple years of funding and long-term commit- ments from investigators. One solution of the sample size problem is the development of a multi-site study of treatment effectiveness. Such a study could be based on a treatment comparison model and could perhaps (because of its po- tential magnitude) be funded by multiple coordinating agencies (e.g., National Institute of Child Health and Human Development, Office of Special Education Programs, National Institute of Mental Health, Center for Disease Control, National Institute on Deafness and Other Communi- cation Disorders, National Institute of Neurological Disorders and Stroke). There is a precedent for federal funding for large initiatives such as this in other areas (e.g., Fast Track project for aggressive children, Infant Health
METHODOLOGICAL ISSUES IN RESEARCH 209 and Development Project, National Institute for Child Health and Human Development Child Care Study). The current coordination of the bio- medical grants in autism funded by the National Institute for Child Health and Human Development and the National Institute on Deafness and Other Communication Disorders in the Collaborative Program for Excel- lence in Autism (CPEA), and efforts to coordinate genetics studies funded by many different agencies, may represent models for such a project. Qualitative Research We have not reviewed qualitative or ethnographic research studies. Although such studies may add to the knowledge about program fea- tures and outcomes for young children with autistic spectrum disorders (Schwartz et al., 1998), the research literature is quite small and does not contain systematic examinations of programwide effects for young chil- dren and families. Qualitative and ethnographic research does hold prom- ise for uncovering important features in educational interventions pro- grams that affect the development of young children with autistic spectrum disorders and their families. FROM RESEARCH TO PRACTICE There is an active research literature on the developmental character- istics, diagnostic criteria, comprehensive treatment programs, and indi- vidual intervention strategies for young children with autistic spectrum disorders. The literature provides a tentative but important basis on which to design intervention strategies and decisions about treatment options for individual children. However, there are concerns about meth- odological issues. Considering these concerns, funding agencies and pro- fessional journals should require minimal standards in design and de- scription of intervention research studies. These studies should include the following information: participantsâ chronological age, developmen- tal assessment data (including verbal and nonverbal levels of perfor- mance), standardized diagnoses, gender, race, family characteristics, so- cioeconomic status, and relevant health or other biological impairments. In addition, fidelity of treatment documentation must operationally define the intervention in sufficient detail so that an external group could replicate it as well as assess the degree of implementation. Independent, objective assessment of expected outcomes should be conducted at regu- lar intervals, and immediate and long-term assessment of effects on chil- dren and families should include measures of generalization and mainte- nance. Future research on intervention programs for young children with autistic spectrum disorders should address the following methodological
210 EDUCATING CHILDREN WITH AUTISM issues: application of standardized procedures for describing participants in intervention studies, including childrenâs diagnoses, chronological age, developmental and behavioral information, family information, gender, sociometric status, race, and pertinent health or biological information; the association between fidelity of treatment information and treatment outcomes; the association between participantsâ characteristics and treat- ment outcomes (e.g., aptitude-by-treatment interactions); the develop- ment of early identification procedures and their relationship to early access to services; and identification of program features (i.e., âactive in- gredientsâ of intervention programs) that relate most directly to child and family outcomes. The impact on growth for young children with autistic spectrum disorders may be measured by techniques such as growth curve analysis, hierarchical linear modeling, and/or structural equation model- ing to model the longitudinal growth and treatment. Addressing these methodological issues will require larger sample sizes, longitudinal follow-ups of participants, and interdisciplinary col- laboration. To enable such needed research, initiatives should be funded jointly by federal agencies responsible for research, development, and services for young children with autistic spectrum disorders (including the Office of Special Education Programs, the Office of Educational Re- search and Improvement, the National Institute of Child Health and Hu- man Development, the National Institute of Mental Health, the National Institute of Neurological Disorders and Stroke, and the National Institute on Deafness and Other Communication Disorders). These initiatives should include a task force that meets regularly to design and provide a synthesis of the diagnostic, developmental, behavioral, and treatment re- search that would inform the design and implementation of early educa- tional treatment for young children with autistic spectrum disorders; con- sideration of the feasibility of a national, cross-site, longitudinal investigation of early intervention treatments for young children with autistic spectrum disorders and their families; and development of spe- cific measurement tools for early diagnosis of children with autistic spec- trum disorders and treatment outcomes (e.g., social functioning, sponta- neous communication and language, peer relationships, and competence in natural settings). Agencies funding competitive research initiatives should include personnel with sufficient research and experiential back- ground to judge the scientific and practical merits of proposals.