B Outcome Measures for Assessing Integrity in the Research Environment

This appendix to the report describes outcome measures and models for the development of outcome measures that could be used or adapted for use by institutions and educators who wish to assess integrity in the research environment. These measures can be applied to assessments of individuals or institutions by processes recommended in this report.

The appendix describes two kinds of outcome measures. First, it describes measures that have been used to assess the moral climate of an institution. Although measures have not been developed specifically for assessment of the climate of integrity in the research institution, measures and methods that could be adapted for use by research institutions have been developed in other settings.

Second, the appendix describes measures that have been used to assess aspects of integrity of the individual. The goal is to recommend measures that could be used (or adapted) by researchers or institutions interested in assessing outcomes of educational efforts to promote the development of integrity in research in trainees. The emphasis will be on outcome measures that are theoretically grounded, that are at least indirect measures of behavior, and that either have been effectively used or have good potential for linking the development of aspects of integrity (e.g., ethical sensitivity, moral reasoning and judgment, and identity formation) to institutional effectiveness. In cases in which a recommended measure cannot be used exactly as designed, the criterion for determination of inclusion in this review is whether the method of assessment has



The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement



Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct B Outcome Measures for Assessing Integrity in the Research Environment This appendix to the report describes outcome measures and models for the development of outcome measures that could be used or adapted for use by institutions and educators who wish to assess integrity in the research environment. These measures can be applied to assessments of individuals or institutions by processes recommended in this report. The appendix describes two kinds of outcome measures. First, it describes measures that have been used to assess the moral climate of an institution. Although measures have not been developed specifically for assessment of the climate of integrity in the research institution, measures and methods that could be adapted for use by research institutions have been developed in other settings. Second, the appendix describes measures that have been used to assess aspects of integrity of the individual. The goal is to recommend measures that could be used (or adapted) by researchers or institutions interested in assessing outcomes of educational efforts to promote the development of integrity in research in trainees. The emphasis will be on outcome measures that are theoretically grounded, that are at least indirect measures of behavior, and that either have been effectively used or have good potential for linking the development of aspects of integrity (e.g., ethical sensitivity, moral reasoning and judgment, and identity formation) to institutional effectiveness. In cases in which a recommended measure cannot be used exactly as designed, the criterion for determination of inclusion in this review is whether the method of assessment has

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct been sufficiently well validated—even if it is in a setting other than research—to warrant adaptation to the research environment. In summary, measures that meet the following criteria are included: (1) they are theoretically well grounded in a model of morality that demonstrates the relationship between aspects of integrity and behavior; (2) they meet or exceed the minimal criteria for validity and reliability; (3) they have been successfully used to assess learning outcomes for adults either in research ethics programs or in professional ethics programs; (4) they have been used effectively to assess institutional effectiveness in promoting one or more aspects of integrity; and (5) the method of measurement is appropriate for assessment of an aspect of integrity in the research environment, even though the content of the measure may be specific to another discipline. Note that this discussion does not include measures or tests that assess content knowledge of the rules related to the conduct of research, measures that assess perceptions of the integrity of others (e.g., survey instruments designed for the Acadia Institute study), or measures designed to assess the norms of scientists with respect to misconduct and questionable research practices (Bebeau and Davis, 1996; Korenman et al., 1998). The latter might serve as a resource for the development of items for use in a survey of the moral climate of an institution or for items for assessment of role concept development. METHODS AND MEASURES FOR ASSESSING INTEGRITY IN THE RESEARCH ENVIRONMENT Two bodies of literature contribute to the understanding of moral climate and its importance for the assessment of integrity in the research environment. The first is the literature on individual moral development, indicating that individual characteristics are not sufficient as an explanation for ethical behavior. Thus, efforts to influence behavior by focusing on the development of abilities related to decision making may be necessary, but not sufficient, to affect integrity in the research environment. The second is the literature on organizational culture and climate that highlights the different kinds of cultures that may be operating in the environment. There is a growing belief that organizations are social actors responsible for the ethical or unethical behaviors of their employees. In fact, corporations (Bowen and Power, 1993) have been held responsible under the law for acts of malfeasance and misfeasance engaged in by employees, sometimes even when the acts of those employees were beyond the scope of their employment. Such instances prompted scholars in the field of organizational development to turn their attention to the assessment of moral climate and to an analysis of the effects of moral climate on decision making.

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct Individual Development and Its Relationship to Collective Norms In the early 1980s, developmental psychologists working in correctional facilities and high schools introduced the concept of a “moral atmosphere” or “just community” to explain the social context that shaped collective norms, which seemed either to inhibit or to override the influence of individual moral development on behavior. To measure moral atmosphere, researchers (Higgins et al., 1984; Power, 1980; Power et al., 1989) presented students with dilemmas likely to occur in their environment. For example, in a high school setting, researchers might present situations involving someone who cheated on an exam or someone who was rude to others. The researcher elicits judgments of responsibility (e.g., What do you think _____ should do? Why?) and judgments of practicality (e.g., What would you do? Why?). These were contrasted with perceptions of the collective norms (What would most others in your school do in this situation? Why would they do that?). Through interviews, researchers were able to identify collective norms and establish whether the norm emerged from within the group or was stipulated by authority external to the group. Then, the degree to which the norm met moral standards and the degree to which individuals were committed to each norm were assessed. By use of this strategy, it was possible to detect groups with strong, but morally defective, collective norms.1 Furthermore, researchers were able to show that groups develop collective norms that belong only to the group. When prosocial collective norms defined what was expected of group members as group members, individuals tended to conform to group norms even when their competence in moral decision making was not well developed. However, when the collective norms did not encourage prosocial behavior,2 individuals with higher levels of competence in moral development felt alienated and discouraged from engaging in actions consistent with their level of competence. Higgins and colleagues (1984) concluded that practical moral action is not simply a product of an individual’s moral competence but is a product of the interaction between his or her competence and the moral features of the situation. Melissa Anderson, in a National Science Foundation-funded longitudinal study of doctoral students’ acquisition of the concepts of science and its norms, uses interview questions similar to those used to elicit 1   Examples of groups with morally defective collective norms might include repressive totalitarian states, fanatical cults, violent gangs, and organized crime. 2   Psychologists use the term prosocial behaviors to distinguish behaviors that are clearly beneficial to another and support societal or communal norms from behaviors that may be norm or rule based (as in a teen-age gang or criminal group) but support the self, or hurt others. A prosocial behavior is not necessarily selfless.

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct implicit norms that shape behavior in the studies cited above. Anderson describes the interview questions as follows: “A series of questions ask students to consider and comment on the relationship between academic norms and behavior (Do you see any conflicts between what people think or say you should do and the way work is actually done?), between their own perspectives and behavior (Do you see people around here acting contrary to your advice [to doctoral students on how to avoid serious mistakes]?) and between their own normative perspectives and academic norms (Are there any ideas or rules about how you should do your work that you don’t agree with?)” (Anderson, 2001, p. 2). Narrative accounts are then analyzed in terms of the contrasts presented above. At a conference sponsored by Office of Research Integrity (ORI), U.S. Department of Health and Human Services (DHHS), in 2001, Anderson reported findings from an analysis of interviews with 30 first-year doctoral students. (See Chapter 5 for a further discussion of the initial findings and their relationship to education in the responsible conduct of research.) Organizational Literature Building on the early work on moral atmosphere, which attempted to define collective norms operating in the environment, Cullen and colleagues argued, “that corporations, like individuals, have their own sets of ethics that help define their characters. And just as personal ethics guide what an individual will do when faced with moral dilemmas, corporate ethics guide what an organization will do when faced with issues of conflicting values” (Cullen et al., 1989, p. 50). Ethical climates were conceptualized as general and pervasive characteristics of organizations that affect a broad range of decisions. In the organizational literature, work climate is defined as “perceptions that are psychologically meaningful moral descriptions that people agree characterize a system’s practices and procedures” (Cullen et al., 1993, p. 180). In contrast to the interview strategy, which, although labor intensive, has the advantage of gauging individual concepts of responsibility as well as perceptions of the group norms, Cullen and colleagues (1993) developed and validated a 36-item questionnaire, the Ethical Climate Questionnaire, to assess perceptions of the norms operating within an organization. Examples of items used to assess climate are as follows: In this company, people are mostly out for themselves. The major responsibility for people in this company is to consider efficiency first.

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct In this company, people are expected to follow their own personal and moral beliefs. People are expected to do anything to further the company’s interests. In this company, people look out for each other’s good. There is no room for one’s own personal morals or ethics in this company. It is very important to follow strictly the company’s rules and procedures here. Work is considered substandard only when it hurts the company’s interests. Each person in this company decides for himself what is right and wrong. In this company, people protect their own interest above other considerations. The most important consideration in this company is each person’s sense of right and wrong. The most important concern is the good of all the people in the company. The first consideration is whether a decision violates any law. People are expected to comply with the law and professional standards over and above other considerations. Everyone is expected to stick by company rules and procedures. Responses to the questionnaire confirm the multidimensional nature of ethical climate and substantiate the existence of a number of hypothesized ethical climates. Victor and Cullen’s (1988) measure is well validated, and their studies confirm that ethical climates are perceived at the psychological level and that individuals within organizations are able to describe the moral atmosphere that prevails in their work units. The kinds of moral climates that prevail differ dramatically among organizations. Furthermore, there appears to be variance in the ethical climate within organizations by position, tenure, and work group membership. The authors argue that ethical climates, although relatively enduring, are not static. A careful assessment of the climate enables an organization to reflect on its policies and practices and institute reforms. Examples of efforts to evaluate the organizational climate in settings that seem relevant to the research environment follow. As useful as these illustrations are for showing how an organization might assess its moral and ethical climate, it is still up to the institution to implement changes and then to reassess the climate to determine whether the improvements have occurred.

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct Examples of Climate Assessments Conducted in Related Fields U.S. Office of Government Ethics In 1999, the U.S. Office of Government Ethics (OGE) hired a consulting firm to assess the effectiveness of the executive branch ethics program and to assess the ethical culture of the executive branch from the employees’ perspective (OGE, 2000). The objective of the executive branch ethics program is to prevent conflicts of interest and misconduct that undermine the public’s trust in government. The study assessed employee perceptions of the ethical culture in the executive branch and enabled OGE to make specific decisions regarding the ethics training programs for executive branch employees; the effectiveness of communication regarding the purpose, goals, and objectives of the ethics program; and the extent to which the program helped employees avoid at-risk situations. Because the study was a first attempt to assess the ethical climate of the executive branch the study focused on overall awareness rather than an analysis of the climate within individual executive branch agencies. The OGE survey was based on the IntraSight Assessment, an assessment tool developed by Arthur Andersen researchers and academic researchers in the fields of business ethics and organizational behavior. Whereas the full report claims that the measure is statistically reliable and valid, a summary of validity and reliability data on the measures was not provided. The IntraSight Assessment examines the impact of an organization’s ethics program by assessing employees’ perceptions of observed unethical or illegal behaviors and several desirable outcomes of ethics efforts. The IntraSight Assessment examines program elements and cultural factors that, in the original study, had the greatest relationship with desirable outcomes. By providing a measure of outcomes and a measure of the related factors, the IntraSight Assessment provides direction for improving outcomes by addressing the factors most highly related to the desired outcomes. The assessment process provided data that OGE could use for continuous quality improvement. One might expect that future efforts at quality assessment would focus on evaluation of the effectiveness of ethics programs within agencies. Academic Integrity Assessment The Center for Academic Integrity at Duke University developed a process and measures that assist institutions of higher learning with assessing the extent to which the climate on their campuses promotes academic integrity (Burnett et al., 1998). The process begins with the appointment of a campus committee charged with evaluating the state of academic integrity on campus and,

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct after a data collection process, drawing conclusions and making recommendations for ways that programs that have been charged with ensuring academic integrity can improve. The committee assembles background information about the policies and disciplinary procedures (including information and statistics about sanctions that have been imposed); collects descriptions of the educational programs and activities that inform students, faculty, and administrators about academic integrity on campus; conducts focus groups for administrators; and facilitates the collection of data on perceptions of the moral climate from students and faculty. The center conducts surveys using the Student Academic Integrity Survey and the Faculty Academic Integrity Survey designed by Donald McCabe. According to the developer, the surveys can be modified to address specific content issues that may be unique to the institution and to address objectives defined by the committee. The survey has been used in several studies, but the guide to the survey provides no references to the psychometric properties of the survey. A recent communication (January 2002) with the test developer confirmed that there are no published data on the validity of the measure. The developer does periodically check its reliability, and it would be possible for the developer to make the data available. Included in the guide are criteria for review of an institution’s policies and disciplinary procedures and outcomes. The center analyzes the data collected by the surveys, as well as comparison data from national samples for the committee’s use in examining the results. The committee’s final task is to draw conclusions and make recommendations for ways in which the institution’s academic integrity programs can be improved. Additional Examples The U.S. Army uses the Ethical Climate Assessment Survey and the Framework for Establishing/Changing Ethical Climate as part of leadership development for members of the U.S. military (U.S. Army, 2001). Leaders are directed to periodically assess their unit’s ethical climate and take appropriate actions to maintain the high ethical standards expected of all organizations that are part of the U.S Army. According to information from the web site (U.S. Army, 2001), an ethical climate is one in which “stated Army values are routinely articulated, supported, practiced and respected.” An organization’s climate is determined by “the individual character of unit members, the policies and practices within the organization, the actions of unit leaders, and environmental and mission factors.” ECAS is a self-administered questionnaire that leaders use to assess how the leader perceives his or her unit and leader actions. Col. George

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct Forsythe (personal communication, United States Military Academy, January 2002) indicated that although the Army has used the measure extensively, studies of the validity of the measure have not been systematically conducted. The National Center for Education Statistics of the U.S. Department of Education compiled the responses of teachers in private and public elementary and secondary schools to an ethical climate survey. The 27-item questionnaire is intended for use by individual schools to assess the organization’s ethical culture. Summary It is apparent from the number of measures of moral climate that have been developed that scholars, at least scholars in organizational development, accept the notion that institutions differ in the kinds of moral and ethical climates that prevail and that the moral and ethical climate of an institution can influence a broad range of outcomes for which a given institution may be held accountable. There also appears to be a belief that institutions have a responsibility to assess the moral and ethical climate that prevails, to reflect on the policies and practices that contribute to that climate, to make appropriate adjustments, and to reassess their moral and ethical climates. It is also apparent that whereas a number of measures have been developed to document the prevailing moral and ethical climate, with the exception of the measure designed by Victor and Cullen (1988), little attention has been given to establishing that the data collected by such surveys provide an accurate and reliable picture of the prevailing moral and ethical climate. As easy as it may be to adapt items from existing measures to develop a climate survey to be used in research institutions, it is incumbent upon the research community to establish the validity, reliability, and usefulness of such measures. METHODS AND MEASURES FOR ASSESSING INTEGRITY OF THE INDIVIDUAL This section provides descriptions of measures or methods used to assess aspects of the moral integrity of the individual. Included are measures of general abilities that are developmental and that are linked to ethical behavior (Bebeau et al., 1999). Measures that assess aspects of the Four-Component Model of Morality of Rest (1983)3 are described and are 3   See Chapter 5, Box 5-1, for an operational definition of each of the components of morality.

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct classified under the following headings: ethical sensitivity, ethical reasoning and judgment, identity formation, and ethical implementation. In most cases, the measures described are profession specific, in that the content of the measure would not be appropriate for the assessment of integrity in research. Nonetheless, the competence being assessed is an ability that is relevant to the integrity of the researcher. If the content of the test is adapted, as has been the case in many of the examples cited below, the measurement strategy should be as effective for assessments of important learning outcomes in the research setting as it has been for assessments of important learning outcomes in other professional settings. Descriptions of some assessment strategies that rely on Rest’s Four-Component Model of Morality (1983) for their theoretical grounding and that seem promising for application to research ethics follow. Ethical Sensitivity Performance-based methods for assessment of ethical sensitivity were first developed in dentistry (Bebeau et al., 1985), and the most extensive work on the validity of the method has been conducted with the Dental Ethical Sensitivity Test (Forms A and B). (See Rest et al. [1986] and Bebeau [1994, 2001] for summaries of the validation studies.) The general strategies for ethical sensitivity assessment have been applied in other professional settings: counselor education (Brabeck and Weisgerber, 1989; Volker, 1984); computer users (Liebowitz, 1990); undergraduate education (McNeel, 1990; Mentkowski and Loacker, 1985); geriatric dentistry (Ernest, 1990); social work (Fleck-Henderson, 1995); journalism (Lind, 1997); and school personnel, including administrators, teachers, and school psychologists (Brabeck et al., 2000). An ethical sensitivity test (Bebeau and Rest, 1990; Ernest, 1990) places students in real-life situations in which they witness an interaction on either videotape or audiotape. The interaction replicates professional interactions and provides clues to a professional ethical dilemma. For example, the Racial Ethical Sensitivity Test (Brabeck, 1998) consists of five videotaped scenarios that portray acts of intolerance exhibited by professionals in school settings. Each scenario includes from five to nine acts of racial and gender intolerance that violate one or more of the common principles specified in ethical codes of school-based professions. Distinct from the cases typically used in ethics courses, the information is not predigested or interpreted. At a point in the presentation, the student is asked to take on the role of the professional in the situation and respond (on an audiotape) as though he or she were that person. Following his or her response to a patient, client, or colleague, the student answers a number of probe questions that ask why he or she said what was said; how he

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct or she expects the patient, client, or colleague to respond; what he or she thinks should be done in like situations, and so on. Using established (by an interdisciplinary team that includes practitioners) and well-validated criteria, judges rate the extent to which the student adequately interprets the significant issues and professional responsibilities presented in the situation. Studies assessing the ethical sensitivities of both professionals in training and professionals in practice (Bebeau, 2001; Bebeau and Brabeck, 1987; Bebeau et al., 1985; Fleck-Henderson, 1995) indicate considerable variability among professionals in terms of their sensitivities to the ethical issues they may encounter. Thus, completion of professional training does not ensure development of sensitivity to professional issues. Studies also show, however, that ethical sensitivity can be improved with instruction (Bebeau and Brabeck, 1987; Leibowitz, 1990; Mentkowski and Loacker, 1985; Sirin et al., submitted for publication). Furthermore, studies show that ethical sensitivity is distinct from the ability to reason (the second component of the Four-Component Model of Morality of Rest [1983]) about what ought to be done in a situation (Bebeau and Brabeck, 1987; Bebeau et al., 1985; Brabeck et al., 2000). Consequently, one cannot assume that education that focuses on ethical reasoning will transfer the ethical reasoning ability to the interpretive process. Because the assessment process is relatively expensive, requiring transcription of a semi structured interview and scoring by trained raters, measures of ethical sensitivity have typically been used in research studies. Recently, however, Brabeck and Sirin (2001) produced a computerized version of the Racial Ethical Sensitivity Test (REST-CD), intended to make their test more efficient. A subsequent study (Sirin et al., submitted for publication) concluded that the more efficient assessment process provides a reliable and valid measure of ethical sensitivity to instances of racial and gender intolerance. The modified ethical sensitivity assessment strategy of Brabeck and colleagues seems ideal for assessment of sensitivity to the cultural, interpersonal, and value conflicts that arise between parties (e.g., mentors and students, collaborators, or administrators and researchers) in the research setting. Notice, however, that in addition to assessing the professional’s attention to behaviors of the person, the cases assess knowledge of the rules, regulations, and codes of ethics in the context in which they are used. Tests that assess the application of knowledge in context usually provide better assurances of knowledge acquisition. The cases developed by ethical sensitivity researchers are not unlike the dialogue cases.

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct Ethical Reasoning and Judgment Assessing Written Essays Perhaps the most familiar approach to measuring ethical reasoning and judgment is the analysis of written arguments, typically conducted by faculty who teach philosophy or professional ethics (Howe, 1982). In dentistry (Bebeau, 1994) and nursing (McAlpine et al., 1997), for example, researchers have demonstrated that essays can be reliably assessed and that instruction is effective in promoting the ability to develop well-written essays that meet criteria that are specified in advance of instruction. Such methods lack practicality for the assessment of competence in reasoning as a function of an institution’s efforts to promote reasoning about dilemmas in integrity in research, as they are labor intensive and require considerable expertise in philosophy or ethics. However, assessment of written essays is a particularly effective way to promote learning, especially if it is accompanied by clearly stated criteria, frequent opportunities for practice, and feedback (Bebeau, 1994). These methods have been applied to integrity in research with various degrees of success. For example, Stern and Elliott (1997) describe the challenges in establishing interrater reliability and the lack of a measurable effect if the criteria used to judge moral arguments are not presented as part of the instructional program. Recognizing both the need to teach the criteria used to make judgments about the adequacy of moral arguments and the need to be able to reliably apply the criteria to the evaluation of arguments developed by students, the Poynter Center developed and validated a set of cases and criteria for the assessment of moral reasoning in scientific research. Moral Reasoning in Scientific Research: Cases for Teaching and Assessment (Bebeau et al., 1995) is an 80-page booklet that features six one- to two-page case studies, as well as extensive information on how to use the case studies and a discussion of the theoretical underpinnings of the approach. In addition to notes that provide the instructor with guidance on leading case discussions, the booklet includes a handout for students that details the criteria used to judge the adequacies of moral arguments. As its title implies, Moral Reasoning in Scientific Research is designed to facilitate improvements in moral reasoning skills, as well as to facilitate assessments of such improvements. Evidence of the effectiveness of the techniques for facilitating reasoning and the validity of the assessment are described and referenced in the booklet. Ken Pimple, a coauthor on the project, recently converted the booklet to PDF format and made it available via the Poynter Center’s World Wide Web site (Bebeau et al., 1995).

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct (moderate gains), whereas the effect size for comparison groups was only 0.09 (little gain). Linkage to many prosocial behaviors and to desired professional decision making. DIT is significantly linked to many prosocial behaviors and to desired professional decision making. One review reports that the links for 37 of 47 measures were statistically significant. Linkage to political attitudes and political choices. DIT is significantly linked to political attitudes and political choices. In a review of several dozen correlates of political attitude, DIT typically correlates with r values in the range of 0.40 to 0.60. When coupled with measures of cultural ideology, the combination predicts up to two-thirds of the variance of controversial public policy issues (such as abortion, religion in public schools, the roles of women, the rights of accused individuals, the rights of homosexuals, and free speech issues). Reliability is good. The Cronbach alpha value5 is in the upper 0.70s to low 0.80s. The test-retest reliability of DIT is stable. Furthermore, DIT shows discriminant validity from verbal ability-general intelligence and from conservative-liberal political attitudes; that is, the information in a DIT score predicts the seven validity criteria above and beyond that accounted for by verbal ability or political attitude. DIT is equally valid for males and females. DIT-2 (Rest et al., 1999b) is an updated version of the original DIT (DIT-1) devised 25 years ago. Compared with DIT-1, DIT-2 not only has stories that are not dated but is also a shorter test, has clearer instructions, and retains more subjects through subject reliability checks. In addition, in studies conducted so far, the validity of the test is not sacrificed because it is a shorter test. If anything, it improves on validity. The correlation of the results of DIT-1 with those of DIT-2 is 0.78, approaching the test-retest reliability of DIT-1 with itself. Using DIT to Assess Educational Effects Because DIT has been used to assess the effects of interventions in professional ethics and research ethics (Heitman et al., 2000), a brief summary of findings is included here. 5   Cronbach alpha (Cronbach, 1951) provides an estimate of the internal consistency of the test. Because ranking data are used to calculate the P index and the N2 index, the individual items would not be the appropriate unit of analysis for determining internal consistency reliability. Further, ranking data are ipsative; that is, if one item is ranked in first place, then no other item can be ranked in first place. Therefore, the unit of internal reliability is on the story level, not the item level, and Cronbach alpha is the appropriate strategy for estimating internal consistency. Calculated across six stories for DIT1, the estimates are 0.76, for the five story DIT2 0.81, which is somewhat lower than the estimate of 0.90 if calculated across all 11 stories for the two forms of the test (Rest et al., 1997).

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct Typically, researchers have reported scores in terms of the P Index score (the proportion of items selected that appeal to postconventional moral frameworks for decision making). The average adult selects postconventional moral arguments about 40 percent of the time, the average Ph.D. candidate in moral philosophy or political science does so about 65.2 percent of the time, the average graduate student does so 53.5 percent of the time, the average college graduate does so 42 percent of the time, and the average high school student does so 31.8 percent of the time (Rest et al., 1999b). Similar to college graduates, Heitman and colleagues’ (2000) sample of 280 graduate students in a research ethics course achieved a mean score of 43.9 (standard deviation [SD], 13.1). In contrast, a sample of 14 scientists (from a variety of disciplines) who completed DIT while in attendance at a summer institute on the teaching of research ethics achieved a mean score of 53 (SD, 13), comparable to the mean and variance for graduate students. What is important about this data set is that the variability among those interested in teaching research ethics is comparable to the variability observed among students and professionals like physicians and dentists. In other words, one cannot assume the development of postconventional thinking on the basis of one’s achievement as a scientist. Furthermore, a recent analysis of DIT profiles for entering professional students (i.e., the proportion of arguments selected with a personal interest, maintaining norms, and postconventional moral framework) indicates that fully 47 percent of a sample of 222 first-year students were in a “transitional status” of developmental change in their mode of thinking (Bebeau, 2001). In other words, their DIT profiles indicated that they were not distinguishing less adequate from more adequate moral arguments as well as students who had completed their ethics program were. As a consequence of this recent observation and a recent meta-analysis of the effects of interventions on moral judgment development (Yeap, 1999), Bebeau (2001) recommends that researchers studying the effects on an intervention conduct a profile analysis rather than rely only on the P Index as a measure of change. Whereas progress in moral judgment is developmental and development proceeds as long as an individual is in an environment that stimulates moral thinking, gains in moral judgment are typically not found to be associated with professional education programs (e.g., veterinary medicine, medicine, dentistry, and accounting programs) unless the program has a specially designed ethics curriculum (Rest and Narváez, 1994). Furthermore, for some students (Bebeau and Thoma, 1994) and some professions (Ponemon and Gabhart, 1994), educational programs actually seem to inhibit growth in terms of gaining moral judgment. For example, Ponemon and Gabhart speculate that the heavy emphasis placed on learn-

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct ing and applying regulatory codes in the education of accountants may inadvertently promote a maintaining norms moral framework that inhibits the development of the advanced moral frameworks needed to reason through new moral issues. Such findings reinforce the importance of the use of outcome measures to assess institutional effectiveness in promoting the development of reasoning ability. Development of a Prototype Intermediate Concept Measure Tests like DIT are valuable for assessment of a general reasoning ability that is a critical element of professional ethical development, but they may not be sensitive to the specific concepts taught in a professional ethics course— or, indeed, in a research ethics course. Referring to teacher education, Strike points out: “It is no doubt desirable that teachers acquire sophisticated and abstract principles of moral reasoning [as measured by DIT].... But a teacher who has a good grasp of abstract moral principles may nevertheless lack an adequate grasp of specific moral concepts, such as due process” (Strike, 1982, p. 213). The question (for educators) is often whether to teach specifically to the codes or policy manuals or to teach concepts particular to a discipline: informed consent, intellectual property, conflict of interest, and so on. Strike (1982) refers to such professionspecific concepts as “intermediate-level ethical concepts,” as they lie in an intermediate zone between the more general principles (e.g., autonomy, justice, and beneficence) described by philosophers and the more prescriptive directives often included in codes of conduct. To test the possibility of designing a profession-specific test of ethical reasoning that could be used to assess the acquisition of intermediate concepts taught in a curriculum and that could be used to study the relationship between abstract reasoning and competence to reason about new professional problems, Bebeau and Thoma (1999) designed and tested the The Dental Ethical Reasoning and Judgment Test (DERJT). Similar to DIT, the test consists of five ethical problems in dentistry to which the respondent provides action choices and justification choices. The action and justification choices for each problem were generated by a group of dental faculty and residents. The scoring key reflects consensus among a national sample of 14 dental ethicists as to better, worst, and neutral choices and justification but does not prescribe a single best action or justification. When taking the test, a respondent rates each action or justification and then selects the two best and the two worst action choices and the three best and the two worst justifications. Scores are determined by calculating the proportion of times that a respondent selects action choices and justifications consistent with “expert judgment.” High levels of agreement among 14 dental ethicists as to better and worse action choices (88

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct percent agreement for appropriate and inappropriate actions respectively and 95 and 93 percent agreement for appropriate and inappropriate justifications, respectively) demonstrated the validity of the construct. Bebeau and Thoma (1999) reported effect sizes of 0.93 and 0.56 for action and justification choices, respectively, between first-year college students and first-year dental school students, and effect sizes of 0.85 and 0.56, respectively, between first-year dental school students and dental school seniors in the class of 1997. Additionally, in a recent study of 308 graduates who completed DERJT and DIT, Bebeau and Thoma (2000) report that scores on DERJT are related to those on DIT (r = 0.22) but that the two tests are not a redundant source of information about competence in ethical reasoning and judgment. In addition, the results indicated that students with a good grasp of abstract moral schemas (good DIT P Index scores) were better able to solve the novel ethical problems presented on DERJT. As with other measures of ethical development, scores on DERJT were not related to a student’s grade point average. Identity Formation and Role Concept Development One of the chief objectives of the study described in On Being a Scientist (NAS, 1989, 1995) was to convey the central values of the scientific enterprise. In an earlier era, such values were typically conveyed informally, through mentors and research advisers. Today, educators recognize the need to introduce the responsibilities more formally. Anderson (2001), in her study of doctoral students’ conceptions of science and its norms, concludes that students might not be subject to as much group socialization through osmosis as many faculty assume. Nonetheless, the means by which socialization to the normative aspects of academic life are communicated are primarily informal (Anderson, 2001). In addition to providing support for the need to more deliberately socialize students to the norms of the research enterprise, Anderson’s study will likely provide grist for the design or modification of items used to assess role concept development for researchers. Such measures have been developed in some professions to assess identity formation and its relationship to ethical action. Professional Role Orientation Inventory The Professional Role Orientation Inventory (PROI) (Bebeau et al., 1993; Thoma et al., 1998) consists of four 10-item likert scales that assess commitment to privilege professional values over personal values. Two of the scales assess dimensions of professionalism that are theoretically

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct linked to models of professionalism described in the professional ethics literature (e.g., Emanuel and Emanuel, 1992; May, 1983; Ozar, 1985; Veatch, 1986). The PROI scales—in particular, the responsibility and authority scales—have been shown to consistently differentiate beginning and advanced student groups and practitioner groups, who are expected to differ in their role concepts. By plotting the responses of a cohort on a two-dimensional grid (Bebeau et al., 1993), it is possible to observe four distinctly different views of professionalism that, if applied, would favor different decisions about the extent of responsibility to others. In comparing practicing dentists with entering students and graduates, Minnesota graduates consistently express a significantly greater sense of responsibility to others than entering students and practicing dentists from the region. This finding has been replicated for five cohorts of graduates (n = 379). Additionally, the mean score for the graduates was not significantly different from that for a group of 48 dentists, who demonstrated a special commitment to professionalism by volunteering to participate in a national seminar to train individuals to be leaders of ethics seminars. A recent comparison of pretest and posttest scores for students in the classes of 1997 to 1999 (Bebeau, 2001) indicates a significant change (p < 0.0001) from the pretest to the posttest scores. Cross-sectional studies of differences between pretest and posttest scores for students in a comparable dental program suggest that instruction in ethics accounts for the change. The most direct evidence of a relationship between role concept and professionalism comes from the study of the performances of 28 members of the practicing community referred for courses in dental ethics because of violations of the Dental Practice Act. Although the practitioners varied considerably on measures of ethical sensitivity, reasoning, and ethical implementation, 27 of the 28 individuals were unable to clearly articulate role expectations for a professional (Bebeau, 1994). (See Bebeau et al. [1993] for a more extensive description of the theoretical grounding for this measure.) Professional Decisions and Values Test Rezler and colleagues (1992) designed the Professional Decisions and Values Test for lawyers and physicians to assess action tendencies and the underlying values in situations with ethical problems. Patterned after DIT and the Medical Ethics Inventory, the test consists of 10 case vignettes, to which respondents provide three alternative actions and seven reasons to explain the action chosen. Actions are arranged from the least to the most intrusive, and the reasons represent one of seven values commonly used to resolve an ethical dilemma. The cases were selected to represent three

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct themes: (1) obligation to the patient versus obligation to society, (2) respect for client autonomy versus professional responsibility, and (3) protection of the patient’s interest versus respect for authority. In the presentation of the findings, data for two consecutive classes of entering medical and law students (n = 340) are presented, as are their action choices, and the values are compared. Although the findings support the construct validity of the test, test-retest reliability is stable over time for action choices but not for values. The developers hypothesize that values do not become stable until later in the curriculum; thus, the test may be more useful for the assessment of change over time than for the tracking of changes for individuals. Differences by sex and profession were observed when the measure was used. Whether the lack of stability in the retest reliability study can be attributed to changes that are influenced by the curriculum is a question worthy of further study. Although further validation work needs to be done with this measure, the test is cited because its format shows promise for the design of a measure of role concept. Ethical Implementation In terms of the implementation of programs on professional ethics, Braxton and Baird (2001) point to the importance of providing preparation for professional self-regulation, and Fischer and Zigmond (2001) stress the importance of a variety of skills relevant to professional practice. To date, objective measures have not been devised to measure competence in the implementation of effective action plans. Although there may be some generic abilities, like problem-solving abilities and abilities in interpersonal and written communication, that could be assessed by the use of objective tests, it is hard to imagine designing anything but performance-based assessments of the broad range of skills required for effective, responsible research practice. Instructional programs could consider collecting examples of professional performance for evaluations by faculty and students, similar to the portfolios that Gilmer (1995) has students develop for her courses in research ethics. Also, institutions could draw attention to the importance of integrity in the conduct of science by including questions derived from the definition of integrity in regular faculty evaluations of research competence, including evaluations used to make promotion and tenure decisions. SUMMARY AND CONCLUSION A considerable amount of work has been done on the development of measurements of ethical integrity that has relevance for research institu-

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct tions concerned with the assessment of integrity in the research environment. This appendix has described outcome measures and models for the development of outcome measures that address two specific purposes. The first is to assess the ethical and moral culture and climate of an institution to ensure that the climate, which includes policies and procedures related to the ethical conduct of research, supports the individual researcher’s ability to function at the leading edge of professional integrity. Research in organizational behavior indicates that the ethical and moral climate of an institution can either inhibit or promote the responsible conduct of research. The second purpose is to describe measures and methods developed in other settings of education in professional ethics that could be used directly or that could be adapted for use in the assessment of the effectiveness of courses on the responsible conduct of research or the effectiveness of an institution’s efforts to promote integrity in research. The following criteria were used for the selection of measures for the latter category: the measures had to be theoretically grounded in a well-validated psychological theory of morality, were at least indirect measures of behavior, and either had been effectively used or have good potential to link the development of aspects of integrity (e.g., ethical sensitivity, moral reasoning and judgment, and identity formation) to institutional effectiveness. In the case of methods and measures that an institution might use in a self-assessment of its moral climate, none that are directly applicable to the research setting have been developed. On the other hand, by modifying the content of the process for assessment of an institution’s moral climate and the survey items used to collect information on the perceptions of individuals who work in that climate, it should be possible for an institution to gather information that would enable it to conduct an effective self-study. A reviewer of the section on the assessment of an institution’s moral climate will notice that data on the psychometric properties of the surveys developed for climate assessment are not readily available for the examples described here. Given such data, it would be necessary not only to modify the content of such a survey but also to conduct appropriate validation studies. In the case of measures for the assessment of outcomes of instruction in the responsible conduct of research, with the exception of DIT (a wellvalidated test of moral development over the life span that has been used effectively in intervention studies and in institutional outcome studies) the content of measures would need to be adapted. Several models for measurement have been sufficiently tested in the context of a professional ethics education program to warrant their application to the setting of integrity in research. Chapter 5 of this report gives considerable attention to teaching the responsible conduct of research. Far less attention, how-

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct ever, has been given to assessments of learning. One reason is the lack of well-validated outcome measures that can be used to assess the effects of instruction on the responsible conduct of research. Because individual teachers and even individual institutions are unlikely to be able to mount the kind of research and development plan needed to design and validate measures that assess the important outcomes of education in the responsible conduct of research, a national effort is needed. The design of such measures should be grounded in a well-established theory of ethical development and should be sufficiently user friendly to enable their use for a variety of purposes. Such purposes may include the following: (1) determining the range of criteria that define competence in ethical behavior in various disciplines; (2) conducting a needs assessment to identify areas where instructional resources should be placed; (3) identifying individual differences or problems that require intervention or remediation; (4) providing feedback to individuals, departments, and institutions on competence in research ethics; (5) determining the effects of current programs; (6) certifying research competence in ethical behavior; and (7) studying the relationship between competence and ethical behavior. Given the paucity of suitable methods for the assessment of integrity in the research environment and the skepticism that education in the responsible conduct of research can make a measurable difference in important abilities related to the responsible conduct of research, there appears to be a clear need for work on the development of measurements that would serve the research community. There is also a need to design, modify, or adapt methods and survey measures to evaluate the culture and climate that promotes integrity in research. REFERENCES Anderson M. 2001. What Would Get You in Trouble: Doctoral Students’ Conceptions of Science and Its Norms. Proceedings of the ORI Conference on Research on Research Integrity. [Online]. Available: http://www-personal.umich.edu/~nsteneck/rcri/index.html [Accessed March 13, 2002 ]. Bebeau MJ. 1994. Influencing the moral dimensions of dental practice. In: Moral Development in the Professions: Psychology and Applied Ethics. Hillsdale, NJ: L. Erlbaum Associates. Pp. 121–146. Bebeau MJ. 2001. Influencing the Moral Dimensions of Professional Practice: Implications for Teaching and Assessing for Research Integrity. Proceedings of the ORI Conference on Research on Research Integrity. [Online]. Available: http://www-personal.umich.edu/~nsteneck/rcri/index.html [Accessed March 13, 2001]. Bebeau MJ, Brabeck MM. 1987. Integrating care and justice issues in professional moral education: A gender perspective Journal of Moral Education 16:189–203. Bebeau MJ, Davis EL. 1996. Survey of ethical issues in dental research. Journal of Dental Research 75:845–855. Bebeau MJ, Rest JR. 1990. The Dental Ethical Sensitivity Test. Minneapolis, MN: Division of Health Ecology, School of Dentistry, University of Minnesota.

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct Bebeau MJ, Thoma SJ. 1994. The impact of a dental ethics curriculum on moral reasoning. Journal of Dental Education 58:684–692. Bebeau MJ, Thoma SJ. 1999. “Intermediate” concepts and the connection to moral education. Educational Psychology Review 11:343–360. Bebeau MJ, Thoma SJ. 2000 (July 8). The Validity and Reliability of an Intermediate Ethical Concepts Measure. Paper presented at the annual meeting of the Association for Moral Education. Glasgow, Scotland. Bebeau J, Rest JR, Yamoor CM. 1985. Measuring dental students’ ethical sensitivity. Journal of Dental Education 49:225–235. Bebeau MJ, Born DO, Ozar DT. 1993. The development of a Professional Role Orientation Inventory. Journal of the American College of Dentists 60(2):27–33. Bebeau MJ, Pimple KD, Muskavitch KMT, Borden SL, Smith DL. 1995. Moral Reasoning in Scientific Research: Cases for Teaching and Assessment. Bloomington, IN: Indiana University. [Online]. Available: http://www.indiana.edu/~poynter/mr-main.html [Accessed March 15, 2002]. Bebeau MJ, Rest JR, Narvaez DF. 1999. Beyond the promise: A perspective for research in moral education. Educational Researcher 28(4):18-26. Bowen MG, Power CP. 1993. The moral manager: Communicative ethics and the Exxon Valdez disaster. Business Ethics Quarterly 3:97–115. Brabeck MM. 1998. Racial ethical sensitivity test: REST videotapes. Chestnut Hill, MA: Lynch School, Boston College. Brabeck MM, Sirin S. 2001. The Racial Ethical Sensitivity Test: Computer Disk Version (RESTCD). Chestnut Hill, MA: Lynch School, Boston College. Brabeck MM, Rogers LA, Sirin S, Henderson J, Benvenuto M, Weaver M, Ting K. 2000. Increasing ethical sensitivity to racial and gender intolerance in schools: Development of the racial ethical sensitivity test. Ethics & Behavior 10:119–137. Brabeck MM, Weisgerber K. 1989. Responses to the Challenger tragedy: Subtle and significant gender differences. Sex Roles 19:639–650. Braxton J, Baird L. 2001. Preparation for professional self regulation. Science and Engineering Ethics 7:593–614. Burnett D, Rudolph L, Clifford K., eds. 1998. Academic Integrity Matters. Washington, DC: National Association of Student Personnel Administrators, Inc. Colby A, Kohlberg L, Speicher B, Hewer A, Candee D, Gibbs J, Power C. 1987. The Measurement of Moral Judgment, Vols. 1 and 2. New York, NY: Cambridge University Press. Cronbach LJ. 1951. Coefficient alpha and the internal structure of tests. Psychometrika 16:297– 334. Cullen J, Victor B, Stephens C. 1989. An ethical weather report: Assessing the organization’s ethical climate. Organizational Dynamics 18:50–62. Cullen JB, Victor B, Bronson JW. 1993. The ethical climate questionnaire: An assessment of its development and validity. Psychological Reports 73:667–674. Emanuel E, Emanuel L. 1992. Four models of the physician-patient relationship. Journal of the American Medical Association 267:2221–2226. Ernest M. 1990. Developing and Testing Cases and Scoring Criteria for Assessing Geriatric Dental Ethical Sensitivity. M.S. thesis. University of Minnesota, Minneapolis. Fischer BA, Zigmond MJ. 2001. Promoting responsible conduct in research through “survival skills” workshops: Some mentoring is best done in a crowd. Science and Engineering Ethics 7:563–587. Fleck-Henderson A. 1995. Ethical Sensitivity: A Theoretical and Empirical Study. Doctoral dissertation. The Fielding Institute, Santa Barbara, California. Gibbs JC, Basinger KS, Fuller D. 1992. Moral Maturity: Measuring the Development of Sociomoral Reflection. Hillsdale, NJ: Erlbaum Associates.

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct Gilmer PJ. 1995. Teaching science at the university level: What about the ethics? Science and Engineering Ethics 1:173–180. Heitman E, Salis, P, Bulger, RE 2000. Teaching ethics in biomedical sciences: Effects on moral reasoning skills. Paper presented at the ORI Research Conference on Research Integrity, Washington, D.C., November 2000 [Online]. Available http://ori.dhhs.gov/multimedia/acrobat/papers/heitman.pdf [Accessed March 15, 2002]. Higgins A, Power C, Kohlberg L. 1984. The relationship of moral atmosphere to judgments of responsibility. In: Kurtines WM, Gewirtz JL, eds., Morality, Moral Behavior, and Moral Development. New York, NY: Wiley. Pp. 74–108. Howe K. 1982. Evaluating philosophy teaching: Assessing student mastery of philosophical objectives in nursing ethics. Teaching Philosophy 5(1):11–22. Kohlberg L. 1984. The Psychology of Moral Development: The Nature and Validity of Moral Stages. Essays on Moral Development Vol. 2. San Francisco: Harper & Row. Korenman SG, Berk R, Wenger NS, Lew V. 1998. Evaluation of the research norms of scientists and administrators responsible for academic research integrity. Journal of the American Medical Association 279:41–47. Leibowitz S. 1990. Measuring Change in Sensitivity to Ethical Issues in Computer Use. Doctoral dissertation. Boston College, Boston, MA. Lind R. 1997. Ethical sensitivity in viewer evaluations of a TV news investigative report. Human Communication Research 23:535–561. Lind G, Wakenhut R. 1985. Testing for moral judgment competence. In: Lind G, Hartmann HA, Wakenhut R, eds. Moral Development and the Social Environment. Chicago, IL: Precedent. Pp. 79–105. May WE. 1983. The Physician’s Covenant: Images of the Healer in Medical Ethics. Philadelphia: Westminster Press. McAlpine H, Kristjanson L, Poroch D. 1997. Development and testing of the ethical reasoning tool (ERT): An instrument to measure the ethical reasoning of nurses. Journal of Advanced Nursing 25:1151–1161. McNeel SP 1990. Development of a measure of moral sensitivity for college students. In: Teaching Values Across the Curriculum. Project report. Dunbarton, NH: The Christian College Consortium. Mentkowski M. 2000. Learning That Lasts: Integrating Learning, Development, and Performance in College and Beyond. San Francisco, CA: Jossey-Bass. Mentkowski M, Loacker G. 1985. Assessing and validating the outcomes of college. In Ewell PT, ed. Assessing Educational Outcomes. New Directions for Institutional Research. No. 47. San Francisco: Jossey-Bass. Pp. 47–64. NAS (National Academy of Sciences). 1989. On Being a Scientist. Washington, DC: National Academy Press. NAS. 1995. On Being a Scientist, 2nd ed. Washington, DC: National Academy Press. OGE (U.S. Office of Government Ethics). 2000. Executive Branch Employee Ethics Survey 2000. [Online]. Available http://www.usoge.gov/pages/forms_pubs_otherdocs/fpo_files/surveys_ques/srvyemp_if_00.pdf [Accessed March 15, 2002]. Ozar DT. 1985. Three models of professionalism and professional obligation in dentistry. Journal of the American Dental Association 110:173–177. Pascarella ET, Terenzini PT. 1991. Moral development. In: How College Affects Students: Findings and Insights from Twenty Years of Research, San Francisco, CA: Jossey-Bass. Pp. 335– 368. Ponemon, LA, Gabhart, DRL. 1994. Ethical reasoning research in the accounting and auditing professions. Rest JR, Narvaez D, ed. Moral development in the professions: Psychology and applied ethics. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Pp. 101–119.

OCR for page 143
Integrity in Scientific Research: Creating an Environment that Promotes Responsible Conduct Power C. 1980. Evaluating just communities: Toward a method of assessing the moral atmosphere of the school. In: Moser R, ed. Moral Education: A First Generation of Research and Development. New York, NY: Praeger. Pp. 223–265. Power C. Higgins A Kohlberg L. 1989. Lawrence Kohlberg’s Approach to Moral Education. New York, NY: Columbia University Press. Rest J. 1983. Morality. In: Mussen PH (series ed.) and Flavell J, Markman E (vol. eds.). Handbook of Child Psychology, Vol. 3, Cognitive Development, 4th ed. New York, NY: Wiley. Pp. 556–629. Rest J, Narváez D, Bebeau MJ, Thoma SJ. 1999a. Postconventional Moral Thinking: A Neo-Kohlbergian Approach. Hillsdale, NJ: L. Erlbaum Associates. Rest J, Narváez D, Thoma SJ, Bebeau MJ. 1999b. DIT2: Devising and testing a revised instrument of moral judgment. Journal of Educational Psychology 91(4):644–659. Rest JR. 1979. Development in Judging Moral Issues. Minneapolis: University of Minnesota Press. Rest JR, Narváez DF, eds. 1994. Moral Development in the Professions: Psychology and Applied Ethics. Hillsdale, NJ: Erlbaum Associates. Pp. 51–70. Rest J, Thoma SJ, Narváez D, Bebeau MJ. 1997. Alchemy and beyond: Indexing the Defining Issues Test. Journal of Educational Psychology 89(3):498–507. Rest JR, Bebeau MJ, Volker J. 1986. An overview of the psychology of morality. In: Rest JR, eds. Moral Development: Advances in Research and Theory, Boston, MA: Prager Publishers. Pp. 1-39. Rezler AG, Schwartz RL, Obenshain SS, Lambert P, McGibson J, Bennahum DA. 1992. Assessment of ethical decisions and values. Medical Education 26:7–16. Sirin S, Brabeck MM, Satiani A, Rogers LA. Submitted for publication. Development of computerized racial ethical sensitivity test. Stern J, Elliott D. 1997. The Ethics of Scientific Research: A Guidebook for Course Development. Hanover, NH: University Press of New England. Strike KA. 1982. Educational Policy and the Just Society. Chicago, IL: University of Chicago. Thoma SJ, Bebeau MJ, Born DO. 1998. Further analysis of the Professional Role Orientation Inventory. Journal of Dental Research 77(Special Issue):120 (abstract 116). U.S. Army. 2001. Ethical Climate Assessment Survey. Document GTA 22-6-1. [Online]. Available: http://www.leadership.army.mil/leaderphilosophyandvision/ECAS.htm [Accessed June 20, 2001]. Veatch RM. 1986. Models for ethical medicine in a revolutionary age. In: Mappes TA, Zembaty J, eds. Biomedical Ethics, 2nd ed. New York:McGraw-Hill. Victor B., Cullen JB. 1988. The organizational bases of ethical work climates. Administrative Science Quarterly 33:101–125. Volker JM. 1984. Counseling Experience, Moral Judgement, Awareness of Consequences, and Moral Sensitivity in Counseling Practice. Doctoral thesis. University of Minnesota, Minneapolis, MN. Yeap CH. 1999. An Analysis of the Effects of Moral Education Interventions on the Development of Moral Cognition. Doctoral dissertation. University of Minnesota, Minneapolis, MN.