A University Statistics Program Based on Quality Principles
Edward D. Rothman
University of Michigan
Introduction
The purpose of this paper is to present aspects of a university statistics program that have had a positive impact on the student's joy in learning. The program's plan is to teach students an efficient learning process aligned with the purpose of a liberal arts education. Implementing the proposed program nationally will require changes in curriculum, research, and the pedagogy, and as a result will require faculty to adapt. When students become more efficient learners they increase their joy in learning.
Efficient learning depends especially on asking appropriate questions. The statistician can then provide suggestions regarding the purpose of the study, the measurement process, study design, the planned analysis, and the presentation of results. To be useful we must understand the limitations of statistics and then make clear what can be said or when there is a leap of faith. This notion of being useful is substantially more demanding than correctness or appropriateness. We want to communicate information about the client's world rather than the idealized world that never obtains. This is the basis of the Deming message about differences in analytic and enumerative studies (see, for instance, Deming, 1993, 1990).
Consider that much of what statisticians teach is based on samples drawn from a single urn so that we can say something about the entire urn's composition. The reality is that we take samples from a succession of urns and we hope to say something about an urn that has yet to be sampled. Is it a wonder that 95%levelofconfidence intervals do not cover the parameter of interest as often as expected? The systematic biases are important but the way we put it all together is critical, and neither are given adequate attention in our programs.
System thinking is much easier to describe than to implement. One feature is that the performance of the system is viewed in terms of impact on the ''bottom line," and so interaction is a key. The effectiveness of measurement process, the design, analysis, presentation, and every step need to be considered together. Optimal designs and decision rules for conditions that do not obtain are not useful. Our clients are better served when we recognize this problem and teach this coordinated approach in our classes. It is far more than the study of robust design or analysis. We need to pay attention to the purpose of the study and every element involved in the study.
The proposed program must also take account of variation. Especially evident are the varying interests and talents of the students and faculty. Whatever program we arrive at must be flexible. Much of what we see is a deterministic takeitorleaveit solution. Consider how little flexibility exists within and between our programs.
In the language of "quality," the variation is reflected in two distributions. One, labeled the voice of the customer, reflects the distribution of customer preferences while the other, called the voice of the process, represents what the process delivers at a point in time.
Achieving the aim means making the distributions congruent. And to achieve congruence we require a program that is central to a liberal arts education.
At the University of Michigan, alignment of all programs with a liberal arts education is supported by our administration. Consider that in every promotion case and in reviews of our department, we have been asked to describe how the area or subject is central. The response by departments to the administrative challenge has been slow. Few programs teach system thinking, place substantial emphasis on measurement, distinguish between analytic and enumerative studies, or are very concerned about graphics and presentation. The pedagogy must also be altered if we expect to establish a pattern of lifelong learning among a broad cross section of students.
Voice of the Customer
Although employers, parents, faculty, and others have a substantial claim as the primary external customer, we at Michigan look to our students first as indicators of the success of the proposal. Their increased joy in learning could be measured, in part, by enrollments, but other indicators may be considered.
The voice of the customer not only varies between customers but can also, as with any service or product, be affected by others. The business community might call this marketing or advertising, but we will not be quite so crass. Contact with students at an early stage to discuss what they ought to be concerned about is important. Though our preference would be to have a faculty member in the counseling office on a regular basis, we do not have the resources. Instead, we talk to some students at orientation, and the message we share with them manages to make the rounds of the counselors.
These orientation talks have met with substantial success. Though they are only thirtyfive minutes long, we include a statement about learning, read some poetry, and present a couple of examples of sense deception. The poem by Yeats, "For Anne Gregory," is used, but there are many alternatives that reflect the difficulty of "seeing" or learning by observation. Sense deception examples are abundant, too. We select examples based on aggregation and coincidence.
Among the students addressed, about 20% elect an introductory class in statistics during their freshman year. The message that statistics is central to a liberal arts education and that they are here to learn how to learn rather than simply accumulate facts, which may soon be seen as incorrect, is quite compelling.
Beyond this introductory role, it is important to understand who our students are and what they need and want. Students want to be productive. They want to contribute to society and also be viewed as individuals. Though these two desires are sometimes in conflict, the balance between the two needs to be addressed at the individual level. We are not, however, and should not be in the training business. Industry must expect to be lead in this activity. What we need to provide, as the Peace Core message suggests, are lessons in fishing and not simply fish dinners.
Once we identify learning styles and student backgrounds, alternative options can be provided. Some students may benefit through early involvement in research. Our summer
research opportunity program for minority students here has had some success. Other students benefit from team learning experiences. These allow the teaching staff to spend time with individuals while most students are engaged in the group activity.
Voice of the Process
The education process is dependent on the faculty, administration, equipment, and materials. But the faculty is our most important challenge. The faculty may be unwilling to change, may have had little exposure to pedagogy, and many may not have worked as consultants in applied research.
We expect coaching will improve the quality of the teaching in a very substantial way for those faculty interested in improvement. (Somehow writing these words now is easier than when I was chair of the department.) However, changing what is taught is another matter. There is a general discomfort with anything new, especially if we require radical changes, in the curriculum. The proposed countermeasure is to broaden our focus with new faculty, continuing education of present faculty, and by allowing those who are more comfortable with theory to broaden their view of the theory.
The applied statistician would need to become much more involved with the integration of all elements required for efficient learning. Theoretical statisticians would continue to develop theory but would broaden the focus by consideration of more general objectives. It is especially important that they become more involved with the metaphor of the theory.
For example, they might ask what the theory says about what is the appropriate question. In Robert Pirsig's Zen and the Art of Motorcycle Maintenance, he asks, "If the cycle doesn't start should we ask what is wrong with the starter or why doesn't the cycle start?" Deming as part of his fourday seminar illustrates the issue through the "Red Bead Game." The game is a physical simulation of a sequence of identically distributed random variables. The random variable is the number of red beads selected from a bead basket containing several thousand red and white beads. The paddle has 50 holes and is filled after much fanfare from the basket. Participants, ''the six willing workers," take turns drawing samples from the basket. There is much to learn from the game, but the relevant issue is that it is not useful to ask why a worker has a particularly large number of red beads, because the answer does not depend on the worker at all.
We need to engage in continuing education of our faculty to sustain the improvement. Some faculty might benefit from sabbaticals that allow them to study with nonstatisticians, and the forced reductions in summer moneys available to researchers may encourage some to spend part of their research time engaged in consultation. Our consulting laboratory, the Center for Statistical Consultation and Research, encourages faculty to provide several hours per week in consultation with other researchers. Graduate students employees of the center screen requests for consulting time so that it is more likely that the faculty time is productive. The result is often a collaborative research effort that yields a higherquality product.
The departmental seminar series can be modified to broaden faculty focus. Seminar speakers from nonstatistical disciplines can be primed to initiate a dialogue that can lead to
learning. The dialogue would generally require someone to keep the seminar moving forward, and there may need to be some discussion of the purpose of the dialogue.
The provision of reference material on the use of theoretical statistics in the broader context of learning would help, too. However, the market for a radically new product is less likely to be published, and we need a mechanism to encourage such efforts.
Curriculum Principles
The curriculum described below focuses on what we should teach but does not say how. We expect that the topics selected are generally consistent with the backgrounds of the current faculty, but changes in the faculty are essential. Use of external resources including faculty in other fields, adjunct professors, video, and other sources of material would also enable a more enriched experience, and team teaching may also prove effective.
The general principles advocated for the curriculum include the following:

Courses should have material aligned for a common purpose.
As proponents of efficient learning, we need to bring coherence to the program. Each course should have a clear, stated purpose consistent with the overall aim, and context setting is especially important. Students benefit from a discussion that indicates how the course work fits with the overall aim.
There are many ways to achieve a common purpose, and we are not advocating a single curriculum for every department. Indeed, the needs of the customers will be better served with a richer variety of options.

Subject matter should be integrated into the entire curriculum.
Applications of what we learn requires integration of ideas and skills. Our models and approaches to learning have become too compartmentalized. Remember, we want to enable students to apply their knowledge to situations beyond the limits of the course. An appreciation for some model requires that we know both when it is useful and contexts where it is not.
Rather than having courses devoted solely to large sample theory, for example, and having the large sample theory covered only in such classes, we propose that the material be integrated into virtually every class. History of statistics classes, too, should also be integrated. The ideas should provide a foundation for the questions we ask and should establish a more scholarly approach to learning.

Subject matter should be useful at several levels.
Material should yield benefits at several levels, as some literature does. Lewis Carroll's Alice in Wonderland, for example, can be studied and restudied. Our students, and all students, should have the opportunity to harvest what they are capable of handling. Using material that can be revisited allows for efficiencies and accommodates variation between students.

The introductory class, Statistics and the Scientific Inquiry, involves lectures by a faculty person and a recitation section led by graduate students. The comments from the graduate students are that they learned as much from this course as they did in any other class. It is likely that the perspectives they bring to the material designed for freshman allowed them to learn at a different level.

The curriculum should be more heavily weighted toward measurement and more generally toward design of the study rather than analysis.
Efficient learning requires a welldesigned study. This means careful attention to the purpose of the study, the measurement process, and to the general design. Our courses need to reflect the fact that most studies are observational rather than randomized.
The current curriculum is too focused on analysis, and the measurement process is virtually always taken as given. Discussions of variation are based on analysis issues, and even though some design has been included in our introductory courses, much is still devoted to aspects of the analysis. Further, the analysis has been concerned with questions of statistical significance or the sampling variation associated with a certain repeatable process. Neither question is as important as we suggest, and, further, there are many more important questions that are not asked.

Models and other characterizations of results should attach more weight to physically based models and not be simply mathematical expressions of the analysis.
Our inclination is to seek mathematically simple models. Sometimes these models are not simple when seen through the eyes of a scientist. Physically meaningful models should be given at least as much attention as those that are mathematically based. The parameters should have physical meaning and, to the extent possible, describe (uncorrelated) features of the data.

Statistical theory has much broader implications than as a framework for data analysis. These broader implications should be built into the curriculum.
The major emphasis here is on the provision of questions. A Talmudic scholar once said that a good question is more than half the answer. Our theoretical models need to provide investigators with questions.
Some Suggestions for the Curriculum
The following sections provide some detail on what we should include in courses. This is not proposed as a comprehensive list but as a means of encouraging a different perspective from that we acquired when we were graduate students. The intention is to describe subjects other than data analysis that lack attention.
Measurement
Although we complain about the requests to perform analyses on poorly designed studies, we sometimes take for granted the quality of the measurement process, and this process of measurement is clearly an integral part of the design. Much of the material in our statistics texts and research papers begins with the measurement process as given, yet we have much to share about this process.
A measurement is defined by answering three questions: Why are we making the measurement?, What is the measurement?, and, How is it measured? The "why" question needs to be addressed because we want our system approach to learning to be aligned with a purpose.
The statistician has much to say about what is measured. Does the measure contribute to the purpose? How many dimensions are required to characterize the quantity? Measurement always causes us to focus on an aspect of the purpose and is thus an abstraction or model. The "what" needs to be returned to throughout the curriculum. At one level, we could discuss scale and units of measurement, and at another the dimension question can be raised.
Finally, we need to define the method use to measure. Each method will yield differences, and comparison is not possible without a common operational definition. Systematic changes arising from different methods of measurement can clearly be far greater than the sampling variation. And for a particular method of measurement we need to consider systematic biases. The discussion should not be relegated to a course in survey research but should be addressed throughout the curriculum.
Variation
Conventional courses discuss the measurement of variation in an abbreviated manner, place much attention on the standard deviation, and then describe the connection with probability through Chebyshev's inequality. In addition to these topics, the new curriculum should also emphasize the selection of an appropriate measure of variation and should describe the factors that influence the measure.
The selection of an appropriate measure of variation is like the choice of any measure. We find that students understand the measure if they are involved in establishing the purpose. Should the measure reflect the number of items studied or how different they are? If we add an additional source of variation, then should our measure reflect that fact? What units should the measure possess? Should the measure be simple to calculate and have a straightforward interpretation? Each question allow us to say something useful about standard measures such as the range and mean absolute deviation.
Use of the standard deviation with angular measures might provide a poor reflection of spread. The dependence on the choice of origin is clear. Variation in nominal data can be measured in many ways: an entropy measure or the homozygosity measure in genetics are just two examples. The questions raised in the preceding paragraph have broad implications, and students will benefit from the general discussion.
Variation is, in general, increased with aggregation, differences in operational definitions, complexity, and overcontrol. The current curricula do pay substantial attention to effects of
aggregation, but the other three generic sources of variation are given little attention. Demonstration of the effects of overcontrol is facilitated through the use of the Nelson funnel experiment.
In the experiment, beans are dispatched through a kitchen funnel placed above a "target." The funnel is held, initially, in place about 8 to 10 inches above the target. After a bean is dispatched and the point where the bean comes to rest is recorded, the funnel is either left in place, rule 1, or its location adjusted according to the outcome. The objective is to minimize the average of the distances between the rest positions of the sequence of beans and the target.
There are various adjustments. Rule 2 moves the funnel from where it last was by an amount needed to compensate for the last error. Rule 3 also examines the last outcome, but compensates from the target rather than the last funnel location. Rule 4 requires that the funnel be placed above the last point where a bean came to rest.
Rule 1 leaves the funnel in place above the target. Though it is clearly the optimal strategy, it is analogous to management sitting on its hands and is not what many people want to do. Rules 2 and 3 increase the average distance, but rule 4 is most commonly used. Rule 4 produces a sequence of distances that we can model as a random walk. The implications on the variation are clear to a statistician. What students find interesting is that the failure of many common practices can be seen through this process. Making the next house key from the last one and the children's game "telephone" are two examples. The purpose of these analogies, and of similar ones for rules 2 and 3, is to understand the impact of overcontrol on a stable (that is, stationary) process.
Learning how to reduce the variation in the funnel experiment is also generally informative. Three systematic changes are lowering the funnel, using a sticky material around the target, and cooking the beans. Each of these actions is not a response to the last outcome but does change the "common cause" of the variation.
Mathematical Statistics and Probability
The theory of statistics and probability will continue to play an important role in the curriculum. Teaching students about deduction, asymptotics, and abstraction is helpful in the construction of useful theory. However, the separation of theory and applied statistics is counterproductive. Our overall purpose requires that we diminish the barriers between areas and approaches to knowledge.
Asymptotic theory provides a case in point. Some students do not appreciate the importance of large sample theory to the modeling effort. The division also creates a potentially dangerous view of what we do, especially in the eyes of an administration that considers application an inferior investment.
The current curriculum in most statistics departments is, however, entirely too focused on hypothesis testing. We are not providing our customers with useful information when we do this, especially because we neglect far more important models. Like the theory of best linear unbiased estimation, the hypothesis testing theory is straightforward, fun, and easy to teach, but the view that we are sampling from a basket of balls does not provide a useful model for most applications. Indeed, it is much better for us to imagine that each observation comes from an
entirely new basket and that our purpose is to learn about a basket that has not been observed. The "leap of faith" needed to make such inferences should be made explicit.
Analogy and Metaphor
The curriculum should also make more substantial use of analogy and metaphor to generate working models. The current process moves from assumptions and a deductive argument or from data to density estimation. We propose that analogy be used to generate models, too. Cognitive psychologists suggest that such learning is quite common.
Examples are plentiful, especially in portions of the curriculum dealing with deterministic systems. Newton's law, F = ma, connecting an action, F, with a reaction, a, through a proportionality constant, m, has analogies everywhere. Ohm's law, E = RI, and blood pressure = R × cardiac output are two examples. The force is a voltage or pressure on the left side, and the reaction is current or cardiac output on the right, and in these two cases the proportionality constant is resistance. Indeed, it is difficult to find a field without a firstorder kinetic law.
What corresponds to such cases in statistics is not clear. There are abundant urn models, but many of the more exciting ones may not respond to simple techniques (that is, techniques that do not require some mathematical maturity). When we oversimplify, much of the flavor is lost. However, computer simulation to illustrate the behavior of solutions can be much more instructive than an analytic approach that can cause some students to lose the forest for the trees.
Asking students to suggest analogues of theorems is a useful learning device. It helps them appreciate the conditions and the implications of the result. In a freshman class, we at the University of Michigan have students describe situations in which a random walk might describe a process. They have little problem with the exercise, and the usefulness of the result is much clearer to them. For example, they have a deeper appreciation for the growth in the standard deviation when they ask what happens to a message that is passed from one person to another with some noise added each time.
Graduate students in statistics may know Wald's identity but may not understand what it says as clearly as when they apply it to situations outside a sequential analysis class. Several years ago, a biologist colleague of mine, Julian Adams, returned from a AAAS conference that included a talk about preferences for male offspring and the impact on the ratio of sexes. In the talk, it was suggested that such preferences by Roman families increased the proportion of males in their society. Wald's identity is not common knowledge in our introductory classes, but even in graduate classes, students may have a difficult time making the transition from theory to use.
In an enlightening paper by Tversky and Kahneman (1974; see also Kahneman et al., 1982), we learn about other difficult transitions. These involve the regression effect and the law of large numbers. Even after studying the law of large numbers, many students may not know that a hospital with 50 births per day is more likely to indicate a wide disparity between the number of boys and girls born than is a hospital with fewer births. Even after a class on the regression effect, students may find a more complicated explanation for the subsequent success of a baseball player's use of a rabbit's foot after a poor start.
Model Description and Presentation
The representation of models in algebraic terms also limits their usefulness. Geometrical and algorithmic expressions provide a richer symbolism, but much remains to be done in providing vehicles and strategies that convey understanding. Although statisticians have some interesting thoughts and excellent examples involving the presentation of a complicated story in simple terms, we need to develop a more general theory or set of principles for the presentation of results. Although much has already been accomplished (see Cleveland, 1993a, b; and Tufte, 1983, 1990), more progress in this direction is needed.
Bias
Compared to variation, systematic bias is given little attention in much of the curriculum, with the exception of the sample survey class. But this lack of attention is not reflected in the applications we encounter. Indeed, we act as though randomization, blinding, and other experimental techniques eliminate bias, and we know that they do not do so. In the introductory classes it is important to identify the various forms of systematic bias and when possible to provide a quantitative analysis of the impact of these biases.
Selection biases, including sizeand lengthbiased selection, are particularly important. The direct effects can be developed easily, while the indirect effects can be introduced for identification purposes only. For example, if we want to learn about the average area of farms from a sample of farms selected by choosing farms hit by a dart thrown at a map, we can see that large farms are more likely to be included than smaller farms. Similarly, the indirect effect situation is characterized by a screening program in which individuals with longer latency periods for the disease are more likely to be selected than those with a shorter latency period.
The impact of a low sampling rate is described in upperlevel timeseries classes, but the effect of aliasing is of general importance. Simple versions of it can be presented in introductory classes to indicate the impact, and then a mathematical description can be given in the advanced classes.
Generally, we act as though the samples we select are drawn from the "basket" that represents the target for our inference. This notion has little foundation and causes others to see statisticians as less useful. The textbooks are filled with calculations based on this premise yet have little helpful information to assist us in the "leap of faith."
Sense Deception
Even when students have been exposed to a wide variety of statistical concepts, they may not be able to identify everyday situations in which these concepts are applicable. Tversky and Kahneman provide many examples of this with special attention to the law of large numbers and the regression effect (see Tversky and Kahneman, 1974; Kahneman et al., 1982). Also important are aggregation and coincidence.
There are many examples of the aggregation effect in our standard textbooks. Students find it difficult to appreciate the impact of a third variable on the relationship between two variables studied. Since the third variable may not be unknown, conclusions regarding the relationship between the other two could under certain circumstances be reversed. It is curious that in spite of these examples, desegregation techniques are not more prevalent in our courses. Of particular importance is a paper by Joiner (1981), who provides several examples that offer clues to the existence of lurking variables and how conclusions need to be reconsidered.
Design of Observational and Randomized Experiments
Much of the data we study arises from observational rather than randomized studies. Such studies come in retrospective and prospective varieties, and our students need to know how to identify the type of study and decide what are appropriate questions to raise.
We also need to reexamine the purpose of our studies. Much attention is placed on attempting to learn about differences that we know exist anyway. There are many other purposes that require different approaches. Understanding what a relatively flat or sharp response surface entails is one example.
Pedagogy
Grades
Students may not be focused on everything we talk about in class, but they certainly pay attention to the bottom line, the final grade. They want to know how they will be graded and what is required to achieve a high grade. However, we know that there is a conflict between learning and preparation to obtain a high grade. Test preparation activities encourage shortterm memory, cause students to compete rather than cooperate, cause them to focus on doing the minimum required to achieve a target, and create general unhappiness with learning.
The unhappiness resulting from the grading process stems, in part, from the use of grades as an incentive for an activity that is intrinsic. Learning is a desire that virtually all are born with. Incentives such as pay for performance create an alternative purpose that can undermine the intrinsic desire for the activity of learning. Even students who receive a high grade may not be happy if they feel that the reward is overjustified.
The grade incentive does, however, facilitate classroom management. Students become quite attentive when they hear, "The test will include. …" But the shortterm advantage of grades as a means to get students to focus in class undermines the longterm goal of continual learning.
Grades do provide feedback. However, feedback is more effective when it leads to enhanced learning; grades tell us something about where we were and not how to get ahead effectively. Alternatives to grades are provided below.
Mastery
Conventional tests, midterms and finals, are often inspections of what has or has not been learned. The usefulness of the results is limited, and the final examination comes too late to provide students with help on the current material. It is seen as a statement of closure rather than as a learning opportunity.
There are several alternatives to these tests. Only one alternative—the evaluation of mastery—is described here. The approach is similar in spirit to the Keller "Personalized System of Instruction" program (see Keller, 1968). We at the University of Michigan continue to experiment with variations on the implementation of this evaluation in our class, Statistics and the Scientific Investigation.
In this course, there is a natural division of material into ten fundamental principles. Demonstration of mastery requires that students first complete prerequisites and then demonstrate that they have the appropriate facts, understand the principle, can apply the principle to a situation that is unique, and can "explain" each of these steps in a clear fashion to an "educated layperson." The results are evaluated as "mastered" or "not yet.'' Masteries are not failed, and when students receive a "not yet" designation they can retake versions of a particular principle.
The prerequisite serves to place some of the learning responsibilities on the students' shoulders. However, students sometimes choose to take a mastery evaluation that has a more narrowed focus of what was wanted than the lecture provided. This is wasteful of teaching fellows' energy and undermines learning.
Initially, a homework assignment on related material served as a prerequisite. Students needed to complete this ungraded assignment before they were allowed to take the mastery evaluation. However, feedback on the homework is needed to focus attention on students' weaknesses. Software is being developed to provide online tutorial help with each principle.
Journal
If the purpose is to teach students to apply what they have learned to entirely new situations, then the evaluation ought to include such cases. University of Michigan students apply what they have learned to a collection of newspaper, magazine, and scientific articles of their choice. Each page of their student journal focuses on a principle.
Since we want the activity to be exciting, students choose the articles, but we prescribe the format. Each journal entry identifies a principle, asks a question based on the principle, and then provides an explanation as to why the question could lead to a different perspective.
For example, a recent headline in a local paper indicated that whereas 40% of black applicants had applications for property insurance rejected, only 10% of nonblack applicants had their applications rejected. A student asked whether family income was a factor. He argued that it might be that both for poor applicants and for nonpoor applicants, rejection rates were higher for nonblack applicants, but appeared lower when aggregated. The student then indicated with a numerical example how this could happen.
Although we have had some problems, especially with some dishonest students, the results of this effort are the most exciting byproduct of the changes made.
Feedback to Instructors
The purpose of feedback on students' understanding is to provide both the student and the instructor with direction. The feedback should be frequent, it should provide an opportunity to improve, and it should focus on a demonstration of learning and not on the accumulation of facts. Student evaluation of teaching at term's end is useful but is too little and too late.
The oneminute paper proposed by others can be adapted to improve teaching. By providing comprehensive timely feedback, it causes students to focus on the lecture, and it helps instructors to provide a firm foundation.
We ask students two questions during the last minute or two of class. First they identify questions that they believe they can now answer as a result of the material covered, and then they identify questions that they would like us to answer to help them understand the material at a deeper level. Feedback based on their responses is provided at the beginning of the next lecture.
Feedback to students must be timely and organized. A Pareto chart that displays the frequencies of responses from most to least frequent can be used to summarize the information. The few most frequent response categories can be reviewed at the start of each class. Subsequent revisitings of a topic can be made more efficient by orienting the discussion in a manner that anticipates the problems.
Discovery
Focusing on the lecturer is not a longterm solution. What we need to provide are fishing techniques rather than the seemingly more expedient serving of fish. One approach is through the process of discovery. Although discovery is seen as a slower process for learning prescribed material in the short run, it can facilitate learning when lectures end.
Courses on design of surveys and experiments, as well as probability classes, are prime candidates for discovery. In the survey design class and experimental design classes at the University of Michigan, students design surveys and experiments and carry out the design. Obtaining firsthand information on what could go wrong is more effective than listing potential sources of error. In analysis too, the appreciation of interactions is enhanced when the interactions are discovered. Probability classes, when coupled with a computer laboratory, provide an excellent opportunity for discovery. Appreciation for scaling, in particular, is greatly enhanced when students experiment with the choice of scales.
Group Learning
Learning is enhanced when students can share. Expressing what they think they understand to others and hearing what others can contribute are both worthwhile activities. Each requires some interaction with an expert, but the interaction should be limited.
Summary
University classes should emphasize the process of learning rather than the presentation of facts. We need to view what we do as educators as distinct from training, which should be done on the job. Students want to learn something that they view as useful both today and tomorrow, and the resulting tension should be seen as an opportunity to improve what we do in our statistics programs.
We should view education in statistics as part of the system of education. It is essential that what and how we teach, and the connections or interactions between statistics and the other parts of the system, be aligned for this common purpose. To view statistics education as separate from university education is the antithesis of the proposal.
This paper has focused only on aspects of this transformation, much remains to be completed. The proposed transformation involves changes in some courses, a major emphasis on design of studies, and a substantial focus on critical reasoning. And the way we teach needs to change, too, from a lecture format to discovery, group learning, and a general recognition of variation among our students and faculty. Finally, we need an honest examination of the way we evaluate students and a replacement for grades.
Although enrollments in statistics classes at the University of Michigan continue to grow, the growth is not broad based, and few of the students choose graduate school in statistics. Standard classes have remained stable or have declined, while courses that focus on the role of statistics in learning grow at an incredible rate. Statistics and the Scientific Investigation grew from 80 students per semester to over 250 in four terms.
References
Cleveland, W. 1993a. The Elements of Graphing Data. 2nd ed. New York: Van Nostrand Reinhold.
Cleveland, W. 1993b. Visualizing Data. Summit, N.J.: Hobart Press.
Deming, W. E. 1990. Sample Design in Business Research. New York: Wiley.
Deming, W. E. 1993. Quotations of Dr. Deming: The Little Blue Book. 2nd ed. S. McCrea, ed. Fort Lauderdale, Fla.: South Florida Electric Auto Assn. 128 pp.
Gnanadesikan, R., and J. R. Kettenring. 1988. Statistics teachers need experience with data. Coll. Math. J. 19:1214.
Joiner, B. 1981. Lurking variables: Some examples. Am. Stat. 35:227233.
Kahneman, D., P. Slovic, and A. Tversky, eds. 1982. Judgment Under Uncertainty: Heuristics and Biases. New York: Cambridge University Press.
Keller, F. S. 1968. Goodbye, teacher. … J. Appl. Behav. Anal. 1:7989.
Tufte, E. R. 1983. The Visual Display of Quantitative Information. Cheshire, Conn.: Graphics Press. 197 pp.
Tufte, E. R. 1990. Envisioning Information. Cheshire, Conn.: Graphics Press. 128 pp.
Tversky, A., and D. Kahneman. 1974. Judgements under uncertainty: heuristics and biases. Science 185:11241131.
Velleman, P. F., and L. Wilkinson. 1993. Nominal, ordinal, interval and ratio typologies are misleading. Am. Stat. 47:6572.