Some International Perspectives
S. Rao Jammalamadaka
University of California at Santa Barbara
My aim is to share with you some thoughts on statistical education and, specifically, bring in some international perspectives. Before I do this, I should tell you a bit about my background. My entire statistical education (bachelor's, master's, and the PhD degrees) was at the Indian Statistical Institute, Calcutta, under such people as C. R. Rao, P. C. Mahalanobis, and J. B. S. Haldane. I have been teaching for the past 25 years, mostly in the United States, but I have also had opportunities to teach and observe statistical education at various universities and institutions in India, Sweden, and Australia.
There is no argument that statistics is a very interdisciplinary subject. Statistics derives its strength, usefulness, and its research motivation from applications and the practical problems that arise in other fields. Therefore, I believe that, for them to become effective statisticians, statistics students need to be trained not only in statistics but also in one or more substantive disciplines. To this end, I advocate the idea that statistics should be offered at the undergraduate level, exclusively or mostly as a ''double major," in conjunction, say, with computer science, economics, biology, environmental science, sociology, psychology, and so forth. In this connection, I will describe in the next section some of my own experiences as a student.
At the master's and PhD levels, students still need to learn "theory" alongside the "practical" methods. The question as to how much of each, the "balance" issue, depends on what these students are training for. Given a finite amount of time and a finite number of courses that students can be asked to take, and the almost unlimited variety of topics available, we have to be very selective in what we teach. I shall briefly describe in the second section below our experiences in this context at the University of California, Santa Barbara (UCSB).
Statistics—Theory versus Applications
I believe in the theme that statistics is a "technology," much like medicine or engineering, and not an art or a science. Mahalanobis (1965) distinguishes science as the "effort to know nature more adequately" (that is, learn the truth—often for its own sake), and technology as "the effort to use scientific knowledge for the fulfillment of specific purposes." He also says that a technologist, unlike a scientist, must have a "knowledge and experience of a wide range of scientific subjects," and he quotes Fisher: "a professional statistician, as a technologist, must talk the language of both the theoreticians and the practitioners."
NOTE: A substantially similar version of this paper was published in the Bulletin of the International Statistical Institute 54, Book 1, 3/7, 1/9. This updated version is reprinted here with the permission of the ISI and the author.
Fisher, Mahalanobis, and Haldane were, in the main, responsible for formulating the degree programs and courses that have been initiated and taught since 1960 at the Indian Statistical Institute, Calcutta. This institute is truly a university with statistics at the center—a concept to which Carl Morris alluded earlier. I was among the first group of students selected for the 4-year professional statistics degree program, the B. Stat. (as contrasted to the 3 years it normally takes to complete a bachelor's degree in India). This very ambitious program's objective was "to offer comprehensive instruction in the theory and practice of statistics and provide at the same time a general education, together with the necessary background knowledge in the basic natural and social sciences, expected of a professional statistician. Students were taught a multitude of subjects where statistics finds applications. They included biology, economics, physics, chemistry, genetics, sociology, engineering, demography, and so on. Practical aspects of statistics were emphasized just as much as the theory. For a more detailed description of this curriculum, see Rao (1969).
Even before we students knew what statistics was about, we were taking part in a post-census survey for the 1960 decennial census and trying to understand such basic notions as "household," "family income," and how crucial such an understanding is for the survey to be meaningful. One of the class projects consisted of extrapolating the 1960 census figure from all the previous censuses and the post-census survey data. One biology project under Haldane, of which I have a vivid recollection, was to count and record the number of petals on hundreds of flowers of a particular species, so as to learn not only the botanical aspects of these flowers but also statistical concepts including frequency distributions, variation in nature, and so forth. Later, after learning a bit about sampling, we were involved (knee-deep in mud, literally and figuratively) in a "crop estimation" survey. The goal of these projects was to prepare statisticians who could analyze data sets and also have a good understanding of the context.
This bachelor's degree program demanded that, in addition to mathematics and statistics, the student learn a great deal about the wide variety of substantive areas. While the program's goal of training statisticians with a broad understanding of the applications was laudable, I believe it was far too ambitious. What now remains at the Indian Statistical Institute is but a skeleton of this idealized experiment at training well-rounded statisticians.
Although the master's degree program was very mathematical, my PhD thesis research tackled a problem in paleocurrent analysis, involving directional data, brought up by a geologist at the Indian Statistical Institute. When this geologist later inquired about a sampling strategy for the paleocurrent data, I was sent to the "field," a very inaccessible forest where he was collecting such data, so that I could more thoroughly understand the context in advising him.
However, from the foregoing description of the Indian Statistical Institute of the 1960s, which I consider to be ideal training for statisticians, I do not want you to get the impression that statistical education is being optimally managed at all Indian universities. Unlike the Indian Statistical Institute, most Indian universities suffer, as do most places, from lack of resources, both financial and intellectual.
For U.S. universities, I strongly advocate a more modest scale of incorporating one or more application areas at the undergraduate level—say, as a "double major" in statistics along with another substantive area such as biology or economics. In a bachelor's degree program, students should spend approximately equal amounts of time on the four subjects of mathematics, computing, statistics, and application areas.
At the master's degree level, it should be possible to separate the students into at least two streams, depending on their goals and interests. For instance, a "Mathematical Statistics" stream would be for those who wish to go on to a PhD, and an "Applied Statistics" stream for those who wish to make the master's a terminal degree and join government or industry. These latter students should be exposed to a wide range of statistical techniques and methods, such as applied multivariate analysis, categorical data analysis, some biostatistics including survival analysis, sampling methods, and time series. They should also be able to effectively use statistical packages for data analysis, be able as consultants to solve general statistical problems from industry and government, and be able to communicate with other scientists. There is an excellent report by the Statistical Education Section of the American Statistical Association on this type of training (ASA, 1980). At UCSB, we found that students who obtain their BA or BS degree in one of the substantive areas (for example, biology or psychology) are equally successful at pursuing this stream, if they are willing to make up deficiencies. This "retro-training" of substantive discipline majors for graduate study in statistics ties in closely with my recommendation of double majors.
Students in the more theoretical stream should take additional courses in mathematics, probability, and statistical theory, as appropriate to their chosen research topic. However, I firmly believe that students in the PhD degree programs should be encouraged to develop considerable breadth in statistics and familiarity with applications and computing, before they are allowed to specialize in their chosen research topic. This might be achieved, for example, through various course requirements or "qualifying examinations." To obtain a PhD in statistics, a student should have working knowledge at the basic level of, in addition to computing, various areas of statistics such as survey sampling, design of experiments, and time series. Students should not be awarded the doctoral degree in statistics for abstract exercises in "statistical mathematics" having no connection or even potential applications to analysis of real data.
Students at every level of statistics education should get the sense that the subject is driven by applications. The best place to teach such applications, whether in an elementary or an advanced course, is alongside relevant theory. Teachers do not always find this to be easy or convenient. Most feel comfortable with tossing coins and dice for "concrete" examples. It seems that the training (or retraining) should actually begin at the faculty level, with the faculty developing interests in some application areas. This can be achieved through joint appointments or by statistics faculty going on temporary assignments to government or private agencies, so as to acquaint themselves with real-world problems. Other ways for statisticians to keep in touch with reality include participation in a statistical consulting laboratory at the university, and/or private consulting. The statistical laboratory can also be an educational tool for the students and an extremely useful resource for the rest of the university.
Whether statistics is a separate department or resides in the mathematics department, building bridges between statisticians and the substantive disciplines, which sometimes teach their own courses on statistics and have a slightly different agenda, can be a rather delicate task in most U.S. universities. It is crucial that statistics departments find common ground with them in order to build both joint educational and joint research programs. At UCSB, such interaction is encouraged by "adjunct" appointments being provided in our group for those faculty from the substantive disciplines with statistical interests. If some become frustrated in making such efforts, consider that in Sweden, the split between applied and mathematical statistics groups is
somewhat more formalized and traditional than in the United States. There, many universities have two separate departments with no apparent interaction. I strongly believe that such a state of affairs is detrimental to both the groups.
The Program at UCSB
At UCSB, the Department of Statistics and Applied Probability currently offers programs for (1) a BA and BS degree in statistical science; (2) an MA in statistics with three possible specializations: applied statistics, mathematical statistics, and operations research; and (3) the PhD in statistics.
The two bachelor's degree programs require a substantial amount of course work in mathematics, computing, and statistics. However, at present, the programs do not require course work in application areas such as I recommended above. We hope to introduce several double majors, where, for instance, statistics majors will also study biology, economics, or computer science. The BA degree requires 10 one-quarter courses in statistics (30 fifty-minute sessions of instruction) at the upper-division level, while the BS degree requires 13 such one-quarter courses. An abbreviated, categorized list of courses (all of which are one quarter unless otherwise indicated) is as follows:
Probability and Mathematical Statistics (3 quarters)
Design and Analysis of Experiments
Statistics in Industry
Ranking and Selection Methods
Applied Stochastic Processes (2 quarters; includes some time-series)
Actuarial Statistics and Risk Theory (3 quarters)
Internship in Statistics
Independent Studies in Statistics
The Independent Studies in Statistics course is a vehicle for learning any other topic that is not taught as a regular course. The Internship in Statistics encourages students to participate in a faculty-supervised, academic internship in industrial or research firms in the area. This is an excellent way for the students as well as faculty to relate the academic course work to real-world problems. Faculty who supervise such interns and their projects should receive appropriate teaching credits, if not other considerations and rewards, when they are evaluated for merit raises and promotions; however, we have not yet been able to arrange for that at
UCSB. A reduced teaching load is given to the faculty member who runs our statistics consulting laboratory.
I should also mention in this connection the very successful annual "Careers Day" held by the Southern California Chapter of the American Statistical Association, in which our students participate. This is an excellent way for students to learn about what jobs are available and what kinds of course work prospective employers look for. This annual "Careers Day" is now being duplicated in Sweden by the Swedish Statistical Society's Educational Committee.
An MA degree with any of the three specializations requires about 11 one-quarter courses (mostly graduate level, but with a few approved upper-division undergraduate courses allowed). An abbreviated list of currently offered graduate courses (all of which are three quarters unless indicated otherwise) includes:
Statistical Decision Theory
Life Testing and Reliability (1 quarter)
Case Studies in Operations Research (1 quarter)
Advanced Statistical Methods
Seminars and Projects in Statistical Consulting (1 or 2 quarters)
Advanced Probability Theory
Advanced Stochastic Processes
There is also a catchall course called Seminars in Probability and Statistics, in which topics (generally advanced) vary with the instructor's interests.
Students in the Applied Statistics master's program take the Advanced Statistical Methods course that covers a wide range of topics (not always with formal proofs), and they have to do an internship in the statistics consulting laboratory for a quarter or two. Here they participate in actual consulting under supervision and write a project report on their statistical analysis and conclusions. This is where the students are exposed to the "inconvenient" real-world problems that may violate traditional assumptions, may have missing data, and may indeed call for the development of new theory. The Operations Research specialization requires a similar case-study project after a year-long graduate course in operations research has been taken. Students in the Mathematical Statistics specialization take the more theory-oriented courses (Statistical Theory, Probability Theory, and so on). These students are encouraged to keep up their interest in an application area by academic credit being provided for appropriate courses taken outside the statistics department.
To make sure that the PhD students have the appropriate breadth, they are asked to take "qualifying examinations" in any four of the following six areas: mathematical analysis, mathematical statistics, applied statistics, operations research, stochastic processes, and probability theory. There are still two foreign-language requirements, and one of them can be
fulfilled by demonstrating proficiency in a computer language. Almost all the students take this option and thereby do a rather substantial exercise in computing.
Although there is no unique formula that works for training all categories of statisticians, I believe there is general agreement that statistical training as well as statistical research should be driven by applications. We statisticians are in the fortunate position of being able to motivate all the theory we teach through real-world applications, and this synergy between theory and applications should be conveyed to our students. A good understanding of the theory along with its limitations, an appreciation for applications, and a familiarity with statistical computing should form the basic themes for any university degree in statistics. To this end, I recommend the idea of double majors at the undergraduate level, the establishment of an active statistical consulting laboratory, internships in statistics, and coordination with the local industry and government.
Since statistics is a discipline that is still evolving and changing, it is imperative that the courses and curricula be reexamined on a periodic basis and updated as necessary. For instance, the tremendous power of the computer now at our disposal has revolutionary implications for statistics and should be taken advantage of. Computer-intensive methods such as resampling techniques (including bootstrapping), iteratively reweighted least squares, density estimation, simulations, dynamic and interactive graphics, and so forth have all become topics of great interest. See, for instance, Speed (1985) and Tierney (1990). These topics should be incorporated as well as a host of new and emerging theoretical areas such as inference for stochastic processes, robustness and influence functions, directional data analysis, spatial statistics, image reconstruction, interacting particle systems, and so on, into statistics curricula alongside the more traditional courses that are already offered.
American Statistical Association (ASA). 1980. Preparing statisticians for careers in industry: Report of the ASA Section on Statistical Education, Committee on Training Statisticians for Industry. Am. Stat. 34:65-67.
Mahalanobis, P. 1965. Statistics as a key technology. Am. Stat. 19 :45-46.
Rao, C. R. 1969. A multidisciplinary approach to teaching statistics and probability. Sankhya, Ser. B 31: 321-340.
Speed, T. P. 1985. Teaching statistics at the university level: How computers can help us find realistic models for real data and reasonably assess their reliability. Pp. 184-195 in Teaching Statistics in the Computer Age. L. Råde and T. Speed, eds. Lund, Sweden: Student literature.
Tierney, L. 1990. LISP-STAT: An Object-Oriented Environment for Statistical Computing and Dynamic Graphics. New York: Wiley.