What the Government Needs
N. Phillip Ross
Environmental Protection Agency
I will talk a bit about what a statistician does in the federal government. Of course, the federal government involves a very broad spectrum of activities. The Environmental Protection Agency is a regulatory agency, and most of what EPA focuses on in the statistical area is public policy decision making.
By the Office of Management and Budget's definition, federal statistical agencies are in effect agencies that deal with public policy information. This means, in 80 or 90 percent of government activities, the idea of what statistics is for and what statisticians do differs from what I learned at the University of Maryland.
In recruiting, we look for statisticians that are persons for all seasons. We would like an applicant to be able to do everything and be a specialist in everything, not only a jack of all trades, but also a master of them all. But I doubt that being both a universalist and a specialist is possible. Nor is it possible to know everything and to become, for example, a universal chemist or physicist; yet you must become the expert, the person to whom everybody comes. Even in the policy arena, EPA looks for training and experience in working as part of a team, and especially so for a statistics PhD. An individual should not come into an organization thinking that he or she has to be an expert in every aspect of that field as well as in the substantive knowledge associated with the social and hard sciences needed to develop information systems, organize the data, and do the analysis for appropriate decision making.
Being on a panel as a client is a unique position for me. I am usually convening panels of my clients and asking them what it is that we should be providing them. Here I can instead say, "This is what I would like you to provide us," and I do not have to tell you how you are going to do it. As Peter Bickel said, I do not know how you would do it.
It is not necessarily the role of the university, in its training programs, its core curriculum, and its development of the degree, to provide this teamwork and communication ability. However, there are opportunities in the university program and process — cooperative agreements, internships, and other such things — so that such training does not have to be squeezed into classes. I teach a statistics tool course at American University, and I argue with the department that two semesters of its are needed rather than one, or rather than trying to squeeze more into the one semester. So I know first hand that finding time to do it all is already a difficult thing.
The most important thing I would like to see is people emerging from graduate school understanding that they are going to play on a team, and knowing how to communicate in that team setting. That is very difficult for people, especially those majoring in mathematics and statistics. As an example from my own experience, after working on my dissertation research for several months, it seemed rather trivial. I thought it certain that in my oral exams, the committee was going to ask, "Why did you bother to pick this problem; it is quite evident what the answer is." They did not. In fact, when considering teaching at universities and circulating the dissertation, at one university where the operations research department was interested in
hiring a statistician, they sent my dissertation to the mathematics department because it was too complex for them to understand. And the mathematics department's response was, "We did not have enough time to go into it in depth, so perhaps Dr. Ross could talk about what he is doing right now as opposed to what is in his dissertation." As a recent PhD graduate, fresh out of school and with no experience, that was wonderful. They did not understand it and I did, and I thought it was trivial. The education that directed me, the competition that I had to go through as a graduate student, made me want to express something that I knew was right, but that others had difficulty understanding. It made me feel much better about myself.
When I went to work, initially in the private sector and then in the government, I remember talking with managers in a roundtable on an approach about which I said, "If you take that size sample, then you cannot infer." The person managing this roundtable said, "Hold it; speak English." Prior to that, I had not known that using "sample" and "infer" would pose a problem of understanding for some people. Although most people should understand that terminology, they do not for one reason or another. One of the critical things I have learned through experience in the federal government is to be careful about language, to be careful about jargon, and to become a member of a team that solves a problem as opposed to showing off my technical mastery, so to speak. Currently, statisticians who are hired straight out of the university must learn that on the job. Some do and some do not. As indicated earlier for industry, we at EPA also can no longer spend the time on that to acquire training on the job. Not only is industry cutting back in hiring; so also is the government. Having experience in teamwork is very important, especially with respect to communications. We want people who can communicate easily and across disciplines, and who will have the other team members understand what statisticians are trying to tell them.
Another desirable quality is an entrepreneurial element in the individual. This entails understanding that the problems you are going to solve do not come knocking at your door asking you to solve them. The people with whom you will work do not necessarily know what you can do for them, will not be coming out of their offices to ask you to solve some problem. In some government agencies, that did take place. I initially worked in the Army Research Institute, and we sat in our offices doing what we wanted until the telephone rang and somebody said, "We have a problem; come out and help us solve it." We did, and there was never a lack of work to be done. But in the public policy area, that is not the case. The managers and others are not necessarily sure of what you can provide them.
In particular, upper management in the government have not the foggiest idea what is going on below them in terms of the information needs that they have. Statisticians need to understand that. As an example of this at EPA, a few years back one of the assistant administrators for water gave a talk at a statistics conference. She was known for not having much use for statisticians, so we invited her to find out why. It was not a milieu of animosity or antagonism. She liked us, but just did not see what we did for her. One of her reasons she gave as an example. She had to advise the Army Corps of Engineers on whether or not they should be allowed to build a dam on a river relative to some environmental impact concerns. This advice had to be given to the Army Corps of Engineers in the next three months, and they would then act on it. Since statisticians obviously deal with data and provide information, she brought in statisticians from her staff and asked them to provide her with as much information as they could so she could make that decision. The answer from the statisticians was that the
data that had been collected were not very good, the monitoring systems were set up to measure for compliance to the law, not with regard to the state of the environment or the impact that might occur. They said if they were given a couple of million dollars, and about two years, they would set up a survey design and monitoring system, and would come back and tell her what to tell the Army Corps of Engineers. This was not of much help to her, of course, because she had to make the decision in the next three months.
The students who intern at EPA do not understand that aspect of policy. They need this experience, to be confronted with this type of situation and individual. What is the role of statisticians in this policy office, and what should be their attributes? They should be people who would say, "Let us take a look at this; we will see what we can do." They should assemble a team of experts, because they are not the only experts, and do the best they can within the time constraints, explaining the limitations of the information, and cautioning the decision makers. Decision makers understand uncertainty. They do not necessarily understand it in the quantitative sense that statisticians may, but they do understand it.
The flip side of this argument is if you simply throw up your hands and say, "It is all garbage, you cannot do it, call somebody else," they will call somebody else; and that somebody else will do something for them and give them an answer. The next thing you know, the statisticians that said, "We cannot help you," start criticizing that answer. That is not a very good or popular way to operate in a public policy arena, and here is where statisticians run into trouble. Part of that trouble is that, outside of an academic world or outside of a research laboratory with which they might be affiliated in the university, they have had no experience or training in what their role would be in the real working world.
A statistician should be trained in a way to use the statistical thinking on the job, to be a problem solver, and not necessarily come to an organization or to a situation with a toolbox of methods and perhaps the mathematical training to innovate and modify those methods, but needing the problem to fit those tools. If I enter with a hammer as a tool, then everything looks like a nail and I make things conform to it. If my training in statistics is with experimental design using the general linear models, then every problem that I see will be framed by those general linear model tools in my solution attempts, instead of my coming into a situation where I am part of a multidisciplinary team in which we jointly look at what the problems are and try and figure out new and innovative ways to solve them. This difference does not rely on the necessity that I am a statistician and so have to solve the problem statistically. Surprising as it may seem, a good number of problems are not solved statistically. They have to be solved in other ways. But the way statisticians think and the way we train statisticians to think will ultimately help them to solve the problems.
The other aspect encountered in government that students must be able to deal with is the need to make a decision in three months for which a "quick and dirty" solution must be found. We have to start acknowledging that quick and dirty solutions will involve difficulty, but there are solutions and approaches using extant information and data that can be brought together to give indications of how to make the decision. That is actually all the decision maker wants.
As a client, I would thus like to see more students emerging from universities with experience using real data, but also with real data that are not very good. Very good data are not much of a challenge. Good data do not push the student to ponder what to do, but data that are not very good will. For example, the environmental monitoring and assessment program
(EMAP) is attempting to set up a probabilistically based monitoring system to provide data that can be aggregated to give a measure of the health of the nation's ecosystems. It is well thought out. The basic idea is to collect data in a statistical manner that would allow some form of inference and a relationship analysis to be done. However, in the federal government that is not done very easily because the Environmental Protection Agency is not the only federal agency collecting environmental data. The Department of Interior is implementing a new National Biological Survey in which they want to do the same thing. The Agriculture Department collects forestry data. The U.S. Geological Survey collects water data. The National Oceanic and Atmospheric Administration collects air and coastal information. Each of these agencies has designed its own systems, and each of these agencies has been collecting data for a long time.
EMAP wants to look at the whole picture. To do that, they have to use the data that the other agencies are collecting, not only the past data but also data that are being collected now, because those agencies are not going to reinvest hundreds of millions of dollars in their monitoring system to conform with the idea that EPA is proposing. So one of the real difficulties EPA now has is getting people who know how to deal with and integrate spatially and temporally different data sets, even with a known probability or sample distribution. But there is also the situation, such as in the Great Lakes or in the Chesapeake Bay, where the question is, how do you deal with temporally and spatially different data that were collected by different people at different times? This is an extension of, at least, the meta-analysis activity.
Statisticians must learn how to do that. It involves more than just developing methods. Statisticians must be developed that know how to attack those problems, people who can work with the other scientists to integrate and understand how to approach new questions and realize that the answers are not going to be perfect. That is what is really needed and is where teamwork and communication come in. There are interesting real problems of that nature requiring people to solve them. The nation has spent hundreds of millions of dollars on these data. One cannot just throw them out and start over. The Congress and others would not allow EPA to do so.
Those are some of the major characteristics I, as a client, would like to see in the statisticians who apply to EPA. I hope in this day-and-a-half symposium you can find ways to satisfy all these desires, because we would be delighted.