Closing Remarks
William Eddy
Carnegie Mellon University
This is a good moment for me to make a long series of rambling remarks and comments that may or may not address that last question.
First, I want to thank all of you for coming and providing a number of very insightful dimensions that the panel had not considered in their deliberations before today. I want to thank all the speakers and my colleagues on the panel. I also want to thank the National Science Foundation (NSF) for its financial support, and I would like to thank the staff at the National Research Council who organized this forum.
As was mentioned this morning, the panel welcomes additional input from you, preferably in writing. To comment on the presentations starting from this morning, Paul Velleman made a statement to the effect that good statistical software drives bad statistical software out of the marketplace. I happen to not believe that. In fact, for quite a long time I have been saying there is an explicit Gresham's law for statistical software, namely, the bad drives out the good. A1 Thaler of the NSF recently pointed out to me that my view on this is not completely correct, that rather there is a modified law: if the software is good enough, it does not have to get any better. That is certainly the case. As a supporting example, I am currently teaching a course to undergraduates. The particular statistical package being used is identical to the one I used in 1972. This package has not changed an iota in 19 years. It was good enough, and so it did not need to change.
Underneath all of these discussions is software. Software is one of the most unusual commodities that humans have ever ever encountered. It has a very unique feature in that if the tiniest change is made in the input, in the program, in the controls, or in anything, there can be huge changes in what results; software is in this way discontinuous. This fact drives a great deal of work in software engineering and reliability, but it should also drive our thinking about statistical software.
In relation to this, Bill DuMouchel's mention of switching from means to medians intrigued me because I know that there exist sets of numbers that are epsilon-indifferent, but which would produce substantially different answers in his system. That switch is not a smooth transition, but is instead a switch that discontinuously kicks in. The fact that it is such a switch troubles me, because I like things to vary in a smooth way. If there is one single goal that I would set for software, it is that ultimately the response would be a smooth function of the input. I do not see any way to achieve this, and cannot offer a solution.
Another thing that troubles me is that there are software systems available in the marketplace that do not meet what I would consider minimally acceptable quality standards on any kind of measure you select. These systems are not the 6 or 10 or 20
big ones that everyone knows. There are 300 of them available out there. A lot of bad software is being foisted on unsuspecting customers, and I am deeply concerned about that.
To put it in concrete terms, what does it take for the developer of a statistical package to be able to claim that the package does linear least-squares regression? What minimal functionality has to be delivered for that? There exists software that does not meet any acceptability criteria--no matter how little you ask.
Another matter troubling me, even more so now that I am a confirmed UNIX user, is the inconsistent interfaces in our statistical software systems. Vendors undoubtably view this as a feature, it being what distinguishes package A from package B. But as a user, I cannot think of anything I hate more than having to read some detestable manual to find out what confounded things I have to communicate to this package in order to do what I want, since the package is not the one I usually use. This interface inconsistency is not only in the human side of things, where it is readily obvious, but also on the computer side. There is now an activity within the American Statistical Association to develop a way to move data systematically from package to package.
Users of statistical software cannot force vendors to do anything, but gentle pleas can be made to make life a little easier. That is what this activity is really about, simplifying my life as well as yours and those of many others out there.
There was discussion this morning on standardizing algorithms or test data sets as ways to confirm the veracity of programs. That is important, but what is even more important is the tremendous amount of wasted human energy when yet another program is written to compute XTX-1XTY. There must be 400 programs out on the market to do that, and they were each written independently. That is simply stupid. Although every vendor will say its product has a new little twist that is better than the previous vendor's package, the fact is that vendors are wasting their resources doing the wrong things.
We heard several speakers this afternoon discuss the importance of simplifying the output and the analysis. If vendors would expend effort there instead of adding bells and whistles or in writing another wretched regression program, everyone would be much better off. The future report of the Panel on Guidelines for Statistical Software will include, I hope, some indications to software vendors as to where they should be focusing their efforts.
I do not know the solution to all these problems. Standards are an obvious partial solution. When a minimum standard is set everybody agrees on what it is and, when all abide by the standard, life becomes simpler. However, life does not become optimal. If optimality is desired, then simplicity must usually be relinquished. Nevertheless, standards are important, not so much in their being enforced, but in their providing targets or levels of expectation of what is to be met.
I hope this panel's future report will influence statistical research in positive ways, in addition to having positive effects on software development, and thereby affect the software infrastructure that is more and more enveloping our lives at every turn.
The critical thing that is most needed in software is adaptability. I loved Daryl Pregibon's “do.it = True” command. That is an optimal command. From now on, that
is all my computer is going to do, so that I do not have to deal with it anymore. An afterthought: an essentially similar function is needed in regard to documentation, namely, “GO AWAY!” Why is it that documentation has not evolved very far beyond the old program textbooks? There is much room for improvement here, and it is sorely needed.
One interesting thing expressed today was that maybe it is statistics, and not just the software, that needs to be changed. As said repeatedly by several speakers, statisticians tend to focus on the procedure rather than the task. That is an educational problem that we as academics need to address in the way statistics is taught, and so this relates to methodological weaknesses in statistical training. The numbers that come out of a statistical package are not the solution; numbers are the problem. There is far too much focus on having the software produce seven-digit numbers with decimal points in the right place. Statistical software must be made more sophisticated, to the point that it “talks” to the user and facilitates correct and appropriate analyses. A paraphrase of an old aphorism says that you can lead a fool to data, but you can't force him to analyze it correctly. Future statistical software should also guide the user to correct analyses.