noticed that massive data sets seem to be very common—the business people out there seem to have a lot more data than they know what to do with. So I thought I would come and pick your brains while I had the chance.
Michael Cohen (Committee on National Statistics). I am a statistician interested in large data sets.
Allen McIntosh (Bellcore). I deal with very large data sets of the sort that John Schmitz was talking about data on local telephone calls. Up until now, I have been fairly comfortable with the tools I have had to analyze data. But recently I have been getting data sets that are much larger than I am used to dealing with, and that is making me uncomfortable. I am here to talk a little bit about it and to try to learn new techniques that I can apply to the data sets that I analyze.
Stephen Eick (Bell Laboratories). I want to make pictures of large data sets. We work hard on how to make pictures of big networks, and how to come up with ways to visualize software. I think the challenge now, at least for AT&T, is to learn how we can make a picture to visualize our 100 million customers.
Jim Hodges (University of Minnesota). There have been two streams in my work. One is an integral involvement in a sequence of applied problems to the point that I felt I actually knew something about the subject area, originally at the Rand Corporation in the areas of combat analysis and military logistics, and now at the University of Minnesota in clinical trials related to AIDS. The other is an interest in the foundations of statistics, particularly the disconnect between the theory that we learn in school and read about in the journals, and what we really do in practice. I did not know I was interested in massive data sets until Daryl Pregibon invited me to this conference and I started reading the position papers. Prior to this, the biggest data set I ever worked on had a paltry 40,000 cases and a trifling hundred variables per case, and so I thought the position papers were extremely interesting, and I have a lot to learn.