The following HTML text is provided to enhance online
readability. Many aspects of typography translate only awkwardly to HTML.
Please use the page image
as the authoritative form to ensure accuracy.
Statistical Analysis of Massive Data Streams: Proceedings of a Workshop
lot of statistical things involved in what are called triggering. So, things are going on in this detector and the thing is when to record data, since they don’t record all 22 terabytes a second, although they would like to, I guess, if they could.
The interesting statistic that I heard was, with what they do now, they think they get 99.1 percent of the interesting events among all the billions of ones that turn out not to be interesting. So, 99.1 is perhaps not a bad collection ratio. So, much of the really interesting statistics that we have talked about is sort of the off-line type. In other words, once you have stored away these gigabytes of data, there are lots of interesting pattern-recognition problems and stuff. Sort of on the real-time data mining sort of issue, we didn’t sort of pursue that particular issue very deeply. What struck everybody was how time-sensitive the science is here, and that the way statisticians do science is sort of at the dinosaur pace and the way physicists do it is, if they only sleep three hours a night, the science would get done quicker, and it is a shame they can’t stay up 24 hours a day. There is lots of discussion about magic tricks to make the science work quicker.
All in all, I think the conversation really grew in intensity and excitement for collaborations, and almost everybody seemed to have ideas about how they could contribute to the discussion. I think I would like to leave it there and ask anybody else in the group if they wanted to add something.