National Academy of Sciences | 150 Year Anniversary

Questions? Call 800-624-6242

| Items in cart [0]

The National Academies Press

Rights & Permissions

topleft topright

Statistical Analysis of Massive Data Streams: Proceedings of a Workshop (2004)
Board on Mathematical Sciences and Their Applications (BMSA)

Citation Manager

. "Session 1: Atmospheric and Meteorological Data ." Statistical Analysis of Massive Data Streams: Proceedings of a Workshop. Washington, DC: The National Academies Press, 2004.

Please select a format:

BibTeX EndNote RefMan


Page
64
bottomleft bottomright

The following HTML text is provided to enhance online readability. Many aspects of typography translate only awkwardly to HTML. Please use the page image as the authoritative form to ensure accuracy.


Statistical Analysis of Massive Data Streams: Proceedings of a Workshop

Report from Breakout Group

Instructions for Breakout Groups

MS. KELLER-MC NULTY: There are three basic questions, issues, that we would like the subgroups to come back and report on.

First of all, what sort of outstanding challenges do you see relative to the collection of material that was in the session? In particular there, we heard in all these cases that there are real specific constraints on these problems that have to be taken into consideration. We can’t just assume we get the process infinitely fast, whatever we want.

The second thing is, what are the needed collaborations? It is really wonderful today. So far, we are hearing from a whole range of scientists. So, what are the needed collaborations to really make progress on these problems?

Finally, what are the mechanisms for collaboration? You know, Amy, for example, had a whole list of suggestions with her talk.

So, the three things are the challenges, what are the scientific challenges, what are the needed collaborations, and what are some ideas on mechanisms for realizing those collaborations?

Report from Atmospheric and Meteorological Data Breakout Group

MR. NYCHKA: The first thing that the reporter has to report is that we could not find another reporter except for me. I am sorry, I was hoping to give someone the opportunity, but everybody shrank from it.

So, we tried to keep on track on the three questions. I am sure that the other groups realized how difficult that was.

Let me first talk about some technical challenges. The basic product you get out of this is a field. It is maybe a variable collected over space and time. There are some just basic statistical problems of how you summarize those in terms of probability density functions, if you have multiple samples of those, how you manipulate them, and also deal with them. Also, if you wanted to study, say, like a particular variable under an El Niño period versus a La Niña period, all those kinds of conditioning issues. So, that is basically, sort of very mainstream space-time statistics.

Another important component that came out of this is the whole issue of uncertainty. This is true in general, and there was quite a bit of discussion about aligning these efforts with the climate change research initiative, which is a very high level kind of organized effort by the U.S. government to study climate. Uncertainty measures are an important part of that, and no surprise that the typical deterministic geophysical community tends to sort of ignore these, but it is something that needs to be addressed.

There was also sort of the sentiment that one limitation is partly people’s backgrounds. People use what they are familiar with. What they tend to do is limited by the tools that they know. They are sort of reticent to take on new tools. So, you have this sort of vicious circle that you only do things that you know how to do. I think an interesting thing that came out of this—and let me highlight this as a very interesting technical challenge, and it is one of these curious things where, all of a sudden, a massive

Page
64