being captured—recording just the source, destination, type, or volume of the communications can reveal information that a user would prefer to keep private.

If network providers could find ways of being more open while protecting legitimate proprietary or privacy concerns, considerably more data could be available for study. Current understanding of data anonymization techniques, the nature of private and sensitive information, and the interaction of these issues with accurate measurement is rudimentary. Too simplistic a procedure may be inadequate: If the identity of an ISP is deleted from a published report, particular details may permit the identity of the ISP in question to be inferred. On the other hand, too much anonymity may hide crucial information (for example, about the particular network topology or equipment used) from researchers. Attention must therefore be paid to developing techniques that limit disclosure of confidential information while still providing sufficient access to information about the network to enable research problems to be tackled. In some circumstances, these limitations may prevent the export of raw measurement data—provoking the need to develop configurable “reduction agents” that can remotely analyze data and return results that do not reveal sensitive details.

Finally, realizing the “day in the life” concept will require the development of a community process for coming to a consensus on what the essential measurements are, the scope and timing of the effort, and so forth. It will require the efforts of many researchers and the cooperation of at least several Internet service providers. The networking research community itself will need to develop better discipline in the production and documentation of results from underlying data. This includes the use of more careful statistical and analytic techniques and sufficient explanation to allow archiving, repeatability, and comparison. To this end, the community should foster the creation of current benchmark data sets, analysis techniques, and baseline assumptions. Several organizations have engaged in such efforts in the past (on a smaller scale than envisioned here), including the Cooperative Association for Internet Data Analysis (CAIDA)5 and the Internet Engineering Task Force’s IP Performance Metrics working group (ippm).6 Future measurement efforts would benefit from the networking community at large adopting analogous intergroup data-sharing practices.

The National Academies | 500 Fifth St. N.W. | Washington, D.C. 20001
Copyright © National Academy of Sciences. All rights reserved.
Terms of Use and Privacy Statement