National Academies Press: OpenBook
« Previous: William Cleveland FSD Models for Open-Loop Generation of Internet Packet Traffic
Suggested Citation:"ABSTRACT OF PRESENTATION." National Research Council. 2004. Statistical Analysis of Massive Data Streams: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/11098.
×
Page 226

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

FSD MODELS FOR OPEN-LOOP GENERATION OF INTERNET PACKET TRAFFIC 226 ABSTRACT OF PRESENTATION FSD Models for Open Loop Generation of Internet Packet Traffic William S.Cleveland, Bell Laboratories (with Jin Cao and Don X.Sun) Abstract: The packet traffic on an Internet link is a marked point process. The packet arrival times are the point process, and the packet sizes are the marks. The traffic arises from connections between pairs of computers; for each pair, the link is part of a path of links over which files are transferred between the computers; each file is broken up into packets on one computer, which are then sent to the other computer where they are reassembled to form the file. Packets arriving for transmissions on the link enter a queue. Many issues of Internet engineering depend heavily on the queue-length distribution, which in turn depends on the statistical properties of the packet process, so understanding and modeling the process are important for engineering. These statistical properties change as the mean connection load changes; consequently, the queuing characteristics change with the load. While much important analysis of Internet packet traffic has been carried out, comprehensive statistical models for the packet marked point process that reflect the changes in statistical properties with the connection load have not previously been developed. We introduce a new class of parametric statistical models fraction sum-different (FSD) models for the packet marked point process and describe the process we have used to identify the models and to then validate them. The models account for the changes in the statistical properties through different values of the parameters, and the parameters are modeled as a function of the mean load, so the modeling is hierarchical. The models are simple, and the simplicity enhances the basic understanding of traffic characteristics that arise from them. The models can be used to generate synthetic packet traffic for engineering studies; only the traffic load and certain parameters of the size marginal distribution that do not change with the load need to be specified. The mean load can be held fixed for the generation or can be varied. FSD models provides good fits to the arrivals and sizes provided the mean connection load—the mean number of simultaneous active connections using the link—is above about 100. The models apply directly only to traffic where packets on the link input router delay only a small fraction of the packets, about 15 or less; but if delayed traffic is needed, it can be very simply generated by putting the synthetic model traffic through a queue. C code is available for generation as well as an implementation in the widely used NS-2 simulation system.

Next: TRANSCRIPT OF PRESENTATION »
Statistical Analysis of Massive Data Streams: Proceedings of a Workshop Get This Book
×
 Statistical Analysis of Massive Data Streams: Proceedings of a Workshop
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Massive data streams, large quantities of data that arrive continuously, are becoming increasingly commonplace in many areas of science and technology. Consequently development of analytical methods for such streams is of growing importance. To address this issue, the National Security Agency asked the NRC to hold a workshop to explore methods for analysis of streams of data so as to stimulate progress in the field. This report presents the results of that workshop. It provides presentations that focused on five different research areas where massive data streams are present: atmospheric and meteorological data; high-energy physics; integrated data systems; network traffic; and mining commercial data streams. The goals of the report are to improve communication among researchers in the field and to increase relevant statistical science activity.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!