Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
A STREAM PROCESSOR FOR EXTRACTING USAGE INTELLIGENCE FROM HIGH-MOMENTUM INTERNET DATA 309 robust, quickly adaptable, and scalable in order to meet the broad range of requirements for fast data collection and analysis. It was in the process of meeting this challenge that the concept of a general purpose stream processor emerged. There are two related suites of technologies discussed in this article. Internet Usage Manager (IUM) (http:// openview.hp.com/products/ium/index.html) is the platform technology that provides the basic stream processing capability. Dynamic Netvalue Analyzer (DNA) (http://openview.hp.com/products/dna/index.html) technology extends the IUM platform to enable stream statistical analysis capabilities. 2. BUSINESS CHALLENGES FOR THE NSPs With the recent spectacular collapse of some major NSP players, the challenge within the industry of creating a profitable return on their sizable infrastructure investments could not be more visible (McGarty 2002; Sidak 2003). During the technology buildup of the late 1990s, many of the NSPs invested heavily in rapid expansion of network capacity at the expense of infrastructure for metering and analyzing subscriber behavior. In the frenzy of the technology hype and market buildup with cheap capital readily available it was easy to believe that bandwidth was free, should be free, or would become free. Why bother to measure usage? And given the wide publicity of the ever expanding bandwidth of optical fiber, a superficial examination of the issues could lead one to that conclusion. Unfortunately, the real bandwidth limitations are not in the Internet backbone, but in the access networks (the âlast mileâ), where the upgrade costs are high. It is ironic that even those ISPs that were operating units of larger, well- established telecommunications companies, which had developed extensive telephone subscriber data collection and analysis capability over the past 20 years, did not make substantive investments in measuring and understanding subscriber usage behavior. This is rapidly changing today. The business motivations for understanding usage behavior on the revenue side include various usage-based charging models for billing and subscriber segmentation for marketing and product planning. Additionally, because IP- based services are still young, having a statistical basis for trial-testing potential pricing models for these services is certainly better than having no data at all. The motivations on the expense and investments side are equally strong. Without the ability to measure or analyze the impact of various usage behaviors, whether they are legitimate or not, the tasks of network management, security, performance, and capacity planning reduce to a guessing game. Some of our early R&D work focused on the usage mediation and tracking in support of billing for Telstra's BigPond⢠Cable and DSL Internet services (http://www.bigpond.com). In Australia, as was the case in many parts of the world outside the U.S., the cost of international transit fees, based on bandwidth usage, represented a significant variable cost that was constantly increasing on a per subscriber basis but not transferable to subscribers who were billed, at that time, only on a flat, all-you-can-use pricing model. Data collected at Telstra from nine different DSL and cable broadband Internet services revealed that the distribution of subscriber usage can be fitted very closely by a lognormal (with a shape