. "Luncheon Keynote Address." Statistical Analysis of Massive Data Streams: Proceedings of a Workshop. Washington, DC: The National Academies Press, 2004.
The following HTML text is provided to enhance online
readability. Many aspects of typography translate only awkwardly to HTML.
Please use the page image
as the authoritative form to ensure accuracy.
Statistical Analysis of Massive Data Streams: Proceedings of a Workshop
Abstract of Presentation
Graph Mining: Discovery in Large Networks Daryl Pregibon (with Corinna Cortes and Chris Volinsky), AT&T Shannon Labs
Large financial and telecommunication networks provide a rich source of problems for the data mining community. The problems are inherently quite distinct from traditional data mining in that the data records, representing transactions between pairs of entities, are not independent. Indeed, it is often the linkages between entities that are of primary interest. A second factor, network dynamics, induces further challenges as new nodes and edges are introduced through time while old edges and nodes disappear.
We discuss our approach to representing and mining large sparse graphs. Several applications in telecommunications fraud detection are used to illustrate the benefits of our approach.