Massive data streams, large quantities of data that arrive continuously, are becoming increasingly commonplace in many areas of science and technology. Consequently development of analytical methods for such streams is of growing importance. To address this issue, the National Security Agency asked the NRC to hold a workshop to explore methods for analysis of streams of data so as to stimulate progress in the field. This report presents the results of that workshop. It provides presentations that focused on five different research areas where massive data streams are present: atmospheric and meteorological data; high-energy physics; integrated data systems; network traffic; and mining commercial data streams. The goals of the report are to improve communication among researchers in the field and to increase relevant statistical science activity.

- 396 pages
- English
- PDF
- Available on iOS & Android
eBook - PDF
About this book
Trusted byĀ 375,005 students
Access to over 1.5 million titles for a fair monthly price.
Study more efficiently using our study tools.
Information
Edition
0Table of contents
- STATISTICAL ANALYSIS OF MASSIVE DATA STREAMS
- Copyright
- ACKNOWLEDGEMENT OF REVIEWERS
- Preface and Workshop Rationale
- Sallie Keller-McNulty Welcome and Overview of Sessions
- James Schatz Welcome and Overview of Sessions
- Douglas Nychka, Chair of Session on Atmospheric and Meteorological Data Introduction by Session Chair
- John Bates Exploratory Climate Analysis Tools for Environmental Satellite and Weather Radar Data
- Amy Braverman Statistical Challenges in the Production and Analysis of Remote Sensing Earth Science Data at the Jetā¦
- Ralph Milliff Global and Regional Surface Wind Field Inferences from Spaceborne Scatterometer Data
- Report from Breakout Group
- David Scott, Chair of Session on High-Energy Physics Introduction by Session Chair
- Robert Jacobsen Statistical Analysis of High Energy Physics Data
- Paul Padley Some Challenges in Experimental Particle Physics Data Streams
- Miron Livny Data Grids (or, A Distributed Computing View of High Energy Physics)
- Report from Breakout Group
- Daryl Pregibon Keynote Address: Graph MiningāDiscovery in Large Networks
- Sallie Keller-McNulty, Chair of Session on Integrated Data Systems Introduction by Session Chair
- J.Douglas Beason Global Situational Awareness
- Kevin Vixie Incorporating Invariants in Mahalanobis Distance-Based Classifiers: Applications to Face Recognition
- John Elder Ensembles of Models: Simplicity (of Function) Through Complexity (of Form)
- Report from Breakout Group
- Mark Hansen Untitled Presentation
- Wendy Martinez, Chair of Session on Network Traffic Introduction by Session Chair
- William Cleveland FSD Models for Open-Loop Generation of Internet Packet Traffic
- Johannes Gehrke Processing Aggregate Queries over Continuous Data Streams
- Edward Wegman Visualization of Internet Packet Headers
- Paul Whitney Toward the Routine Analysis of Moderate to Large-Size Data
- Leland Wilkinson, Chair of Session on Mining Commercial Streams of Data Introduction by Session Chair
- Lee Rhodes A Stream Processor for Extracting Usage Intelligence from High-Momentum Internet Data
- Pedro Domingos A General Framework for Mining Massive Data Streams
- Andrew Moore kd- R- Ball- and Ad- Trees: Scalable Massive Science Data Analysis
- Concluding Comments