Skip to main content

Currently Skimming:

Enabling Petabyte Computing
Pages 405-411

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 405...
... Large archives will be created that vail make various types of data available: remote-sensing data, economic statistics, clinical patient records, digitized images of art, government records, scientific simulation results, and so on. To remain competitive, one must be able to analyze this information.
From page 406...
... But an important component of the size increase is expected to come from incorporating additional ancillary information into the databases. Clinical patient records will be augmented with the digitized data sets produced by modern diagnostic equipment such as magnetic resonance imaging, positron emission tomography, x-rays, and so on.
From page 407...
... The next-generation parallel computers will be able to sustain DO rates of 12 g~gabytes/second to attached peripheral storage devices, allowing the movement of a petabyte of data per day between the data archive and the compute platform. The parallel computers Will also be able to execute at rates exceeding 100 gigaflops, thus providing the associated compute power needed to process He data.
From page 408...
... The advantages provided by such a system are that the application does not have to coordinate the data movement; data sets can be accessed through relational queries rather than by file name, and data formats can be controlled by the database, eliminating the need to convert between file formats. The software infrastructure needed to do this consists of a library interface between We application and the database to convert application read and write requests into SQL-*
From page 409...
... The second project is developing the requisite common authentication, file, and scheduling systems needed to support distributed data movement. Data-Caching Technology Malting petabyte computing available as a resource on the NIT to access distributed sources of information wall require understanding how to integrate cache management across multiple data~elivery mechanisms.
From page 410...
... If the flow to the local disk is included, the amount wall be up to seven times larger. A teraflops computer wall need to be able to support an appreciable fraction of the data movement associated with petabyte computing.
From page 411...
... 411 initiative will allow fixture commercial systems to be constructed more rapidly. Win Me rapid advance of hardware technology, commercial versions of We petabyte compute capability it be feasible within 5 years.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.