Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 315
315
being funded to perform computations on a university com-
puter. In these cases, net cost savings and performance
gains may very well be realized by switching to a dedi-
cated minicomputer-array processor system, and the users
should be encouraged to submit a collaborative proposal
to obtain such a system.
3. The Panel applauds the support of astrophysical
computation provided by Lawrence Livermore Laboratory,
NASA/Ames, NASA/Langley, Los Alamos National Laboratory,
the National Center for Atmospheric Research, and others.
The Panel recommends that these cooperative efforts be
continued and expanded, since they work to the advantage
of both parties. Astronomers have access to larger and
faster machines than would otherwise be available, while
the laboratories receive broader recognition and stimula-
tion from more diverse viewpoints than would otherwise be
the case.
Astronomical use of these large facilities may be
expanded by providing funds to support problem definition
and software development and testing with university
computers or dedicated minicomputers (as above) and by
encouraging the large centers to make more time available
for production runs. The software that has been developed
locally may then be run at the large centers either via
remote links or travel of the investigator to the center
The Panel considered a number of other options but
rejected them as being too costly. A National Computing
Center for Astronomy with a large vector machine would
cost $5 million per year. The provision of funds to buy
time at existing large computers (at cost) could only be
justified if the problem could not be run in a reasonable
amount of time on a minicomputer-array processor system.
Some problems are large and would cost $0.5 million to
$1.0 million per problem.
V. IMAGE PROCESSING AND ANALYSI S
.
Despite the title of this section, we do not intend to
ignore the processing and analysis required for other
kinds of data. It is simply that data generated by image
detectors are so voluminous that if we have the capability
to process and analyze these data we will certainly have
the capability to handle other forms of astronomical data.
It could conceivably be argued that we are collecting
too many data--that we should cut back until the data
collected match our capabilities to analyze and interpret
OCR for page 316
316
them. Such an argument is specious. We do not generate
data simply for the sake of generating data. Each astro-
nomical observation is performed in order to obtain data
that will lead to new knowledge concerning specific
scientific questions. Even survey observations (which
may appear to some as indiscriminate collection of data)
are designed to provide new knowledge that can be obtained
only through statistical studies of large bodies of data.
We unabashedly admit that we generate new knowledge for
the sake of new knowledge.
Until a few years ago, most astronomical observing
techniques produced data at very modest rates.
A night's
observation might typically produce tens of stellar-source
brightness or radio-flux measurements. However, with the
advent of solid-state array detectors and rapid scanning
microphotometers, data are now produced at a prodigious
rate, typically in the form of digital images. In an
observing run of a few nights, optical astronomers now
generate many tens of images, which in turn consist of
over 106 picture elements (each pixel containing un to
16 bits of intensity information).
A single photographic
plate taken at the prime focus of a large reflector
contains well over 109 bits.
Optical astronomers are hardly alone in their ability
to generate large numbers of digital images. The Very
Large Array produces radio maps with spatial resolutions
from 0.1 arcsec to a few arcseconds. Each radio picture
may contain up to 107 pixels with 16 bits per pixel.
For many programs, images will be made at up to 256
frequencies simultaneously in order to map the velocity
field in selected objects.
m e Space Telescope Wide-Field/Planetary Camera will
produce typically fifty 1600 x 1600 images per day. The
high spatial resolution afforded by this space observa-
tory should, by mid-decade, provide astronomers with an
unprecedented view of the deep reaches of the Universe.
me application of these new imaging techniques, both
on the ground and in space observatories, offers enormous
. . . . . . .
promise in our quest to understand the geometry and
structure of the Universe and the evolution of galaxies
and stars. While astronomers are all beginning to take
advantage of the potential for discovery afforded by
application of these new tools, the community has just
begun to realize that these technological opportunities
present a major and immediate challenge to develop
innovative techniques for the analysis of digital images.
.
OCR for page 317
317
Application of interactive image analysis at several
large universities and national centers has
already per-
mitted astronomers to attack certain classes of astro-
nomical problems. For example, large-scale surveys of
galaxy surface brightness distributions have provided new
insights into the effects of environment on the formation
and evolution of galaxies. Statistical studies of stellar
images near the limit of detectability have suggested
that, in our Local Group of galaxies, there appear to be
radical differences in the history of star formation and
the types of stars formed in different systems. The fac-
tors that appear to influence the rate at which new stars
are formed in external galaxies and the factors that
determine how many stars are formed as a function of mass
are starting to be sorted out. Radio maps, with arcsecond
and subarcsecond resolution, have revealed structure rang-
ing from the remnants of protostellar clouds to the
detailed intensity and polarization structures of galactic
jets.
In all cases, progress on these problems depends in
large measure on our ability to study not one or a few
examples of an object or class of objects but rather a
statistically significant sample. For each problem,
astronomers must invent new techniques for extracting the
essential data from digital image arrays. The horizon
for creative analysis of these arrays is bounded solely
by the imagination of individual scientists and their
access to adequate image analysis facilities.
While we can easily predict an explosion in the number
of digital images generated on the ground and in space-
based experiments, there appear at present to be few
facilities capable of permitting individual astronomers
to interact creatively with these data. If we are to
take advantage of the promise of new imaging technology,
it is crucial that the astronomical community, in con-
junction with federal funding agencies, plan to develop
and disseminate adequate image data-analysis capability
throughout the United States.
The need for extensive image-analysis capability for
the astronomical community arises from (1) the increase
in the number of solid-state array detectors that will be
deployed at the major ground-based facilities over the
next several years; (2) the high rate of image data gener-
ation from the present-day deployment of such detectors,
which already threatens to overwhelm existing data-
analysis systems--data that will be produced by the Space
Telescope, particularly from the Wide-Field/Planetary
OCR for page 318
318
Camera and the Faint-Object Camera, will further burden
existing analysis systems; (3) our ability to recover,
with digital microphotometers, the information contained
on photographic plates (which will remain unchallenged as
recorders of moderate-photometric-accuracy large-area-
coverage data); and (4) the generation of high spatial
and spectral resolution maps from the Very Large Array
(VLA). We estimate that in the latter half of the 1980's
these sources will be generating image data at a rate of
approximately 5 X 1012 to 10 X 1012 bits per year
(including calibration frames and ancillary data).
Num-
oers 1n tnls range can ne arrived at in several ways:
(1) by multiplying the estimated number of telescopes
equipped with image detectors by a typical night's output
by the number of nights per year or (2) by scaling the
output of Space Telescope (which has been estimated at
1ol bits per year from a detailed mission simulation)
to the number of ground-based telescopes that are expected
to be equipped with image detectors (taking account of
the fact that Wide Field/Planetary Camera images will be
1600 X 1600 pixels but ground-based systems built around
the same chip will probably produce 800 X 800 pixel
images). Also, these numbers are in agreement with
estimates made for the United Kingdom Starlink system
(see below) if the difference in ~vail~hil iEv of F-1~-
scopes is taken into account.
Astronomers must be able to extract from digital images
the data relevant to the solutions of the problems they
pose. Extracting these data will involve two types of
operations--first, processing of the data, which involves
removing from the data the signatures characteristic of
an instrument. As an example, images generated from
solid-state arrays must be corrected pixel by pixel for
variation in sensitivity across the array. In the case
of intensified arrays, geometrical corrections to remove
the effects of image distortion introduced by the inten-
sifier may also have to be applied. Radio maps must be
explicitly corrected for sidelobe and other distortion
effects, to provide appropriate intensity versus position
maps at each observed frequency.
m e second function that mat he n~rfc~rm-~1 in :.n~lvĒ:i~
of the data. After an array containing accurate inten-
sity-position data is produced, the maths to problem
solution begin to diverge.
Each type of problem may
require tne Development or special techniques for ex-
tracting the relevant data from the image array and
providing such problem-specific measures as object
OCR for page 319
319
brightness, statistical "counts" of numbers of objects
brighter than a certain value, and distribution of bright-
ness with position. Operations carried out under the
rubric ~analysis" often demand the development of imagina-
tive new techniques. In some cases, analysis will reveal
that residual instrumental effects are still present in
the data, thus forcing a return to the last stares of
processing. This is particularly true in the case of VLA
Images, where often only analysis reveals the potential
and need for improvement in dynamic range.
In developing a strategy for deployment of computing
equipment required to meet the challenge of image analv-
sis, we conclude the following:
(a) Facilities capable
of processing and analyzing data should be placed at or
near the place of data origin. Each major ground-based
observatory should be provided computing capability ade-
quate to render data in the form of intensitY-position
information. Such commuters would have sufficient Dower
to permit the staff and a relatively small number of out-
side users to analyze their data in parallel with continu-
ina Processing activities. (b) Analysis functions, on
should be carried out as close to one's
~ ,
the other hand,
home institution as possible.
The best scientific results
derive from the active participation in data analysis by
research astronomers and their graduate students operating
on their own schedules with time to think about the inter-
mediate results as they are obtained. For some problems,
many man-months or -years are required in order to develop
appropriate reduction algorithms.
These two distinct environments, with their differing
constraints of user access and throughput, require differ
ent scales of investment. In the university community,
local facilities are required for data analysis and theo-
retical computation. Since these facilities are local,
comparatively small numbers of astronomers are served,
and they can make use of a ready access to the facility
over an extended period of time. By contrast, at the
National Astronomy Centers, including the Space Telescope
Science Institute (STScI), and those centers of data
acquisition where user service is contemplated, a large
number of users must be served. Each user must be able
to complete his or her work in a comparatively short time,
since most astronomers cannot easily arrange for long
periods away from their home institutions.
-
Also, at the
National Centers it is imperative that all data processing
(calibration and removal of instrumental effects) be done
so that all observers can, at minimum, be provided with
OCR for page 320
320
data as free as practical from instrumental aberrations.
This data-reduction problem should not propagate beyond
the data source. Thus, while it is usual for an analysis
problem to take an astronomer many months to complete at
his or her home institution, a few days is the typical
limit that one will likely be able to stay at a National
Center beyond the scheduled observing time for data col-
lection. National Centers should therefore be able to
provide processed data within a day or so of the end of
the observations. Routines should be available at
National Centers to allow visitors to perform standard
manipulations of data with at least the same rate as it
is gathered and provide useful output formats (maps,
pictures, tables, plots).
Home institution computer
facilities are more useful for extensive interactive
image analysis, which may include development of new
techniques requiring experimentation and iteration.
This basic difference in the required user response
time of the facility serves to set the scale and nature
of the equipment required. National Centers require
integrated computer systems that allow efficient inter-
leaving of batch and interactive processing in large
volumes. Such a system may take the form of a major
batch facility (the host) interconnected to multiple,
smaller interactive satellite computer systems. In some
cases, rather elaborate pipelined computer structures may
be needed to handle extensive, special processing prob-
lems. The multiplicity of interactive computer facilities
is essential to assure the availability and response
required by short-term visitors. These satellite pro-
cessors could serve as models (with modest upgrading of
peripherals and memory) for individual university systems.
Larger university systems could more closely follow the
structure of the host processor, using this host to serve
both batch and multiple interactive stations. For the
National Centers, at present, a combined host-interactive
satellite computer scheme will cost (capital) $1 million
for a fully integrated system capable of serving simul-
taneously 4-6 interactive picture-processing functions
and the concurrent batch stream. Single-processor
university systems will range in cost from 8100,000 to
400,000, depending on the scope of the problems to be
attacked. The costs are likely to remain relatively
constant through the 1980's; as costs drop, demand may be
expected to increase at least as much.
We recommend that the funding agencies (NASA and NSF)
provide funds for the implementation of decentralized
OCR for page 321
321
image processing and analysis systems equivalent to 20
"canonical" systems (see Appendix 5.A). A continuing
level of funding is required so that when a "steady state"
is reached, the funding provides for the replacement of
systems every 6 years--the typical life of a computer
system before it becomes obsolete. Appendix 5.A elabo-
rates on the need for replacement and also discusses the
maintenance required for these systems.
The British Starlink System (see below) provides a
similar ratio (actually somewhat higher) of computer
capacity to data volume as will be supplied by the pro-
posed systems with the data volumes estimated above.
Without array processors, the combined capacity of
these 20 systems would be sufficient to perform approxi-
mately 100 computer operations per pixel if the data
rates are as described above. (Here we use "operation"
to mean an elementary computer operation such as an
addition or multiplication of two numbers; in other
usage, the term "operation" may denote a function such as
addition of two images.)
. . . · · . . . .
This may seem like a lot of
capacity until one realizes that Lo perform a single func-
tion, many operations are required.
Lions, such as reading an image from a disk file and for-
matting it for display on a video monitor, can consume
tens of operations per pixel. Complex functions, such as
geometric corrections, correction for nonlinear transfer
functions, or applications of spatial filters can consume
hundreds of operations per pixel. The estimate that 20
systems are required is unlikely to be too large and may,
in fact, be considerably too small. However, the capa-
bility of adding array processors to these systems pro-
vides a margin of protection against this possibility.
Up to now the discussion in this section has concen-
trated on the hardware facilities required to provide
adequate image processing and analysis capabilities. Of
course, a substantial effort must be devoted to software
development. As described in Appendix 5.A, a typical
system will require, at a minimum, the support of one
full-time software person to provide for maintenance of
the system software and the development and maintenance
of basic applications software. Also as described in
Appendix 5.A, there exists considerable flexibility in
the manner in which this support is provided--part-time
student or faculty support, dedicated support, shared
support, or other.
However, with the decentralization of facilities as
proposed above, there is the potential that much of the
Even simple tunc-
OCR for page 322
322
software development will be duplicated at each facility.
This is much more likely for image processing and analy-
sis applications than for theoretical applications. Soft-
ware required for the former is based on a core of stan-
dard functions such as contrast stretching, filtering,
and object identification, while software used for theo-
retical computations tends to be much more application-
dependent. The theorist tends to put his or her knowl-
edge of the physics of the problem into the software
itself, while the image analyst uses his or her knowledge
of the problem to determine the sequence of basic func-
tions to be applied to the data. (Of course, image
analysis often requires the development of unique
software to supplement the core of standard functions.)
A possible solution to the potential problem of dup-
lication of software development efforts is to insist
that all facilities be built around machines with the
same architecture (i.e., machines from a single family
from a single vendor) and to designate one of the facil-
ities to be responsible for software development for all
facilities. Such a solution would be similar to that
adopted by the British for their Starlink System.
The Starlink System, currently being implemented, is a
distributed system dedicated to image processing and
analysis. There are to be six nodes in the system, each
built around a Digital Equipment Corporation VAX 11/780
computer. The nodes will be connected by dedicated 9600-
baud lines, and one node will be responsible for software
development and maintenance. (A baud is a measure of
information-carrying capacity--in typical applications a
baud corresponds to 0.73 bit per second, thus, a 9600 baud
line can carry about 875 eight-bit characters per second.)
The communications links will be used primarily for dis-
tribution of software updates and documentation.
While such a system has advantages and is well matched
to the needs of the British astronomical community, we
believe that the Starlink model is not entirely appropri-
ate for image processing and analysis in the United
States. The U.S. astronomical community is spread over a
much wider geographical area than is the British com-
munity, and, even with low speed lines, the communication
links would be a significant expense. Furthermore, funds
are probably not available to purchase the complete facil-
ities at the same time as the British are doing. This
means that the facilities would have to be installed grad-
ually and updated gradually. If, in addition, the key to
the successful operation of the software is the fact that
OCR for page 323
323
the same computer and peripheral architectures exist at
each node, then the facilities would be locked into the
architecture they started with until that family of
computers was discontinued by the manufacturer. At this
point a tremendous spike in funding would be required to
convert both hardware and software to a new architecture.
Finally, it appears that a distributed system such as
Starlink offers less flexibility than a decentralized
system for tailoring the capabilities of individual nodes
to the problems to be addressed at the node.
Our suggested approach to the software problem is not
a sure-fire solution, but it does represent, we believe,
the best solution available at present. The National
Centers (particularly Kitt Peak National Observatory, the
National Radio Astronomy Observatory, and STScI) should
take an active role in software development by
1. Developing software that is as machine independent
as possible;
2. Adequately documenting the software;
3. Distributing standard processing and analysis
software and/or algorithms; and
4. Assisting users in the implementation and use of
software developed at the National Centers.
Of these four functions, only the second two should impose
any additional costs on the Centers. The first two are
required in any well-managed software effort. We estimate
that the second two functions can be provided at the addi-
tional expense of one man-year (plus overhead) per center
per year.
In addition, it is necessary that there be some com-
munity-wide coordination in such areas as software devel-
opment and data format standards. For this purpose, a
permanent committee should be established, perhaps under
the auspices of the AAS and perhaps with federal support.
The committee would be responsible for continuing evalua-
tion of astronomical computing needs, developing standards
for data formats and software compatibility, and collect-
ing and disseminating software and hardware news of rele-
vance to the community. The committee should also play a
coordinating role in the archiving and data-base activ-
ities described in the next two sections. Finally, the
committee should provide a focal point for liaison with
Starlink and other non-U.S. image processing and analysis
facilities.
Representative terms from entire chapter:
software development