Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 165
l -
Methods of
Data Collection,
Representation
Analysis
, and
OCR for page 166
OCR for page 167
SMethods of
Data Collection.
Representation, and
This chapter concerns research on collecting, representing, and analyzing
the data that underlie behavioral and social sciences knowledge. Such research,
methodological in character, includes ethnographic and historical approaches,
scaling, axiomatic measurement, and statistics, with its important relatives,
econometrics and psychometrics. The field can be described as including the
self-conscious study of how scientists draw inferences and reach conclusions
from observations. Since statistics is the largest and most prominent of meth-
odological approaches and is used by researchers in virtually every discipline,
statistical work draws the lion's share of this chapter's attention.
Problems of interpreting data arise whenever inherent variation or measure-
ment fluctuations create challenges to understand data or to judge whether
observed relationships are significant, durable, or general. Some examples: Is
a sharp monthly (or yearly) increase in the rate of juvenile delinquency (or
unemployment) in a particular area a matter for alarm, an ordinary periodic
or random fluctuation, or the result of a change or quirk in reporting method?
Do the temporal patterns seen in such repeated observations reflect a direct
causal mechanism, a complex of indirect ones, or just imperfections in the
Analysis
167
OCR for page 168
168 / The Behavioral and Social Sciences
data? Is a decrease in auto injuries an effect of a new seat-belt law? Are the
disagreements among people describing some aspect of a subculture too great
to draw valid inferences about that aspect of the culture?
Such issues of inference are often closely connected to substantive theory
and specific data, and to some extent it is difficult and perhaps misleading to
treat methods of data collection, representation, and analysis separately. This
report does so, as do all sciences to some extent, because the methods developed
often are far more general than the specific problems that originally gave rise
to them. There is much transfer of new ideas from one substantive field to
another—and to and from fields outside the behavioral and social sciences.
Some of the classical methods of statistics arose in studies of astronomical
observations, biological variability, and human diversity. The major growth of
the classical methods occurred in the twentieth century, greatly stimulated by
problems in agriculture and genetics. Some methods for uncovering geometric
structures in data, such as multidimensional scaling and factor analysis, orig-
inated in research on psychological problems, but have been applied in many
other sciences. Some time-series methods were developed originally to deal
with economic data, but they are equally applicable to many other kinds of
data.
Within the behavioral and social sciences, statistical methods have been
developed in and have contributed to an enormous variety of research, includ-
ing:
· In economics: large-scale models of the U.S. economy; effects of taxa-
tion, money supply, and other government fiscal and monetary policies;
theories of duopoly, oligopoly, and rational expectations; economic effects
of slavery.
· In psychology: test calibration; the formation of subjective probabilities,
their revision in the light of new information, and their use in decision
making; psychiatric epidemiology and mental health program evaluation.
· In sociology and other fields: victimization and crime rates; effects of
incarceration and sentencing policies; deployment of police and fire-fight-
ing forces; discrimination, antitrust, and regulatory court cases; social net-
works; population growth and forecasting; and voting behavior.
Even such an abridged listing makes clear that improvements in method-
ology are valuable across the spectrum of empirical research in the behavioral
and social sciences as well as in application to policy questions. Clearly, meth-
odological research serves many different purposes, and there is a need to
develop different approaches to serve those different purposes, including ex-
ploratory data analysis, scientific inference about hypotheses and population
parameters, individual decision making, forecasting what will happen in the
event or absence of intervention, and assessing causality from both randomized
experiments and observational data.
OCR for page 169
Methods of Data Collection, Representation, and Analysis / 169
This discussion of methodological research is divided into three areas: de-
sign, representation, and analysis. The efficient design of investigations must
take place before data are collected because it involves how much, what kind
of, and how data are to be collected. What type of study is feasible: experi-
mental, sample survey, field observation, or other? What variables should be
measured, controlled, and randomized? How extensive a subject pool or ob-
servational period is appropriate? How can study resources be allocated most
effectively among various sites, instruments, and subsamples?
The construction of useful representations of the data involves deciding what
kind of formal structure best expresses the underlying qualitative and quanti-
tative concepts that are being used in a given study. For example, cost of living
is a simple concept to quantify if it applies to a single individual with unchang-
ing tastes in stable markets (that is, markets offering the same array of goods
from year to year at varying prices), but as a national aggregate for millions of
households and constantly changing consumer product markets, the cost of
living is not easy to specify clearly or measure reliably. Statisticians, economists,
sociologists, and other experts have long struggled to make the cost of living
a precise yet practicable concept that is also efficient to measure, and they must
continually modify it to reflect changing circumstances.
Data analysis covers the final step of characterizing and interpreting research
findings: Can estimates of the relations between variables be made? Can some
conclusion be drawn about correlation, cause and effect, or trends over time?
How uncertain are the estimates and conclusions and can that uncertainty be
reduced by analyzing the data in a different way? Can computers be used to
display complex results graphically for quicker or better understanding or to
suggest different ways of proceeding?
Advances in analysis, data representation, and research design feed into and
reinforce one another in the course of actual scientific work. The intersections
between methodological improvements and empirical advances are an impor-
tant aspect of the multidisciplinary thrust of progress in the behavioral and
.
socla. . sciences.
DESIGNS FOR DATA COLLECTION
Four broad kinds of research designs are used in the behavioral and social
sciences: experimental, survey, comparative, and ethnographic.
Experimental designs, in either the laboratory or field settings, systematically
manipulate a few variables while others that may affect the outcome are held
constant, randomized, or otherwise controlled. The purpose of randomized
experiments is to ensure that only one or a few variables can systematically
affect the results, so that causes can be attributed. Survey designs include the
collection and analysis of data from censuses, sample surveys, and longitudinal
studies and the examination of various relationships among the observed phe-
OCR for page 170
170 / The Behavioral and Social Sciences
nomena. Randomization plays a different role here than in experimental de-
signs: it is used to select members of a sample so that the sample is as repre-
sentative of the whole population as possible. Comparative designs involve the
retrieval of evidence that is recorded in the flow of current or past events in
different times or places and the interpretation and analysis of this evidence.
Ethnographic designs, also known as participant-observation designs, involve
a researcher in intensive and direct contact with a group, community, or pop-
ulation being studied, through participation, observation, and extended inter-
vlewlng.
Experimental Designs
Laboratory Experiments
Laboratory experiments underlie most of the work reported in Chapter 1,
significant parts of Chapter 2, and some of the newest lines of research in
Chapter 3. Laboratory experiments extend and adapt classical methods of de-
sign first developed, for the most part, in the physical and life sciences and
agricultural research. Their main feature is the systematic and independent
manipulation of a few variables and the strict control or randomization of all
other variables that might affect the phenomenon under study. For example,
some studies of animal motivation involve the systematic manipulation of amounts
of food and feeding schedules while other factors that may also affect motiva-
tion, such as body weight, deprivation, and so on, are held constant. New
designs are currently coming into play largely because of new analytic and
computational methods (discussed below, in "Advances in Statistical Inference
and Analysis".
Two examples of empirically important issues that demonstrate the need for
broadening classical experimental approaches are open-ended responses and
lack of independence of successive experimental trials. The first concerns the
design of research protocols that do not require the strict segregation of the
events of an experiment into well-defined trials, but permit a subject to respond
at will. These methods are needed when what is of interest is how the respond-
ent chooses to allocate behavior in real time and across continuously available
alternatives. Such empirical methods have long been used, but they can gen-
erate very subtle and difficult problems in experimental design and subsequent
analysis. As theories of allocative behavior of all sorts become more sophisti-
cated and precise, the experimental requirements become more demanding,
so the need to better understand and solve this range of design issues is an
outstanding challenge to methodological ingenuity.
The second issue arises in repeated-trial designs when the behavior on suc-
cessive trials, even if it does not exhibit a secular trend (such as a learning
curve), is markedly influenced by what has happened in the preceding trial or
trials. The more naturalistic the experiment and the more sensitive the meas-
OCR for page 171
Methods of Data Collection, Representation, and Analysis / 171
urements taken, the more likely it is that such effects will occur. But such
sequential dependencies in observations cause a number of important concep-
tual and technical problems in summarizing the data and in testing analytical
models, which are not yet completely understood. In the absence of clear
solutions, such effects are sometimes ignored by investigators, simplifying the
data analysis but leaving residues of skepticism about the reliability and sig-
nificance of the experimental results. With continuing development of sensitive
measures in repeated-trial designs, there is a growing need for more advanced
concepts and methods for dealing with experimental results that may be influ-
enced by sequential dependencies.
Randomized Field Experiments
The state of the art in randomized field experiments, in which different
policies or procedures are tested in controlled trials under real conditions, has
advanced dramatically over the past two decades. Problems that were once
considered major methodological obstacles such as implementing random-
ized field assignment to treatment and control groups and protecting the ran-
domization procedure from corruption have been largely overcome. While
state-of-the-art standards are not achieved in every field experiment, the com-
mitment to reaching them is rising steadily, not only among researchers but
also among customer agencies and sponsors.
The health insurance experiment described in Chapter 2 is an example of a
major randomized field experiment that has had and will continue to have
important policy reverberations in the design of health care financing. Field
experiments with the negative income tax (guaranteed minimum income) con-
ducted in the 1970s were significant in policy debates, even before their com-
pletion, and provided the most solid evidence available on how tax-based
income support programs and marginal tax rates can affect the work incentives
and family structures of the poor. Important field experiments have also been
carried out on alternative strategies for the prevention of delinquency and other
criminal behavior, reform of court procedures, rehabilitative programs in men-
tal health, family planning, and special educational programs, among other
areas.
In planning field experiments, much hinges on the definition and design of
the experimental cells, the particular combinations needed of treatment and
control conditions for each set of demographic or other client sample charac-
teristics, including specification of the minimum number of cases needed in
each cell to test for the presence of effects. Considerations of statistical power,
client availability, and the theoretical structure of the inquiry enter into such
specifications. Current important methodological thresholds are to find better
ways of predicting recruitment and attrition patterns in the sample, of designing
experiments that will be statistically robust in the face of problematic sample
OCR for page 172
172 / The Behavioral and Social Sciences
recruitment or excessive attrition, and of ensuring appropriate acquisition and
analysis of data on the attrition component of the sample.
Also of major significance are improvements in integrating detailed process
and outcome measurements in field experiments. To conduct research on pro-
gram effects under held conditions requires continual monitoring to determine
exactly what is being done—the process how it corresponds to what was
projected at the outset. Relatively unintrusive, inexpensive, and effective im-
plementation measures are of great interest. There is, in parallel, a growing
emphasis on designing experiments to evaluate distinct program components
in contrast to summary measures of net program effects.
Finally, there is an important opportunity now for further theoretical work
to model organizational processes in social settings and to design and select
outcome variables that, in the relatively short time of most field experiments,
can predict longer-term effects: For example, in job-training programs, what
are the effects on the community (role models, morale, referral networks) or
on individual skills, motives, or knowledge levels that are likely to translate
into sustained changes in career paths and income levels?
Survey Designs
Many people have opinions about how societal mores, economic conditions,
and social programs shape lives and encourage or discourage various kinds of
behavior. People generalize from their own cases, and from the groups to which
they belong, about such matters as how much it costs to raise a child, the extent
to which unemployment contributes to divorce, and so on. In fact, however,
effects vary so much from one group to another that homespun generalizations
are of little use. Fortunately, behavioral and social scientists have been able to
bridge the gaps between personal perspectives and collective realities by means
of survey research. In particular, governmental information systems include
volumes of extremely valuable survey data, and the facility of modern com-
puters to store, disseminate, and analyze such data has significantly improved
empirical tests and led to new understandings of social processes.
Within this category of research designs, two major types are distinguished:
repeated cross-sectional surveys and longitudinal panel surveys. In addition,
and cross-cutting these types, there is a major effort under way to improve and
refine the quality of survey data by investigating features of human memory
and of question formation that affect survey response.
Repeated cross-sectional designs can either attempt to measure an entire
population as does the oldest U.S. example, the national decennial census
or they can rest on samples drawn from a population. The general principle is
to take independent samples at two or more times, measuring the variables of
interest, such as income levels, housing plans, or opinions about public affairs,
in the same way. The General Social Survey, collected by the National Opinion
Research Center with National Science Foundation support, is a repeated cross-
OCR for page 173
Methods of Data Collection, Representation, and Analysis / 173
sectional data base that was begun in 1972. One methodological question of
particular salience in such data is how to adjust for nonresponses and "don't
know" responses. Another is how to deal with self-selection bias. For example,
to compare the earnings of women and men in the labor force, it would be
mistaken to first assume that the two samples of labor-force participants are
randomly selected from the larger populations of men and women; instead,
one has to consider and incorporate in the analysis the factors that determine
who is in the labor force.
In longitudinal panels, a sample is drawn at one point in time and the relevant
variables are measured at this and subsequent times for the same people. In
more complex versions, some fraction of each panel may be replaced or added
to periodically, such as expanding the sample to include households formed
by the children of the original sample. An example of panel data developed in
this way is the Panel Study of Income Dynamics (PSID), conducted by the
University of Michigan since 1968 (discussed in Chapter 35.
Comparing the fertility or income of different people in different circum-
stances at the same time to kind correlations always leaves a large proportion
of the variability unexplained, but common sense suggests that much of the
unexplained variability is actually explicable. There are systematic reasons for
individual outcomes in each person's past achievements, in parental models,
upbringing, and earlier sequences of experiences. Unfortunately, asking people
about the past is not particularly helpful: people remake their views of the past
to rationalize the present and so retrospective data are often of uncertain va-
lidity. In contrast, generation-long longitudinal data allow readings on the
sequence of past circumstances uncolored by later outcomes. Such data are
uniquely useful for studying the causes and consequences of naturally occur-
ring decisions and transitions. Thus, as longitudinal studies continue, quant,i-
tative analysis is becoming feasible about such questions as: How are the de-
cisions of individuals affected by parental experience? Which aspects of early
decisions constrain later opportunities? And how does detailed background
experience leave its imprint? Studies like the two-decade-long PSID are bring-
ing within grasp a complete generational cycle of detailed data on fertility, work
life, household structure, and income.
Advances in Longitudinal Designs
Large-scale longitudinal data collection projects are uniquely valuable as
vehicles for testing and improving survey research methodology. In ways that
lie beyond the scope of a cross-sectional survey, longitudinal studies can some-
times be designed without significant detriment to their substantive inter-
ests to facilitate the evaluation and upgrading of data quality; the analysis of
relative costs and effectiveness of alternative techniques of inquiry; and the
standardization or coordination of solutions to problems of method, concept,
and measurement across different research domains.
OCR for page 174
174 / The Behavioral and Social Sciences
Some areas of methodological improvement include discoveries about the
impact of interview mode on response (mail, telephone, face-to-face); the effects
of nonresponse on the representativeness of a sample (due to respondents'
refusal or interviewers' failure to contact); the effects on behavior of continued
participation over time in a sample survey; the value of alternative methods of
adjusting for nonresponse and incomplete observations (such as imputation of
missing data, variable case weighting); the impact on response of specifying
different recall periods, varying the intervals between interviews, or changing
the length of interviews; and the comparison and calibration of results obtained
by longitudinal surveys, randomized field experiments, laboratory studies, one-
time surveys, and administrative records.
It should be especially noted that incorporating improvements in method-
ology and data quality has been and will no doubt continue to be crucial to
the growing success of longitudinal studies. Panel designs are intrinsically more
vulnerable than other designs to statistical biases due to cumulative item non-
response, sample attrition, time-in-sample effects, and error margins in re-
peated measures, all of which may produce exaggerated estimates of change.
Over time, a panel that was initially representative may become much less
representative of a population, not only because of attrition in the sample, but
also because of changes in immigration patterns, age structure, and the like.
Longitudinal studies are also subject to changes in scientific and societal con-
texts that may create uncontrolled drifts over time in the meaning of nominally
stable questions or concepts as well as in the underlying behavior. Also, a
natural tendency to expand over time the range of topics and thus the interview
lengths, which increases the burdens on respondents, may lead to deterioration
of data quality or relevance. Careful methodological research to understand
and overcome these problems has been done, and continued work as a com-
ponent of new longitudinal studies is certain to advance the overall state of the
art.
Longitudinal studies are sometimes pressed for evidence they are not de-
signed to produce: for example, in important public policy questions concern-
ing the impact of government programs in such areas as health promotion,
disease prevention, or criminal justice. By using research designs that combine
field experiments (with randomized assignment to program and control con-
ditions) and longitudinal surveys, one can capitalize on the strongest merits of
each: the experimental component provides stronger evidence for casual state-
ments that are critical for evaluating programs and for illuminating some fun-
damental theories; the longitudinal component helps in the estimation of long-
term program effects and their attenuation. Coupling experiments to ongoing
longitudinal studies is not often feasible, given the multiple constraints of not
disrupting the survey, developing all the complicated arrangements that go
into a large-scale field experiment, and having the populations of interest over-
lap in useful ways. Yet opportunities to join field experiments to surveys are
OCR for page 175
Methods of Data Collection, Representation, and Analysis / 175
of great importance. Coupled studies can produce vital knowledge about the
empirical conditions under which the results of longitudinal surveys turn out
to be similar to—or divergent from those produced by randomized field
experiments. A pattern of divergence and similarity has begun to emerge in
coupled studies; additional cases are needed to understand why some naturally
occurring social processes and longitudinal design features seem to approxi-
mate formal random allocation and others do not. The methodological impli-
cations of such new knowledge go well beyond program evaluation and survey
research. These findings bear directly on the confidence scientists and oth-
ers can have in conclusions from observational studies of complex behavioral
and social processes, particularly ones that cannot be controlled or simulated
within the confines of a laboratory environment.
Memory and the Framing of questions
A very important opportunity to improve survey methods lies in the reduc-
tion of nonsampling error due to questionnaire context, phrasing of questions,
and, generally, the semantic and social-psychological aspects of surveys. Survey
data are particularly affected by the fallibility of human memory and the sen-
sitivity of respondents to the framework in which a question is asked. This
sensitivity is especially strong for certain types of attitudinal and opinion ques-
tions. Efforts are now being made to bring survey specialists into closer contact
with researchers working on memory function, knowledge representation, and
language in order to uncover and reduce this kind of error.
Memory for events is often inaccurate, biased toward what respondents
believe to be true or should be true—about the world. In many cases in
which data are based on recollection, improvements can be achieved by shifting
to techniques of structured interviewing and calibrated forms of memory elic-
itation, such as specifying recent, brief time periods (for example, in the last
seven days) within which respondents recall certain types of events with ac-
ceptable accuracy.
Experiments on individual decision making show that the way a question is
framed predictably alters the responses. Analysts of survey data find that some
small changes in the wording of certain kinds of questions can produce large
differences in the answers, although other wording changes have little effect.
Even simply changing the order in which some questions are presented can
produce large differences, although for other questions the order of presenta-
tion does not matter. For example, the following questions were among those
asked in one wave of the General Social Survey:
· "Taking things altogether, how would you describe your marriage? Would
you say that your marriage is very happy, pretty happy, or not too happy?"
· "Taken altogether how would you say things are these days—would you
say you are very happy, pretty happy, or not too happy?"
OCR for page 190
190 / The Behavioral and Social Sciences
falsity makes sense. In particular, statements that remain invariant under certain
symmetries of structure have played an important role in classical geometry,
dimensional analysis in physics, and in relating measurement and statistical
models applied to the same phenomenon. In addition, these ideas have been
used to construct models in more formally developed areas of the behavioral
and social sciences, such as psychophysics. Current research has emphasized
the communality of these historically independent developments and is at-
tempting both to uncover systematic, philosophically sound arguments as to
why invariance under symmetries is as important as it appears to be and to
understand what to do when structures lack symmetry, as, for example, when
variables have an inherent upper bound.
Clustering
Many subjects do not seem to be correctly represented in terms of distances
in continuous geometric space. Rather, in some cases, such as the relations
among meanings of words which is of great interest in the study of memory
representations a description in terms of tree-like, hierarchial structures ap-
pears to be more illuminating. This kind of description appears appropriate
both because of the categorical nature of the judgments and the hierarchial,
rather than trade-off, nature of the structure. Individual items are represented
as the terminal nodes of the tree, and groupings by different degrees of similarity
are shown as intermediate nodes, with the more general groupings occurring
nearer the root of the tree. Clustering techniques, requiring considerable com-
putational power, have been and are being developed. Some successful appli-
cations exist, but much more refinement is anticipated.
Network Models
Several other lines of advanced modeling have progressed in recent years,
opening new possibilities for empirical specification and testing of a variety of
theories. In social network data, relationships among units, rather than the
units themselves, are the primary objects of study: friendships among persons,
trade ties among nations, cocitation clusters among research scientists, inter-
locking among corporate boards of directors. Special models for social network
data have been developed in the past decade, and they give, among other things,
precise new measures of the strengths of relational ties among units. A major
challenge in social network data at present is to handle the statistical depend-
ence that arises when the units sampled are related in complex ways.
STATISTICAL INFERENCE AND ANALYSIS
As was noted earlier, questions of design, representation, and analysis are
intimately intertwined. Some issues of inference and analysis have been dis-
OCR for page 191
Methods of Data Collection, Representation, and Analysis / 191
cussed above as related to specific data collection and modeling approaches.
This section discusses some more general issues of statistical inference and
advances in several current approaches to them.
Causal Inference
Behavioral and social scientists use statistical methods primarily to infer the
effects of treatments, interventions, or policy factors. Previous chapters in-
cluded many instances of causal knowledge gained this way. As noted above,
the large experimental study of alternative health care financing discussed in
Chapter 2 relied heavily on statistical principles and techniques, including
randomization, in the design of the experiment and the analysis of the resulting
data. Sophisticated designs were necessary in order to answer a variety of
questions in a single large study without confusing the effects of one program
difference (such as prepayment or fee for service) with the effects of another
(such as different levels of deductible costs), or with effects of unobserved
variables (such as genetic differences). Statistical techniques were also used to
ascertain which results applied across the whole enrolled population and which
were confined to certain subgroups (such as individuals with high blood pres-
sure) and to translate utilization rates across different programs and types of
patients into comparable overall dollar costs and health outcomes for alternative
financing options.
A classical experiment, with systematic but randomly assigned variation of
the variables of interest (or some reasonable approach to this), is usually con-
sidered the most rigorous basis from which to draw such inferences. But ran-
dom samples or randomized experimental manipulations are not always fea-
sible or ethically acceptable. Then, causal inferences must be drawn from
observational studies, which, however well designed, are less able to ensure
that the observed (or inferred) relationships among variables provide clear
evidence on the underlying mechanisms of cause and effect.
Certain recurrent challenges have been identified in studying causal infer-
ence. One challenge arises from the selection of background variables to be
measured, such as the sex, nativity, or parental religion of individuals in a
comparative study of how education affects occupational success. The adequacy
of classical methods of matching groups in background variables and adjusting
for covariates needs further investigation. Statistical adjustment of biases linked
to measured background variables is possible, but it can become complicated.
Current work in adjustment for selectivity bias is aimed at weakening implau-
sible assumptions, such as normality, when carrying out these adjustments.
Even after adjustment has been made for the measured background variables,
other, unmeasured variables are almost always still affecting the results (such
as family transfers of wealth or reading habits). Analyses of how the conclusions
might change if such unmeasured variables could be taken into account is
OCR for page 192
192 / The Behavioral and Social Sciences
essential in attempting to make causal inferences from an observational study,
and systematic work on useful statistical models for such sensitivity analyses
is just beginning.
The third important issue arises from the necessity for distinguishing among
competing hypotheses when the explanatory variables are measured with dif-
ferent degrees of precision. Both the estimated size and significance of an effect
are diminished when it has large measurement error, and the coefficients of
other correlated variables are affected even when the other variables are meas-
ured perfectly. Similar results arise from conceptual errors, when one measures
only proxies for a theoretical construct (such as years of education to represent
amount of learning). In some cases, there are procedures for simultaneously
or iteratively estimating both the precision of complex measures and their effect
. .
On a particu tar criterion.
Although complex models are often necessary to infer causes, once their
output is available, it should be translated into understandable displays for
evaluation Results that depend on the accuracy of a multivariate model and
the associated software need to be subjected to appropriate checks, including
the evaluation of graphical displays, group comparisons, and other analyses.
New Statistical Techniques
Internal Resampling
One of the great contributions of twentieth-century statistics was to dem-
onstrate how a properly drawn sample of sufficient size, even if it is only a tiny
fraction of the population of interest, can yield very good estimates of most
population characteristics. When enough is known at the outset about the
characteristic in question for example, that its distribution is roughly nor-
mal inference from the sample data to the population as a whole is straight-
forward, and one can easily compute measures of the certainty of inference, a
common example being the 9S percent confidence interval around an estimate.
But population shapes are sometimes unknown or uncertain, and so inference
procedures cannot be so simple. Furthermore, more often than not, it is difficult
to assess even the degree of uncertainty associated with complex data and with
the statistics needed to unravel complex social and behavioral phenomena.
Internal resampling methods attempt to assess this uncertainty by generating
a number of simulated data sets similar to the one actually observed. The
definition of similar is crucial, and many methods that exploit different types
of similarity have been devised. These methods provide researchers the freedom
to choose scientifically appropriate procedures and to replace procedures that
are valid under assumed distributional shapes with ones that are not so re-
stricted. Flexible and imaginative computer simulation is. the key to these
methods. For a simple random sample, the "bootstrap" method repeatedly
resamples the obtained data (with replacement) to generate a distribution of
OCR for page 193
Methods of Data Collection, Representation, and Analysis / 193
possible data sets. The distribution of any estimator can thereby be simulated
and measures of the certainty of inference be derived. The "jackknife" method
repeatedly omits a fraction of the data and in this way generates a distribution
of possible data sets that can also be used to estimate variability. These methods
can also be used to remove or reduce bias. For example, the ratio-estimator, a
statistic that is commonly used in analyzing sample surveys and censuses, is
known to be biased, and the jackknife method can usually remedy this defect.
The methods have been extended to other situations and types of analysis,
such as multiple regression.
There are indications that under relatively general conditions, these methods,
and others related to them, allow more accurate estimates of the uncertainty
of inferences than do the traditional ones that are based on assumed (usually,
normal) distributions when that distributional assumption is unwarranted. For
complex samples, such internal resampling or subsampling facilitates estimat-
ing the sampling variances of complex statistics.
An older and simpler, but equally important, idea is to use one independent
subsample in searching the data to develop a model and at least one separate
subsample for estimating and testing a selected model. Otherwise, it is next to
impossible to make allowances for the excessively close fitting of the model
that occurs as a result of the creative search for the exact characteristics of the
sample data characteristics that are to some degree random and will not
predict well to other samples.
Robust Techniques
Many technical assumptions underlie the analysis of data. Some, like the
assumption that each item in a sample is drawn independently of other items,
can be weakened when the data are sufficiently structured to admit simple
alternative models, such as serial correlation. Usually, these models require
that a few parameters be estimated. Assumptions about shapes of distributions,
normality being the most common, have proved to be particularly important,
and considerable progress has been made in dealing with the consequences of
different assumptions.
More recently, robust techniques have been designed that permit sharp, valid
discriminations among possible values of parameters of central tendency for a
wide variety of alternative distributions by reducing the weight given to oc-
casional extreme deviations. It turns out that by giving up, say, 10 percent of
the discrimination that could be provided under the rather unrealistic as-
sumption of normality, one can greatly improve performance in more realistic
situations, especially when unusually large deviations are relatively common.
These valuable modifications of classical statistical techniques have been
extended to multiple regression, in which procedures of iterative reweighting
can now offer relatively good performance for a variety of underlying distri-
butional shapes. They should be extended to more general schemes of analysis.
OCR for page 194
194 / The Behavioral and Social Sciences
In some contexts notably the most classical uses of analysis of variance the
use of adequate robust techniques should help to bring conventional statistical
practice closer to the best standards that experts can now achieve.
Many Interrelated Parameters
In trying to give a more accurate representation of the real world than is
possible with simple models, researchers sometimes use models with many
parameters, all of which must be estimated from the data. Classical principles
of estimation, such as straightforward maximum-likelihood, do not yield re-
liable estimates unless either the number of observations is much larger than
the number of parameters to be estimated or special designs are used in con-
junction with strong assumptions. Bayesian methods do not draw a distinction
between fixed and random parameters, and so may be especially appropriate
for such problems.
A variety of statistical methods have recently been developed that can be
interpreted as treating many of the parameters as or similar to random quan-
tities, even if they are regarded as representing fixed quantities to be estimated.
Theory and practice demonstrate that such methods can improve the simpler
fixed-parameter methods from which they evolved, especially when the num-
ber of observations is not large relative to the number of parameters. Successful
applications include college and graduate school admissions, where quality of
previous school is treated as a random parameter when the data are insufficient
to separately estimate it well. Efforts to create appropriate models using this
general approach for small-area estimation and underc.ount adjustment in the
census are important potential applications.
Missing Data
In data analysis, serious problems can arise when certain kinds of (quanti-
tative or qualitative) information is partially or wholly missing. Various ap-
proaches to dealing with these problems have been or are being developed.
One of the methods developed recently for dealing with certain aspects of
missing data is called multiple imputation: each missing value in a data set is
replaced by several values representing a range of possibilities, with statistical
dependence among missing values reflected by linkage among their replace-
ments. It is currently being used to handle a major problem of incompatibility
between the 1980 and previous Bureau of Census public-use tapes with respect
to occupation codes. The extension of these techniques to address such prob-
lems as nonresponse to income questions in the Current Population Survey
has been examined in exploratory applications with great promise.
Computing
Computer Packages and Expert Systems
The development of high-speed computing and data handling has funda-
mentally changed statistical analysis. Methodologies for all kinds of situations
OCR for page 195
Methods of Data Collection, Representation, and Analysis / l9S
are rapidly being developed and made available for use in computer packages
that may be incorporated into interactive expert systems. This computing ca-
pability offers the hope that much data analyses will be more carefully and
more effectively done than previously and that better strategies for data analysis
will move from the practice of expert statisticians, some of whom may not have
tried to articulate their own strategies, to both wide discussion and general use.
But powerful tools can be hazardous, as witnessed by occasional dire misuses
of existing statistical packages. Until recently the only strategies available were
to train more expert methodologists or to train substantive scientists in more
methodology, but without the updating of their training it tends to become
outmoded. Now there is the opportunity to capture in expert systems the
current best methodological advice and practice. If that opportunity is ex-
ploited, standard methodological training of social scientists will shift to em-
phasizing strategies in using good expert systems - including understanding
the nature and importance of the comments it provides rather than in how
to patch together something on one's own. With expert systems, almost all
behavioral and social scientists should become able to conduct any of the more
common styles of data analysis more effectively and with more confidence than
all but the most expert do today. However, the difficulties in developing expert
systems that work as hoped for should not be underestimated. Human experts
cannot readily explicate all of the complex cognitive network that constitutes
an important part of their knowledge. As a result, the first attempts at expert
systems were not especially successful (as discussed in Chapter 1~. Additional
work is expected to overcome these limitations, but it is not clear how long it
will take.
Exploratory Analysis and Graphic Presentation
The formal focus of much statistics research in the middle half of the twen-
tieth century was on procedures to confirm or reject precise, a priori hypotheses
developed in advance of collecting data—that is, procedures to determine
statistical significance. There was relatively little systematic work on realistically
rich strategies for the applied researcher to use when attacking real-world
problems with their multiplicity of objectives and sources of evidence. More
recently, a species of quantitative detective work, called exploratory data anal-
ysis, has received increasing attention. In this approach, the researcher seeks
out possible quantitative relations that may be present in the data. The tech-
niques are flexible and include an important component of graphic represen-
tations. While current techniques have evolved for single responses in situa-
tions of modest complexity, extensions to multiple responses and to single
responses in more complex situations are now possible.
Graphic and tabular presentation is a research domain in active renaissance,
stemming in part from suggestions for new kinds of graphics made possible
by computer capabilities, for example, hanging histograms and easily assimi-
lated representations of numerical vectors. Research on data presentation has
OCR for page 196
196 / The Behavioral and Social Sciences
been carried out by statisticians, psychologists, cartographers, and other spe-
cialists, and attempts are now being made to incorporate findings and concepts
from linguistics, industrial and publishing design, aesthetics, and classification
studies in library science. Another influence has been the rapidly increasing
availability of powerful computational hardware and software, now available
even on desktop computers. These ideas and capabilities are leading to an
increasing number of behavioral experiments with substantial statistical input.
Nonetheless, criteria of good graphic and tabular practice are still too much
matters of tradition and dogma, without adequate empirical evidence or theo-
retical coherence. To broaden the respective research outlooks and vigorously
develop such evidence and coherence, extended collaborations between statis-
tical and mathematical specialists and other scientists are needed, a major
objective being to understand better the visual and cognitive processes (see
Chapter 1) relevant to effective use of graphic or tabular approaches.
Combining Evidence
Combining evidence from separate sources is a recurrent scientific task, and
formal statistical methods for doing so go back 30 years or more. These methods
include the theory and practice of combining tests of individual hypotheses,
sequential design and analysis of experiments, comparisons of laboratories, and
Bayesian and likelihood paradigms.
There is now growing interest in more ambitious analytical syntheses, which
are often called meta-analyses. One stimulus has been the appearance of syntheses
explicitly combining all existing investigations in particular fields, such as prison
parole policy, classroom size in primary schools, cooperative studies of ther-
apeutic treatments for coronary heart disease, early childhood education in-
terventions, and weather modification experiments. In such fields, a serious
approach to even the simplest question how to put together separate esti-
mates of effect size from separate investigations leads quickly to difficult and
interesting issues. One issue involves the lack of independence among the
available studies, due, for example, to the effect of influential teachers on the
research projects of their students. Another issue is selection bias, because only
some of the studies carried out, usually those with "significant" findings, are
available and because the literature search may not find out all relevant studies
that are available. In addition, experts agree, although informally, that the
quality of studies from different laboratories and facilities differ appreciably
and that such information probably should be taken into account. Inevitably,
the studies to be included used different designs and concepts and controlled
or measured different variables, making it difficult to know how to combine
them.
Rich, informal syntheses, allowing for individual appraisal, may be better
than catch-all formal modeling, but the literature on formal meta-analytic models
OCR for page 197
Methods of Data Collection, Representation, and Analysis / 197
is growing and may be an important area of discovery in the next decade,
relevant both to statistical analysis per se and to improved syntheses in the
behavioral and social and other sciences.
OPPORTUNITIES AND NEEDS
This chapter has cited a number of methodological topics associated with
1 1 · ~ 1 . ~ . ~
oenav~ora~ and social sciences research that appear to be particularly active and
promising at the present time. As throughout the report, they constitute illus-
trative examples of what the committee believes to be important areas of re-
search in the coming decade. In this section we describe recommendations for
an additional $16 million annually to facilitate both the development of meth-
odologically oriented research and, equally important, its communication
throughout the research community.
Methodological studies, including early computer implementations, have for
the most part been carried out by individual investigators with small teams of
colleagues or students. Occasionally, such research has been associated with
quite large substantive projects, and some of the current developments of
computer packages, graphics, and expert systems clearly require large, orga-
nized efforts, which often lie at the boundary between grant-supported work
and commercial development. As such research is often a key to understanding
complex bodies of behavioral and social sciences data, it is vital to the health
of these sciences that research support continue on methods relevant to prob-
lems of modeling, statistical analysis, representation, and related aspects of
behavioral and social sciences data. Researchers and funding agencies should
also be especially sympathetic to the inclusion of such basic methodological
work in large experimental and longitudinal studies. Additional funding for
work in this area, both in terms of individual research grants on methodological
issues and in terms of augmentation of large projects to include additional
methodological aspects, should be provided largely in the form of investigator-
initiated project grants.
Ethnographic and comparative studies also typically rely on project grants
to individuals and small groups of investigators. While this type of support
should continue, provision should also be made to facilitate the execution of
studies using these methods by research teams and to provide appropriate
methodological training through the mechanisms outlined below.
Overall, we recommend an increase of $4 million in the level of investigator-
initiated grant support for methodological work. An additional $1 million
should be devoted to a program of centers for methodological research.
Many of the new methods and models described in the chapter, if and when
adopted to any large extent, will demand substantially greater amounts of
research devoted to appropriate analysis and computer implementation. New
OCR for page 198
198 / The Behavioral and Social Sciences
user interfaces and numerical algorithms will need to be designed and new
computer programs written. And even when generally available methods (such
as maximum-likelihood) are applicable, model application still requires skillful
development in particular contexts. Many of the familiar general methods that
are applied in the statistical analysis of data are known to provide good ap-
proximations when sample sizes are sufficiently large, but their accuracy varies
with the specific model and data used. To estimate the accuracy requires ex-
tensive numerical exploration. Investigating the sensitivity of results to the
assumptions of the models is important and requires still more creative, thoughtful
research. It takes substantial efforts of these kinds to bring any new model on
line, and the need becomes increasingly important and difficult as statistical
models move toward greater realism, usefulness, complexity, and availability
in computer form. More complexity in turn will increase the demand for com-
putational power. Although most of this demand can be satisfied by increas-
ingly powerful desktop computers, some access to mainframe and even su-
percomputers will be needed in selected cases. We recommend an additional
$4 million annually to cover the growth in computational demands for model
development and testing.
Interaction and cooperation between the developers and the users of statis-
tical and mathematical methods need continual stimulation both ways. Ef-
forts should be made to teach new methods to a wider variety of potential users
than is now the case. Several ways appear effective for methodologists to com-
municate to empirical scientists: running summer training programs for grad-
uate students, faculty, and other researchers; encouraging graduate students,
perhaps through degree requirements, to make greater use of the statistical,
mathematical, and methodological resources at their own or affiliated univer-
sities; associating statistical and mathematical research specialists with large-
scale data collection projects; and developing statistical packages that incor-
porate expert systems in applying the methods.
Methodologists, in turn, need to become more familiar with the problems
actually faced by empirical scientists in the laboratory and especially in the
field. Several ways appear useful for communication in this direction: encour-
aging graduate students in methodological specialties, perhaps through degree
requirements, to work directly on empirical research; creating postdoctoral
fellowships aimed at integrating such specialists into ongoing data collection
projects; and providing for large data collection projects to engage relevant
methodological specialists. In addition, research on and development of sta-
tistical packages and expert systems should be encouraged to involve the mul-
tidisciplinary collaboration of experts with experience in statistical, computer,
. . .
anc ~ cognitive sciences.
A final point has to do with the promise held out by bringing different
research methods to bear on the same problems. As our discussions of research
methods in this and other chapters have emphasized, different methods have
OCR for page 199
Methods of Data Collection, Representation, and Analysis / 199
different powers and limitations, and each is designed especially to elucidate
one or more particular facets of a subject. An important type of interdisciplinary
work is the collaboration of specialists in different research methodologies on
a substantive issue, examples of which have been noted throughout this report.
If more such research were conducted cooperatively, the power of each method
pursued separately would be increased. To encourage such multidisciplinary
work, we recommend increased support for fellowships, research workshops,
anc . tramlug institutes.
Funding for fellowships, both pre- and postdoctoral, should be aimed at
giving methodologists experience with substantive problems and at upgrading
the methodological capabilities of substantive scientists. Such targeted fellow-
ship support should be increased by $4 million annually, of which $3 million
should be for predoctoral fellowships emphasizing the enrichment of meth-
odological concentrations. The new support needed for research workshops is
estimated to be $1 million annually. And new support needed for various kinds
of advanced training institutes aimed at rapidly diffusing new methodological
findings among substantive scientists is estimated to be $2 million annually.
OCR for page 200
Representative terms from entire chapter:
longitudinal studies