Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 47
5
Data Bases
This chapter provides an overview and assessment of the data bases
used in the development of the panel's flow diagram and its CEDE
model. Fourteen distinct data bases were used to obtain the necessary
data and estimates on the education and employment of those in the
engineering community. The panel also made use of a large variety of
secondary sources that are not reviewed here. The chapter is divided
into four parts: an overview, a discussion of data coverage, a technical
assessment, and recommendations.
Overview of Data Bases
Table 1 summarizes the main features of the 14 data leases used in
developing the panel's flow diagram and model. The features summa-
rized in the table are discussed below.
Data Base Manager
The term data base manager refers to the organization that is princi-
pally responsible for storing and reporting the data. The National Sci-
ence Foundation NSF), National Research Council, Engineering
Manpower Commission {EMC), Bureau of the Census ~Census),
Bureau of Labor Statistics {BLSJ, and National Center for Education
Statistics {NCES) are the primary data base managers. There is, how-
ever, a considerable level of interdependence among the data base man
47
OCR for page 48
48
4 -
.
-
.
-
.
o
. -
c)
.-
in
Lo
in
o
in
an
a)
by
·
·
bC
by
· -
4=
Go
· -
o
as
· -
a
o
hi
o
~ o
<: ~ -=
ED o ~
US ~
At ~
~ To
At
it ~
ED
G
a,
CC
Cal
~ i- ~ ~ ~ ~ C C)~ · ~-~ O
_ ~ ,~ ~ ~ ~ ~ _ _ ~ . _ _ 5~ · - -
Cal ~ ~ O
~ O ~ Ce ~ g ~ ~:5 ~ O
Cal
O C%l ~.
Do ~oo
Go ON Cx Go
._ ~ oo
1 _ _ ~1
O ~ C;x
<)
CN GN
7_ t_ t_
c ~
~O C)
0 0 ct ~ ~.~;:
~.o ~ ~.G~
4= 4 -
~ C) - ~_ ~_ ~
O ~O ~O ~O ~;
== ~° 5 a =-- =' =-= o
> ~, ~ -, ~ c)
c~ ~ ~ ~
.=
~- - ~ ~- ~ ~ ~
5 ~=> C) ~ ~ ~ C
o ~ ~ ~a ) bC u: C)
~
~_ _ ~
_ ~CO 5 5 DN 0) O
. _ · _ · c ~· < ~ ~ ~ ~0
v .° `~rJ 4- V, ,. Z ~c,,
a: Z Z Z
a C O _ . O v 0
r 5 5 > U
~Z CC
OCR for page 49
49
o ~ - ~ O o _ ~ _ _ ~ 5 - - ~:t _
at =~- °a°= =~= V ~= ~= 5
~U)
X ~O ~
oo oo Go Go Go oo
_ _ ~_ _ _
1 1 1 1 1 1
of ~
Go Cx Go
~_ _ t_ _ _
C) Cot
_ Cal ~. ~1 is ~ Cal 5 _ _
5 ~o ~ ~ o v 5 5
o ~ V ~ ~ . _ ~
¢ ~¢ ¢
~=- ~
O O ~C ~
41,> ~ C) C.~ O
=1 ~LO
~e ~ ~V ~ 6 <: a _ a
5 0 C ~a 0 ~5 ~ V ~5 a5 5
? ~-
C) ~-V C) C) C)
~ C~ . _
5 5 5 ~5 5
~c ~cn cn c~
_ _ V ~_ _ _
._ ._ ~._ ._ . -
~(- ) ~Ce
V · _ 'V · _ · ~· - ·- ~ -
c: 'J a a :' c 5 D ·0 ~5 ~ ·- a a 6 ~:~
Z Z ~
>~ a `, ~° 0 ~ c ~ ~ ^_ .", >~_ . - ca ~ v a 0 ~5
° ~ a° ~ a 5 =° c5~ _ g a - ° ~ ~ ~ a ~ ;~ ~ ~ ~ ~= =,
~0 ~
OCR for page 50
50
By Ad 'A 'A
.- o o o o
. - . - . - .
.- 4 - 4 - 4=
l At ~Ed
At
- ~
'e
¢ ~ ~
o ~ ~
. ~oo Go Do
ON ~ ~
. _ Cat Get GO
~a'
c)
.- .
Go - - Go
5 ~ ~ ~ ~
~ ¢ ~¢ ¢
cl)
5
Lo
o
U)
a'
o
. - _
4= ~
~ O
~ _ 4_,
Cal O
Cal ~
Cal
Ed ~
4= C"
Cal
Cal
CO
Cal
4 - ~
a' ~
O o~ o ~ ~o
~ a=5 5= 5 ~==
Fan ~
4=~
a)A)
4= ~ ~
~ ~ ~0
^=^= 3 ~
o Cto ~.CT, ~
~c~ C~0
U. ~ C ~.
=.
L~_ ~
?- ~:^
~G)a~
:>
5 ~
co cocn co
_. __ _
._ ._._ ._
C~
<;
_ ~
~O
O - tV ~V .O
C) C ~ o t4 i
a ° v ° ~ 0 5 o ~ ~ _ o C V C
P ~
U.
C~
~_
o
4=
._
4 -
4 -
~4
._
o
-
-
o
cn
U)
4=
C~
._
o
4=
~0
cn
o
o
C)
4=
4 -
-0
CC
._
4=
~0
4=
X
ao
4 -
4=
.~
O
4=
~>
C~
C~
CC
C~
e~
4=
C~
OCR for page 51
DATA BASES
51
agers. For example, NSF and BLS both rely on the census for some of
their samples and the actual administration and compilation of the
surveys. NSF, in turn, operates as a funding or sponsoring body for the
Research Council.
The significance of the interrelationships among the data base man-
agers stems from the increased likelihood that the individual data leases
could be made fully compatible with each other at some time in the
future. NSF acts as the primary integrative organization particularly by
standardizing key survey questions and by sponsoring a mathematical
model that combines a number of its data leases.
At the present time, however, the variety of data base managers
increases the difficulties associated with integrating the individual
data bases. Each data base is constructed to address the issues that
follow from immediate organizational purposes of the data base man-
ager. Thus, NSF focuses heavily on the training of scientists and engi-
neers, on the functioning of higher education, and on the types of
employment of scientists and engineers. By contrast, BLS focuses
exclusively on where scientists and engineers are employed. One major
consequence of these differences is that definitions of engineer and
engineering differ. As a result, there are marked discrepancies in the
estimates of the number of engineers derived from different sources
using different definitions. [For example, estimates in the number of
engineers practicing in the United States range from 1.2 million to 1.9
million. ~
Data Collection Methods
Mail surveys are the primary means used by the 14 data bases {listed
in Table 1J for collecting data. Only the Current Population Survey
;CPS) consistently uses an interview method. Mail surveys are the
most cost-effective method for collecting the data, and, given the rela-
tively high existing response rates to the mail surveys {see section on
"Respondents" l~elowJ, there is little reason for changing data collec-
tion methods. Studies conducted lay the Census indicate that mail
surveys do not in themselves result in biased or inaccurate data.
Respondents
Respondents to data base surveys fall into three categories: {1) the
targeted individual, {2) the household in which the targeted individual
resides, and {3) the establishments where the individual works or is
educated. NSF, Research Council, and National Society of Professional
OCR for page 52
52
INFRASTRUCTURE DIAGRAMMING AND MODELING
Engineers (NSPEJ surveys are directed at the targeted individual, EMC
surveys at the employment establishment or educational institution,
and the Census surveys and CPS at the head of the household. BLS and
NCES direct certain surveys to establishments and other surveys to
individuals. Battelle's survey is sent to both individuals and estal~lish-
ments at the same time.
The choice of respondent affects key aspects of the data collection
and analysis effort. Individuals and households are more willing to
provide more detailed information than is an establishment. Establish-
ment surveys are therefore generally designed to collect limited
amounts of information on a relatively few categories {e.g., occupation
and industry). The unit of analysis is generally a group of individuals
rather than an individual. As a result, it is possible to generate data on
"males" or "Asiatics" but seldom "male Asiatics" without adding
enormously to the reporting burdens of an organization. If the number
of categories on an establishment survey is large or if the survey uses
unfamiliar categories, it becomes more probable that the data will be
less reliable. At this time, despite the use of establishment data to
project manpower needs and opportunities, hardly any published
research exists on the accuracy of establishment surveys. ~
Considerably more detail can be obtained from individual respon-
dents, but there remains the concern that individuals may selectively
distort their responses, particularly in status-related areas such as
degrees, salary, occupation, and organizational level. Although the
research is limited, Census research studies tend to indicate that while
errors in reporting do occur, they are not major and there is no consis-
tent pattern of bias.
Similar concerns exist for household surveys, with the added issue
that the person answering the survey may have little idea about what
other members of the household actually do or earn and what their
educational backgrounds are. As is the case with establishment sur-
veys, there is little research to support or refute this issue.
Data for the panel's flow diagram and the CLUE model were devel-
oped from surveys of individuals. Although establishment data can be
used to estimate the size of the various stocks, the level of detail
required for an analysis of flows requires that data lee collected on
individuals. As a result, the most useful data bases are those in which
the respondent is an individual or household.
Target Population
The term target population refers to the group of individuals on
whom data are sought. Some target populations are very broadly
OCR for page 53
DATA BASES
53
defined le.g., by Census, CPS, and the Occupational Employment Sur-
vey POESY I; others are very narrowly defined (e.g., Survey of Doctorate
Recipients (DR), Survey of Recent Science and Engineering Graduates
(RSE) I. For example, DR only provides information on engineers with
doctorates, and RSE only provides information on engineers and com-
puter specialists who have graduated with bachelor's or master's
degrees.
The flow diagram and model require personal, educational, and
employment information on three specific occupational groups: engi-
neers, computer specialists, and technicians. A number of the data
bases only address a distinct segment of these populations.
Overall, the data bases as currently configured do not provide
detailed up-to-date information on engineers and computer specialists
without degrees, associate degree engineers, and computer specialists
and technicians. Some information is available from Census, CPS, and
the Postcensal Manpower Survey (PMS), but in the case of Census and
PMS, it is provided only once every 10 years. The CPS data base is
constructed on a relatively small sample size.
The absence of data in these areas in part reflects the differences in
the definitions of engineers and engineering that underlie the various
data collection efforts. Attention has been focused principally on a
primary segment of the engineering community, the engineer with a
degree (B.S. or above). Yet NSF and other bodies acknowledge that
industry frequently meets local shortages of B.S. engineers by upgrad-
ing other technical staff, many of whom have no degrees.
Other small but significant segments of the engineering community
also receive little attention. Military personnel with B.S. degrees or
equivalent training in engineering and individuals with computer spe-
cialties are not treated as part of the general labor force. For example,
the BLS labor participation rate is calculated after excluding those in
military service. This figure is relatively large: in 1979-1980, 2.1 per-
cent of the graduating class in engineering entered the armed forces.2 In
addition, a significant percentage of physical science graduates also
entered the armed forces as engineers.
The inability to obtain complete coverage of the engineering com-
munity using the existing data bases means that the data elements in
the flow diagram tend to underestimate the size of the stocks and flows.
Focus
As mentioned earlier, the focus or purpose of each data base varies but
can be captured in a combination of three categories: personal, educa-
tion, and employment. Establishment surveys tend to focus on a single
OCR for page 54
54
INFRASTRUCTURE DIAGRAMMING AND MODELING
category, while individual surveys cover two or more categories. The
precise coverage of the different data bases is explained in more detail
below See section on "Data Base Coverage" ).
Frequency
The frequency with which a data base is updated is a function of its
purpose and complexity and of the pace at which the statistics change.
For example, the size and scope of the census ensures that its frequency
remains low. By comparison, the need to adjust salaries in inflationary
periods resulted in an increase in the frequency of various salary sur-
veys, such as the Professional Engineer Income and Salary Survey {PE).
The frequency of updating the data bases listed in Table 1 varies from
1 month ~CPS) to 10 years {Census and PMSJ. The majority of the
surveys are conducted on an annual or biennial basis.
NSF's National Survey of Experienced Scientists and Engineers (ESE)
is unique among the surveys. It was designed to follow the careers of a
sample of scientists and engineers over an eight-year period ~1970-
1978~. ESE provides the only genuine measure of the flow of engineers
throughout the engineering community.
Time Period Covered
Some of the data bases cover long periods of time, such as those of the
Census, CPS, and Professional Income of Engineers {PIE). The majority
of the data bases were started in the 1960s, in part as a response to the
Sputnik challenge and the subsequent increased demand for scientists
and engineers in aerospace and defense industries. As a result, few of
the current data bases go back to 1960. In part, the new data bases
replaced others, such as the Engineering Register, but major differences
in target populations and survey items severely limit the usefulness of
the earlier data bases. The first year in which the data elements con-
form to the needs of the flow diagram as presently structured is 1962.
Availability
All of the data bases listed in Table 1 are computerized. In most cases,
the data tapes are available to the public. However, in a number of
instances [e.g., EMC and NCES), the existing tabulations are exhaus-
tive enough to make the tapes redundant for most users.
The majority of the panel's work was done using existing tabula-
tions, with the notable exception of NSF's data [PMS, ESE, and RSE).
OCR for page 55
DATA BASES
55
While the quality of reporting of the data in tabular form was in general
very high, serious difficulties were encountered in the use of NSF's RSE
tapes. For example, on the 1979 RSE tape documentation 12,285 cases
are listed, yet only 11,543 appear on the tape. More critical was an
apparent difficulty in attaching the correct weights to individual cases
in the 1976, 1978, and 1979 tapes. tFor 1979 data approximately 5 per-
cent of the cases have incorrect weights. ~ This problem made it impos-
sible to reconstruct and validate NSF published tables. If they have not
been addressed already, these technical issues need to be resolved by
NSF.
Data Base Coverage
Table 2 summarizes in detail the data elements covered by the data
bases. The data elements fall into three categories: personal, educa-
tion, and employment. The table provides only a limited indication of
the adequacy of the data bases. There is a need not only to determine
whether a specific topic is covered but for what period of time, in what
detail, the underlying unit of analysis, and the representativeness of the
sample. Each of these issues is addressed below in terms of the require-
ments of the flow diagram.
Personal Variables
The six establishment-respondent surveys (see Table 1) provide no
personal background data. The remaining surveys with the exception
of OES provide standard information on age, sex, and marital status See
Table 2~.
Prior to 1972 the surveys did not include items on racial or ethnic
background with the exception of the census. For the PMS and ESE data
bases, NSF used the individual's census response to cover this variable.
The census and PMS responses of individuals in the ESE sample are
used in the ESE data base. Citizenship is included in the individual
respondent data bases except for CPS and OES. Total income, as
opposed to base pay from a primary source of employment, is asked in
some surveys Census, PMS, CPS, and DR) but not in others {ESE, RSE,
OES, and the Survey of Earned Doctorates FEDS ).
Education
The PMS, ESE, RSE, and ED generate considerable data on types and
level of education. However, the range of postsecondary education cov
OCR for page 56
56
INFRASTRUCTURE DIAGRAMMING AND MODELING
TABLE 2 Data Elements in Existing Data Bases
Survey
Data Elements Census (PMS)
National
Survey of
Experienced
Postcensal Scientists
Manpower and
Engineers
(ESE)
Survey of
Recent
Science
and Current Occupational
Engineering Population Employment
Graduates Survey Survey
(RSE ~(CPS ~(OES)
Personal
Age
Sex
Race
Marital status
Citizenship
Income
Educat~on
Level
Field
History
Type of degree
Field
Date received
Date enrolled
Current status
Other training
Future plans
Source of support
Employment
Employment
status
X
X
X
X
X
X
X
X
X X X
X X X
X
X
X
X
X
X
_ X
_ X
_ X
_ X
_ X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
_ X
X
X _
X
X
X
X
-
X
X
Current job
Occupation X X X X X X
Type of employer X X X X X X
Industry X X X - X X
Level - X
Tenure - X
Work activities X X X X
Salary X X X X X
Satisfaction - X X
Skill utilization - - X X
Job history
Occupation - X X
Type of employer - X X
Industry - X X
Level - X
Tenure - X
Work activities - X X
Salary - X X
Satisfaction - X X
Skill utilization - - X
OCR for page 57
DATA BASES
57
Engineering
and
Survey of Survey of Technology
Earned Doctorage Degrees
Doctorates Recipients Granted (E/T Engineers Compensation
(ED ~(DR) (E/T Degrees) Enrollments ~(PIE ~(Battelle~
E. .
ngmeerlog
and
Technology Professional National
Enrollments Income of Survey of
Professional
Engineer
Income
and
Salary
Survey
(PE)
X X
X
X
X
X
X
X
X
X
X X
X
X
X
X - X
X - X
X - X
X
X
X
X
X
_ X
X
X
X
X X X
X - X X
X
X
X
X
X
X
X
X
X
X
X
X
OCR for page 60
60
INFRAS TR UC TURK DIAGRAMMING AND MODELING
data on the employment of engineers. The others are of limited useful-
ness, although they can be used to verify estimates derived from the
PMS and ESE.
Table 3 is a summary evaluation of the data bases. The PMS and ESE
provide the most complete coverage. However, no data base currently
available adequately covers the flow of high school students into col-
leges and universities. Nor is there sufficient coverage of the flows of
students across the fields within the higher-education system. Finally,
although the PMS and ESE provide detailed coverage of the remaining
data elements, the data are only collected every 10 years and need to be
augmented on a more frequent and representative basis using the NSF
model to generate the needed estimates for updating. The RSE and DR
data bases are deficient in item coverage, and the associate and nonde-
greed segment of the engineering community is not covered.
Technical Characteristics
Table 4 summarizes and reviews a number of the key technical char-
acteristics of the data bases, which are discussed below. This section
also addresses the possibility of creating an integrated data base.
Sampling Frame
The sampling frames used for the different data bases are generally
well defined in the sense that the target population is clearly identified
and a listing of potential respondents can be constructed. For example,
the Postcensal Manpower Survey jPMS) uses a clearly identified subset
of the census population as its target population, and the Engineering
Manpower Commission jEMC) has a complete list of colleges and
universities offering bachelor's degrees in engineering. In a few
instances, the sampling frame is either ill-defined or incomplete. For
example, EMC does not have an exhaustive list of colleges and univer-
sities offering less than bachelor's-level degrees and programs in engi-
neering. Also, EMC's salary survey has no clearly identified target
population although NSPE does, namely, its own membership.
Even though a sampling frame may be well defined it may not be
representative of the desired target population. For example, National
Society of Professional Engineers INSTEP membership tends to be older
and better qualified than are engineers in general. Therefore, care needs
to be taken in generalizing the results of this survey to the entire engi-
neering population. For the most part, however, the sampling frames
are representative of the target populations for the surveys.
OCR for page 61
DATA BASES
TABLE 3 Data Base Coverage
61
Adequacy of Coverage
Education Employment
Data Base Personal Current History Current History
Census X - - X
Postcensal Manpower Survey
SPAS ~
National Survey of Experienced
Scientists and Engineers (ESE)
National Survey of Recent
Science ancl Engineering
Graduates (RSE)
Survey of Earned Doctorates
(ED)
Survey of Doctorate Recipients
(DR)
Current Population Survey (CPS)
Occupational Employment
Survey WEST
X X X
X X X
X
X X X ?
X X X ?
X
X
X X ~
? _
NOTE: X = fairly complete coverage; ? = some elements are covered, lout there are
major elements that are not covered.
Sampling Procedures
A number of the surveys are leased on a " 100 percent sample," and
therefore, the sampling procedures are not an issue. Those data bases
actually using a sample employ standard procedures to randomly select
the sample. In most instances, a stratified random sample is used in
order to ensure adequate coverage of key demographic, geographic, or
size variables. Design effects from the stratification procedure tend to
lee small.
Sample Size and Sampling Fraction
The sample size and sampling fraction are the key determinants of
the reliability of the subsequent population estimates. As the sample
size increases and the sampling fraction increases or both, the standard
error of an estimate declines. Sample sizes and sampling fractions are
chosen so that the standard error for the same set of key estimates falls
within certain acceptable limits. In general, the decisions concerning
sample size and sampling fraction reflect a considerable amount of
careful planning and analysis. While no definitive judgment can lie
made as to the acceptability of the sampling errors, the standard error
OCR for page 62
62
~_- _
. _ _ o o C~ ~
,_ O ~A, <= V ~
U:
. ~
O
N A
._ ~
CC
O
C~4 AV
C ~
V C~
C~
C~
Ct
bC
~ _
CC
·_
L~
o
CC
V
·_
CC
·_
V
Ct
C~
V
a~
bC
·= O
C~
,A_' V
~ ,'
C~ ~
~_
~ N o
~ ~ E~
~ 5
_ V
P~ ~
~ V
~ O
V:
bC
._
_ ~)
£ £
L~
C~
C~
:q
C~
C~
-= CNOo
~_ oo
_ o o
C ~o o
o o
E~
bC ~
~ ~r:
_ ~
~ ~.
o ~
~ ~U~
oc
~ 1
L~ 1
A~
~o 1
A~
O O
O O
Cx ~)
_ _
C ~
o
O -
<) ~
_ ~
CN
Ct ·-
~ ~_
o ~ N
V C~ U'
C~
C~ . _
- 5 ~) ~
Ct O -
b.0 ~ O
C~ S~ ) A-A
~ £ tc v
- U) ~ C~
4-
V
~'
O V
~ 'V
-, V:
o
g C
O ~C ~
~ ~ ~ ~ ~ o
A
5 A ~='
_ ~
o
- o
o
o
-
o
C~
g A~
_ ~
__
1
~- ~
~ _
· _ o ~ (C
O O A _, A
~O
C~
U,
._
. ~ V
~4
o
o
-
o
A_
-
0O
~.
C~
X
_ 0o
C ~
O O
O O
~) ~)
~'~
O O
O O
O O
O O
_ _
o
O O
O O
_ _
1 0 0
1 _ tI)
~-O _ \_)A~O
V - _ C A ~ a _: v ~
A ~
C ~
-
O ~cn
>%~ ~_
~. _ A ~
4_A ~ V
~V V
_~
~ 9 -
(t . . A_
~C 'V
_ ~'V
V V C ~`_ _ ~
O ~=` ~c~ O O A
~ c.= 3 £ .= v ~
~ :>
_ ~ ~
~C C~
V) 7_ _
V ~
~ O =,
C~
_ _
~AV 't ~ V
r ~ _ ~· _
V - ~ ~
~ ~ ~ . ~ 'V
=~ O ~
~ _ O
~C ~C ~I ~V
£
~C
c~ v
5 ~
~ ~ _
O ~
£ AV =\
~ - _
5 5 L~1
~ v L~
_ ~C ~C~
A~ . _ ~
C~ ~C~
(C
O ~ .= 0<
·- ~ V ~
tC O C~ ~
z
OCR for page 63
63
~ .=
U) C)
C~ ~
oc
.
C~
3 x
CC CN
5 ~
CC ~
c)
CC ~
O C)
U. -
~ 5
C
O O
O O
C-1
- - 1 1 1 1
O O O
O O O O O O
O O O O O O
_ _ _ ~ O ~)
t_ Cs ~-
~ _
. .
c ~1 1 1 1
- X ~O -
U C ~Cx ~ ~ ~
O O O O
oo
O O
_ o 1 1 1 1
U~
~_
O ~_
~o 0 1 1 1
e
O O
O O
Cx
- - 1 1 1 1
O O O
O O O O O O
O O O O O O
0 ~t _ O O
_ L ~0O ~ ~ C~
- C~
C _- ~C ~C - - - ~- A
- v) U) V:
~0
o ~o~ ~._ ~3
u, ~ 5 G 5 ~ - 5 ~ ~- ~ ~u
~0 4_} <- ~ ~
C': ~O ~0- ¢, ~ ~ O
O O ~0 o_ ~ (~=, ,=: ~ ~ ~
V O
OCR for page 64
64
INFRASTRUCTURE DIAGRAMMING AND MODELING
data provided by NSF, the National Research Council, Census, and BLS
indicate that care should be taken when using estimates of any group
comprising less than 10 percent of the target population. For example,
the majority of estimates on the distribution of minorities on any
dimension have large standard errors, making the estimates somewhat
unreliable.
Response Rate
The response rate refers to the percentage of usable responses. The
majority of the data base managers expend considerable effort in ensur-
ing an adequate response rate. Census, NSF, and the Research Council
all undertake an analysis of responses to ensure that differences in
response rates are not a significant source of error. As a result, the
response rates achieved, while not perfect, are very high for mail sur-
veys. In particular, the 82.1 percent response rate for the ESE in 1978 is
exceptionally high, given the longitudinal design of the data base.
Results of the various analyses of responses suggest that some signifi-
cant differences do exist between respondents and nonrespondents. For
example, those under 30 are less likely to respond than are those 30 to
65 ~71.2 percent versus 75.3 percent) j9 when sampled, engineers with-
out college degrees are less likely to respond than are graduate engi-
neers `68.7 percent versus 78.5 percentJ.~° Engineers are less likely to
respond than are physical scientists t74.7 percent versus 79.5 per-
cent) jii and master's-degree holders are less likely to respond than are
bachelor's-degree holders ~54.3 percent versus 65.2 percent for engi-
neers).~2 These results strongly indicate that in-depth analyses of
responses should continue in order to avoid additional sources of error
based on response irregularities.
Accuracy of Data Base
As noted earlier, the choice of data collection method and respondent
has a potential impact on the accuracy of the responses and hence on
the accuracy or reliability of the data base. With the exception of the
Census and the Current Population Survey {CPS~, however, very little
research has been done to assess the accuracy of individual responses.
The data bases, therefore, may contain inaccuracies based on response
inaccuracies. However, the reliability studies that have been done for
the Census indicate that self-report measures of occupation and indus-
try are moderately consistent with an employer's reports of the individ
OCR for page 65
DATA BASES
65
ual's occupations and industry. i3 In addition, mail survey responses are
consistent with interviewer-obtained responses with regard to occupa-
tion and industry. i4
The absence of reliability studies on the NSF and Research Council
crate bases, particularly with regard to the specification of the major
work activity, needs to be addressed. The work activity item occurs in a
more or less standard form in the NSF and Research Council data bases
and is critical to the flow diagram and the CLUE model. The sheer
complexity of the question and the ambiguity of many of the response
categories suggest that some effort needs to be made to check the accu-
racy and validity of responses.
Data Compatibility
The Panel on Infrastructure Diagramming and Modeling considered
the feasibility of creating a single data file that could be used in conjunc-
tion with the flow diagram and CLUE model to examine different
assumptions and make projections. Since no single data base provides
complete and up-to-date coverage, the compatibility of the different
data bases needs to be assessed.
Based on their current formats, the NSF and Research Council data
bases are all technically compatible, that is, they can be used to con-
struct a single data base covering a significant number of items. The
actual items that could be included in an integrated data base would be
essentially limited to those in the Survey of Recent Science and Engi-
neering Graduates (RSEJ. The integrated data base would have, there-
fore, considerably fewer data elements than have the PMS and ESE.
NSF's current model is essentially based on such an integrated data
base. A single data set, however, has not been created.
While it is possible to create an integrated data base using existing
data bases, the resultant data base would have the following clear limi-
tations:
· It would include no information on technicians.
· It would grossly underrepresent engineers with less than a bache-
lor's degree.
· It would have limited information on the type of current employ-
ment and no information on employment history.
· It would have little information on education outside of program
completion.
Therefore, a more complete data base is clearly needed. Given the
limitations in coverage of the current data bases, it would be more
OCR for page 66
66 INFRASTRUCTURE DIAGRAMMING AND MODELING
appropriate to consider the feasibility of developing a data base that will
be more comprehensive.
Conclusions and Recommendations
Existing data bases provide a limited picture of the engineering com-
munity as defined in the flow diagram and the simulation model devel-
oped as part of the study undertaken lay the Committee on the
Education and Utilization of the Engineer. In order to provide even a
limited understanding of the stocks and flows of engineers, multiple
data bases must lie used. Moreover, the data from any source are
extremely limited prior to 1970. In the future, therefore, a more com-
plete understanding of the engineering community or of any other
technical occupation for that matter will require some significant
modifications in categorizing the community from which data are col-
lected, in the range of information gathered, and in the coordination of
the various data collection efforts. The Panel on Infrastructure Dia
gramming and Modeling suggests that these modifications may lee
achieved through the following:
· Monitoring of existing data collection efforts should be under-
taken by an organization not currently involved in any specific data
collection effort. The organization should have the perspective and
resources to review realistically and to integrate the need for accurate
and timelydata on the engineeringcommunitywith the data collection
efforts of the various governmen t and nongovernmen t agencies.
· Such an organization would be responsible for ensuring that future
data collection efforts were guided by an agreed-upon general model of
the engineering community similar to the flow diagram described in
this report. The panel suggests that the National Academy of Engineer-
ing may be suited to this role of data base monitoring.
~ At a minimum, however, future data collection efforts should be
extended to cover (1J segments of the engineering community not cur-
rentlycovered, (2J the flowofstudents through the various engineering
institutions, and (3J the employment category of members of the engi-
neering community, including industry where employed.
· The National Science Foundation should continue its existing
plans for longitudinal studies of scientific and engineering manpower.
In addition, NSF should make an effort to extend the time period cov-
ered from the current eight years. One distinctpossibilityis to arrange
for a follow-up study of the sample of engineers and scientists surveyed
between 1972 and 1978.
OCR for page 67
DATA BASES
67
· The National Science Foundation should, separately or in con-
junction with the Engineering Manpower Commission, extend the
survey of recent bachelor's- and! masters-degree graduates to cover the
placement of graduates from all major types of scientific and technical
programs.
· If the Bureau of Labor Statistics7 Occupational Employment Sur-
veyis to remain a majorinput in the analysis of the demand and supply
of technical manpower, it is essential that an effort be made to assess
the reliability and accuracy of the data.
Notes
Discussions with BLS personnel indicate that there is a strong need to undertake
such research in order to determine the reliability of BLS manpower forecasts.
Unpublished data from 1980 Recent College Graduates Survey. Of an estimated
66,973 engineering B.S. graduates ir1 1979-1980, 1,419 entered the armed forces.
3. National Science Foundation, The 1972 Scientists and Engineers Population
Redefined. Vol. 1. Demographic, Educational, and Professional Characteristics.
NSF 75-313 (Washington, D.C.: NSF, 1975), Table B1, p. 42.
4. Bureau of the Census, Technical Paper 33, Table 5, p. 68.
5. National Science Foundation, U.S. Scientists and Engineers: 1980. NSF 82-314
(Washington, D.C.: NSF, 1982), Table B52, p. 209.
6. Engineering Manpower Commission, E/T Enrollments, 1981, p. 16; E/T
Degrees, 1982, p. 14.
7. A similar pattern exists for previous years. The figures cited do not take into
account part-time students or those taking a fifth year.
8. National Science Foundation, Recent Science and Engineering Graduates: 1980.
NSF 82-313 (Washington, D.C.: NSF, 1982), Table B60, p. 244.
9. Bureau of the Census, Technical Paper No. 33, p. 146.
l 0. Ibid.
11. Ibid. The difference is less marked among new graduates.
12. United States Personnel and Funding Resources for Science, Engineering, and
Technology: Survey ofRecentScienceandEngineeringGraduates, 1980(Wash-
ington, D.C.: National Science Foundation), p. 46.
13. Bureau of the Census, Census of Population and Housing: 1970 (Washington,
D.C.: U.S. Government Printing Office), Table 3, p. 1 1.
14. Ibid., Table B. p . 7 .
OCR for page 68
OCR for page 69
Appendixes
OCR for page 70
Representative terms from entire chapter:
engineering community