National Academies Press: OpenBook
« Previous: 4. Modeling
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 47
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 48
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 49
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 50
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 51
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 52
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 53
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 54
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 55
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 56
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 57
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 58
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 59
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 60
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 61
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 62
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 63
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 64
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 65
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 66
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 67
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 68
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 69
Suggested Citation:"5. Data Bases." National Research Council. 1986. Engineering Infrastructure Diagramming and Modeling. Washington, DC: The National Academies Press. doi: 10.17226/587.
×
Page 70

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

5 Data Bases This chapter provides an overview and assessment of the data bases used in the development of the panel's flow diagram and its CEDE model. Fourteen distinct data bases were used to obtain the necessary data and estimates on the education and employment of those in the engineering community. The panel also made use of a large variety of secondary sources that are not reviewed here. The chapter is divided into four parts: an overview, a discussion of data coverage, a technical assessment, and recommendations. Overview of Data Bases Table 1 summarizes the main features of the 14 data leases used in developing the panel's flow diagram and model. The features summa- rized in the table are discussed below. Data Base Manager The term data base manager refers to the organization that is princi- pally responsible for storing and reporting the data. The National Sci- ence Foundation NSF), National Research Council, Engineering Manpower Commission {EMC), Bureau of the Census ~Census), Bureau of Labor Statistics {BLSJ, and National Center for Education Statistics {NCES) are the primary data base managers. There is, how- ever, a considerable level of interdependence among the data base man 47

48 4 - . - . - . o . - c) .- in Lo in o in an a) by · · bC by · - 4= Go · - o as · - a o hi o ~ o <: ~ -= ED o ~ US ~ At ~ ~ To At it ~ ED G a, CC Cal ~ i- ~ ~ ~ ~ C C)~ · ~-~ O _ ~ ,~ ~ ~ ~ ~ _ _ ~ . _ _ 5~ · - - Cal ~ ~ O ~ O ~ Ce ~ g ~ ~:5 ~ O Cal O C%l ~. Do ~oo Go ON Cx Go ._ ~ oo 1 _ _ ~1 O ~ C;x <) CN GN 7_ t_ t_ c ~ ~O C) 0 0 ct ~ ~.~;: ~.o ~ ~.G~ 4= 4 - ~ C) - ~_ ~_ ~ O ~O ~O ~O ~; == ~° 5 a =-- =' =-= o > ~, ~ -, ~ c) c~ ~ ~ ~ .= ~- - ~ ~- ~ ~ ~ 5 ~=> C) ~ ~ ~ C o ~ ~ ~a ) bC u: C) ~ ~_ _ ~ _ ~CO 5 5 DN 0) O . _ · _ · c ~· < ~ ~ ~ ~0 v .° `~rJ 4- V, ,. Z ~c,, a: Z Z Z a C O _ . O v 0 r 5 5 > U ~Z CC

49 o ~ - ~ O o _ ~ _ _ ~ 5 - - ~:t _ at =~- °a°= =~= V ~= ~= 5 ~U) X ~O ~ oo oo Go Go Go oo _ _ ~_ _ _ 1 1 1 1 1 1 of ~ Go Cx Go ~_ _ t_ _ _ C) Cot _ Cal ~. ~1 is ~ Cal 5 _ _ 5 ~o ~ ~ o v 5 5 o ~ V ~ ~ . _ ~ ¢ ~¢ ¢ ~=- ~ O O ~C ~ 41,> ~ C) C.~ O =1 ~LO ~e ~ ~V ~ 6 <: a _ a 5 0 C ~a 0 ~5 ~ V ~5 a5 5 ? ~- C) ~-V C) C) C) ~ C~ . _ 5 5 5 ~5 5 ~c ~cn cn c~ _ _ V ~_ _ _ ._ ._ ~._ ._ . - ~(- ) ~Ce V · _ 'V · _ · ~· - ·- ~ - c: 'J a a :' c 5 D ·0 ~5 ~ ·- a a 6 ~:~ Z Z ~ >~ a `, ~° 0 ~ c ~ ~ ^_ .", >~_ . - ca ~ v a 0 ~5 ° ~ a° ~ a 5 =° c5~ _ g a - ° ~ ~ ~ a ~ ;~ ~ ~ ~ ~= =, ~0 ~

50 By Ad 'A 'A .- o o o o . - . - . - . .- 4 - 4 - 4= l At ~Ed At - ~ 'e ¢ ~ ~ o ~ ~ . ~oo Go Do ON ~ ~ . _ Cat Get GO ~a' c) .- . Go - - Go 5 ~ ~ ~ ~ ~ ¢ ~¢ ¢ cl) 5 Lo o U) a' o . - _ 4= ~ ~ O ~ _ 4_, Cal O Cal ~ Cal Ed ~ 4= C" Cal Cal CO Cal 4 - ~ a' ~ O o~ o ~ ~o ~ a=5 5= 5 ~== Fan ~ 4=~ a)A) 4= ~ ~ ~ ~ ~0 ^=^= 3 ~ o Cto ~.CT, ~ ~c~ C~0 U. ~ C ~. =. L~_ ~ ?- ~:^ ~G)a~ :> 5 ~ co cocn co _. __ _ ._ ._._ ._ C~ <; _ ~ ~O O - tV ~V .O C) C ~ o t4 i a ° v ° ~ 0 5 o ~ ~ _ o C V C P ~ U. C~ ~_ o 4= ._ 4 - 4 - ~4 ._ o - - o cn U) 4= C~ ._ o 4= ~0 cn o o C) 4= 4 - -0 CC ._ 4= ~0 4= X ao 4 - 4= .~ O 4= ~> C~ C~ CC C~ e~ 4= C~

DATA BASES 51 agers. For example, NSF and BLS both rely on the census for some of their samples and the actual administration and compilation of the surveys. NSF, in turn, operates as a funding or sponsoring body for the Research Council. The significance of the interrelationships among the data base man- agers stems from the increased likelihood that the individual data leases could be made fully compatible with each other at some time in the future. NSF acts as the primary integrative organization particularly by standardizing key survey questions and by sponsoring a mathematical model that combines a number of its data leases. At the present time, however, the variety of data base managers increases the difficulties associated with integrating the individual data bases. Each data base is constructed to address the issues that follow from immediate organizational purposes of the data base man- ager. Thus, NSF focuses heavily on the training of scientists and engi- neers, on the functioning of higher education, and on the types of employment of scientists and engineers. By contrast, BLS focuses exclusively on where scientists and engineers are employed. One major consequence of these differences is that definitions of engineer and engineering differ. As a result, there are marked discrepancies in the estimates of the number of engineers derived from different sources using different definitions. [For example, estimates in the number of engineers practicing in the United States range from 1.2 million to 1.9 million. ~ Data Collection Methods Mail surveys are the primary means used by the 14 data bases {listed in Table 1J for collecting data. Only the Current Population Survey ;CPS) consistently uses an interview method. Mail surveys are the most cost-effective method for collecting the data, and, given the rela- tively high existing response rates to the mail surveys {see section on "Respondents" l~elowJ, there is little reason for changing data collec- tion methods. Studies conducted lay the Census indicate that mail surveys do not in themselves result in biased or inaccurate data. Respondents Respondents to data base surveys fall into three categories: {1) the targeted individual, {2) the household in which the targeted individual resides, and {3) the establishments where the individual works or is educated. NSF, Research Council, and National Society of Professional

52 INFRASTRUCTURE DIAGRAMMING AND MODELING Engineers (NSPEJ surveys are directed at the targeted individual, EMC surveys at the employment establishment or educational institution, and the Census surveys and CPS at the head of the household. BLS and NCES direct certain surveys to establishments and other surveys to individuals. Battelle's survey is sent to both individuals and estal~lish- ments at the same time. The choice of respondent affects key aspects of the data collection and analysis effort. Individuals and households are more willing to provide more detailed information than is an establishment. Establish- ment surveys are therefore generally designed to collect limited amounts of information on a relatively few categories {e.g., occupation and industry). The unit of analysis is generally a group of individuals rather than an individual. As a result, it is possible to generate data on "males" or "Asiatics" but seldom "male Asiatics" without adding enormously to the reporting burdens of an organization. If the number of categories on an establishment survey is large or if the survey uses unfamiliar categories, it becomes more probable that the data will be less reliable. At this time, despite the use of establishment data to project manpower needs and opportunities, hardly any published research exists on the accuracy of establishment surveys. ~ Considerably more detail can be obtained from individual respon- dents, but there remains the concern that individuals may selectively distort their responses, particularly in status-related areas such as degrees, salary, occupation, and organizational level. Although the research is limited, Census research studies tend to indicate that while errors in reporting do occur, they are not major and there is no consis- tent pattern of bias. Similar concerns exist for household surveys, with the added issue that the person answering the survey may have little idea about what other members of the household actually do or earn and what their educational backgrounds are. As is the case with establishment sur- veys, there is little research to support or refute this issue. Data for the panel's flow diagram and the CLUE model were devel- oped from surveys of individuals. Although establishment data can be used to estimate the size of the various stocks, the level of detail required for an analysis of flows requires that data lee collected on individuals. As a result, the most useful data bases are those in which the respondent is an individual or household. Target Population The term target population refers to the group of individuals on whom data are sought. Some target populations are very broadly

DATA BASES 53 defined le.g., by Census, CPS, and the Occupational Employment Sur- vey POESY I; others are very narrowly defined (e.g., Survey of Doctorate Recipients (DR), Survey of Recent Science and Engineering Graduates (RSE) I. For example, DR only provides information on engineers with doctorates, and RSE only provides information on engineers and com- puter specialists who have graduated with bachelor's or master's degrees. The flow diagram and model require personal, educational, and employment information on three specific occupational groups: engi- neers, computer specialists, and technicians. A number of the data bases only address a distinct segment of these populations. Overall, the data bases as currently configured do not provide detailed up-to-date information on engineers and computer specialists without degrees, associate degree engineers, and computer specialists and technicians. Some information is available from Census, CPS, and the Postcensal Manpower Survey (PMS), but in the case of Census and PMS, it is provided only once every 10 years. The CPS data base is constructed on a relatively small sample size. The absence of data in these areas in part reflects the differences in the definitions of engineers and engineering that underlie the various data collection efforts. Attention has been focused principally on a primary segment of the engineering community, the engineer with a degree (B.S. or above). Yet NSF and other bodies acknowledge that industry frequently meets local shortages of B.S. engineers by upgrad- ing other technical staff, many of whom have no degrees. Other small but significant segments of the engineering community also receive little attention. Military personnel with B.S. degrees or equivalent training in engineering and individuals with computer spe- cialties are not treated as part of the general labor force. For example, the BLS labor participation rate is calculated after excluding those in military service. This figure is relatively large: in 1979-1980, 2.1 per- cent of the graduating class in engineering entered the armed forces.2 In addition, a significant percentage of physical science graduates also entered the armed forces as engineers. The inability to obtain complete coverage of the engineering com- munity using the existing data bases means that the data elements in the flow diagram tend to underestimate the size of the stocks and flows. Focus As mentioned earlier, the focus or purpose of each data base varies but can be captured in a combination of three categories: personal, educa- tion, and employment. Establishment surveys tend to focus on a single

54 INFRASTRUCTURE DIAGRAMMING AND MODELING category, while individual surveys cover two or more categories. The precise coverage of the different data bases is explained in more detail below See section on "Data Base Coverage" ). Frequency The frequency with which a data base is updated is a function of its purpose and complexity and of the pace at which the statistics change. For example, the size and scope of the census ensures that its frequency remains low. By comparison, the need to adjust salaries in inflationary periods resulted in an increase in the frequency of various salary sur- veys, such as the Professional Engineer Income and Salary Survey {PE). The frequency of updating the data bases listed in Table 1 varies from 1 month ~CPS) to 10 years {Census and PMSJ. The majority of the surveys are conducted on an annual or biennial basis. NSF's National Survey of Experienced Scientists and Engineers (ESE) is unique among the surveys. It was designed to follow the careers of a sample of scientists and engineers over an eight-year period ~1970- 1978~. ESE provides the only genuine measure of the flow of engineers throughout the engineering community. Time Period Covered Some of the data bases cover long periods of time, such as those of the Census, CPS, and Professional Income of Engineers {PIE). The majority of the data bases were started in the 1960s, in part as a response to the Sputnik challenge and the subsequent increased demand for scientists and engineers in aerospace and defense industries. As a result, few of the current data bases go back to 1960. In part, the new data bases replaced others, such as the Engineering Register, but major differences in target populations and survey items severely limit the usefulness of the earlier data bases. The first year in which the data elements con- form to the needs of the flow diagram as presently structured is 1962. Availability All of the data bases listed in Table 1 are computerized. In most cases, the data tapes are available to the public. However, in a number of instances [e.g., EMC and NCES), the existing tabulations are exhaus- tive enough to make the tapes redundant for most users. The majority of the panel's work was done using existing tabula- tions, with the notable exception of NSF's data [PMS, ESE, and RSE).

DATA BASES 55 While the quality of reporting of the data in tabular form was in general very high, serious difficulties were encountered in the use of NSF's RSE tapes. For example, on the 1979 RSE tape documentation 12,285 cases are listed, yet only 11,543 appear on the tape. More critical was an apparent difficulty in attaching the correct weights to individual cases in the 1976, 1978, and 1979 tapes. tFor 1979 data approximately 5 per- cent of the cases have incorrect weights. ~ This problem made it impos- sible to reconstruct and validate NSF published tables. If they have not been addressed already, these technical issues need to be resolved by NSF. Data Base Coverage Table 2 summarizes in detail the data elements covered by the data bases. The data elements fall into three categories: personal, educa- tion, and employment. The table provides only a limited indication of the adequacy of the data bases. There is a need not only to determine whether a specific topic is covered but for what period of time, in what detail, the underlying unit of analysis, and the representativeness of the sample. Each of these issues is addressed below in terms of the require- ments of the flow diagram. Personal Variables The six establishment-respondent surveys (see Table 1) provide no personal background data. The remaining surveys with the exception of OES provide standard information on age, sex, and marital status See Table 2~. Prior to 1972 the surveys did not include items on racial or ethnic background with the exception of the census. For the PMS and ESE data bases, NSF used the individual's census response to cover this variable. The census and PMS responses of individuals in the ESE sample are used in the ESE data base. Citizenship is included in the individual respondent data bases except for CPS and OES. Total income, as opposed to base pay from a primary source of employment, is asked in some surveys Census, PMS, CPS, and DR) but not in others {ESE, RSE, OES, and the Survey of Earned Doctorates FEDS ). Education The PMS, ESE, RSE, and ED generate considerable data on types and level of education. However, the range of postsecondary education cov

56 INFRASTRUCTURE DIAGRAMMING AND MODELING TABLE 2 Data Elements in Existing Data Bases Survey Data Elements Census (PMS) National Survey of Experienced Postcensal Scientists Manpower and Engineers (ESE) Survey of Recent Science and Current Occupational Engineering Population Employment Graduates Survey Survey (RSE ~(CPS ~(OES) Personal Age Sex Race Marital status Citizenship Income Educat~on Level Field History Type of degree Field Date received Date enrolled Current status Other training Future plans Source of support Employment Employment status X X X X X X X X X X X X X X X X X X X X _ X _ X _ X _ X _ X X X X X X X X X X X X X X X X X X X _ X X X _ X X X X - X X Current job Occupation X X X X X X Type of employer X X X X X X Industry X X X - X X Level - X Tenure - X Work activities X X X X Salary X X X X X Satisfaction - X X Skill utilization - - X X Job history Occupation - X X Type of employer - X X Industry - X X Level - X Tenure - X Work activities - X X Salary - X X Satisfaction - X X Skill utilization - - X

DATA BASES 57 Engineering and Survey of Survey of Technology Earned Doctorage Degrees Doctorates Recipients Granted (E/T Engineers Compensation (ED ~(DR) (E/T Degrees) Enrollments ~(PIE ~(Battelle~ E. . ngmeerlog and Technology Professional National Enrollments Income of Survey of Professional Engineer Income and Salary Survey (PE) X X X X X X X X X X X X X X X X - X X - X X - X X X X X X _ X X X X X X X X - X X X X X X X X X X X X X X

58 INFRASTRUCTURE DIAGRAMMING AND MODELING ered in RSE and ED are narrowly defined lay their target populations those in engineering with at least a bachelor's degree. The PMS and the related ESE are the primary sources of data on engineers and techni- cians with less than a l~achelor's degree. Again, this is a major short- coming in the coverage of the existing data leases, since a significant percentage of the engineers and almost all technicians come from less than four-year programs. For example, in 1972 NSF estimated that 11.3 percents of engineers had less than a l~achelor's degree The Bureau of the Census estimated 30.7 percents. In 1980, NSF estimated that 4.0 percents of engineers had less than a bachelor's degree. This decline from 1972 to 1980 reflects a failure to account for those individuals entering engineering with less than a l~achelor's degree during the 1970s. In terms of the panel's flow diagram and model, the data leases pro- vide limited information on the flows of individuals among fields, particularly at four-year colleges and universities. The EMC Engineer- ing and Technology Enrollment {E/T EnrollmentJ data provide some aggregate information on the size of classes for each specialty within engineering, but it is impossible to determine where students switch- ing out of engineering go and where students entering engineering come from. NCES and other sources of educational data are primarily establishment data and provide no insight into the movement of stu- dents among fields. The potential importance of the issue can lee illus- trated using EMC E/T enrollment and degree data: Of 110,000 full-time freshmen in engineering in 1980, 87,500 became full-time sophomores in engineering in 1981. This suggests a dropout rate or change in major of 20 percent. Of more interest is that in the class of 1982 there were 95,800 entering full-time engineering students, 78,600 sophomores, then 80,000 juniors, 92,400 seniors, and, finally, 67,000 graduates in 1982.6 These data raise a series of questions, not the least of which is the source of the additional 12,400 seniors in engineering and the fate of the 25,400 engineering majors who did not graduate. The latter represent approximately 38 percent of the 19821~accalaureate engineering popu- lation. 7 The lack of clarity around these flows increases the difficulty of constructing the flow diagram. Another difficulty with the educational data elements stems from the failure of the current status item in the RSE to distinguish between graduate students enrolled in master's and doctoral programs and to indicate the field of the program. Given the infrequency of the PMS, the RSE is critical in determining the flows of baccalaureates in engineer

DATA BASES 59 ing, science, and other fields into master's and doctoral programs in engineering and other fields. The same problem is encountered when using NCES fall enrollment data. The flows can only be approximately estimated from total enrollment figures. Current data on the flow of students from high school into colleges and universities are basically limited to establishment data on the numbers of high school graduates. Very little information is available on which institutions and programs high school graduates select. The American Freshmen study provides the only measure of program choices. Employment Two surveys, PMS and ESE, provide almost complete coverage of the employment data needed for the flow diagram and model. Unfortu- nately, as noted earlier, these data bases only provide a current picture of employment patterns in the engineering community every 10 years. The main means for updating the PMS data are the RSE and DR. While these two data leases are largely compatible with the PMS and ESE and provide considerable information about work activities and salary, they do not provide any information on type of industry. This in itself is not critical for the flow diagram and model at their highest level of aggregation, lout given NSF's estimate that 78.2 percents of employed engineers work in private industry, it would be useful to determine the differences among the various industrial sectors and to see where new Ph.D.s are going. The 1982 PMS and Census will provide more current information when they become available on public use computer tapes. The jolt history data available from the PMS and the mobility data derived from the ESE provide the only means of estimating the flow of engineers between work activities, specialties, and occupations. The OES and CPS provide limited data on occupations and industry. The structure of the OES data limits their usefulness with regard to the model and flow diagram, although they provide a cross-check on the representativeness of the PMS and ESE data. Similarly, the Professional Engineer Income and Salary Survey (PEJ provides for checking the rep- resentativeness of the NSF data despite the sample's being limited to National Society of Professional Engineers members. The Battelle data can serve a similar function, lout here the population is even more limited since only R&D establishments are surveyed. In sum, the PMS and ESE data leases provide a reasonable amount of

60 INFRAS TR UC TURK DIAGRAMMING AND MODELING data on the employment of engineers. The others are of limited useful- ness, although they can be used to verify estimates derived from the PMS and ESE. Table 3 is a summary evaluation of the data bases. The PMS and ESE provide the most complete coverage. However, no data base currently available adequately covers the flow of high school students into col- leges and universities. Nor is there sufficient coverage of the flows of students across the fields within the higher-education system. Finally, although the PMS and ESE provide detailed coverage of the remaining data elements, the data are only collected every 10 years and need to be augmented on a more frequent and representative basis using the NSF model to generate the needed estimates for updating. The RSE and DR data bases are deficient in item coverage, and the associate and nonde- greed segment of the engineering community is not covered. Technical Characteristics Table 4 summarizes and reviews a number of the key technical char- acteristics of the data bases, which are discussed below. This section also addresses the possibility of creating an integrated data base. Sampling Frame The sampling frames used for the different data bases are generally well defined in the sense that the target population is clearly identified and a listing of potential respondents can be constructed. For example, the Postcensal Manpower Survey jPMS) uses a clearly identified subset of the census population as its target population, and the Engineering Manpower Commission jEMC) has a complete list of colleges and universities offering bachelor's degrees in engineering. In a few instances, the sampling frame is either ill-defined or incomplete. For example, EMC does not have an exhaustive list of colleges and univer- sities offering less than bachelor's-level degrees and programs in engi- neering. Also, EMC's salary survey has no clearly identified target population although NSPE does, namely, its own membership. Even though a sampling frame may be well defined it may not be representative of the desired target population. For example, National Society of Professional Engineers INSTEP membership tends to be older and better qualified than are engineers in general. Therefore, care needs to be taken in generalizing the results of this survey to the entire engi- neering population. For the most part, however, the sampling frames are representative of the target populations for the surveys.

DATA BASES TABLE 3 Data Base Coverage 61 Adequacy of Coverage Education Employment Data Base Personal Current History Current History Census X - - X Postcensal Manpower Survey SPAS ~ National Survey of Experienced Scientists and Engineers (ESE) National Survey of Recent Science ancl Engineering Graduates (RSE) Survey of Earned Doctorates (ED) Survey of Doctorate Recipients (DR) Current Population Survey (CPS) Occupational Employment Survey WEST X X X X X X X X X X ? X X X ? X X X X ~ ? _ NOTE: X = fairly complete coverage; ? = some elements are covered, lout there are major elements that are not covered. Sampling Procedures A number of the surveys are leased on a " 100 percent sample," and therefore, the sampling procedures are not an issue. Those data bases actually using a sample employ standard procedures to randomly select the sample. In most instances, a stratified random sample is used in order to ensure adequate coverage of key demographic, geographic, or size variables. Design effects from the stratification procedure tend to lee small. Sample Size and Sampling Fraction The sample size and sampling fraction are the key determinants of the reliability of the subsequent population estimates. As the sample size increases and the sampling fraction increases or both, the standard error of an estimate declines. Sample sizes and sampling fractions are chosen so that the standard error for the same set of key estimates falls within certain acceptable limits. In general, the decisions concerning sample size and sampling fraction reflect a considerable amount of careful planning and analysis. While no definitive judgment can lie made as to the acceptability of the sampling errors, the standard error

62 ~_- _ . _ _ o o C~ ~ ,_ O ~A, <= V ~ U: . ~ O N A ._ ~ CC O C~4 AV C ~ V C~ C~ C~ Ct bC ~ _ CC ·_ L~ o CC V ·_ CC ·_ V Ct C~ V a~ bC ·= O C~ ,A_' V ~ ,' C~ ~ ~_ ~ N o ~ ~ E~ ~ 5 _ V P~ ~ ~ V ~ O V: bC ._ _ ~) £ £ L~ C~ C~ :q C~ C~ -= CNOo ~_ oo _ o o C ~o o o o E~ bC ~ ~ ~r: _ ~ ~ ~. o ~ ~ ~U~ oc ~ 1 L~ 1 A~ ~o 1 A~ O O O O Cx ~) _ _ C ~ o O - <) ~ _ ~ CN Ct ·- ~ ~_ o ~ N V C~ U' C~ C~ . _ - 5 ~) ~ Ct O - b.0 ~ O C~ S~ ) A-A ~ £ tc v - U) ~ C~ 4- V ~' O V ~ 'V -, V: o g C O ~C ~ ~ ~ ~ ~ ~ o A 5 A ~=' _ ~ o - o o o - o C~ g A~ _ ~ __ 1 ~- ~ ~ _ · _ o ~ (C O O A _, A ~O C~ U, ._ . ~ V ~4 o o - o A_ - 0O ~. C~ X _ 0o C ~ O O O O ~) ~) ~'~ O O O O O O O O _ _ o O O O O _ _ 1 0 0 1 _ tI) ~-O _ \_)A~O V - _ C A ~ a _: v ~ A ~ C ~ - O ~cn >%~ ~_ ~. _ A ~ 4_A ~ V ~V V _~ ~ 9 - (t . . A_ ~C 'V _ ~'V V V C ~`_ _ ~ O ~=` ~c~ O O A ~ c.= 3 £ .= v ~ ~ :> _ ~ ~ ~C C~ V) 7_ _ V ~ ~ O =, C~ _ _ ~AV 't ~ V r ~ _ ~· _ V - ~ ~ ~ ~ ~ . ~ 'V =~ O ~ ~ _ O ~C ~C ~I ~V £ ~C c~ v 5 ~ ~ ~ _ O ~ £ AV =\ ~ - _ 5 5 L~1 ~ v L~ _ ~C ~C~ A~ . _ ~ C~ ~C~ (C O ~ .= 0< ·- ~ V ~ tC O C~ ~ z

63 ~ .= U) C) C~ ~ oc . C~ 3 x CC CN 5 ~ CC ~ c) CC ~ O C) U. - ~ 5 C O O O O C-1 - - 1 1 1 1 O O O O O O O O O O O O O O O _ _ _ ~ O ~) t_ Cs ~- ~ _ . . c ~1 1 1 1 - X ~O - U C ~Cx ~ ~ ~ O O O O oo O O _ o 1 1 1 1 U~ ~_ O ~_ ~o 0 1 1 1 e O O O O Cx - - 1 1 1 1 O O O O O O O O O O O O O O O 0 ~t _ O O _ L ~0O ~ ~ C~ - C~ C _- ~C ~C - - - ~- A - v) U) V: ~0 o ~o~ ~._ ~3 u, ~ 5 G 5 ~ - 5 ~ ~- ~ ~u ~0 4_} <- ~ ~ C': ~O ~0- ¢, ~ ~ O O O ~0 o_ ~ (~=, ,=: ~ ~ ~ V O

64 INFRASTRUCTURE DIAGRAMMING AND MODELING data provided by NSF, the National Research Council, Census, and BLS indicate that care should be taken when using estimates of any group comprising less than 10 percent of the target population. For example, the majority of estimates on the distribution of minorities on any dimension have large standard errors, making the estimates somewhat unreliable. Response Rate The response rate refers to the percentage of usable responses. The majority of the data base managers expend considerable effort in ensur- ing an adequate response rate. Census, NSF, and the Research Council all undertake an analysis of responses to ensure that differences in response rates are not a significant source of error. As a result, the response rates achieved, while not perfect, are very high for mail sur- veys. In particular, the 82.1 percent response rate for the ESE in 1978 is exceptionally high, given the longitudinal design of the data base. Results of the various analyses of responses suggest that some signifi- cant differences do exist between respondents and nonrespondents. For example, those under 30 are less likely to respond than are those 30 to 65 ~71.2 percent versus 75.3 percent) j9 when sampled, engineers with- out college degrees are less likely to respond than are graduate engi- neers `68.7 percent versus 78.5 percentJ.~° Engineers are less likely to respond than are physical scientists t74.7 percent versus 79.5 per- cent) jii and master's-degree holders are less likely to respond than are bachelor's-degree holders ~54.3 percent versus 65.2 percent for engi- neers).~2 These results strongly indicate that in-depth analyses of responses should continue in order to avoid additional sources of error based on response irregularities. Accuracy of Data Base As noted earlier, the choice of data collection method and respondent has a potential impact on the accuracy of the responses and hence on the accuracy or reliability of the data base. With the exception of the Census and the Current Population Survey {CPS~, however, very little research has been done to assess the accuracy of individual responses. The data bases, therefore, may contain inaccuracies based on response inaccuracies. However, the reliability studies that have been done for the Census indicate that self-report measures of occupation and indus- try are moderately consistent with an employer's reports of the individ

DATA BASES 65 ual's occupations and industry. i3 In addition, mail survey responses are consistent with interviewer-obtained responses with regard to occupa- tion and industry. i4 The absence of reliability studies on the NSF and Research Council crate bases, particularly with regard to the specification of the major work activity, needs to be addressed. The work activity item occurs in a more or less standard form in the NSF and Research Council data bases and is critical to the flow diagram and the CLUE model. The sheer complexity of the question and the ambiguity of many of the response categories suggest that some effort needs to be made to check the accu- racy and validity of responses. Data Compatibility The Panel on Infrastructure Diagramming and Modeling considered the feasibility of creating a single data file that could be used in conjunc- tion with the flow diagram and CLUE model to examine different assumptions and make projections. Since no single data base provides complete and up-to-date coverage, the compatibility of the different data bases needs to be assessed. Based on their current formats, the NSF and Research Council data bases are all technically compatible, that is, they can be used to con- struct a single data base covering a significant number of items. The actual items that could be included in an integrated data base would be essentially limited to those in the Survey of Recent Science and Engi- neering Graduates (RSEJ. The integrated data base would have, there- fore, considerably fewer data elements than have the PMS and ESE. NSF's current model is essentially based on such an integrated data base. A single data set, however, has not been created. While it is possible to create an integrated data base using existing data bases, the resultant data base would have the following clear limi- tations: · It would include no information on technicians. · It would grossly underrepresent engineers with less than a bache- lor's degree. · It would have limited information on the type of current employ- ment and no information on employment history. · It would have little information on education outside of program completion. Therefore, a more complete data base is clearly needed. Given the limitations in coverage of the current data bases, it would be more

66 INFRASTRUCTURE DIAGRAMMING AND MODELING appropriate to consider the feasibility of developing a data base that will be more comprehensive. Conclusions and Recommendations Existing data bases provide a limited picture of the engineering com- munity as defined in the flow diagram and the simulation model devel- oped as part of the study undertaken lay the Committee on the Education and Utilization of the Engineer. In order to provide even a limited understanding of the stocks and flows of engineers, multiple data bases must lie used. Moreover, the data from any source are extremely limited prior to 1970. In the future, therefore, a more com- plete understanding of the engineering community or of any other technical occupation for that matter will require some significant modifications in categorizing the community from which data are col- lected, in the range of information gathered, and in the coordination of the various data collection efforts. The Panel on Infrastructure Dia gramming and Modeling suggests that these modifications may lee achieved through the following: · Monitoring of existing data collection efforts should be under- taken by an organization not currently involved in any specific data collection effort. The organization should have the perspective and resources to review realistically and to integrate the need for accurate and timelydata on the engineeringcommunitywith the data collection efforts of the various governmen t and nongovernmen t agencies. · Such an organization would be responsible for ensuring that future data collection efforts were guided by an agreed-upon general model of the engineering community similar to the flow diagram described in this report. The panel suggests that the National Academy of Engineer- ing may be suited to this role of data base monitoring. ~ At a minimum, however, future data collection efforts should be extended to cover (1J segments of the engineering community not cur- rentlycovered, (2J the flowofstudents through the various engineering institutions, and (3J the employment category of members of the engi- neering community, including industry where employed. · The National Science Foundation should continue its existing plans for longitudinal studies of scientific and engineering manpower. In addition, NSF should make an effort to extend the time period cov- ered from the current eight years. One distinctpossibilityis to arrange for a follow-up study of the sample of engineers and scientists surveyed between 1972 and 1978.

DATA BASES 67 · The National Science Foundation should, separately or in con- junction with the Engineering Manpower Commission, extend the survey of recent bachelor's- and! masters-degree graduates to cover the placement of graduates from all major types of scientific and technical programs. · If the Bureau of Labor Statistics7 Occupational Employment Sur- veyis to remain a majorinput in the analysis of the demand and supply of technical manpower, it is essential that an effort be made to assess the reliability and accuracy of the data. Notes Discussions with BLS personnel indicate that there is a strong need to undertake such research in order to determine the reliability of BLS manpower forecasts. Unpublished data from 1980 Recent College Graduates Survey. Of an estimated 66,973 engineering B.S. graduates ir1 1979-1980, 1,419 entered the armed forces. 3. National Science Foundation, The 1972 Scientists and Engineers Population Redefined. Vol. 1. Demographic, Educational, and Professional Characteristics. NSF 75-313 (Washington, D.C.: NSF, 1975), Table B1, p. 42. 4. Bureau of the Census, Technical Paper 33, Table 5, p. 68. 5. National Science Foundation, U.S. Scientists and Engineers: 1980. NSF 82-314 (Washington, D.C.: NSF, 1982), Table B52, p. 209. 6. Engineering Manpower Commission, E/T Enrollments, 1981, p. 16; E/T Degrees, 1982, p. 14. 7. A similar pattern exists for previous years. The figures cited do not take into account part-time students or those taking a fifth year. 8. National Science Foundation, Recent Science and Engineering Graduates: 1980. NSF 82-313 (Washington, D.C.: NSF, 1982), Table B60, p. 244. 9. Bureau of the Census, Technical Paper No. 33, p. 146. l 0. Ibid. 11. Ibid. The difference is less marked among new graduates. 12. United States Personnel and Funding Resources for Science, Engineering, and Technology: Survey ofRecentScienceandEngineeringGraduates, 1980(Wash- ington, D.C.: National Science Foundation), p. 46. 13. Bureau of the Census, Census of Population and Housing: 1970 (Washington, D.C.: U.S. Government Printing Office), Table 3, p. 1 1. 14. Ibid., Table B. p . 7 .

Appendixes

Next: Appendix A: The Definition of Engineering and of Engineers in Historical Context »
Engineering Infrastructure Diagramming and Modeling Get This Book
×
 Engineering Infrastructure Diagramming and Modeling
Buy Paperback | $50.00
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!