Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 411

APPENDIX H 411
original typesetting files. Page breaks are true to the original; line lengths, word breaks, heading styles, and other typesetting-specific formatting, however, cannot be
About this PDF file: This new digital representation of the original work has been recomposed from XML files created from the original paper book, not from the
retained, and some typographic errors may have been accidentally inserted. Please use the print version of this publication as the authoritative version for attribution.
APPENDIX H
Using Mobility Data to Develop
Occupational Classifications: Exploratory
Exercises
JOHN A.HARTIGAN
How can occupational mobility data help occupational classification? They
may help determine that two occupational titles with slightly different
definitions are similar enough to be amalgamated or that some occupation is
attracting two distinctly different types of workers and should perhaps be split.
They may also supplement the Dictionary of Occupational Titles in suggesting
plausible cross-listings for job titles. Occupational mobility data can contribute
only a little, however, to the definition of occupations in terms of job tasks: for
that, occupational analysis or some alternative methodology is needed.
The most significant use of job mobility data is to suggest a suitable
hierarchical organization of occupations, given a set of occupational definitions.
Mobility data are of value in grouping occupations in a way that reflects the
transfer of workers between occupations within a group. Mobility data also are
of value in constructing career ladders, that is, hierarchies of occupations up
which workers tend to move in the course of successful careers.
We have conducted an exploratory analysis of alternative methods of
classifying occupations. This analysis assessed the feasibility of developing
classifications consisting of groups of occupations between which there are high
rates of labor mobility.
Our basic data consist of the transfers between the 441 U.S. Census

OCR for page 411

APPENDIX H 412
original typesetting files. Page breaks are true to the original; line lengths, word breaks, heading styles, and other typesetting-specific formatting, however, cannot be
About this PDF file: This new digital representation of the original work has been recomposed from XML files created from the original paper book, not from the
retained, and some typographic errors may have been accidentally inserted. Please use the print version of this publication as the authoritative version for attribution.
occupational categories between 1965 and 1970.1 Unfortunately, data on
various extra-labor-force statuses (e.g., unemployed, in school, in armed forces,
etc.) were not available to us. Similarly, no data coded into the 12,099 DOT
occupational titles are available, nor are any data available that give complete
work histories or short-term transfers between jobs.2 More appropriate data are
needed for future work in this area. We use the available census data for our
exploratory purposes to illustrate how one might proceed in constructing a
classification based on naturally occurring patterns of labor mobility.
The first problem we faced in this analysis was the storage and
manipulation of the full mobility matrix for the 441 detailed census
occupations. A 441×441 matrix is formidable (194,481 cells), and the
12,099×12,099 matrix for the DOT (more than 146 million cells) is even worse to
contemplate. A more manageable way to manipulate such data is to represent
them in a list structure, which gives for each 1965 occupation a list of 1970
occupations to which transfers took place and the corresponding counts in each
of these occupations. The total storage is reduced without much loss by
eliminating very small counts. It is also necessary to carry the transposed list
ordered by 1970 occupational categories.
STANDARDIZED RATES AND PROBABILITY MODELS
In order to adjust for different numbers of workers in various occupations,
Goldhamer (1948) proposed the standardized rate
where
nij number transferring from job i in 1965 to job j in 1970;
ni. number in job i in 1965;
n.j number in job j in 1970;
N total number of workers.
Hauser (1978) notes that this measure does not adjust for expected
diagonal peculiarities and suggests a measure in which the “margins” ni.
1See Sommers and Eck (1978) for a description of the data used in these analyses.
2Had the work history data routinely collected from Employment Service job
applicants been available for analysis, we could have conducted a much more interesting
and informative exercise. Unfortunately, although the work history data are initially
coded with nine-digit DOT codes, all but the first two digits are dropped when the data are
put on tape.

OCR for page 411

APPENDIX H 413
original typesetting files. Page breaks are true to the original; line lengths, word breaks, heading styles, and other typesetting-specific formatting, however, cannot be
About this PDF file: This new digital representation of the original work has been recomposed from XML files created from the original paper book, not from the
retained, and some typographic errors may have been accidentally inserted. Please use the print version of this publication as the authoritative version for attribution.
and n.j ignore specified cells such as the diagonal ones, using Goodman's (1969,
1971) quasi-independence techniques. For example, ni., n.j, and N might all
plausibly be defined ignoring the diagonals.
For a hierarchical structure on the set of jobs, consider the model
pij=pi.p.jλG(i,j),
where
pij probability of observing a transfer i to j;
pi. probability (roughly) that a worker begins in job i;
p.j probability (roughly) that a worker ends in job j;
λG(i,j) transfer rate corresponding to the smallest group G containing job i and j;
there will be a different rate for each group.
Following the standard quasi-independence procedure (Haberman, 1974),
the maximum likelihood estimates of pi., p.j, λG are obtained by setting the
observed margins and between-group transfers equal to their expected values
under the model
Solutions may be obtained by solving successively for {pi.}, {p.j}, {λG} with
the other parameters fixed. The overall fit of the model may be measured by the
log likelihood
This measure permits comparison of various hierarchies. It also allows
construction of new hierarchies by seeking groups G that make L as large as
possible. Conceptually, the procedure is straightforward; computationally, it
would be quite a chore to design iterative parameter estimates for a list data
structure and to improve the hierarchy by moving jobs between groups.

OCR for page 411

APPENDIX H 414
original typesetting files. Page breaks are true to the original; line lengths, word breaks, heading styles, and other typesetting-specific formatting, however, cannot be
About this PDF file: This new digital representation of the original work has been recomposed from XML files created from the original paper book, not from the
retained, and some typographic errors may have been accidentally inserted. Please use the print version of this publication as the authoritative version for attribution.
CLUSTERING ANALYSES
Alternative procedures are available. A hierarchical clustering has been
carried out by Dauffenbach (1973). He discusses the 1970 Census classification
and principles for constructing a new classification. For job i a vector is
constructed equal to the proportion that transfer from job j to job i for all j.
Distance between jobs is the Euclidean distance between these vectors. (Some
other distances and data vectors are also considered.) Thus two jobs are similar
if there are similar patterns of movement into them. Complete linkage clustering
(cf. Hartigan, 1975) was then used to construct a binary tree of clusters on the
set of all jobs. The results are not very different from the census classification.
The measure of distance and the data vector of proportional transfers used
by Dauffenbach are not wholly adequate. In particular, there will be large
transfers from jobs with many workers, and such jobs will tend to make large
contributions; there will be many entries near zero in every vector, and it seems
wrong to ignore this property of the vectors; the essential information in the
data is carried by the transfers from each job to just a few other jobs. The
problem with the measure of distance is that after we have computed Euclidean
distance between two vectors of length 441, we do not know what we have.
Complete linkage is statistically inconsistent. Nevertheless, Dauffenbach's
clusters are suggestive.
An alterative method of constructing clusters uses a quasi-independence
model (see Appendix G). This would require advanced programming that has
not been done. A simpler method is to use the standardized transfer rates
where
ni. total number transferring from job i;
n.j total number transferring into job j;
N total number of transfers.
Any two jobs i and j are similar if tij and tji are both high; the measure of
distance between i and j is dij=1/min(tij, tji). The single linkage technique
constructs clusters by linking together jobs for which the transfer rate exceeds
some threshold; a cluster is made up of jobs linked together. Varying the
threshold produces a hierarchy of clusters.

OCR for page 411

APPENDIX H 415
original typesetting files. Page breaks are true to the original; line lengths, word breaks, heading styles, and other typesetting-specific formatting, however, cannot be
About this PDF file: This new digital representation of the original work has been recomposed from XML files created from the original paper book, not from the
retained, and some typographic errors may have been accidentally inserted. Please use the print version of this publication as the authoritative version for attribution.
We have applied this technique: the clusters obtained are shown in
Table H-1. Like Dauffenbach's clusters they draw together different levels of
skills, such as librarian and library attendant or health record technician and
medical secretary. They also show some absurd associations, such as dentist
and flight engineer, which are due in part to single linkage chaining together a
number of slightly related jobs and in part to the unreliability of transfer rates
that (because diagonal terms are removed) may be rather high for jobs with high
retention rates, from which people transfer to just a few other jobs.
CAREER LADDERS
We would like a classification scheme not only to group together
occupations between which transfers are likely but also to order occupations so
that transfers tend to take place from lower-ranked jobs to higher-ranked jobs.
In order to accommodate both aims and to explain the transfer data succinctly, it
would seem desirable to put jobs close together in the structure whenever there
are many transfers in ezither direction. The small groups should therefore
consist of families of jobs within which a career ladder exists; there may only
be a weak ladder relationship between the larger groups. (In the census scheme
there are strong ladder relations between the large groups.)
A probabilistic model constructs an ordering and a hierarchical
classification of all jobs. The probability of a transfer i to j is
pij=pi.p.jλij,
where
pi. is the probability (roughly) that a person is in job i in 1965,
p.j is the probability (roughly) that a person is in job j in 1970, and
λij is constant over all i

About this PDF file: This new digital representation of the original work has been recomposed from XML files created from the original paper book, not from the
original typesetting files. Page breaks are true to the original; line lengths, word breaks, heading styles, and other typesetting-specific formatting, however, cannot be
retained, and some typographic errors may have been accidentally inserted. Please use the print version of this publication as the authoritative version for attribution.
TABLE H-1 Single-Linkage Clusters
Occupational Title Pairs With Transfer Rates Greater Than Expected
1. Computer programmers, system analysts 3, 4
2. Farm management advisor, agricultural scientist, archivist, biological scientist, social 17, 27, 23, 29, 61, 64
APPENDIX H
scientist, agriculture teacher
3. Home management advisors, dietitians 19, 45
4. Judges, lawyers, librarians, law teachers, library attendants 20, 21, 22, 84, 178
5. Actuaries, mathematicians, statisticians 24, 25, 26
6. Chemists, chemistry teachers 30, 67
7. Marine scientists, physicists, physics teachers, engineering teachers, political 32, 33, 68, 69, 57, 58, 59, 70, 72, 73, 74, 75, 76, 77, 82, 88
scientists, psychologists, sociologists, mathematics teachers, psychology teachers,
business teachers, economics teachers, history teachers, sociology teachers, social
science teachers, foreign language teachers, unspecified university teachers
8. Dentists, optometrists, pharmacists, physicians, health teachers, airplane pilots, air 38, 39, 40, 41, 71, 103, 104, 106, 229
traffic controllers, flight engineers, dental lab technicians
9. Podiatrist, clinical technicians 42, 48
10. Dental hygienists, health record technicians, medical secretaries 49, 50, 196
11. Theology teachers, clergymen, religious workers, n.e.c. 85, 54, 55
12. Social workers, clerical assistants 62, 168
13. Atmospheric teachers, biology teachers 65, 66
14. Surveyors, chainmen 101, 312
Embalmers, funeral directors 105, 130
15.
416

OCR for page 411

About this PDF file: This new digital representation of the original work has been recomposed from XML files created from the original paper book, not from the
original typesetting files. Page breaks are true to the original; line lengths, word breaks, heading styles, and other typesetting-specific formatting, however, cannot be
retained, and some typographic errors may have been accidentally inserted. Please use the print version of this publication as the authoritative version for attribution.
16. Authors, editors and reporters, radio and television announcers 113,116,121
17. Painters and sculptors, sign painters 118, 292
18. Photographers, engravers, photoengravers, pressmen plate, pressmen apprentices 119, 234, 277, 284, 285
19. Sailors (deckhands), boatmen, fishermen, pilots 344, 362, 377, 136
APPENDIX H
20. Railroad conductors, brakemen, switchmen 141, 369, 370
21. Auto accessory installers, bakers, bookbinders 212, 213, 216
22. Tile setters, floor layers 299, 236
23. Metal heaters, heat treaters, rollers 325, 242, 286
24. Locomotive engineers, locomotive firemen 247, 248
25. Opticians, special craft apprentices 272, 303
26. Painter-apprentices, plumbers (pipe fitters) 274, 281
27. Plasterers, plasterer apprentices 279, 280
28. Sheetmetal workers, sheetmetal apprentices 288, 289
29. Shoe machine operators, shoe repairmen 347, 291
30. Telephone installers, telephone linemen 347, 291
31. Tool and die makers, tool and die apprentices 300, 301
32. Dressmakers, milliners, blasters, and powdermen 316, 331, 310
33. Mine operatives, mine motormen 332, 367
34. Lumbermen, teamsters 382, 384
Private cooks, housekeepers, maids 437, 438, 440
35.
NOTE: Single-linkage clusters joining pairs of jobs with mutual transfer rates exceeding 65, the expected number. Numbering is the ordered sequence of 441 jobs in the 1970
Census classification. All other jobs do not associate at this threshold. Christel Mack of Yale University is to be thanked for her work in the preparation of this table.
417

OCR for page 411

APPENDIX H 418
original typesetting files. Page breaks are true to the original; line lengths, word breaks, heading styles, and other typesetting-specific formatting, however, cannot be
About this PDF file: This new digital representation of the original work has been recomposed from XML files created from the original paper book, not from the
retained, and some typographic errors may have been accidentally inserted. Please use the print version of this publication as the authoritative version for attribution.
Sophisticated programming is required to construct a hierarchical
clustering and an ordering according to this model.
A quick but less adequate way to construct an ordering is as follows. Let si
be the level of the ith job. Compute {si} so that most transfers from i are to jobs
j, where (sj–si) is small. The easiest criterion to minimize is the sum of (si–sj)2
over all transfers, subject to the condition that the sum of si2 over all workers be
fixed. This criterion leads to the iterative ŝ equals the average sj over all transfer
to and from i, equal to Σj(nij sj+njisj)/ Σj(nji+nij) for obtaining improved
estimates ŝi given the old estimates si. The starting point for the estimates would
be the original numbering for the jobs, which will give a crude rank order by
level in the standard classifications. The procedure should be repeated several
times.
Another simple procedure is to reorder the jobs so that as many transfers as
possible take place to increase the ordering; this is simpler conceptually but
more complicated in computation than the procedure described above.
FEASIBILITY
Our analyses were carried out to explore the feasibility of using mobility
data to construct an occupational classification. Our tentative conclusions are
the following:
1. Mobility data can be useful for constructing a hierarchical classification
and ordering of occupations, but the basic occupational titles on which the
mobility data are collected must be defined by other procedures.
2. There are formidable statistical and computational problems involved in
constructing a classification in this way. In particular, in developing
classifications for job-worker matching, it is crucial to pay careful attention
to activities before entry and after exit from the work force. In addition,
computations should be carried out using list structures; a standard matrix
representation is not feasible.
3. Some plausible statistical models for transfers are available and could be
used as a guide in evaluating and generating classifications and career-
ladder orderings.
4. Crude reclassifications and orderings suggest that the 1970 Census
classification had many pairs of similar jobs in quite different groups,
owing to its emphasis on socioeconomic status.
5. It would be feasible to construct occupational groupings so that most
transfers take place within relatively small groups and so that most
transfers take place upon a career ladder.