| Copyright © 2009. National Academy of Sciences. All rights reserved. Terms of Use and Privacy Statement |
Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 411
Using Mobility Data to
H Develop Occupational
Classifications:
Exploratory Exercises
JOlIN A. HARTIGAN
How can occupational mobility data help occupational classification? They
may help determine that two occupational titles with slightly different
definitions are similar enough to be amalgamated or that some occupation
is attracting two distinctly different types of workers and should perhaps
be split. They may also supplement the Dictionary of Occupational Titles in
suggesting plausible cross-listings for job titles. Occupational mobility data
can contribute only a little, however, to the definition of occupations in
terms of job tasks: for that, occupational analysis or some alternative
methodology is needed.
The most significant use of job mobility data is to suggest a suitable
hierarchical organization of occupations, given a set of occupational
definitions. Mobility data are of value in grouping occupations in a way
that reflects the transfer of workers between occupations within a group.
Mobility data also are of value in constructing career ladders, that is,
hierarchies of occupations up which workers tend to move in the course of
successful careers.
We have conducted an exploratory analysis of alternative methods of
classifying occupations. This analysis assessed the feasibility of developing
classifications consisting of groups of occupations between which there are
high rates of labor mobility.
Our basic data consist of the transfers between the 441 U.S. Census
4~]
OCR for page 412
412
WORK, JOBS, AND OCCUPATIONS
occupational categories between 1965 and 1970.i Unfortunately, data on
various extra-labor-force statuses (e.g., unemployed, in school, in armed
forces, etc.) were not available to us. Similarly, no data coded into the
12,099 DOT occupational titles are available, nor are any data available
that give complete work histories or short-term transfers between jobs.2
More appropriate data are needed for future work in this area. We use the
available census data for our exploratory purposes to illustrate how one
might proceed in constructing a classification based on naturally occurring
patterns of labor mobility.
The first problem we faced in this analysis was the storage and
manipulation of the full mobility matrix for the 441 detailed census
occupations. A 441 X 441 matrix is formidable (194,481 cells), and the
12,099 X 12,099 matrix for the DOT (more than 146 million cells) is even
worse to contemplate. A more manageable way to manipulate such data is
to represent them in a list structure, which gives for each 1965 occupation
a list of 1970 occupations to which transfers took place and the
corresponding counts in each of these occupations. The total storage is
reduced without much loss by eliminating very small counts. It is also
necessary to carry the transposed list ordered by 1970 occupational
categories.
STANDARDIZED RATES AND PROBABILITY MODELS
In order to adjust tor different numbers of workers in various occupations,
Goldhamer (1948) proposed the standardized rate
n,.fN
tij -
ni.n.j
where
nil number transferring from job i in 1965 to job j in 1970;
ni numberin jobiin 1965;
n j number in job j in 1970;
N total number of workers.
Hauser (1978) notes that this measure does not adjust for expected
diagonal peculiarities and suggests a measure in which the "margins" nit
See Sommers and Eck (1978) for a description of the data used in these analyses.
2Had the work history data routinely collected from Employment Service job applicants been
available for analysis, we could have conducted a much more interesting and informative
exercise. Unfortunately, although the work history data are initially coded with nine-digit
DOT codes, all but the first two digits are dropped when the data are put on tape.
OCR for page 413
Using Mobility Data to Develop Occupational Classifications 413
and n.j ignore specified cells such as the diagonal ones, using Goodman's
(1969, 1971) quasi-independence techniques. For example, nit, n.j, and N
might all plausibly be defined ignoring the diagonals.
For a hierarchical structure on the set of jobs, consider the model
Pij PieP.i\G (i,j) '
where
Pij
Pi.
P.
probability of observing a transfer i to I;
probability (roughly) that a worker begins in job i;
probability (roughly) that a worker ends in job I;
transfer rate corresponding to the smallest group G containing job
i and I; there will be a different rate for each group.
Following the standard quasi-independence procedure (Haberman, 1974),
the maximum likelihood estimates of Pi., P. j, AG are obtained by setting the
observed margins and between-group transfers equal to their expected
values under the model
ni IN = Pi [A P I )\G (i I)]
n.i/N = P j ~ Pi. JIG {i, I)]
-L
G tGti,/)=G · j
Solutions may be obtained by solving successively for SPi.}, [P.j~, {AG}
with the other parameters fixed. The overall fit of the model may be
measured by the log likelihood
L - I nij log pij .
This measure permits comparison of various hierarchies. It also allows
construction of new hierarchies by seeking groups G that make L as large
as possible. Conceptually, the procedure is straightforward; computation-
ally, it would be quite a chore to design iterative parameter estimates for a
list data structure and to improve the hierarchy by moving jobs between
groups.
OCR for page 414
414
CLUSTERING ANALYSES
WORK, JOBS, AND OCCUPATIONS
Alternative procedures are available. A hierarchical clustering has been
carried out by Dauffenbach (1973~. He discusses the 1970 Census
classification and principles for constructing a new classification. For job i
a vector is constructed equal to the proportion that transfer from job j to
job i for all j. Distance between jobs is the Euclidean distance between
these vectors. (Some other distances and data vectors are also considered.)
Thus two jobs are similar if there are similar patterns of movement into
them. Complete linkage clustering (cf. Hartigan, 1975) was then used to
construct a binary tree of clusters on the set of all jobs. The results are not
very different from the census classification.
The measure of distance and the data vector of proportional transfers
used by Dauffenbach are not wholly adequate. In particular, there will be
large transfers from jobs with many workers, and such jobs will tend to
make large contributions; there will be many entries near zero in every
vector, and it seems wrong to ignore this property of the vectors; the
essential information in the data is carried by the transfers from each job
to just a few other jobs. The problem with the measure of distance is that
after we have computed Euclidean distance between two vectors of length
441, we do not know what we have. Complete linkage is statistically
inconsistent. Nevertheless, Dauffenbach's clusters are suggestive.
An alterative method of constructing clusters uses a quasi-independence
model (see Appendix G). This would require advanced programming that
has not been done. A simpler method is to use the standardized transfer
rates
- nit
[if
ni.n.;
where
ni total number transferring from job i;
n; total number transferring into job j;
N total number of transfers.
Any two jobs i and j are similar if tij and tji are both high; the measure of
distance between i end j is dij = 1/mint, tji). The single linkage technique
constructs clusters by linking together jobs for which the transfer rate
exceeds some threshold; a cluster is made up of jobs linked together.
Varying the threshold produces a hierarchy of clusters.
OCR for page 415
Using Mobility Data to Develop Occupational Classifications 415
We have applied this technique: the clusters obtained are shown in
Table H-1. Like Dau~enbach's clusters they draw together different levels
of skills, such as librarian and library attendant or health record technician
and medical secretary. They also show some absurd associations, such as
dentist and flight engineer, which are due in part to single linkage chaining
together a number of slightly related jobs and in part to the unreliability of
transfer rates that (because diagonal terms are removed) may be rather
high for jobs with high retention rates, from which people transfer to just a
few other jobs.
CAREER LADDERS
We would like a classification scheme not only to group together
occupations between which transfers are likely but also to order
occupations so that transfers tend to take place from lower-ranked jobs to
higher-ranked jobs. In order to accommodate both aims and to explain the
transfer data succinctly, it would seem desirable to put jobs close together
in the structure whenever there are many transfers in either direction. The
small groups should therefore consist of families of jobs within which a
career ladder exists; there may only be a weak ladder relationship between
the larger groups. (In the census scheme there are strong ladder relations
between the large groups.)
A probabilistic model constructs an ordering and a hierarchical
classification of all jobs. The probability of a transfer i to j is
Pit Pi.P.iiii,
where
Pi is the probability (roughly) that a person is in job i in 1965,
p j is the probability (roughly) that a person is in job j in 1970, and
kit is constant over all i < j such that G is the smallest group containing
iti.
To estimate the parameters given the order and hierarchy, it is sufficient to
know the marginal numbers of workers, the number of transfers from
lower-status to higher-status occupations within each group, and the
number of transfers within each group.
OCR for page 416
416
V]
04
Cal
._
1
an
._
a'
Ct
Ct
C,7
Ct
At:
Ct
s
-
3
Cal
._
Ct
_
._
Cal
o
._
_
Cal
3
O
oo
oo
r~
oo
[~
~^
r~
O
~_
O
~O
~_
oo O
_
~oo^
_ ~t _
~_V)
0N ~C ~-
~oo
r ~ ~00 0 ~ ~
O ~r _ ~C~ O
~oo _ r~
l- ~ _ ~ r ~r~ ~ oo O ~ ~ ~ ~ _
C~ ~ ~ r~ ~ ~ ~ _
O ~ O ~ oo -i ~ ~ ~ ~ O O
_ _ ~ ~ r ~r~ ~ ~ ~ oo ~ ~ - -
a~
~:
C~
_ ~_
00 ~ O O ~t ce ~
,~} ~ C~ S C<' tV ~·~-
2- ~: C ~ C,) C ~ ~
ct ~o,0 S U, ~a' ct ·O
·- Cq ~C ~^
' ~' 3 ' 3 ~ ~ ~ 3 ~ ~ 3= ~ ~ 3 3 o
oD E ,'s E V,- ~ E .~ .= <, ° 0, E ~ .=-~ ~ ~ ~ c
E E ~ E ,,= :3 E ~ ° ° s s ~ ~ cc, ,c, o 3 ~ ° E
. . . . . . .
_ C~ ~, ~ ', ~c, ~ Oo
_ _
. .
_ _
OCR for page 417
417
oo
oo ~
Cal _
- Ciao ~o o
_ oo
ret rat ~ rat ~rat
CJ ~ ~o ~ - ~ ~ ~ O 00 oo 00 ~ ~ O ~ ~ oO
~ ~ ~ C`\ r~ {~] ELI {~) in ~ ~ ~ ~ ro rat ~ ~ ~
~ ^ ^ ^ ^ rid ~ ~-^ ~ oo t-^ ~ O ~ - i ~ ~
_ _ _
U.
V)
c:
o ~
AL
so: s::
. _ Vet
. _ ~
V,
I>
Cut CO
. o U. ~
`: 4°
~ ._ ~
Cut Cd ~-
o .=
~ V, ~u
a~ [~] ~
Ct
00
C~
U.
o C~ ~
~ ~ S
_ C O
~ ~_
_ _
~ a~ (~ ~ C>
E ~ Y~ t;: 0, ~ ~ ~ c
_`~ ~, ~ | ' ~ ~ ~ '
_ ~
..
ooo~
__
o
. .
r~ ~ ~
. .
oo
r~
_ - ,
. .
C~ ~
D
a., o
V,
o ._
~ . ~
S V,
·m ;>
._
oo c:
S:
._
~ (~.)
~ Ct
Z ~0
D ~
U.
S
~L ~
X O
aa ~
S~ S
~ I
C~
S
~ ~
·- c~s
(D
Ct
X o
V)- V
ao
C~-
cd Do
~ ·~
_
ct a~
S
O
_
s::
. - ~ ~
3 O D
,S, Cd Cd
O ~ .m
= S
O V, ~
C~ Ce O
C) ~
.0
0e
.= ~ C~
=o P"
0
_
S
~:
_ _ ._
C~ .~; ~
°J) D O
,56: 0~ 3
.St
(L) ~ S
oo O O
.= a,
C~
`
U~ O
OCR for page 418
418
WORK, JOBS, AND OCCUPATIONS
Sophisticated programming is required to construct a hierarchical
clustering and an ordering according to this model.
A quick but less adequate way to construct an ordering is as follows. Let
si be the level of the ith job. Compute {si } so that most transfers from i are
to jobs j, where (sj-si) is small. The easiest criterion to minimize is the
sum of (Si-SO over all transfers, subject to the condition that the sum of
Si2 over all workers be fixed. This criterion leads to the iterative 3 equals
the average Sj over all transfers to and from i, equal to JO (nijSj + Isis/
:~,~nji + n,,) for obtaining improved estimates Si given the old estimates
Si. The starting point for the estimates would be the original numbering for
the jobs, which will give a crude rank order by level in the standard
classifications. The procedure should be repeated several times.
Another simple procedure is to reorder the jobs so that as many
transfers as possible take place to increase the ordering; this is simpler
conceptually but more complicated in computation than the procedure
described above.
FEASIBILITY
Our analyses were carried out to explore the feasibility of using mobility
data to construct an occupational classification. Our tentative conclusions
are the following:
1. Mobility data can be useful for constructing a hierarchical clas-
sification and ordering of occupations, but the basic occupational titles on
which the mobility data are collected must be defined by other procedures.
2. There are formidable statistical and computational problems in-
volved in constructing a classification in this way. In particular, in
developing classifications for job-worker matching, it is crucial to pay
careful attention to activities before entry and after exit from the work
force. In addition, computations should be carried out using list structures;
a standard matrix representation is not feasible.
3. Some plausible statistical models for transfers are available and could
be used as a guide in evaluating and generating classifications and career-
ladder orderings.
4. Crude reclassifications and orderings suggest that the 1970 Census
classification had many pairs of similar jobs in quite different groups,
owing to its emphasis on socioeconomic status.
5. It would be feasible to construct occupational groupings so that most
transfers take place within relatively small groups and so that most
transfers take place upon a career ladder.
Representative terms from entire chapter:
occupational titles