Under the direction of the committee, RTI International investigated the statistical reliability of the BLS wage data. Given the time and data constraints faced by the project, it was not possible to study the full spectrum of BLS wage data for all occupations. Instead, a simulation was performed with all-industry mean hourly wages and relative standard errors (RSEs) publicly available from the BLS website for three common hospital occupations. These three occupations would comprise a substantial portion of a BLS hospital wage index including all occupations.
OCCUPATIONS USED IN THE ANALYSIS
The occupations used in the simulation were
- Registered nurses;
- Nursing aides, orderlies, and attendants; and
- Office and administrative support occupations.
Together, these three occupations account for half (49.9 percent) of total hospital employment nationally, according to the May 2009 Occupational Employment Statistics national employment estimates for hospitals from the BLS website. The occupations’ shares in employment in the three occupations combined are:
- Registered nurses: 56 percent;
- Nursing aides, orderlies, and attendants: 15 percent; and
- Office and administrative support occupations: 29 percent.
The reliability of the estimated weighted average wage of these three occupations was investigated, that is, the reliability of average wage = [0.56 · (registered nurse wage)] + [0.15 · (nursing aide wage) + 0.29 · (administrative support wage)].
GEOGRAPHIC AREAS USED IN THE ANALYSIS
The analysis was done separately for metropolitan and nonmetropolitan areas. BLS reported at least some data for 400 metropolitan statistical areas (MSAs), but data on wages or RSEs for at least one of the three selected occupations were missing for 21 areas, leaving data for 379 MSAs in the analysis. BLS reported data for 172 nonmetropolitan areas, but data on wages or RSEs for at least one of the selected occupations were missing for four areas, leaving data for 168 nonmetropolitan areas in the analysis. Each nonmetropolitan area was contained within a single state, but many states contained more than one nonmetropolitan area (for example, northeast Alabama nonmetropolitan area, northwest Alabama nonmetropolitan area, southeast Alabama nonmetropolitan area, and southwest Alabama nonmetropolitan area). RTI analyzed data for these nonmetropolitan areas individually and did not aggregate them into a single statewide nonmetropolitan area. RTI’s analysis likely overstates the RSEs of single statewide nonmetropolitan areas.
CALCULATION OF RSE FOR EACH AREA
The RSE for each area was calculated according to the following derivation:
Let a1, a2, and a3 be the weights of the components in the index, where component 1 is registered nurses; component 2 is nursing aides; and component 3 is administrative support occupations.
The weights are as follows:
a1 = 0.56, a2 = 0.15, and a3 = 0.29.
Let y1, y2, y3 = estimated mean wages for the components (in a certain area).
The index value (Y; weighted average wage) is defined as Y = (a1· y1 + a2 · y2 + a3 · y3).
Let s1, s2, s3 = standard errors (SEs) for y1, y2, and y3 respectively. The SEs (s) were calculated from the BLS-reported RSEs as s1 = (RSE1 · y1), s2 = (RSE2 · y2), and s3 = (RSE3 · y3).
Let c be the sampling correlation between pairs of y variables, which is unknown. For this simulation, it was assumed that c is equal to 0.5 (extremes values are 0 and 1).
Now, the squared SE of Y, denoted V, can be calculated for each area, using the national employment weights a1, a2, and a3; the assumed sampling correlation c; and the BLS-reported SEs s1, s2, and s3.
Then, the RSE for each area is calculated as .
CALCULATION OF RELIABILITIES FOR EACH AREA
To calculate reliabilities, an estimate of the between-area population (model) variance T was first obtained as follows:
(1) Areas with very large values of V were discarded (RTI removed areas with V >2 for nonmetropolitan areas [5 areas] and V >3 for metropolitan areas [five areas]) to improve the efficiency of the estimator.
(2) T= SD(Y)2 – mean(V), where SD(Y) is the standard deviation of the remaining Y values and mean(V) is the mean of the remaining V values, was estimated.
Then, the reliability for each area was calculated as T/(V + T), where V is the area-specific quantity derived above.
Table D-1 shows the distribution of metropolitan and nonmetropolitan areas by ranges of simulated RSEs in the weighted-average wage of the three common hospital occupations that together account for about half of total hospital employment. About three-quarters of both metropolitan and nonmetropolitan areas have RSEs of 1 to 3 percent. Less than 10 percent of metropolitan areas and less than 5 percent of nonmetropolitan areas have RSEs greater than 5 percent or have missing data. Fully 95 percent of non-missing metropolitan employment, and 90 percent of non-missing nonmetropolitan employment, for these occupations was located in areas with RSEs of 3 percent or less. Less than 1 percent of non-missing metropolitan employment, and less than 3 percent of non-missing nonmetropolitan employment, for these occupations was located in areas with RSEs greater than 5 percent.
Table D-2 shows the distribution of metropolitan and nonmetropolitan areas by ranges of simulated reliabilities in the weighted-average wage of the three common hospital occupations. A reliability of 90 percent means that 90 percent of the measured wage variation among areas is due to real wage differences among areas rather than sampling variation. If a reliability greater than 90 percent is considered “very good,” then the average wage estimates for about 90 percent of both metropolitan and nonmetropolitan areas have very good reliability. Fully 99 percent of non-missing metropolitan employment, and 94 percent of non-missing non-metropolitan employment, for the three occupations, is in these areas with “very good” reliability of wage estimates. Another 3 percent of metropolitan areas and 5 percent of nonmetropolitan areas have reliabilities between 80 and 90 percent, which could be considered “acceptable.” Only 7 percent of metropolitan areas and 6 percent of nonmetropolitan areas have reliabilities of 80 percent or less or missing data, and these areas comprise only 0.4 percent of non-missing metropolitan employment and 3.8 percent of non-missing nonmetropolitan employment. (Areas with missing data are likely to have the lowest reliability.)
The statistical reliability of the BLS wage data is adequate for most metropolitan and non-metropolitan areas (at least 90 percent of areas). For a small proportion of areas, the BLS data are not as reliable. For areas with less reliable data, steps that could be taken to improve the
|Relative Standard Error (%)||Metropolitan Areas||Nonmetropolitan Areas|
|Number||% of Areas||% of Employment||Number||% of Areas||% of Employment|
|>5 to 10||12||3||0.8||4||2.3||2.2|
|>4 to 5||12||3||0.8||6||3.5||2.7|
|>3 to 4||41||10.3||3.1||13||7.6||5|
|>2 to 3||117||29.3||15.7||50||29.1||24.8|
|>1 to 2||177||44.3||58.3||91||52.9||62.8|
|0 to 1||18||4.5||21.2||4||2.3||2.6|
|Relative (%)||Metropolitan Areas||Nonmetropolitan Areas|
|Number||% of Areas||% of Employment||Number||% of Areas||% of Employment|
|>90 to 100||362||90.5||98.9||153||89||94|
|>80 to 90||11||2.8||0.7||9||5.2||2.1|
|>70 to 80||4||1||0.3||3||1.7||1.7|
|>50 to 70||0||0||0||3||1.7||2.1|
|0 to 50||2||0.5||0.1||0||0||0|
reliability of estimated wages include consolidating them with adjacent areas; increasing the proportion of sampled employers who respond to the BLS Occupational Employment Statistics survey; increasing the number of employers surveyed by BLS; and adding more years of data. For example, where nonmetropolitan wage data are less reliable, multiple sub-state nonmetropolitan areas could be consolidated into a single statewide nonmetropolitan area.
This page intentionally left blank.