Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.

Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 157

Appendix F
A Multivariate Analysis of Potential
Biases in the Final Evaluation Scores
Because bivariate relationships can be obscured if the data generating
processes are multivariate, the data were also examined using a multi-
variate regression approach. As was also true in the case of the bivariate
statistical analyses, the multivariate model was designed to explore the
statistical significance of potential sources of bias in the determination of
National Sea Grant Office Final Evaluation Review (FE) scores. Thus the
model did not include measures of program accomplishments and suc-
cess, but instead assumed that the Program Assessment Team (PAT) and
FE scores provide accurate assessments of program quality according to
the assessment criteria, but might be subject to random errors associated
with differences between Cycle 1 and Cycle 2, the number of years that
particular NSGO program officers are associated with: particular Sea
Grant programs; program seniority; the size of state and federal budget
allocations awarded to programs; the within cycle order of review of
programs; and the number of years that particular program officers have
served as program officers. The general linear model that was estimated
can be represented by:
Cyclej, PO Continuityij, Program Maturityij, State Budgetij,
FEij = f
Federal Budgetij, Order of Reviewi, PO Seniorityij
where Cyclej is a binary variable used to differentiate between scores
awarded in Cycle 1 and Cycle 2; PO Continuity is the number of years that
a particular NSGO program officer is assigned to the ith individual Sea
157

OCR for page 157

158 APPENDIX F
Grant program during the jth review cycle; Program Maturity is the num-
ber of years that elapsed between the initial chartering of the ith indi-
vidual Sea Grant program and the jth review cycle; State Budget is the
average state budget allocated to the ith individual Sea Grant program for
2000 through 2002 for observations from Cycle 1 and the 2003 budget for
Cycle 2; Federal Budget is the average federal budget allocated to the ith
individual Sea Grant program for 2000 through 2002 for observations
from Cycle 1 and the 2003 budget for Cycle 2; Order of Review is a pair of
binary variables used to differentiate between individual Sea Grant pro-
grams reviewed in the first or second year of each cycle from those that
were reviewed in the third or fourth year of that cycle; and PO Seniority is
a set of binary variables used to differentiate between individual Sea
Grant programs that were reviewed by program officers with one or less,
2 or 3, 4 to 10, or more than 10 years of experience as program officers.
With observations from Cycle 1 and Cycle 2, there were 44 observations
available to use in the analysis. The initial model coefficient estimates are:
Standard
Coefficients Error P-value
Intercept 2.723 0.608 0.000
Cycle Dummy 0.068 0.156 0.667
PO Continuity -0.087 0.038 0.029
Program Maturity -0.040 0.019 0.046
State Budget 1.17E-07 2.32E-07 0.617
Federal Budget 7.68E-08 1.11E-07 0.493
Prog Reviewed in Year 1 0.049 0.186 0.793
Prog Reviewed in Year 2 0.088 0.162 0.591
PO Experience < or = 1 year -0.277 0.411 0.505
PO Experience 2 to 3 years 0.093 0.212 0.664
PO Experience 4 to 10 years -0.117 0.146 0.428
The structure of the model can be viewed as an attempt to explain
variations in FE scores for the individual programs using information or
proxy information for potential sources of bias that were suggested by the
individual Sea Grant program directors. Thus, if the model were to pro-
vide accurate predictions of the FE scores, there would be evidence to
support the concerns of the individual Sea Grant program directors. The
value of R2 (0.292) indicates that the estimated model accounts for 29.2
percent of the observed variation in FE scores. The F-statistic (1.359) is
used to test whether the model estimates provide a statistically significant
improvement over simply using the average of all FE scores as a predic-
tor. The null hypothesis for the test is that the sum of squared deviations

OCR for page 157

APPENDIX F 159
of the estimates is not significantly different from the sum of squared
deviations about the mean. Because the probability that the null hypoth-
esis is true (0.242) is greater than 5 percent, the null hypothesis cannot be
rejected.
Although the overall model performance does not lend credence to
the hypothesized biases, it is instructive to look at the model coefficients.
The coefficients are the partial derivatives of the model with respect to the
explanatory variables. That is, the coefficients are the estimated changes
in the value of the FE score for a marginal increase in the associated
explanatory variable, holding the value of all other explanatory variables
constant.
The coefficient associated with the Cycle dummy suggests that there
has been an average increase of 0.068 points in the scores of programs in
Cycle 2 relative to the scores of programs in Cycle 1. This increase could
be due to across-the-board degradation in the programs or tougher grad-
ing, but the difference could also have resulted from pure chance. Indeed,
the probability that a value of 0.068 could have been observed even if the
truth were that there is no effect is 0.667; consequently, it can be con-
cluded that the estimated difference is not significantly different from
zero.
The PO Continuity variable is associated with a coefficient of -0.087.
This suggests that for each additional year that a particular program of-
ficer spends working with a particular Sea Grant program, the average FE
score falls by 0.087 points. This is consistent with public testimony that
suggested that the scores would be lower for individual Sea Grant pro-
grams that enjoyed longer working relationships with their program of-
ficers. Consequently, the relevant null (no effect) hypothesis is that this
coefficient is not significantly greater than zero. Because the probability of
observing an estimate of -0.087 if the true value of this coefficient were
greater that or equal to zero is 0.014, the null hypothesis can be rejected.
That is, there is statistical support for the assertion that individual Sea
Grant programs with long-term relationships with their program officers
scored lower than programs with less program officer continuity.
The coefficient associated with the Program Maturity variable (-0.040)
suggests that for every additional year of age, program scores decline by
0.040 points. Because testimony suggested that there is an inverse rela-
tionship between program age and the FE score, the null hypothesis is
that the estimated coefficient is greater than or equal to zero. Because the
probability that we would observe an estimate of -0.040 if the true value
of the coefficient were greater than or equal to zero is 0.023,1 the null
1The p-value for a 1-tail test is one half the magnitude of the p-value for a 2-tail test;
Excel's regression output defaults to a 2-tail p-value.

OCR for page 157

160 APPENDIX F
hypothesis can be rejected; there is statistical support for the assertion
that mature programs are scored lower than newer programs.
The coefficients associated with the magnitude of state and federal
budgets allocated to the individual Sea Grant programs indicate that pro-
grams with larger budgets earn higher scores, but the effect is miniscule:
a $1 increase in the individual program's state budget is associated with
an increase of 1.17E-07 in the score, and a $1 increase in the individual
program's federal budget is associated with an increase of 7.68E-08 in the
score. That is, to increase the score by 0.1 point, the individual program's
state budget would need to be increased by about $8.5 million or the
individual program's federal budget would need to be increased by about
$13 million. Moreover, the standard errors of the coefficient estimates are
so large that the probabilities that differences in the magnitude of state
and federal budget allocations have no effect on FE scores are greater than
50 percent.
The effect of Order of Review is represented by two binary variables,
so the influence of order of review must consider both coefficients to-
gether. The appropriate test is an F-test that compares the predictive abil-
ity of the model presented above and a model that differs from the above
model by excluding the two binary variables used to represent the order
of review. The probability that the order of review has no statistically
significant influence on the FE score is 93 percent.
The effect of PO Seniority is represented by three binary variables,
each of which represents the average difference in scores awarded to
programs with the most senior program officers relative to the scores
awarded to programs with one of the three categories of less experienced
program officers. The statistical significance of the influence of program
officer seniority is tested with an F-test similar to the test applied for Order
of Review. The probability that program officer seniority has no statisti-
cally significant influence on the FE score is 64 percent.
Because preliminary analysis failed to eliminate the possibility that
PO Continuity or Program Maturity exercise statistically significant influ-
ence on FE scores, the model was respecified using only those variables as
explanations of the observed variation in final scores. The restricted model
coefficient estimates are:

OCR for page 157

APPENDIX F 161
Standard
Coefficients Error P-value
Intercept 2.504 0.530 2.74E-05
PO Continuity -0.079 0.029 0.010
Program Maturity -0.040 0.019 0.046
State Budget -0.023 0.015 0.137
Although the value of R2 (0.184) for this simpler model is smaller than
the R2 for the initial model (0.292), the difference in model performance is
not statistically significant.2
In the restricted model, the coefficient (-0.079) associated with the PO
Continuity variable suggests that for each additional year that a particular
program officer spends working with a particular individual Sea Grant
program, the average FE score falls (is improved) by 0.079 points. Again,
because public testimony suggested that the scores would be lower for
programs that enjoyed longer working relationships with their program
officers, the null (no effect) hypothesis is that this coefficient is not signifi-
cantly greater than zero. Because the probability of observing an estimate
of -0.079 if the true value of this coefficient were greater that or equal to
zero is 0.005, the null hypothesis can be rejected. That is, there is again
statistical support for the assertion that individual Sea Grant programs
that have enjoyed long term relationships with their program officers
scored lower (better) than programs with less program officer continuity.
The coefficient associated with the Program Maturity variable (-0.023)
suggests that for every additional year of age, program scores decline by
0.023 points. Because testimony suggested that there is an inverse rela-
tionship between program age and the FE score, the null hypothesis is
that the estimated coefficient is greater than or equal to zero. However,
because there is a 0.069 probability of observing an estimate of -0.023 even
if the true value of the coefficient were greater than or equal to zero, the
null hypothesis cannot be rejected, thus there is insufficient statistical
support for the assertion that mature programs are scored lower than
newer programs.
The results of the restricted model suggest that the model could be
further simplified without statistically significant loss of performance.
The coefficient estimates for a simple linear regression model are:
2 If the true difference in performance between the initial model and the restricted model
were zero, the probability of observing this large of a decrease in model fit with the elimina-
tion of 8 explanatory variables is 0.747.

OCR for page 157

162 APPENDIX F
Standard
Coefficients Error P-value
Intercept 1.715 0.103 0.000
PO Continuity -0.077 0.030 0.013
Although the value of R2 (0.138) for this model is again smaller than
the R2 for the initial model (0.292), the difference in model performance is
not statistically significant.3
In this model, the PO Continuity variable is associated with a coeffi-
cient of -0.077, suggesting that for each additional year that a particular
program officer spends working with a particular individual Sea Grant
program, the average FE score falls (improves) by 0.077 points. Again,
because public testimony suggested that the scores would be lower for
Sea Grant Colleges and Institutes that enjoyed longer working relation-
ships with their program officers, the null (no effect) hypothesis is that
this coefficient is not significantly greater than zero. Because the probabil-
ity of observing an estimate of -0.077 if the true value of this coefficient
were greater that or equal to zero is only 0.007, the null hypothesis can be
rejected. That is, there is again statistical support for the assertion that
individual Sea Grant programs with long term relationships with pro-
gram officers are scored lower (better) than programs with less program
officer continuity.
In summary, the results of the multivariate analysis are generally
consistent with the results of the bivariate analyses and do not support
the suggestions that the FE scores are biased as a result of program officer
seniority, program funding levels, program maturity, order of review
within a cycle, or between Cycle 1 and Cycle 2. However, there is persis-
tent and statistically significant evidence that program officer continuity
with the individual Sea Grant program is inversely related to the FE score.
Indeed, there is less than a 0.007 probability of observing an estimate as
large as |-0.077| if the true value of the coefficient were zero.
The analysis suggests that knowing how long a program officer has
been assigned to a state program carries information that is reflected in
the FE scores, but the analysis does not identify whether the observed
3If the true difference in performance between the initial model and the restricted model
were zero, the probability of observing this large of a decrease in model fit with the elimina-
tion of 9 explanatory variables is 0.622.

OCR for page 157

APPENDIX F 163
effect is a consequence of program officers representing the program dur-
ing the PAT or FE or due to the program officers helping to mentor the
individual Sea Grant programs or some other cause. While an effect of
0.077 points seems small, in 2004-05, the average difference between Cat-
egory 1A and Category 1B was 0.13 points, the predicted equivalent mag-
nitude of a two-year difference in the length of time that a particular
program officer is assigned to a particular individual Sea Grant program.
The average difference between Category 1B and Category 1C is of a
similar magnitude. Thus for two otherwise identical individual Sea Grant
programs that deserve to be rated in Category 1A--one with a new pro-
gram officer and one with a program officer who has been with an indi-
vidual Sea Grant program for 4 years--the program with the new officer
would be expected to score 0.307 points higher (worse), a difference large
enough to move it from Category 1A to Category 1C.

OCR for page 157