Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter.
Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.
OCR for page 226
APPENDIX F
DATA ON PUBLICATION RECORDS
Data for these measures were provided by a subcontractor, Computer
Horizons, Inc. A detailed description of the derivation of these
measures and examples of their use is given in:
Francis Narin, Evaluative Bibliometrics: The
Use of Publications and Citations Analysis in the
Evaluation of Scientific Activity, Report to the
National Science Foundation, March 1976.
The following pages have been excerpted from Chapters VI and VII of
this report and describe operational considerations in compiling the
publication records included here (measure 15) and the methodology
used in determining the "influence" of published articles (measure 16)
226
OCR for page 227
227
VI. OPERATIONAL CONSIDERATIONS
A. Basics of Publication and Citation Analysis
The first section of this chapter discusses the major stages
of publication and citation analysis technique. in evaluative
bibliometrics. Later sections of the chapter consider publica-
tion and citation count parameters in further detail, including
discussions of data bases, of field-dependent characteristics
of the literature, and of some cautions and hazards in perform-
ing citation analyses for individual scientists.
The basic stages which must be kept in mind when doing a
publication or citation analysis are briefly summarized in Figure
6-1.
1. Type of Publication
For a publication analysis the fundamental decision is
which type of publication to count. A basic count will include
all regular scientific articles. However, notes are often count-
ed since some engineering and other journals often contain notes
with significant technical content. Reviews may be included.
Letters-to-the-editor must also be considered as a possible cate-
gory for inclusion, since some important journals are sometimes
classified as letter journals. For example, publications in
Physical Review Letters were classified as letters by the Science
Citation Inde prior to 1970, although they are now classified
as articles.
For most counts in the central core of the scientific lit-
erature, articles, notes and reviews are used as a measure of
scientific output. When dealing with engineering fields, where
many papers are presented at meetings accompanied by reprints
and published proceedings, meeting presentations must also be
considered. In some applied fields, i.e., agriculture, aero-
space and nuclear engineering, where government support has been
particularly comprehensive, the report literature may also be im-
portant. Unfortunately, reports generally contain few refer-
ences, and citations to them are limited so they are not amenable
to the normal citation analyses.
Books, of course, are a major type of publication, espec-
ially in the social sciences where they are often used instead
of a series of journal articles. In bibliometrics a weighting
of n articles equal to one book is frequently used; no uniform-
ly acceptable value of n is available. A few of the papers
discussed in Chapter V contain such measures.
OCR for page 228
228
o ~
~ <,~z
/ , ~ ~ ~S
o ~
\
\
J
~r
z
o
-
m
~ O
_ Z~
c) O ~n
cS 4~ [L ~
Z = - Z
3, ~ ~
~ ·L ~S ~
Lo3J I ~
1
C)
~ <`
cn ~ 3
eO ~
~ o
w~ ~
~, cn LL
i=Z~
~,
t~ J
cn
~ J
m O
,~c,~:
l
-
UJ
C, Z
IL Z ~ ~ Z~
Z ,~ i_ C
Oz l_ ~ ~ ~ Z ~ _ =_
1 1 1 1 i
~n
CC
Z
O _
J
8
bm
C'
Z ._ Z
o ~ ~ g
_ ~ ~ ~ 8
o
Z ~ ~
o > 0
~ C,) U)
C', UJ
— ~ C,
o
a:
~D
~:
3
C'
H
U]
H
U]
2
2
o
H
E~
C)
z
o
E~
~:
m
3
o
U)
E~
U)
OCR for page 229
229
2. Time Spans
A second important decision in making a publication count
is to select the time span of interest. In the analysis of the
publications of an institution a fixed time span, usually one
year or more, is most appropriate. In comparing publication
histories of groups of scientists, their professional ages
(normally defined as years since attaining the PhD degree) must
be comparable so that the build-up of publications at the begin-
ning of a career or the decline at the end will not complicate
the results. A typical scientist's first publication appears
soon after his dissertation; if he continued working as a sci-
entist, his publications may continue for thirty or more years.
The accurate control of the time span
as trivial as it might seem. Normally, the
made from secondary sources (Art
rather than from scann i n a ah ~
of a count is not
publication count is
_~ "A ally or indexing services)
~ , _ _ publications individually. Since
most abstracting and indexing sources have been expanding their
coverage over time, any publication count covering more than a
few years must give careful consideration to changes in coverage.
Furthermore, the timeliness of the secondary sources varies
widely, with sources dependent on outside abstracters lagging
months or even years behind. Since these abstracting lags may
depend upon language, field and country of origin, they are
particular problem in international publication counts.
The Science Citation Index is one of the most current
,
secondary sources, with some80% to 90% of a given year's publi-
cations in the SCI for that year.
Of course, no abstracting or indexing service can be per-
fect, since some journals are actually published months after
their listed publication dates. Nevertheless, variations in
timeliness are large from one service to another.
3. Comprehensiveness of Source Coverage
An important consideration in making a publication count
is the comprehensiveness of the source coverage. Most abstract-
ing and indexing sources cover some journals completely, cover
other journals selectively, and omit some journals in their
field of interest. The Science Citation Index is an exception
in that it indexes each and every important entry from any jour-
nal it covers. This is one of the major advantages in using
the SCI as a data base. Chemical Abstracts and Biological
~ . .
Abstracts have a group of journals which they abstract complete-
ly, coupled with a much larger set of journals from which they
abstract selectively, based upon the appropriateness of the
article to the subject coverage. In some cases the abstracter
or indexer may make a quality judgment, based on his estimate
of the importance or the quality of the article or upon his
OCR for page 230
230
knowledge of whether similar information has appeared elsewhere;
Excerpta Medica is a comprehensive abstracting service for which
articles are included only if they meet the indexers' quality
criteria.
Some data on the extent of coverage of the major secondary
sources is presented in Section D of this chapter.
4. Multiple Authorships and Affiliations
Attributing credits for multiple authorships and affili-
ations is a significant problem in publication and citation anal-
ysis. In some scientific papers the authors are listed alpha-
betically; in others the first author is the primary author;
still others use different conventions. These conventions have
been been discussed by Crane1 and by other social scientists.2
There does not seem to be any reasonable way to deal with the
attribution problem, except to attribute a fraction of a publi-
cation to each of the authors. For example, an article which
has three authors would have one-third of an article attributed
to each author. The amount of multiple authorship unfortunately
differs from country to country and from field to field. Several
studies have investigated the problem, but no comprehensive
data exists.3
Multiple authorship takes on particular importance when
counting an individual's publications since membership on a
large research team may lead to a single scientist being a co-
author of ten or more publications per year. This number of
publications is far in excess of the normal publication rate
of one to two articles per year per scientist.
Multiple authorship problems arise less often in institu-
there are seldom more than one
tional publication counts since
or two institutions involved in one publication.
A particularly vexing aspect of multiple authorship is the
first author citation problem: almost all citations are to the
first author in a multi-authored publication. As a result, a
researcher who is second author of five papers may receive no
Diana Crane, "Social Structure in a Group of Scientists:
A Test of the 'Invisible College' Hypothesis," American Socio-
logical Review 34 (June 1969):335-352.
James E. McCauly, "Multiple Authorship," Science 141
(August 1963):579.
Beverly L. Clark, "Multiple Authorship Trends in Scientific
Papers " Science 143 (February 1964):822-824.
3Harriet Zuckerman, "Nobel Laureates in Science: Patterns
of Productivity, Collaboration, and Authorship," American
Sociolgoical Review 32 (June 1967):391-403.
-
OCR for page 231
231
citations under his own name, even though the papers he co-author-
ed may be highly cited. Because of this, a citation count for a
person must account for the citations which appear under the
names of the first authors of publications for which the author
of interest was a secondary author. This can lead to a substan-
tial amount of tedious additional work, since a list of first
authors must be generated for all of the subjects' multi-author-
ed papers. Citations to each of these first authors must then
be found, the citations of interest noted, and these citations
fractionally attributed to the original author. Since multiple
years of the Citation Index are often involved, the amount of
clerical work searching from volume to volume and from author
to author, and citation to citation can be quite large.
A note of caution about the handling of multiple author-
ship in the Corporate Index of the Science Citation Index: SCI
lists a publication giving all the corporate affiliations, but
always with the first author's name. Thus a publication by
Jones and Smith where Jones is at Harvard and Smith is at Yale
would be listed in the Corporate Index under Harvard with the
name Jones and also under Yale with the name Jones. To find
the organization with which the various authors are affiliated,
the original article must be obtained.
Although the publisher of the Science Citation Index, the
Institute for Scientific Information, tries to maintain a con-
sistent policy in attributing institutional affiliations, when
authors have multiple affiliations the number of possible var-
iants is large. In the SCI data base on magnetic tape, suffic-
ient information is included to assign a publication with auth-
ors from a number of different institutions in a reasonably
fair way to those institutions; however, in the printed Cornor-
ate Index one has to refer to the Source Index to find the
actual number of authors, or to the paper itself to find the
affiliations of each of the authors.
5.
Completeness of Available Data
Another consideration in a publication analysis is the
completeness of data available in the secondary source, since
looking up hundreds or thousands of publications individually
is tedious and expensive. One difficulty here is that most of
the abstracting and indexing sources are designed for retrieval
and not for analysis. As a result, some of the parameters which
are of greatest analytical importance, such as the affiliation
of the author and his source of financial support, are often
omitted. Furthermore, some of the abstracting sources are
cross-indexed in complex ways, To that a publication may only
be partially described at any one point, and reference must be
made to a companion volume to find even such essential data
as the author's name. While intellectually trivial, these
OCR for page 232
232
searches can be exceedingly time consuming when analyzing large
numbers of publications.
The specific data which are
secondary sources are the basic bi
authors' name, journal or resort t
information is the basic
abstracting and indexing
bibliographic information
consistently available in the
bliographie information: i.e.,
__~___ _itle, volume, page, eta. This
data used for retrieval, and since the
services are retrieval oriented, this
is always included.
Data which are less consistently available in the seeon-
dary source are the authors' affiliation and the authors' rank
or title. Both of these are of interest in analysis. For ex-
ample, the ranking of universities based on publication in a
given subject area is often of interest. This ranking can be
tabulated only from a secondary source which gives the authors'
university affiliation.
Sunnort Aeknowledaements
The source of the authors' financial support is seldom
given in any secondary source, although it is now being added
to the MEDLARS data base. Since this financial data can be used
to define the fraction of a subject literature which is being
supported by a particular corporate body such as a governmental
agency, the data are of substantial evaluative interest.
The amount of acknowledgement of agency support in the
scientific literature has changed over time. In a Computer
Horizons study completed in 1973 the amount of agency support
acknowledgement was tabulated in twenty major journals from
five different fields.4 Table 6-1 summarizes those support
acknowledgements for 1969 and 1972.
In 1969, only 67% of the
acknowledged financial support
articles acknowledging financi
imately 85%. The table shows
from one field to another and
terest to these sources differ
National Science Foundation is
support in mathematics, while
clearly dominate the support o
with the largest amount of non
support in the U.S.
articles in 20 major journals
. By 1972, the percentage of
al support had risen to approx-
that the sources of support differ
also shows that the fields of in-
as well. For example, the
the major source of acknowledged
the National Institutes of Health
f biology. Chemistry is the field
-government (private sector)
Note also that the 20 journals used were major journals
in their fields; as less prestigious journals are examined, the
amount of support acknowledgement generally decreases.
-
Computer Horizons, Ine., Evaluation of Research in the
Physical Sciences Based on Publications and Citations, Washington,
. .
D.C., National Science Foundation, Contract No. NSF-C627, November,
1973.
OCR for page 233
233
10 0`
0
a'
~: iD
r~
a~
cn 0
o
.-
Z m~
3 ~
o
:^ ~
~ ~ 4J 0\
Z ~ U.
H ~\ , -
a ~ E
o,
0=
a'
0 m~
r~
Z
_4 H _I
1 >' ,_
54 a
U)
E~ 1 u,
.^
~ a~
m ~: u~ ~ ~
01
C: ~ ~
a ~
W H
3 ~ a,
O ~ O
Z O '^
~: x:
~ ~ P~
E~ ·n
O
~ O U) r~
P4 P: C) ~
~ ~ ~ a'
U]
~ aC o~
Z ~ ~D
O £ ~1
~:
er
~a
>'
3
O
~: ~
U
' ·<
~0 40 ~D 0\ ~ ~ tD
_. _~ ~ _ _
~ ~ ~ O ~ ~4 ~ CO
CD ~ C~ ~ r~ ~ ~ ~ ~
~ —~ N _'
a:, ~ ~ r~ ~ ~ o, 0 r~
CO 0` ~ ~ ~ ~ O
~4
~P
O ~ U~
_. _t
dP
0 a' 0 ~ r~ ~ 0 aD
d°
O O
d _~
~ ~ co r. ~
dP
a~ ~ un u~ ~ u~ _
d°
~ N ~1
t— ~ ~4 t— _. N
0 ~ r~
CO N ~ un ~ ~ ~ ~ u~
_1 ~
a,
. ~ . a
V) ~ V)
3
O
~ ~ ~ ~ ~C
44: ~ 4} 4~ 4~ c)
~ ::: ~ ~ U) .C ~ ~ ~ ~'
U' ~ ~ O ~ ~ O ~ O ~
z z ~s: a z 0 ~ 0 ~ ~
OCR for page 234
234
In an attempt to account for the 15% of unacknowledged
papers, a questionnaire was sent to all U.S. authors in the 1972
sample who did not acknowledge agency support. Almost 70% of the
authors who had not listed sources of support responded to the
questionnaire. Of the authors who responded, over two-thirds
were supported by their institutions as part of their regular
duties; approximately 20% of the respondents cited specific
governmental agencies as sources of support, even though they
had not acknowledged these in the article itself. Twelve per-
~ent of the respondents listed no agency or institutional sup-
port; research done as fulfillment of graduate studies was in-
cluded in this category.
Overall, the 1972 tabulation and survey showed that 88%
of the research reported in these prestigious journals was ex-
ternally supported, and that 97% of the externally supported
work was acknowledged as such.
7. Subject Classification
Having constructed a basic list of publications, the next
step in analysis is normally to subject classify the publica-
tions. Either the journals or the papers themselves may be
classified. When a large number of papers is to be analyzed,
classification of the papers by the field of the journal can
be very convenient. Such a classification implies, of course,
a degree of homogeneity of publication which is normally ade-
quate when analyzing hundreds of papers. Such a classification
may not be sufficient for the analysis of the scientific pub-
lications of one or a few individuals.
Subject classification schemes differ from one abstract-
ing and indexing service to another. Therefore, a comparison
of a collection of papers based on the classification schemes
of more than one abstracting and indexing service is almost
hopeless. A classification of papers at the journal level has
been used in the influence methodology discussed in Chapters
VII through X.
8. Citation Counts
Citation counts are a tool in evaluative bibliometrics
second in importance only to the counting and classification
of publications. Citation counts may be used directly as a
measure of the utilization or influence of a single publica-
tion or of all the publications of an individual, a grant, con-
tract, department, university, funding agency or country.
Citation counts may be used to link individuals, institutions,
and programs, since they show how one publication relates to
another publication.
OCR for page 235
235
In addition to these evaluative uses, citations also have
important bibliometric uses, since the references from one paper
to another define the structure of the scientific literature.
Chapter III discusses how this type of analysis may be carried
out at a detailed, micro-level to define closely related papers
through bibliographic coupling and co-citation. That chapter
also describes how citation analysis may be used at a macro-
level to link fields and subfields through journal-to-journal
mapping. The bibliometric characteristics of the literature
also provide a numeric base against which evaluative parameters
may be normalized.
Some of the characteristics of the literature which are
revealed by citation analysis are noted on Figure 6-1. These
characteristics include:
The dispersion of references: a measure
of scientific "hardness", since in fields
that are structured and have a central
core of accepted knowledge, literature
references tend to be quite concentrated.
The concentration of papers and influence:
another measure of centrality in a field,
dependent upon whether or not a field has
a core journal structure.
The hierarchic dependency relationships
between field, subfield and journals,
including the comparison of numbers of
references from field A to field B.
compared with number of references from
field B to field A: this comparison pro-
vides a major justification for the pur-
suit of basic research as a foundation
of knowledge utilized by more applied
areas.
The linkages between fields, suLfields
and journals: a measure of the flow of
information, and of the importance of
one sector of the scientific mosaic to
another.
OCR for page 236
236
VII. THE INFLUENCE METHODOLOGY
A. Introduction
In this chapter an influence methodology will be described
which allows advanced publication and citation techniques to be
applied to institutional aggregates of publications, such as
those of departments, schools, programs, support agencies and
countries, without performing an individual citation count. In
essence, the influence procedure ascribes a weighted average set
of properties to a collection of papers, such as the papers in
a journal, rather than determining the citation rate for the
papers on an individual basis.
The influence methodology is completely general, and can
be applied to journals, Outfields, fields, institutions or coun-
tries.
There are three separate aspects of the influence method-
ology which are particularly pertinent to journals. These are
1. A subject classification for each journal
2. A research type (level) classification for
the biomedical journals, and
3. Citation influence measures for each journal.
It is the third of these, the citation influence measures, which
add a quality or utilization aspect to the analysis. The influ-
ence methodology assumes that, although citations to papers vary
within a given journal, aggregates of publications can be char-
acterized by the influence measures of the journals in which
they appear. Chapter IX discusses this assumption in some de-
tail.
Older measures of influence all suffer from some defect
which limits their use as evaluative measures.
The total number of publications of an individual, school
or country is a measure of total activity only; no inferences
concerning importance may be drawn.
The total number of citations to a set of publications,
while incorporating a measure of peer group recognition, de-
pends on the size of the set involved and has no meaning on an
absolute scale.
The journal "impact factor" introduced by Garfield is a
size-independent measure, since it is defined as the ratio of
the number of citations the journal receives to the number of
publications in a specified earlier time period.1 This
1Eugene Garfield, "Citation Analysis As a Tool in
Journal Evaluation " Science 178 (November 3 1972):471
. , .
OCR for page 237
237
measure, like the total number of citations, has no meaning on an
absolute scale. In addition the impact factor suffers from three
more significant limitations. Although the size of the journal,
as reflected in the number of publications, is corrected for,
the average length of individual papers appearing in the journal
is not. Thus, journals which publish longer papers, namely re-
view journals, tend to have higher impact factors. In fact
the nine highest impact factors obtained by &arfield were for
review journals. This measure can therefore not be used to
establish a "pecking order" for journal prestige.
The second limitation is that the citations are unweighted,
all citations being counted with equal weight, regardless of the
citing journal. It seems more reasonable to give higher weight
to a citation from a prestigious journal than to a citation from
a peripheral one. The idea of counting a reference from a more
prestigious journal more heavily has also been suggested by
Kochen.
A third limitation is that there is no normalization for
the different referencing characteristics of different segments
of the literature: a citation received by a biochemistry journal,
in a field noted for its large numbers of references and short
citation times, may be quite different in value from a citation
in astronomy, where the overall citation density is much lower
and the citation time lag much longer.
In this section three related influence measures are de-
veloped, each of which measures one aspect of a journal's in-
fluence,with explicit recognition of the size factor. These
measures are:
(1) The influence weight of the journal:
a size-independent measure of the
weighted number of citations a jour-
nal receives from other journals,
normalized by the number of refer-
ences the journal gives to other jour-
nals.
(2) The influence per publication for the
journals: the weighted number of ci-
tations each article, note or review
in a journal receives from other
journals.
The total influence of the journal:
the influence per publication times
the total number of publications.
M.Kochen, Princiles of Information Retrieval, (New
York: John Wiley & Sons, Inc~.~ 19-74), 83.~
OCR for page 238
238
B. Development of the Weighting Scheme
1. The Citation Matrix
A citation matrix may be used to describe the interactions
among members of a set of publishing entities. These entities
may, for example, be journals, institutions, individuals, fields
of research, geographical subdivisions or levels of research
methodology. The formalism to be developed is completely gener-
al in that it may be applied to any such set. To emphasize this
generality, a member of a set will be referred to as a unit
rather than as a specific type of unit such as a journal.
The citation matrix is the fundamental entity which con-
tains the information describing the flow of influence among
units.
The matrix has the form
C
C12 . . . Cln
C22 . . . C2n
Cn2 . . . Cnn/
A distinction is made between the use of the terms "refer-
ence" and "citation" depending on whether the issuing or receiv-
ing unit is being discussed. Thus, a term Cij in the citation
matrix indicates both the number of references unit i gives to
unit j and the number of citations unit j receives from unit i.
The time frame of a citation matrix must be clearly under-
stood in order that a measure derived from it be given its proper
interpretation. Suppose that the citation data are based on
references issued in 1973. The citations received may be to
papers in any year up through 1973. In general, the papers
issuing the references will not be the same as those receiving
the citations. Thus, any conclusions drawn from such a matrix
assume an on-going, relatively constant nature for each of the
units. For instance, if the units of study are journals, it is
assumed that they have not changed in size relative to each
other and represent a constant subject area. Journals in rapid-
ly changing fields and new journals would therefore have to be
treated with caution.
A citation matrix for a specific time lag may also be form-
ulated. This would link publications in one time period with
publications in some specified earlier time period.
OCR for page 239
239
2. Influence Weights
For each unit in the set a measure of the influence of
that unit will be extracted from the citation matrix. Because
total influence is clearly a size-dependent quantity, it is
essential to distinguish between a size-independent measure of
influence, to be called the influence weight, and the size-
dependent total influence.
To make the idea of a size-independent measure more pre-
cise, the following property of such a measure may be specified:
if a journal were randomly subdivided into smaller entities,
each entity would have the same measure as the parent journal.
The citation matrix may be thought of as an "input-
output" matrix with the medium of exchange being the citation.
Bach unit gives out references and receives citations; it is
above average if it has a "positive citation balance", i.e.,
receives more than it gives out. This reasoning provides a
first order approximation to the weight of each unit, which is
(1) = total number of citations to the ith unit from other units
total number of references from the ith unit to other units
This is the starting point for the iterative procedure for the
calculation of the influence weights to be described below.
The denominator of this expression is the row sum
si =
n
j =1
C13
corresponding to the ith unit of the citation matrix; it may be
thought of as the "target size" which this unit presents to the
referencing world.
The influence weight, Wi, of the ith unit is defined as
n
Wi = ~
k=1
In the sum, the number of cites to the ith unit from the
kth unit is weighted by the weight of kth (referencing) unit.
The number of cites is also divided by the target size Si of
OCR for page 240
240
the unit i being cited. The n equations, one for each unit, pro-
vide a self consistent "bootstrap" set of relations in which each
unit plays a role in determining the weight of every other unit.
The following summarizes the derivation of those weights.
The equations defining the weights,
n
W.; = ~
,
k=1
ok Cki, i = 1, ,n (1)
S
i
are a special case of a more general system of equations which
may be written in the form
n
. an'
k=1
Here ~
ki S
i
wk ski | -
~ Wi = 0, i = 1,..,n
C .
k1 and Equation 1 is shown to be
(2)
a special case of Equation 2 corresponding to ~ = 1. As will be
explained shortly the system of equations given in (1) will not,
in general, possess a non-zero solution; only for certain values
of A called the eigenvalues of the system, will there be non-
zero solutions.
With the choice of target size Si, the value 1 = 1 is
in fact an eigenvalue so that Equation 1 itself does possess a
solution.
Using the notation O
IT
0.- =
1~ ski
defined by ~ -
O.
the equation can then be written
for the transpose of ~ ,
introducing the Kronecker delta symbol
i1 i = k
ik (O i ~ k
OCR for page 241
241
n
'
k=1
( ~ Ok -
~ ~ k) wk
= 0 . (3)
This is a system of n homogeneous equations for the weights.
In order that a solution for such a system exists, the determin-
ant of the coefficients must vanish. This gives an nth order
equation for the eigenvalues
/11-)
i12
.
.
/21 .
/22-)
Yin i2n
· —
called the characteristic equation.
. {nl
.
~n2
Inn
= 0
(4)
~ Only for values of ~ which satisfy this equation r does
a non-zero solution for the W's exist. Moreover, Equation 3
does not determine the values of the Wk themselves, but at best
determines their ratios. Equivalently the eigenvalue equation
may be thought of as a vector equation for the vector unknown
W [W1,.,.,Wn}
-
r
,? _ ~ W
(s)
from which it is clear that only the direction of W is determined.
The normalization or scale factor is then fixed by the
condition that the size-weighted average of the weights is 1, or
OCR for page 242
242
n
1
S. W.
-k -k
n
S
ok
(6)
This normalization assures that the weight values have an absolute
as well as a relative meaning, with the value 1 representing an
average value.
Each root of the characteristic equation determines a solu-
tion vector or eigenvector of the equation, but the weight vector
being sought is the eigenvector corresponding to the largest
eigenvalue. This can be seen from the consideration of an alter-
native procedure for solving the system of equations, a procedure
which also leads to the algorithm of choice.
for all
Consider an iterative process starting with equal weights
_ units. The values (o)
Hi = 1 can be thought of as
zeroth order approximations to the weights. The first order
weights are then
W (1) _
_
i
n
k=1
C
si
This ratio (total cites to a unit divided by the target size
of the unit) is the simplest size-corrected citation measure
and, in fact, corresponds to the impact measure used by Garfield.
These values are then substituted into the right hand side of
Equation 1 to obtain the next order of approximation. In gener-
al, the mth order approximation is
OCR for page 243
243
wim )
=
n
'
k=1
w(m-l)
k Cki =
The exact weights are therefore
W
i
n
Z' ( m- 1 ) X y
k=1
= W(°°)
=
j=1 \m ~~00
{km ~ ~
m
(it)
ji
This provides the most convenient numerical procedure for finding
the weights, the whole iteration procedure being reduced to suc-
cessive squarings of the ~ matrix.
This procedure is closely related to the standard method
for finding the dominant eigenvalue of a matrix. Since ~ = 1
is the largest eigenvalue, repeated squarings are all that is
needed. If the largest eigenvalue had a value other than 1, the
normalization condition, Equation 6, would have to be reimposed
with each squaring. Convergence to three decimal places usually
occurs with six squarings,corresponding to raising ~ to the
64th power.
Representative terms from entire chapter:
citation analysis