Questions? Call 888-624-8373

Rights & Permissions

Free PDF Access

topleft topright

Proceedings of the International Conference on Scientific Information -- Two Volumes (1959)

Page
377
bottomleft bottomright

The following HTML text is provided to enhance online readability. Many aspects of typography translate only awkwardly to HTML. Please use the page image as the authoritative form to ensure accuracy.


The Relation Between Completeness and Effectiveness of a Subject Catalogue

C.S.SABEL

Anyone who has frequently undertaken literature searches will have used the references in material already found as a lead to further material relevant to the subject in which the search is being made. This experience suggests that it might be interesting to investigate the material that can be retrieved in this manner in order to see whether too much effort was perhaps being put into aiming at 100% storage of material and at 100% retrieval of the items (regarded as documents) stored.

The preliminary investigation described here analysed the references contained in the documents from three different sources dealing with the same subject (Controlled thermonuclear reactions).

The A.E.R.E. unclassified reports on radioactivation analysis were also examined to see how far the results from these reports agreed with those obtained from the documents on controlled thermonuclear reactions.

Analysis of references

The 98 documents on controlled thermonuclear reactions studied in this investigation were in three classes. These were: (a) 27 Atomic Energy Research Establishment unclassified reports, (b) 20 published articles by A.E.R.E. authors, other than unclassified reports, (c) 51 United States Atomic Energy Commission unclassified reports, listed in a bibliography prepared by the U.S.A.E.C.

For the purpose of this study, these could be regarded as representing, in each class, a 100% sample of the documents dealing with the subject.

The number of times a document was quoted as a reference in other documents is set out in Table 1, which shows, for example, seventy-seven documents were not quoted as a reference in any other document, twelve documents were quoted as a reference in one other document, etc.

C.S.SABEL Atomic Energy Research Establishment, Harwell, England.

Page
377
Front Matter (R1-R24)
Opening Session Address (1-8)
Area 1: Literature and Reference Needs of Scientists: Knowledge now available and methods of ascertaining requirements (9-12)
Proposed Scope of Area 1 (13-18)
Study on the Use of Scientific Literature and Reference Services by Scandinavian Scientists and Engineers Engaged in Research Development (19-76)
The Transmission of Scientific Information (77-96)
An Operations Research Study of the Dissemination of Scientific Information (97-130)
Information and Literature Use in a Research and Development Organization (131-162)
Methods by which Research Workers Find Information (163-180)
Determining Requirements for Atomic Energy Information from Reference Questions (181-188)
Systematically Ascertaining Requirements of Scientists for Information (189-194)
How Scientists Actually Learn of Work Important to Them (195-198)
Planned and Unplanned Scientific Information (199-244)
The Use of Technical Literature by Industrial Technologists (245-266)
Requirements of Forest Scientists for Literature and Reference Services (267-276)
The Information-Gathering Habits of American Medical Scientists (277-286)
Use of Scientific Periodicals (287-300)
Summary of Discussion (301-312)
Area 2: The Function and Effectiveness of Abstracting and Indexing Services (313-316)
Proposed Scope of Area 2 (317-320)
An Evaluation of Abstracting Journals and Indexes (321-350)
Analytical Study of a Method for Literature Search in Abstracting Journals (351-376)
The Relation Between Completeness and Effectiveness of a Subject Catalogue (377-380)
Cost Analysis of Bibliographies or Bibliographic Services (381-392)
The Efficiency of Metallurgical Services (393-406)
Subject Slanting in Scientific Abstracting Publications (407-428)
The Importance of Peripheral Publications in the Documentation of Biology (429-434)
Current Medical Literature: A Quantitative Survey of Articles and Journals (435-448)
A Combined Indexing-Abstracting System (449-460)
A Unified Index to Science (461-474)
Lost Information: Unpublished Conference Papers (475-480)
International Cooperation in Physics Abstracting (481-490)
International Cooperative Abstracting on Building: An Appraisal (491-496)
Cooperation and Coordination in Abstracting and Documentation (497-510)
On the Functioning of the All-Union Institute for Scientific and Technical Information of the USSR Academy of Sciences (511-522)
Summary of Discussion (523-536)
Area 3: Effectiveness of Monographs, Compendia, and Specialized Centers: Present trends and new and proposed techniques and types of services (537-540)
Proposed Scope of Area 3 (541-544)
Review Literature and the Chemist (545-570)
The Place of Analytical and Critical Reviews in Any Growing Biological Science and the Service They May Render to Research (571-588)
Recent Trends in Scientific Documentation in South Asia: Problems of Speed and Coverage (589-604)
Scientific Documentation in France (605-612)
Scientific, Technical, and Economic Information in a Research Organization (613-648)
Summary of Discussion (649-660)
Area 4: Organization of Information for Storage and Search: Comparative characteristics of existing systems (661-664)
Proposed Scope of Area 4 (665-670)
Conventional and Inverted Grouping of Codes for Chemical Data (671-686)
The Evaluation of Systems Used in Retrieval Systems on Large Electronic Computers (687-698)
Experience in Developing Information Retrieval Systems (699-710)
Printing Chemical Structures Electronically: Encoded Compounds Searched Generically with IBM-702 (711-730)
Evolution of Document Control in a Materials Deterioration Information Center (731-762)
Retrieval Questions from the Use of Linde's Indexing and Retrieval System (763-770)
Classification with Peek-a-boo for Indexing Documents on Aerodynamics: An Experiment in Retrieval (771-802)
Summary of Discussion (803-812)
Area 5: Organization of Information for Storage and Retrospective Search: Intellectual problems and equipment considerations in the design of new systems (813-816)
Proposed Scope of Area 5 (817-822)
The Basic Types of Information Tasks and Some Methods of Their Solution (823-854)
Subject Analysis for Information Retrieval (855-866)
The Construction of a Faceted Classification for a Special Subject (867-888)
On the Coding of Geometrical Shapes and Other Representations, with Reference to Archaeological Documents (889-902)
Subject-Word Letter Frequencies with Applications to Superimposed Coding (903-916)
The Analogy between Mechanical Translation and Library Retrieval (917-936)
Linguistic Transformations for Information Retrieval (937-950)
Linguistic and Machine Methods for Compiling and Updating the Harvard Automatic Dictionary (951-974)
The Feasability of Machine Searching of English Texts (975-996)
Semantic Matrices (997-1026)
Interlingual Communication in the Sciences (1027-1046)
An Overall Concept of Scientific Documentation Systems and Their Design (1047-1070)
The Possibilities of Far-Reaching Mechanization of Novelty Search of the Patent Literature (1071-1096)
Descriptive Documentation (1097-1116)
Variable Scope Search System: VS8 (1117-1142)
The Haystaq System: Past, Present, and Future (1143-1180)
A Proposed Information Handling System for a Large Research Organization (1181-1202)
Information Handling in a Large Information System (1203-1220)
Tabledex: A New Coordinate Indexing Method for Bound Book Form Bibliographies (1221-1244)
The Comac: An Efficient Punched Card Collating System for the Storage and Retrieval of Information (1245-1254)
Summary of Discussion (1255-1268)
Area 6: Organization of Information for Storage and Retrospective Search: Possibility for a general theory (1269-1272)
Proposed Scope of Area 6 (1273-1274)
The Structure of Information Retrieval Systems (1275-1290)
The Descriptive Continuum: A (1291-1312)
Algebraic Representation of Storage and Retrieval Languages (1313-1326)
A Mathematical Theory of Language Symbols in Retrieval (1327-1364)
Abstract Theory of Retrieval Coding (1365-1382)
Maze Structure and Information Retrieval (1383-1394)
Summary of Discussion (1395-1410)
Area 7: Responsibilities of Government, Professional Societies, Universities (1411-1414)
Proposed Scope of Area 7 (1415-1416)
Responsibilities for Scientific Information in Biology: Proposal for Financing a Comprehensive System (1417-1428)
Responsibility for the Development of Scientific Information as a National Resource (1429-1434)
Differences in International Arrangements for Financial Support of Information Services (1435-1440)
Training for Activity in Scientific Documentation Work (1441-1488)
Training the Scientific Information Officer (1489-1494)
Training for Scientific Information Work in Great Britain (1495-1502)
The ICSU Abstracting Board: The Story of a Venture in International Cooperation (1503-1516)
Creation of an International Center of Scientific Information (1517-1522)
An International Institute for Scientific Information (1523-1534)
Summary of Discussion (1535-1548)
Closing Session (1549-1562)
Financial Support (1563-1564)
Exhibitors (1565-1566)
Roster of Registrants (1567-1606)
Index (1607-1638)

Below are the first 10 and last 10 pages of uncorrected machine-read text (when available) of this chapter, followed by the top 30 algorithmically extracted key phrases from the chapter as a whole.
Intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text on the opening pages of each chapter. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

Do not use for reproduction, copying, pasting, or reading; exclusively for search engines.

OCR for page 377
--> The Relation Between Completeness and Effectiveness of a Subject Catalogue C.S.SABEL Anyone who has frequently undertaken literature searches will have used the references in material already found as a lead to further material relevant to the subject in which the search is being made. This experience suggests that it might be interesting to investigate the material that can be retrieved in this manner in order to see whether too much effort was perhaps being put into aiming at 100% storage of material and at 100% retrieval of the items (regarded as documents) stored. The preliminary investigation described here analysed the references contained in the documents from three different sources dealing with the same subject (Controlled thermonuclear reactions). The A.E.R.E. unclassified reports on radioactivation analysis were also examined to see how far the results from these reports agreed with those obtained from the documents on controlled thermonuclear reactions. Analysis of references The 98 documents on controlled thermonuclear reactions studied in this investigation were in three classes. These were: (a) 27 Atomic Energy Research Establishment unclassified reports, (b) 20 published articles by A.E.R.E. authors, other than unclassified reports, (c) 51 United States Atomic Energy Commission unclassified reports, listed in a bibliography prepared by the U.S.A.E.C. For the purpose of this study, these could be regarded as representing, in each class, a 100% sample of the documents dealing with the subject. The number of times a document was quoted as a reference in other documents is set out in Table 1, which shows, for example, seventy-seven documents were not quoted as a reference in any other document, twelve documents were quoted as a reference in one other document, etc. C.S.SABEL Atomic Energy Research Establishment, Harwell, England.

OCR for page 378
--> TABLE 1 All documents on controlled thermonuclear reactions No. of documents Times quoted as reference 77 0 12 1 3 2 6 3 Considering separately within themselves the three classes of documents on controlled thermonuclear reactions, we have Tables 2–4. TABLE 2 A.E.R.E. unclassified reports No. of documents Times quoted as reference 19 0 4 1 4 2 TABLE 3 A.E.R.E. author’s papers No. of documents Times quoted as reference 11 0 4 1 5 2 TABLE 4 U.S.A.E.C. unclassified reports No. of documents Times quoted as reference 49 0 2 1 The references in A.E.R.E. unclassified reports on radioactivation analysis, studied as a comparison, gave the breakdown shown in Table 5. TABLE 5 No. of documents Times quoted as reference 19 0 4 1 0 2 0 3 1 4 A 25th report, a bibliography, was excluded from Table 5 as the presence of a bibliography in a subject field will obviously have a considerable influence on a search, provided its existence is known. In this case, as the bibliography was recent, none of the other reports included it as a reference. The bibliography included 17 of the 24 reports.

OCR for page 379
--> Consideration of Tables 1 to 5 The very high proportion of documents which are not included as references in any other document suggests that a complete retrieval from only some of the documents in a given category is unlikely. The results were a disappointment from the statistical viewpoint in that it was hoped that they would be amenable to an analysis as, possibly, a binomial distribution. In which case p=k/n where p is the probability of one document being quoted in another document, n is the total number of documents, and k is a constant. With a value for k, it would be possible by weighting the above tables with results from tables of the frequency of quotations in documents to arrive at, say, a 95% certainty of obtaining 100% of the documents by choosing some number less than the total number of documents and examining the references within these documents. However, the sample above was not large enough for one to be statistically categorical and the values of k that were obtained were not consistent. Intuitively it does appear to be true that the probability of a document being quoted in another document is inversely proportional to the total number of documents. Further analysis In addition, the results were studied to see whether chains of references existed leading from a few to many documents, and to see how far references in one class of document led to documents in another class. It was found that for the documents on controlled thermonuclear reactions there were no “key” references which led to a large number of others in the subject, and no document was quoted as a reference in each of the three classes of document. In particular, references tended to be restricted to their own class of document. For example, there was no reference in the A.E.R.E. reports or published papers to give entry to the U.S.A.E.C. reports and only one reference in the U.S.A.E.C. reports to an A.E.R.E. published paper. Conclusions Tables 1 to 5 show that the references in an incomplete list of documents are unlikely to indicate more than a small proportion of the remaining documents, and hence one is unlikely to be justified in retrieving less than all the documents that can be found from a subject catalogue.

OCR for page 380
--> The further analysis of the references also shows that, in the subject studied, there is no possibility of selecting documents for storage which would in themselves indicate most of the material one would wish to retrieve. The results also indicate the necessity, in planning an information service, of ensuring adequate retrieval of all types of relevant documents. This is only a preliminary survey and further subject fields and different types of documents should be analysed to see if the restriction of references to documents within their own class is as marked in other fields.

Representative terms from entire chapter:

unclassified reports