SUMMARY OF DISCUSSION
Professor Eric de Grolier, Panel Chairman for Area 4, began the discussion of comparative characteristics of existing systems by drawing an analogy between the evolution of systems for transportation over water and those for scientific communication. Man began, in the one case, by such “pedestrian” methods as wading and swimming, followed by the use of artifacts, the development of galleys and sails, and eventually the steam, diesel, and atom-powered ships of our latter day. Similarly, systems for communication evolved from oral methods through the use of artifacts such as books carved in stone, through the development of printing to mechanical methods of handling information and again, with recent rapid progress in new technologies, to the use of electronic computers for information selection and retrieval. If such use of computers is rightly viewed as analogous to the exploits of the Nautilus, immediately the question of the basic objective is brought to the fore: Should we use the Nautilus for crossing a ford that we could wade, on the one hand, or should we attempt to swim the Arctic Ocean, on the other?
Other useful bases of comparison in the analogy include the fact that as the artifacts used become more and more complex, but also more and more efficient, their investment costs become higher and higher, yet their operating costs become smaller and smaller for a given workload. There is also the important point of similarity that the new machines and methods have never entirely suppressed the old. They have supplemented them, replaced them for certain tasks, and, significantly, have permitted other and entirely new tasks to be undertaken.
In intellectual as well as in physical systems of communication there are different fields of application of each means or tool that is developed and there are different thresholds which separate these fields as a function of the difficulty, complexity, and size of the task to be accomplished.
As the various systems for scientific communications have evolved, there has been a continuing coexistence of the products of the different stages of evolution: correspondence between working scientists, publications in primary journals, the emergent bibliographical aids, and, presently, the multitude of large and small documentation centers with at least the three functions of accessioning, processing, and redistributing material.
Since 1939, there have been various attempts to centralize or unify docu-
mentary efforts on a national basis, ranging from such efforts as the French National Center for Scientific Research to the All-Union Institute of the USSR. Worldwide, there is somewhat analogous evidence of the will to cooperate, to coordinate, yet this may be taken in the sense of a French saying to the effect that when one does not really know how to organize, then one coordinates.
The point is, however, that what one wishes to coordinate, centralize, and organize, embraces the gamut of information selection and retrieval systems at all levels from the small but meaningful index maintained by the individual bench worker to that represented by huge national centers such as that of the USSR. What is good as documentation practice for the individual scientist is not necessarily good, for instance, for the French Atomic Energy center where documents accumulate at the rate of more than 100,000 new accessions per year. Evaluations of different proposed systems cannot reasonably be made without differentiation between such varying requirements.
A second necessary objective is to recognize the different fields of scientific information: for example, are the social sciences, even the science of religion, to be considered as within this scope? If so, the user requirements of the social scientist may differ from those of the physical scientist, as also those of the chemist differ from those of the physicist or the engineer. Must we not recognize that each field has its own particular needs and particular requirements? Finally, there are the different types of information for different communication purposes, from horizontal communication between scientists on the same level, to vertical communication from the pure scientists to the man on the street, passing through the applied scientist, the technologist, the shop foreman, and at the other extreme, to researchers in quite different disciplines.
Such compelling, and challenging, reflections set the stage for the ensuing discussions, both general and specific. For the panel, Mr. Eugene Wall first discussed some general terms on which comparisons of existing systems might be based. Two sets of variables must either be controlled, or their effects on the system being evaluated or compared must be quantitatively known or measured. These are: 1, the external or environmental variables, and, 2, those peculiar to the internal system. It is the necessary interaction between the two types of variables which must determine both the technical and the economic effectiveness of any system. Thus it is doubtful that any one panacea for all information handling problems is realizable—that any one combination of internal system characteristics can be optimal for varying combinations of environmental factors. We should seek, therefore, for any underlying set of principles by which we can adjust internal or system variables for optimal interaction with environmental requirements. Perhaps this can best be achieved
by the improvements of communication between the user and the system, and of the match between the typical question and its answer, and not by seeking to improve the match beyond an economic point.
In seeking such an improvement, various technical problems must be solved. First is that of viewpoint, since the experience of various individuals is usually such as to exclude universal agreement on the assignment of any given concept to any one logical class. The second problem is that of generics or breadth or narrowness of viewpoint. The sought-for answer should be available at any equal or lower generic level of any of the family trees that stem from a given concept. The third problem is that of semantics—the meaning of words or relations between concepts and their symbols. Different words often mean the same thing, or identical words different things. A fourth problem is syntactic, that of the ordering of words or concepts in such a way as to express the direction of relationships between them.
Mr. Wall suggested the need for a continuous rather than a discontinuous solution to these four problems, that is, a solution which can by increasing costs incrementally thereby also incrementally increase the effectiveness of the match between question and answer. Only by the development of principles which enable the solution of these problems in a continuous fashion can a basis be found for evaluation of interactions of information systems with their specific environments. General principles yielding continuous solutions to such linguistic and semantic problems would provide a smooth curve along which to plot cost versus effectiveness. Small steps along the curve would then focus optimum points for particular systems and their particular environments.
All facets of documentation from the origination of the information to its ultimate use must be considered because they interact with each other as well as with the environment. The cost-benefit relationship must be determined, not only for a single task, such as indexing, but for the system as a whole.
Mr. Wall then outlined the principal environmental factors to be considered. First is the value of the information in the collection. The more valuable the information the more often it will be referred to and, conversely, the lower will be the accession rate since low value material probably will not be retained. This results in a relatively high ratio of references to accessions, where it may be profitable to expend effort to put things away well so that the subsequent work of reference is speeded. Contrariwise, with a low output-to-input ratio it may be best to search harder rather than to index better.
A second external factor is the relative transience of the material to be filed. A third is the number of items in the file, requiring, with increasing volume, an increasing specificity of questions asked and a correspondingly greater effort
on input in order to ease the input task. A fourth consideration relates to the number of potential customers, and the fifth to the types of question asked.
To summarize, there is no one solution adaptable for all environments—no one right tool for all tasks. Secondly, various possible systems must solve the technical problems of viewpoint, generics, semantics, syntactics. A set of general principles for the solution of these problems must surely exist. Mr. Wall further suggested that comparisons among systems will be valid only as the solutions to such problems are continuous and as environmental factors can be generalized in the comparative evaluation. Fourthly, he declared that the logic of the system should dictate machine design. Finally, he stressed the opinion that all facets of a documentation system must be considered as well as a number of important environmental factors.
Dr. Brian C.Vickery was the next panel speaker. His comments reinforced the view of the necessity for consideration of systems as wholes. In designing a system for a particular situation, we must take into account the total environment, the needs of the user, the costs and benefits of each step in the whole documentary operation. The environment determines what a retrieval system is required to do—its basic function. If a new system is to be designed to carry out certain specified functions, then we must know how those functions and the structure of the proposed system are related. Comparisons of one system vis-a-vis another may show one better suited to a particular environment than the other, but we should know more than that, we should know why, for what purpose, one system does or does not function better than another. If there are different ways of carrying out a certain function then these ways should be intercompared to decide which does a given job most efficiently. It may be necessary to construct a series of model systems in which only one feature is varied at a time.
Of the various internal system variables suggested in the introduction to Area 4, only a very few have been tested under any circumstances for any systems at all. Tests reported in the papers for Area 4 provide, as yet, no basis for generalization. Indeed, the tests of Mr. Cleverdon, although designed to study variable features of the over-all problem, underline the truth that no single variable factor can be considered in isolation.
But the whole point of intercomparing the characteristics of existing systems should not be to show that one system is better than another. On the contrary, a principal concern of conferences such as this, and of all who are interested in the field of documentation, should be to find out how we can design new and improved systems that will aid all in the most efficient way. We want to be able to generalize from comparisons, and perhaps to find systems, as a result of comparisons, which will be better than any system we now have.
The comments of the next panelist, Dr. C.J.de Haan, re-introduced the perspective of looking at the common factors in conventional systems and in modern systems using machines. In both, the objective is to reduce work, namely, retrieval work. The needs of the user, in particular differing needs, such as those of the research scientist in contrast to those of the applied scientist, must be taken into account. Economic factors are basic, since we are never justified in working out new systems whose costs in labor, investment, and effort, exceed what could be done with similar results by a more conventional system. What the systems should be depends largely on the information that the users of the system are likely to require.
In order to meet the objectives for which machine systems are being developed, three operations are of great importance. First, the analysis of each document in the file to be searched in terms of all features relevant for future retrieval. Second, the drawing up of dictionaries and standardized terminology for such retrieval features. Third, the translation of identified retrieval features into a language suitable for machine use.
First, then, the analyses of documents must be carried out in the most accurate way possible. Each feature overlooked or improperly measured must lead to incomplete or erroneous answers when machine searches are made. This task of analysis is both the most important and the most expensive part of mechanized search systems. It requires a great amount of labor on the part of technically qualified persons. Thus, in machine searching, the exercise of human intellect and judgment cannot be eliminated. It can be concentrated. It can be condensed in such a way that the work need not be repeated for each question to be answered, but at least once the intellectual effort must be expended. A pertinent criterion for determining applicability of machine systems is therefore the question of manpower investment necessary to select and define essential features—those that will be asked for by users of the system—and to trace these features in the analysis of the documents.
The remarks of Dr. John W.Kuipers again stressed the difficulties of carrying out the mandate of the Area 4 panel, of separating the considerations proper to this Area from those of Areas 5 and 6, and of attempting to fix criteria for the evaluation of systems. He suggested that the basis for these and similar difficulties, not only in Area 4 but throughout the Conference discussions, may lie in a tendency to concern ourselves with secondary questions rather than with primary ones. Such matters as specific machines and recording media, economic questions, and notation schemes are secondary to basic questions concerned with the over-all problem of communication. What is the over-all task to which we set ourselves? The system, the integrated set of procedures by which specific tasks may be accomplished, will necessarily be complex. The
system should take into account all required functions, including those of generating messages, recording, processing, disseminating, transmitting, storing, searching, collecting, and using. The system must take into account the dynamic nature of the situation in which it operates. It should take into account that there are many individuals and many organized groups attempting to communicate. Certain parts of the ultimate over-all system are vague at this time, but there is much that can be stated and clarified. We can hope to fill out the picture only as we begin to get better answers to other primary questions: What is the nature of the messages to be carried in the system? How do we get from expressions in natural language as used by human individuals to whatever information units will flow in the machine part of the system? If there are limitations to be imposed on the messages the system handles, what are these limitations and why do we accept them? If we can come to some general clarification and agreement on questions such as these, we may hope to proceed more profitably with matters of secondary concern such as are involved in comparative characteristics of classifications and of machines.
This part of the panel discussions concerned with the general philosophy of evaluation was closed with brief remarks by Dr. Jacques Samain urging greater standardization in the preparation and presentation of documents, and by concise, practical observations by Dr. Hans Selye based upon experience with a system that has been used for over a century.
Dr. Selye suggested first that in the development of any system a primary objective should be the molding of that system by the practice gained in its use, rather than by theoretical presuppositions or by undue concern with how many hits or errors the system makes. It is a matter of giving higher credit in any system to the finding of such information as is particularly difficult to find. An occasional positive hit, the tracing of a very important but rare document, might be very much more important to the scientist than a high average of successful findings of information on a statistical basis.
Dr. Selye also re-emphasized the basic importance of the intellectual task of taking a document and extracting from it that information which will be really useful to later users. He therefore urged the desirability of establishing better cooperation between those who know the mechanical and linguistic aspects of organizing information for search and retrieval and those who actually use such information. Suggestions for the organization of working parties for detailed consideration of practical problems and tasks were endorsed by the chairman, who then opened the discussion to the floor.
Discussions from the floor following the general comments by panel members included clarification of certain specific points raised by the papers or made by the panel members. Mr. R.A.Fairthorne offered comments on the
apparent misunderstanding as to the extent to which machines could perform intellectual work, stating that what the machine is meant to do, just as a book is meant to do, is rather to prevent the repetition of the same intellectual work over and over again.
Dr. M.J.Taube suggested that the question of the right tool for the right job, a best system for a particular situation, depends upon the assertion of a theory of search, storage, and retrieval.
Questions were raised with respect to suggested criteria, such as the relationship of input-output ratio to the cost-benefit factor; to possible reasons for differences between the Herner and Herner results, using user language, and those reported by Dr. Whaley; to possibilities for converting any one system to any other; to problems in planning and carrying out of tests and experiments, such as those of Mr. Cleverdon, and to re-emphasis of the fact that a feature of any system is that we cannot get out what has not been put in. Mr. Farradane offered an emphatic plea for greater application of scientific method in testing and experimentation, and for realistic objectives in terms of system efficiency. It is time, he suggested, that we adopt a truly scientific approach to what must become, if it is not yet, a scientific subject area. We need proper experimentation, proper ideas of isolating different parts of a problem, and of testing them properly in accordance with well-defined, objective procedures. Dr. E.F.Moore suggested that an important factor in the success of systems in use today may well be the interest and abilities of those who do the indexing and abstracting and of those who are in charge of the operations, rather than the performance characteristics of the machine or system itself.
These comments from the floor provided ready transition to the next topics discussed by the Area 4 panel—the problems of attempting actual intercomparisons and the selection of specific criteria for evaluation. Mr. Gerald Jahoda reported on the results of a questionnaire survey of some 39 correlative or coordinate indexing systems now in operation, mostly located in the United States. These systems are relatively small. Over 80 per cent of the installations participating in the survey reported collections of 20,000 or fewer documents. The systems are not extensively used, since 60 per cent reported 200 or less inquiries per year. Finally, there was a strong indication from the questionnaire responses that in the large majority of the installations more traditional methods such as alphabetical subject indexing or conventional classification systems might have been used to equal advantage. If so, the question is properly to be raised as to whether we should not be less extreme in formulating information retrieval requirements. At the present time, neither correlative indexing systems nor conventional indexing techniques prove particularly adequate for doing an increasingly formidable job. Should we not then ask
ourselves if there may be quite different answers possible to problems of information storage and retrieval and proceed to research on such other solutions? On this last point, another panelist confirmed the need to explore alternate possible solutions but interposed the suggestion that since the less conventional systems are very new they should not be judged prematurely, as though, in an example accredited to Sir Winston Churchill, we were to ask, “What good is a new-born baby?”
In the following remarks, Mrs. Claire K.Schultz discussed the practical problems in isolating, defining, and applying comparative factors. An experiment in intercomparing two closely similar mechanized systems, dealing with the same general subject matter, led to the conclusion that the systems should be studied as whole systems but also to the difficulty that all the necessary interrelationships between the functional parts of a system are not easily traced. One of the important lessons learned from the study was the need to use real questions, or real questioners, in evaluating output.
Some of the factors considered included the following questions: How quickly must the answer be found? For what purpose is the answer wanted? Is the client a student, a specialist in the field, an administrator, someone who needs to document a legal dispute? Is the answer to be limited, for example, to review articles or to a particular language? What form of output—bibliography, journal, microfilm, digest of information contained in the selected references—is desired? What is the usual depth of indexing, bearing in mind that the number of descriptors used may not be directly correlated with this since various depths may be represented in several fields that may be covered in any one document? What proportion of the indexable material in the fields covered is indexed? What types of errors can be made in input and what are the costs of errors that are undetected?
These and other critical factors were discussed with emphasis upon time and costs of necessary staff effort. For example, if the user would allow two weeks to elapse before obtaining an answer to his question this would not necessarily mean that the staff should use two weeks to find that answer. Again, the time spent screening to reject or eliminate material should be accounted for as well as the time spent on actual input of selected material. How much professional time, and at what cost, is involved in entering selected documents into the system? How costly in manpower, money, and materials are checks that are made to prevent or reduce the introduction of errors?
After these specific examples of critical factors, Mr. C.D.Gull commented that follow-up of criteria suggested in the papers and in the discussions would be desirable both in terms of expansion and refinement and in terms of the design of suitable tests and experiments to apply them. He then took the
Wright-Wilson paper as a point of departure for the elaboration of further critical considerations. Can we, for example, regard a system as satisfactory if the user must discard a high proportion of the output product as irrelevant to his query? If a relatively large number of descriptors or index headings are assigned per document, why is it that significant points are still missed? To this latter question the senior author rejoined, thereby reinforcing a major theme of the Area 4 discussions, that the failures were in the intellectual analysis, not in the mechanics of the system used.
The final panel member presentation was by Dr. Robert M.Hayes, who first re-stated such key questions as that of whether systems were to be considered as equipment, as coding or classificatory schemes, or as total organizations of resources and procedures to meet specified objectives, and the question of overlap of Area 4 with other Areas. He pointed out that both the development of a new system and the comparison of two existing systems involve the trade-off between a number of factors and that criteria typically have practical meaning only when they are considered within the framework of a general theory. He raised the point of necessary objectivity in the development and application of criteria for comparison and questioned the extent to which tests such as those of Miller, Cleverdon, and others produce results that can be extrapolated to a general situation. Within the framework of a good general theory, however, tests can be used to establish basic parameters for given systems within a controlled environment.
Continuing the discussion of specific possible criteria, Dr. Hayes suggested first the desirability of interdisciplinary meetings where users, librarians, systems analysts, and engineers might outline the relevant factors of particular concern to each. He then outlined a number of possible qualitative factors for consideration, including those that reflect usage requirements, such as the purpose of the system and the level of complexity of the material to be filed and searched; those that are involved in organization, data format, code notations, arrangements of the file; those that involve the organization and performance of equipment; and finally those that involve elements of design.
The floor was then opened for general discussion, with comments on the use of large versus small computers, the suggestion that a most significant criterion in the evaluation of systems is that of the capacity of the system to change, and discussion of the possibilities for abstracting selected operations from a total system in order to measure them objectively.
In closing, the panel chairman drew three principal conclusions from the discussions. First was the implication that it is necessary to plan and conduct many more experiments than have been tried hitherto, in order to test conditions under which comparisons may be meaningful and to define and improve
the criteria suggested for application. The second conclusion drawn was that the survey that had been originally proposed but not carried out prior to the Conference, could and possibly should be conducted, drawing upon the actual experience of this Conference and following methods such as those of descriptive natural history. The third conclusion was that special studies of subjects not specifically included in the papers or discussions, especially matters of intellectual categorization or codification, should be given attention.
MARY ELIZABETH STEVENS, Rapporteur and Area Program Chairman
ERIC DE GROLIER, Discussion Panel Chairman