At the closing session of the Conference the Chairmen of the Area Discussion Panels attempted to bring together the more important points that emerged in the Area discussions, whether or not there was consensus or disagreement on these.
In general terms, the main needs of scientists are fairly clear. They need means of becoming aware of information of concern to them; they need access to particular journals and to copies of particular papers; they need means of access to the whole scientific literature both in their own and other fields; and they need up-to-date media such as monographs, reviews, and compendia.
In the discussion for Area 1 there appeared to be tacit agreement on several points.
The requirements of users of scientific information and the uses to which they put the information they require should determine the design of information systems and services.
Study of the informational requirements of scientists is difficult, for the communication of scientific information is extremely complex. Needs vary with the subject field, the type of research, the availability of information services and source materials, geographical situations, differences in human abilities, and the use that is to be made of the information. Furthermore, the real needs of scientists and their expressed wants may be quite different. It was pointed out that the studies made thus far have not been concerned with information requirements as such. They have sought data on what scientists do, what they think about the present situation, and what they say they want in the way of improved services. The development of information services based simply on an analysis of present user habits is likely to be inadequate.
The studies made thus far, although admittedly imperfect, have produced data and conclusions that have served as a basis for action in particular local situations. It was suggested that the studies already
reported be carefully reviewed, evaluated, and compared. These studies appear to indicate certain things:
The various services now available to the research worker would serve to a considerable extent in meeting his requirements if he were more wise in their use, developed more facility in their use and made more use of them.
Scientists place great reliance on face-to-face and other informal means of communication. Personal communication, however, and the conveyance of information through the literature are mutually dependent.
The temporal span of the usefulness of the literature for current research is relatively short.
Abstracts are less used than one might expect; information is obtained rather more from journals and the references they cite.
Interdisciplinary communication is both important and difficult.
There is still need to know more about the real requirements of users, the various factors affecting the efficiency of scientific communication, and to develop more sophisticated and reliable techniques for studying them. Studies should be designed so as to produce conclusions that could be tried out experimentally. A number of suggestions concerning future work in this field were made:
In future studies of the operations research type, a qualitative judgment as to the value of the information should be obtained along with quantitative data and objective measurements.
Knowledge of human perception, motivation, capabilities and limitations is relevant to the whole communication problem, and the point of view of the physiological psychologist should be taken into consideration in planning future work.
Socio-psychological inquiries, of the sort made by Bavelas, might be made of the nature of the conversational way of getting information. For example, one might set up a simply-patterned “game” involving information gathering in order to see how differently structured teams succeed.
Inquiry into the scientific meeting as a means of transferring information should be made. One might ask: Are present meetings too large? Are ten-minute papers too short? Does one always learn more “in the hall” after the sessions than during the sessions—and, if so, why?
Inasmuch as no research worker has access to all the information he ought to have, it is important to determine to what extent this matters.
Since it is difficult to remember what one does and uses, perhaps it would be possible to devise a system of tallies to be placed on library-materials so that their use could be summarized.
Other suggestions concerning future studies were made in their conference papers by Professor Bernal and Dr. Menzel.
It was pointed out in the Area 6 discussion that for a sound theoretical analysis of information retrieval systems a greater understanding of empirical processes in their use is needed; and that the person who uses a system machine should therefore be required to describe himself in a way that will indicate his field of interest and objective. Until a hierarchy or classification of users and their needs has been established, however, it is not at all obvious how we should ask users to describe themselves.
It was suggested that much of the problem of better and faster information services could be solved rather simply if the financial support for information services could be made to keep pace with the support of scientific research. Certainly it is clear that more support for our information services is needed and that this would solve some of our problems. The funds we can spend on information services, however, are not unlimited and the number of qualified persons who could be put to work on them is limited. Since there is a limit to our resources, it will always be important to try to make the allocation of these resources that will best meet the needs of scientists.
It has to be remembered that the body scientific is an organic whole adapting itself to circumstances and changing all the time. Conclusions drawn from studies may be to some extent ephemeral. Provisions for meeting information needs should therefore be flexible.
In Area 2 there was considerable discussion on the advantages and disadvantages of highly centralized, comprehensive, national information services, versus the decentralized type where much of the activity ties in closely with professional societies.
The discussion on the question of the authorship of abstracts reiterated what has been said many times on author versus non-author preparation, indicative versus informative content, telegraphic versus complete-sentence style. If one can generalize from the limited analytical studies that are reported in the Conference papers, the information content of author and non-author abstracts differs very little; the author-prepared abstract is likely to be both a little more precise and a little more verbose. This seems to indicate that the intellectual job of preparing abstracts or annotated titles should be placed on authors, with some educational effort expended to achieve conciseness of expression. Some
further studies in this aspect might end the discussion, hitherto based on opinion only, on author versus non-author abstracts and the alleged slanting of abstracts.
A point made in connection with comprehensiveness of coverage was that of what one is willing to pay for that little last approach to perfection. It is somewhat like the comprehensive bibliography problem where finding the last five per cent of the references requires 95 per cent of the cost. At some point one has to ask, is it worth it? Even if we postulate that the best possible abstract for a given scientist is one written by another expert in that scientist’s own narrow field of interest, what would one have to pay in terms of money and time to have this? Is it worth it, or should we compromise in certain other ways in order to get a more generally valuable answer?
There was a considerable amount of favorable opinion for a sequence that would consist of an author draft written according to a generally circulated set of ground rules, followed by the editor of the primary journal taking the same kind of responsibility for the abstract that he normally does for the paper. He should feel perfectly willing to send the manuscript back and say the paper is all right but the abstract has to be done over again for such and such reasons. The abstracting service editor, of course, would have the final veto or modification power over the result.
Another point noted had to do with cooperative efforts beyond the one of exchange of proof of abstracts among various services. As it stands now, major abstracting services in the subject fields use different classification systems, different indexing systems, different abbreviation lists for journals, different translation assistance. Certainly this is an area for cooperative effort, and such is already in progress—in the field of physics and chemistry under the ICSU Abstracting Board.
The discussion in both Areas 1 and 2 strongly emphasized the need for better education of users and training of their habits in information search and retrieval. This training should begin at least in the graduate school. The design and development of information services should be determined in part by the present habits of users and in part by habits that might reasonably be expected to be conditioned by training.
It was pointed out that our present type of information services are at once highly developed for the fields of applied science or technology (medicine, agriculture, metallurgy, engineering) and at the same time most inadequate for satisfying urgent needs to know in those fields. The inadequacies of indexes are particularly serious.
The opinion was expressed that the present organization of uncoordinated abstracting and bibliographic services is on the verge of collapse, and that it
may have to be replaced soon by another type of organization, based on specialized centers, each retaining a considerable degree of independence but combined and harmonized in an over-all system. Such a system would almost inevitably be a highly mechanized one and could be used to full advantage only if the principles of division of labor, coordination and large use output are applied.
In Area 3 the discussion of the effectiveness of monographs, compendia, specialized centers, and of new techniques and types of services led to no detailed consensus; indeed none could be expected. Some generalizations did emerge, however.
The present rapid growth in the mass of scientific information, in both its fundamental aspects and its utilization, in the scope which it covers and in its complexity, parallels exactly present political and economic growth. No longer can any countries except the very largest ones be really self-sufficient in politics or economics or science.
It is just the same with scientific information. We can no longer be self-sufficient either as individual scientists, as institutes, or as countries. With the increase in the quantity of publication and the ephemeralness of its importance, there is occasioned a tremendously increased need for selection of what is significant, and speed in getting it to the individual scientist user. With the increase in the amount of specialization and subspecialization in science, there is need not only for the subjects discussed in Areas 1 and 2, namely, indication of the existence of information and availability of the original, there is much more need also for complementary scientific activities and background scientific publications such as reviews, compendia, monographs, and so forth.
No one asks for fewer of these, although there was some doubt in some of the papers as to whether large systems like Beilstein and Gmelin could in fact continue, because they are very expensive. The answer is probably they will have to continue.
There is also agreed a need for annual reports on the advancement of this or that subject, for monographs giving comprehensive treatment of individual documents from time to time as their importance becomes clear, and a very great need for critical evaluation. We have heard much in the last year or two about the importance of the scientific literature from the Soviet Union. A great deal of translation is being done. But one gathers the impression from talking to working scientists that the real user’s demand for this material is not yet fully developed or is likely to be for two reasons. Much of the material
has not been available because of language difficulties until now. Also, people are uncertain as to the value and the perspective to be given to the literature of a country with which they are not completely familiar. Therefore, the critical type of review or monograph which brings forward and presents a number of the important points of growth in a science has become especially valuable in recent years, and will become more valuable as it becomes more and more difficult to read anything except a very small portion of the papers on subjects in which one is interested.
In addition to this type of non-serial publication, research and laboratory reports have an immediate though transient importance because of their speed factor. Although these complicate the system and although they may not be abstracted completely because eventually the material they contain turns up in papers in journals, nevertheless there is a need for them.
Compendia, critical tables, and quantitative data of all kinds, which represent the concentration of scientific data from the past, may be the most important type of scientific information of all. But even the accumulated quantitative data are growing in amount so quickly that summaries of the more important and more commonly used data are also required. There is clearly a great deal to be done in this field which can be classified generally as non-serial publication.
The situation with respect to special information centers and services is very different for fundamental science from what it is for applied science. In fundamental science, a national or an international specialized institute or information service has very often as its function the publication of original material as well as the production of informative abstracts, reviews, and bibliographic materials of all kinds.
On the whole, there is far more need for the specialized information center in applied science. In the Conference the discussions very properly concentrated on the scientists’ need for scientific information, and have said rather little specifically about the need for information by the practitioners of applied science, although the economic compulsion here is the reason why so much science is being done. In applied science for industry, agriculture, medicine, and engineering, information requirements are of two types. There is first the searching of the fundamental scientific literature to find what is appropriate for the particular field of application, and, secondly, the additional task of finding out what is new and significant in application and development for that field.
So the task here is different, and decentralized activities, although connected in a centralized system, are inevitable. For example, in a rapidly developing industrial field the progressive firm needs a great amount of both basic and
applied scientific and technical information as rapidly as it becomes available. The technical information may be fairly easy to gather from a limited number of journals. For the fundamental scientific advances on which all of the technology rests, there is a minute proportion of pertinent material in each of a great number of journals. Consequently the specialist information center with an industrial objective has to vary and multiply the tools required in information work. It has probably a great need to provide for industry abstracts of a completely different type from what is the basic service need for indicative abstracts. It may in fact abstract from a certain paper only one paragraph which is relevant to the particular industry, but the informative aspect of that abstract may be of tremendous importance.
Consequently many of these bodies bring out and publish after much searching, informative abstracts of a particular type. They also encourage among their firms and among scientists in their industry verbal communication through such media as symposia, conferences, and of late there has been a great growth in many parts of the world of field liaison services of various kinds.
In the concluding survey of Area 4, the discussions were concerned with the scope and difficulties of comparison of systems, the need to determine primary objectives for any retrieval system both for purposes of design and of evaluation, and with the question of determining the specific criteria to be used in testing the merits of a given system.
Comparison and evaluation of systems are still very hazardous, because of the many variables that enter into the design of any particular system and the fact that the design of any system is decided to some extent on specific objectives and somewhat unique combinations of objectives. It is doubtful that the comparative tests that have been made or that are planned can decide the relative merits of systems for general application. They can only help to decide on the merits of certain particular systems for certain particular applications. Hence there is real danger in drawing general conclusions from experience that is as yet too limited.
The observation was reiterated that our new machines, systems, and methods have only supplemented and not replaced the old. As various systems in scientific communication have evolved, there is a continuing coexistence of the previous ones, from oral communication, correspondence between working scientists, publication in primary journals, secondary media of information, bibliographical aids, and the present multitude of large and small documenta-
tion centers for at least the functions of accessioning, processing, and redistributing material. Is it likely that this pattern will be reversed and that various mechanisms developed in the past will be supplanted completely?
Areas 5 and 6
In summarizing the discussions in Areas 5 and 6, it was reiterated that despite claims of proponents, none of the systems in use or suggested is entirely attractive and safe from the point of view of the scientist-users themselves. It is not the nature of machines that is going to determine the systems, but the nature of the scientists who use them. Even in systems of indexing classification where there is always a certain amount of interplay between the seeker and the librarian, documentalist or machine operator, the basic frame of reference must be the seeker. This second stage of interplay is still highly unorganized. It is hoped that impetus will be given by the Conference for undertaking serious efforts in operations research study of types, forms, vagueness, degree of sophistication and specificity of questions by users.
Another type of problem upon which some people will be interested in doing research in the next few years was touched on briefly in the discussions. It is not too difficult to make a retrieval system work for a small or circumscribed body of knowledge. The difficulty arises with the overlaps into different fields. New problems arise and it is not yet clear what these problems are. To bring enormous systems into existence, with the rigidity that their existence implies, before these problems are thoroughly recognized and explored would be a mistake.
Many fields and attitudes were represented at the Conference, whose greatest value may have been through increasing the appreciation of these complexities and of the relevance and attitudes and actions of other groups. Thus more librarians have come to understand that the mathematical formalization of procedures that librarians have long carried out competently and informally is a necessary prerequisite to the automation of even some of the easier aspects of a librarian’s work. Information retrieval is, like all other broad fields, in the process of mathematization.
It seems obvious that theories must grow by stages and that a complete general theory of information may never come. It is also obvious that readymade mathematical techniques are unlikely to take us very far. Mathematical theories need to be developed within a wider framework of functions, that is, functions in the sense of the way in which things work, not in a mathema-
tician’s sense. Here one wide enough to cover systems both present and future, both manual and automatic, is needed.
For the next few years such theories may be expected to have limited generality. But, for all of this, mathematical theories developed with the aid of deep insight and experimental verification is the long-run hope of information retrieval. In particular, the applications of information theory to information retrieval will be specialized, possibly including, for example, such things as the optimum arrangement of documents in a file not subject to direct human access or the optimum coding of addresses for such a file.
Further work on over-simplified mathematical models carefully evaluated will be valuable and should be actively continued. The results thus obtained should be regarded as starting points, not stopping points, and should be compared with experience and experiment to show which of the considerations left out is the most important.
A wide variety of types of experimentation is badly needed. Arrangements of physical objects, trial libraries, simulations on general purpose computers, even writing experimental computer programs, all are experiments. Adequate attention must be given to problems of scale, particularly to the use of extreme ingenuity in modifying small scale situations so that they will exhibit behavior ordinarily shown on much larger scales. In particular, objective and empirical studies of present-day patterns of both queries and retrieval are needed. It is seriously to be doubted if we know how retrieval is actually being carried out with manual systems.
INFORMATION RETRIEVAL SYSTEMS AS WHOLES
There is a growing realization that a large mechanized system will include functions which today’s manual system assigns to the library staff. However, information retrieval systems need not be single stage even after mechanization. Breaks between stages may correspond either to human intervention and decision or to alterations in a programmed strategy. Indeed, the problems of large, medium and small systems differ enough to generate quantitative differences in the character of optimum solutions likely.
INFORMATION RETRIEVAL SYSTEMS IN RELATION TO PEOPLE
It may well prove desirable to ask persons connected with in-put, that is, authors, referees and so forth, to aid information retrieval by simple expressions
of judgment. We must attach more importance to the continuation of both freedom of expression by users in querying, even though many queries may be maladroit, especially since old habits of query formulation will at best die slowly, and education by the information retrieval system of those who ask queries. For these and other reasons, two-way communication between querist and information retrieval system, human or machine, will continue to have high priority. We sometimes forget that we do have two-way communication between the querist and the librarian. Indeed, as such strategies are automatized, users are likely to have to teach the machine how to improve its search strategy.
INTERNAL SEARCH, LANGUAGES, CLASSIFICATION, AND RELATED TOPICS
The language in which queries are discussed with the querist need not be the same as the internal language in which a query expresses the desire for search. Internal search languages of three extreme types received the most discussion.
Hierarchial ordered languages, that is, classification in the restricted sense.
Languages consisting of descriptions, simple, inflected, combined with relations, and so on. About these, two remarks are important. Such relations as interlocking and interfixes are likely to be but the beginning of relational descriptive syntax. The presence of either inflexional or relational descriptive syntax requires mathematical structures more complex than those of lattice theory.
Associative languages in which expressions of the relevancy of pairs of documents provided the basis for search and queries are expressed in the form “like this and this and this, but not like that and that.” Again, two remarks are needed. Classification is implicit in such a system, and thus much more easily automatically up-dated. Second, the documents are likely to be parts of present-day documents. These simple extreme types were thought of only as exhaustive. The need for including the querist’s background, for example, as “an organic chemist interested in dyestuffs,” was stressed, as was the possibility of simultaneous use of two or more sublanguages as an elementary, not very realistic example, as if the UDC and the Colon classification were used simultaneously. In mechanization of classification, internal search language must be specified as a tool, not as an object of art, although semantic considerations need not be absent.
SEARCH STRATEGY, MACHINES, PROGRAMMING
Details of search strategies are ever less clear than details of internal search languages, but the view was strongly expressed that such strategies must be
closely tied to the most modern techniques for programming digital computers, techniques which go far beyond computation. Today programs which involve randomness of start and guidance by results, so-called heuristic programs, beat good players at checkers and have given neater proof of one theorem of propositional calculus than did Whitehead and Russell. Thus self-checking exploration allows machines to handle a new group of actions once considered “thought.”
Special formal languages used in programming are now so flexible and powerful that they may be much better adapted to the discussion of search strategies than either natural languages or the formal languages of pre-programming mathematics. Techniques for obtaining programs which may be safely modified by intervention without detailed study (and hence may be safely self-modified) appear to be in sight. Both heuristic and safely self-modifiable aspects are likely to be essential if search strategies are to be both teachable by human intervention and adequately able to profit from experience. These are both abilities which will continue to be of great importance in information retrieval systems.
RELEVANCE AND ASSOCIATION
The problems of using a specific formula to express mutual relevancies (query to document) were discussed and many ideas were hinted at. There is a clear need for both cooperative and collaborative work to provide several reasonable alternatives and for some real trials. Similar needs exist in other areas of search strategy where many problems call for combination of background, for careful discussion and for experimentation.
Once search procedures have determined the call-signs of documents, abstracts, and so forth, there remain significant engineering problems of how best to arrange for the fetching of these documents or abstracts.
Some things that have been mentioned are but hopes. Others have already been done, but without regard to cost. The ideas are stimulating and the net value of their discussion and consideration is great so long as we remember that many of them will not be embodied in procurable devices for a substantial time. Meanwhile, those with immediate problems must do the best they can now. They dare not wait for an overall general theory whose approval is unpredictable nor for theories of intermediate generality, nor for safely modifiable heuristic programming, which will come sooner than most of us think. New and diverse special and interim solutions can and should be attempted. They will contribute both to the solution of the specific problems and to our common store of knowledge.
In summarizing the discussion in Area 7 there was a consensus that this area shared responsibility in the entire subject of the Conference, although the responsibility is not of a specifically scientific sort. It is rather at the administrative level but it invokes there the arts and politics. It involves the support of documentation services, the support of research in documentation and the whole question of the practicability and the method for working toward unification and simplification.
There is as yet no measure of the problem which we are facing, no complete picture of the present economics of the situation, and not even any good descriptions of methods employed to work out present arrangements in economic and administrative terms. We do not even have a systematic catalog and description of the tools that are now available. Yet the real reason for the Conference, beyond the mere fact that we are faced by mountains of documentation, is that we do have available some very powerful tools, specifically that group of techniques, arts and sciences which are lumped together under the name automation. There is the existence of the computers and the other data processing machines, of material moving devices, of mechanisms for reducing to minute facsimile and to other terms and in other ways the facts and information in which we are interested, and for transmitting information at high rates of speed. The powerful tools of mathematics, of statistical analysis, and the techniques of the behaviorial sciences, including psychology and that group of techniques brought together frequently under a name such as operations research, the tools of linguistics, all these are around us in active development but are to a very limited extent brought to bear on this particular problem.
There was tacit agreement that full use must continue to be made of all the facilities, of all the organizations working in this area, and of all the potentialities which are available for improving work. Specifically, as for the responsibilities of scientists individually or in groups as scientific societies, of the indexing and abstracting services, of national governments and of international groups, both governmental and otherwise, there seemed to be no inclination on anybody’s part to say that specific responsibilities of one kind or another always adhere to one or another of these bodies, or that any of them can now be relieved of any responsibility which they now have undertaken.
It also seemed to be accepted that attempts at global solutions should be regarded with some suspicion and question. Progress is to be made rather by small experiments or, as one speaker put it, by seeking answers in restricted areas, permitting us to walk before we run.
International cooperation must certainly be encouraged, but in its early stages between national groups rather than on a global international centralized scale. Many types of continued research and inquiry are needed on the size of the problem, on the difference of the size of the problems as affecting the various sciences, the various different disciplines, on the effects of differences in size on the kinds of documentation services and the organization of these services, on the economics of the problem, and on the real reasons for the failure of certain efforts.
In the field of education for scientific information work, it seemed to be agreed that interesting developments were in existence worldwide and that there is no need at the present time for any freezing of the situation. Rather, it is preferable to let the present experimentation proceed and get where it can. In the matter of unification and simplification, too little attention was given. It was noted here again that the tremendous cost and expenditure of effort which the present situation represents poses the question: can we afford to waste this energy in the present scattering, duplication and laggardness of effort?
It might be agreed that this question should be answered in the negative, but as to what action should result therefrom, there was no consensus. There were sour looks at all the proposals for centralization, but this entire picture might be changed by the change of one detail in either technological or methodological development.
A basic problem is that of organization, including integration of services both nationally and internationally, and of negotiation between organizations and countries as to what shall be done and in what manner to meet the needs of different kinds of scientists and institutions much more carefully and precisely than hitherto.
This organization, including a division of tasks, provision of meeting specialized needs, and the establishment of a certain degree of standardization may be all that is required, except financial support. Certain clearing-house activities and communication services are the maximum that might be necessary for world centralization or even national centralization.
As to standardization, component rather than rigid standardization is the desirable goal. We need the components in different organizations and in different countries to be of such a type that they at least fit together. This, rather than the external form is what matters, in fact the variety which is the richness of science must be preserved. The problem is essentially one of sociological, organizational, and administrative type. Although it is complex, it is no more difficult than other problems that man as a social animal has already solved.
and Area Chairmen addressed themselves particularly. Many specific suggestions, partial solutions and new points of view in addition to new factual information were presented during the Conference and in the working papers. It was urged that these not be neglected, but that they be worked over, digested and studied in and through a continuing activity of the Conference. Rather than waiting a number of years before holding another large general conference, it was urged that beginning in the near future there be organized a series of smaller symposium-conferences on restricted and specialized areas and particular topics, both those of the subject areas of this Conference and new ones. These should be organized on a more truly international basis than was possible for the present occasion. Without a deliberate follow-up much of the value of the Conference would be lost.
MILTON O.LEE, Rapporteur and Discussion Panel Chairman
CHARLES I.CAMPBELL, Program Committee Chairman