Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
6 Deciding to Acquire a Powerful New Research Too! Supercomputing Beverly Eccles Abbott Laboratories Abbott Laboratories is a health care company that has five major divisions. Each division behaves somewhat as a separate entity. I belong to the Pharmaceutical Products Division, and I provide support to the computational chemistry area of "discovery" research. USING COMPUTATION TO PRODUCE PHARMACEUTICALS In the Pharmaceutical Products Division, computation is done in two distinct areas that are functionally, physically, and managerially separate. The first is corporate computation, which supports payroll and production and sales and is primarily IBM based. The second is R&D computation, which is done primarily on Digital Equipment Corporation VAX computers. Research and development computational support spans several areas. A network of VAX systems forms the backbone for general-purpose support functions: electronic mail, graphics, word processing, and so on. Clinical and toxicological data processing is another area of support. Government regulations require that these data be kept in archives and that statistical analyses be done on the data to prove Abbott's claims of drug effectiveness and safety. Laboratory automation systems include many special-purpose instruments that connect to computers for data collection, analysis, and pos- sibly delivery via the computer network to a central processor or database. Computational chemistry, the area in which I am directly involved, will be the focus of the remainder of this discussion. 73
74 BEVERLY ECCLES Computational chemistry spans three primary areas that utilize com- putationally intensive methods to solve problems: 1. Computer-assisted molecular design. This is the computer version of putting together the ball-and-stick plastic models of chemical compounds. The computer further enables the researcher to visualize a molecular structure, examine it, compute its theoretical properties, and attempt to determine what it is about the molecule that makes it particularly useful for a drug application. 2. X-ray crystallography. This area deals with analysis of the crystalline form of materials to determine the structure of the molecules of a chemical substance. Again, chemical structure is very important. 3. Nuclear magnetic resonance spectroscopy. This involves another physical measurement that helps the chemist determine the structure of a molecule, the spatial relationships between portions of molecules, and the relationships between a molecule (a drug) and a substrate. The common theme is to try to understand the structures of molecules and how they relate to the chemical actions of these molecules. These computational activities are far from what would commonly be considered wet chemistry. Although computational chemists do perform experiments, the context of their experiments is the computational domain. Modeling is their world of chemical reality. How well a model matches observed chemical behavior is a good measure of the usefulness of the model. The current methods used in computational chemistry give a broad picture. The goal is not to model every minute detail of a molecule and its behavior, but rather to gain an understanding of how structure and function are related. Two methods can be used to perform experiments to gain such understanding. The first involves interactive processing, which may include querying a database or looking at a graphic display of the model of a molecule- a three-dimensional depiction of the structure- on a high-resolution, high-performance computer graphics workstation and interacting with the structure in view (using techniques of rotating, zooming, color coding, and so on). Rather than physically holding a plastic model of a structure, one can turn dials and cause the structure to move on a display screen to try to answer a question such as, How do these two molecules fit together? This is the inspection and manipulation stage, the manual part of getting acquainted with these chemical structures. The second computational method involves batch processing. A com- putational chemist who has an interesting structure to investigate can do any of a number of things that come under the classification of batch com- putations. These do not necessarily take very long to run. However, many such computations do. Some of the computations running on VAX/785-type equipment can take from a week to several months of elapsed time-shared
DECIDING TO ACQUIRE A POWERFUL NEW RESEARCH TOOL 75 computing time. Such time requirements are not uncommon. Basically, our computers at Abbott are kept running night and day, every day, doing these sorts of computations. Batch computations permit the computational chemist to evaluate the theoretical physical properties of compounds of interest and to simulate the behavior of these compounds, both individually and in interaction with other molecules. The ultimate goals of interactive and batch processing are to elucidate chemical structures, derive theoretical physical properties, relate these to observed chemical behavior, and deduce what aspects of a chemical compound result in a desired drug action. With the understanding gained from such information, it may perhaps be possible to develop a better drug, a drug that perhaps is more easily absorbed by the body, or will not break down in body tissues, or will not have serious side effects, or will be more specific in its behavior. The computational method is a cyclical one of interactive processing interleaved with batch processing. The computational chemist views the results of the batch processing and may make adjustments in the structure, test new hypotheses, or simply resume the computation where it left off, all in the spirit of experimentation. The computational chemist works in collaboration with the more traditional bench chemist, aiding in develop- ing rational approaches to drug design based on physical and chemical . . - principles. The computational chemist uses the computer daily in experimental work and conducts computational experiments on computers of varying power and function. Much of the work proceeds in a direction based on chemical intuition from accumulated years of experience. This science of computer modeling and simulation of chemical behavior is still something of an art. One cannot compute with great precision everything about a reasonably sized molecule because there are not enough computer resources in the world to do so. One has to make approximations involving adjustable parameters. One must sacrifice accuracy for feasibility of computation. In the course of experimentation, the computational chemist must manipulate many interrelated experimental parameters according to a "try-it-and-see" methodology. ASSESSING ADVANTAGES OF SUPERCOMPUTING Now the question is, with this cyclical methodology to performing computational experiments, What can a supercomputer do for us? The simplest advantage is that a supercomputer can speed up most calculations. Computations that take 30 days can possibly be accomplished in 1 day, or even in hours. But simply speeding up a calculation is not enough. We want to look for new ways of analyzing our data.
76 BEVERLY ECCLES Immediate Feedback for Rapid Results If we can tighten the loop between the experimental design (the interactive step) and the experimental outcome (the batch step), thus shortening the time between asking the question and seeing the answer, then we can very quickly ask the next question, and the next, thus making creative thought how more easily. It is similar to the difference between writing a letter and making a telephone call. The feedback is immediate, and channels are followed that perhaps would not be if results came back in a weeL When results come more slowly, we are more conservative in the questions we can ask, and we leave many more stones unturned. When the answers come back while the questions are still fresh, the train of thought can continue, and less time is spent trying to remember where one left off and what line of reasoning was being explored. A very important thing that a supercomputer can do is to tighten the loop between inception of an experiment and the final outcome. Increased Human Input A second important thing that a supercomputer can do is to put a human into the loop in some of what is now batch, iterative computation. Right now the human element in the process exists only at the initial point when the input data are assembled for a batch computation. At this point a decision must be made as to how far to carry an iterative calculation, and often this decision is based on arbitrary criteria. Then one must wait until the batch calculation is completed, only to find, perhaps, that some of the computational time was wasted on exploring an unfruitful avenue. A human can get into the loop if the time it takes to calculate each iterative step is reduced to the point that a batch calculation becomes interactive- to the point that the iteration interval is short enough that a human can monitor the progress of the iterations continuously and can intervene to take a corrective or exploratory action. There are several examples of batch activities that can profit from this. An additional benefit of continuous feedback of results is that the computational chemist can gain new chemical insight when the temporal aspect of a computation is compressed to the point that new principles can be inferred that could not otherwise have been inferred from a postprocessing review of the results of a batch computation. Visualization of Results A third advantage of a supercomputer is that it can enable the vi- sualization of scientific results. Present-day computational chemists are
DECIDING TO ACQUIRE A POWERFUL NEW RESEARCH TOOL 77 accustomed to viewing the ball-and-stick models, both in real plastic and in computer-produced graphic displays. This is not the only way to view a molecular structure. An atom is not a ball, nor the bond between atoms a stick. On the contrary, a molecule is more correctly viewed as a malleable cloud of electrons surrounding a number of atomic nuclei that are con- strained to move in some rather preferred distances and orientations with respect to one another. From these fundamental particles, physics allows us to derive some physical properties that vary continuously throughout the volume of the molecular structure. We are looking for new ways to visualize these physical properties to gain new insights. The computational chemist who can view the data in just the right way can perhaps discern new patterns, derive new hypotheses, and explore new directions. We at Abbott are looking forward to the enhanced visualization possibilities that a supercomputer will afford us in the display of derived physical data to lead us to the development of new algorithmic approaches. CREATING AN ENVIRONMENT FOR SUPERCOMPUTING At Abbott Laboratories we have convinced ourselves that we do need a supercomputer for all of the reasons I have listed. However, we are also convinced that we cannot just put a supercomputer on the floor and turn the users loose on it. We need the whole integrated supercomputer environment. The environment must take into account the fact that if it is too much trouble to put a supercomputer to work for the scientists, they will not use it. They are used to sitting down to their computer terminals every day. They read their electronic mail. They work comfortably at graphics workstations. They manage input and output data sets stored in data files and databases on disk. If they are handed a supercomputer, they must be able to make use of it in the same way that they currently use existing computer resources. It must fit seamlessly into their environment. This means that we need to be able to supply interactive access to the supercomputer. It cannot just be a batch machine. With the coming of age of Unix in the supercomputer and workstation marketplace and, more importantly, in the physical and chemical sciences, a standard interactive operating system is now a reality. But we want to do more than just interactively submit batch jobs. We also want to run those jobs interactively. We want to be able to tie the computation in progress on the supercomputer into the graphics workstation at our desk, permitting a two-way flow of data: a real-time display of intermediate results of calculations to monitor progress, and interactive input from the user to modify the course of the calculation. We want the flexibility of being able to derive new ways to visualize the data. We need the ability to program the environment to suit our evolving needs and experiments;
78 BEVERLY ECCLES hence we need a development platform. The supercomputer is a new kind of tool. It is going to stimulate new thinking if it is put into the hands of people who can use it easily, and this will lead to the development of new computational algorithms. Gaining Acceptance At Abbott Laboratories we have had to deal with two obstacles to bringing in a supercomputer. The first, selling it to upper management, was, surprisingly, the easier to overcome. 1b a great degree this was due to our ability to easily demonstrate the supercomputer's usefulness in a particular pilot application. We were able to show how a supercomputer could save approximately 1 year of labor for a crystallographer in doing a refinement for determining a crystal structure. Management can see the dollars and cents of that. We were able to assert the competitive advantage to being the first pharmaceutical company to purchase a supercomputer. Also significant was the fact that we already had some champions in the ranks of upper management who had drawn their own conclusions, early on, about the scientific promise of a supercomputing environment. The second, and greater, obstacle to acquiring a supercomputer has been gaining acceptance by the user community. We have heard of this difficulty a number of times from other organizations trying to accomplish the same thing. The problem is that the users do not see a costjustifiable way of owning this technology and incorporating it into their research. They have short-range goals that involve a slight profit from the application of a supercomputer. Their long-range goals do not include the supercomputing environment as a necessity. A supercomputing environment is new and unfamiliar to them. We need to educate the users as to what they can do; we need to open up the creative flow. The users perceive a supercomputer not as an opportunity but as a responsibility and a burden thrust upon them. They feel that if the supercomputer is not fully utilized or that if no great breakthroughs come because of its presence, they will be responsible for a perceived failure. They feel that the expectations of management will have to be high to match the capital outlay and operating expenses of a supercomputer. They are researchers and cannot predict or engineer breakthroughs, so it is very difficult for them to stand up and say that they need and can justify the acquisition of a supercomputer. We have yet only partially sold acceptance of supercomputers to the users. Defining Computing Requirements At Abbott we have gone through the standard steps for acquiring any kind of computer system, starting with defining our computing require
DECIDING TO ACQUIT A POTFUL NEW BESEECH TOOL 79 meets. Interviews with the computational chemists and other researchers indicated the need for a machine with a performance class well into the range of supercomputers. Given this, the next step was to define other measurables and deliverables that would affect vendor selection. For Ab- bott, the first of these was a list of a number of turnkey applications three or four third-party packages that we said we absolutely must have. These codes are our bread and butter in pharmaceutical computational research; they already exist commercially; we know how to put them to work in our research today. Next, we specified that we must have a strong program development platform: an interactive operating system (that works), a com- piler (that works) with the capability to optimize code to suit the computer architecture, file management and data integrity, program optimization tools, subroutine libraries, and so forth. Making use of this development platform will allow Abbott to realize a competitive advantage in new com- putational methods. Next, we insisted on connectivity, the ability to make this machine talk to all of the computer hardware we have on our site. Protecting Corporate Investment Finally, we laid out the specifications for things that are not so easily quantifiable but that have played a very large role in our decision-making process for vendor selection. 1b try to help protect our investment and achieve our desired goals, we considered the following: 1. Upgrade path. What new hardware will be developed, and, more importantly, how will everything done this year move to that new piece of hardware? How long will a machine be down while it is being upgraded? 2. Support. Will the vendor help researchers who encounter problems? Will help be available when a machine's performance is less than optimal? 3. Does the vendor understand the purchaser's business? This is ex- tremely important. A vendor that does not understand your business will not understand what you need, and the vendor's development strategies may not support your strategies. Assessing vendor understanding and sup- port has been very important to those of us in the computational chemistry area. 4. Do the vendor and the purchaser share a common vision? There are so many pieces of hardware available now, including the whole hierarchy from personal computers to the supercomputer, networks, and so forth. Does the vendor share your vision of what you are trying to put in place in your company? Every company has different attitudes about departmental computing versus central computing, operating systems, graphics require- ments, and so on. Making sure that the vendor and purchaser share a common vision ensures a greater ability to achieve desired goals.
80 BEVERLY ECCLES MAKING A DECISION After this whole process at Abbott of evaluating computers, evaluating vendors, and evaluating whether or not we really want to acquire our own supercomputer (or whether we want simply to get some time-sharing on another computer) after all this, we have decided that the time is now. The simple reason is summed up in an adage that applies to everything from personal computers to supercomputers: There will always be something better next . We at Abbott have made that decision even though there are several players in the picture, each with distinct advantages, whether available today or promised for the future. There are the Crays that have been available for a long time and some machines that are just 80 percent developed now. In selecting a vendor it is necessary to consider the whole environment, the whole picture, and to keep in mind that as soon as a piece of hardware is bought, next month, or next year, there will be something better that can be taken advantage of when fiscal possibilities allow that in the next round.