Mathematical Challenges from Theoretical/Computational Chemistry


CHAPTER 4 continued

Efficient Generation of Points That Satisfy Physical Constraints
in a Many-Particle System

Prototypical Problem

Consider N particles in a cube in three-dimensional space, each with x, y, and z coordinates in the range [0, L]. The state of the system is then given by specifying the positions r of all N points.

In the statistical mechanics modeling of condensed phases, one typically is interested in restricted sets of particle configurations. For instance, one's interest is often restricted to only those states for which

| r - r | for all 1 i < j N

Imagine that the points represent the locations of the centers of hard spheres of diameter unity. The conditions state that the hard spheres do not overlap one another; that is, the spheres repel each other strongly when they are close together, so that each center-center distance must be greater than unity.

For the present problem, we are interested in the regime where N is large, of the order of 10 to
10. The volume is large enough so that N/L is in t he range (0,2). In the actual problem of common interest for computer simulation of materials, the system will satisfy peri odic boundary conditions. In effect, this means that the restriction above is more precisely stated as

|r - r - nL| for all 1 i < j N

and all vectors n with integer components.

The problem is to generate efficiently states of the system that satisfy these constraints. The states will more than likely be generated by a stochastic process of some sort. A more ambitious goal is to generate states such that all states that satis fy the constraints are equally likely to be generated. (More precisely, if the set of positions {r,...,rN} is regarded as a set of random variables, the joint distribution function for these variables, which is zero when one or more of the constraints is violated, should be a constant f or values that satisfy the constraint.)

This is an example of a problem for which each constraint is relatively easy to state and express in terms of a small number of the variables in the problem, and the number (or measure) of states that are consistent with the constraints is very small co mpared with the total number of states, in fact vanishing exponentially as N increases. This problem is related to that of generating possible states for an atomic fluid whose interatomic potential precludes two atoms from getting close to one a nother. The rigid-sphere interpretation relates to the challenging mathematical problem of existence and characterization of random sphere packings.

Variations on the Prototypical Problem

First Variation. Consider a random walk in a three-dimensional space that consists of N steps of unit length and random direction. Let s for i 1 be the ith step. The position

is the location of the random walker after the ith step, with r being equal to the origin. The only states of interest are those for which these positions never come close to one another. More pr ecisely,

|r - r | for all 0 i < j N

This problem is related to specifying the possible structures of a polymeric molecule. The goal is to generate random walks that satisfy these conditions. As in the previous problem, the method of generation is likely to be probabilistic. A stronger go al would be to generate states with a process such that all states that satisfy the constraint are equally likely to be generated. (More precisely, if each step s of the random walk is specified by a locati on on the surface of a unit sphere, the joint distribution function of the N steps should be zero for sets of steps that violate the constraints and a constant for sets that satisfy the constraints.)

Second Variation. In problems of interest, there may also be some additional constraints on the locations; for example, for some pairs of locations, there might be restrictions of the form

a | r - r | b

When the number of these restrictions is large enough to imply that only a small range of structures can satisfy all the constraints, the problem becomes a special case of the problems that are solved by using the methods of distance geometry, which is d iscussed in more detail in Chapter 3.

Third Variation. The various steps in the random walk of the previous problem might be correlated in the sense that the probability distribution for s might depend on the value of s or perhaps both s and s.

In these problems, there are some constraints that are "easy" to satisfy (e.g., in the first problem each particle must be inside the cubic box; in the second problem, each step must have unit length). Then there are others that are "harder" to deal wi th. Whether a constraint is hard or easy to deal with is related, in general, to whether it is concerned with just one of the basic vectors of the problem or with more than one.

Simplest Strategy

The simplest strategy for generating states that satisfy all the constraints is obviously to generate states that satisfy the easy constraints and then delete those states that violate the hard constraints. This solution is practical, if at all, only f or relatively uninteresting situations. It works for the first problem, for example, only when the density N/L is much lower than the maximum density allowed. The problem is that almost all of the stat es generated will subsequently be deleted by this process.

More precisely, in the first problem one might imagine generating sets of N points, each of which is randomly distributed in the cube, and then discarding sets that violate the conditions specified. The sets not discarded then should be unifor mly and randomly distributed among the states that do satisfy the conditions. The difficulty with this approach is that the probability that a set of randomly generated points satisfies the condition is of order exp(-aN) for large N, wh ere a is a constant that depends on the density N/L . For large N, therefore, much of the computational effort is wasted.

An obviously more efficient procedure is to generate the set of positions one at a time and test each one to ensure that it is far enough from the previous positions before generating the next position. If one violation of the conditions is found this way, no effort need be expended to generate the remaining positions in the set or to test them. Even this is not efficient enough. A significant amount of effort will be expended in generating partial sets of positions, only to find that the set must be discarded because of the value of some position generated late in the sequence. This difficulty is sometimes referred to as the problem of "exponential attrition" because of the exponentially small fraction of sets of positions that are generated succes sfully.

Metropolis Monte Carlo Method

In this strategy one first generates one state that satisfies all the constraints; then this is used as the beginning of a Markov process whose transition probabilities are such that transitions are allowed only to other states that satisfy the constrai nts. In practice, this means that each transition typically involves a change of, at most, one of the coordinates by a very small amount. The two difficulties with this method are the following:

1. The set of states that satisfy the constraints may not be connected, so that with a particular initial state it will be impossible to generate a very large fraction of the states.

2. Even if the set of states that satisfy the constraints is connected, typical Markov processes explore the range of accessible states relatively slowly.

Therefore, some entirely new ideas for dealing with this class of problems would be worthwhile.

Relationship of These Problems to More General Optimization Problems

Some special cases of these problems, especially the first and second variations described above, are closely related to optimization problems that arise in chemical calculations. Chemical optimization problems typically involve minimization of an ener gy or free energy that depends on the positions of atoms or groups of atoms. (See pages 68-77 for a discussion of optimization problems and methods.) In more general problems, the object is not to minimize the energy or free energy but to calculate the typical properties of all the states of low enough energy that they might be populated at the temperature of interest. Such functions typically are very large and positive for certain configurations in which atoms or molecules are very close. It is typi cally true that a configuration in which any one pair of atoms is too close to one another has a high enough energy to make the total energy of the configuration so high that it cannot possibly be a solution of the optimization problem (or so high that it is not thermally populated). Thus, identification of states that are consistent with constraints of the type mentioned can be a useful first step toward solving optimization problems in chemistry.


Previous Section | HTML Home Page | Next Section

NAS Home Page | NAP Home Page | Reading Room | Report Home Page