CAMPARI Tutorial 8

Tutorial 8: A Very Brief Introduction to Minimizations in Rigid-Body / Torsional Space

Preface:

Given a potential energy landscape and a starting conformation for a high-dimensional system, minimization attempts to steer the system toward nearby minima on that landscape. This is often used to relax high energy starting structures, with the relaxed structures then being used as input for molecular dynamics simulations. You may have been surprised to find that in Tutorial 5, Tutorial 6, and Tutorial 7 we circumvented the issue of minimization altogether by using a combination of different methods. This mini-tutorial attempts to show how you could replace one of the setup steps in those tutorials with a minimization run.
Before we start, there are a few general points to consider:

Typically, minimization occurs in Cartesian space, and the enforcement of constraints is often possible only when the simplest minimizer (steepest descent) is used. CAMPARI offers minimization in Cartesian space if no holonomic constraints are in use. Often, it is more appropriate to pose the minimization problem entirely in rigid-body/torsional space, and the reader is referred to the documentation for what difficulties this poses from a theoretical point of view (→ FMCSC_MINI_MODE). The benefit obviously lies in the fact that constraints such as the rigidity of water molecules are innate to the algorithm.
In dilute systems (including those obtained with a continuum model of solvation), the energy landscape is often too flat. Explicit particles at high density would introduce pronounced ruggedness that almost always guarantees that nearby minima exist whereas in dilute systems particles relax along a very smooth gradient. This means that well-defined minima are frequently not found at all, or are very far away from the starting structure, which makes the procedure inefficient and difficult to interpret. Of course, it is still possible to relax locally resolvable conflicts (such as most steric conflicts) quickly for such systems.
Minimizers are typically not meant and designed to find special classes of local minima, in particular not the global minimum. Given finite step sizes and numerical issues, this unfortunately means that there are almost no guarantees whatsoever about the accuracy of parsing the local environment. In detail, shallow and small minima may be easily missed, and the final result may depend on the algorithm used. Rigorous energy landscape exploration will therefore require a sound understanding of the numerical implementation of different minimizers.
For global minimum searches, CAMPARI offers an annealing-type method (option 4 to FMCSC_MINI_MODE) that differs fundamentally from the other minimizers. This method is not discussed further in this tutorial.

Step-by-Step:

Step 1 - Switching to minimization

Here we will modify an existing key-file in such a way that it will allow us to run a basic minimization on one of the structures generated in any of the previous three tutorials. These are suitable as those simulations employed explicit solvent (see comments above). Let us assume you completed Tutorial 5. Simply take the key-file that you used to start your last production run and copy it and any auxiliary files into a fresh directory. Most importantly, you should have a pdb file of a reasonably overlap-free conformation. Then, change the existing keyword determining the sampling method as follows:

FMCSC_DYNAMICS 6
We should also make sure that log-output is frequent (and updated frequently) and that the degrees of freedom are the rigid-body/torsional ones. Thus, add or modify existing keywords:

FMCSC_CHECKFREQ 10 FMCSC_FLUSHTIME 0.25 FMCSC_CARTINT 1
You could attempt to run this immediately without setting any further options. Note that if you accidentally minimize waters in Cartesian space, the energy tends to decrease dramatically because water molecules deform (they may even lack bonded potentials entirely, which will eventually cause a crash). Additional modifications to the key-file and to the commands below may be needed if you decided to work with a different tutorial as template. Note that if you work from tutorial 6, you cannot necessarily expect convergence: this tutorial uses a heterogeneous mixture (polymer, ions, water) with prominent long-range electrostatic forces. Thus, you might want to limit the maximum number of force evaluations to a value that allows you to proceed at a reasonable pace. The rates of energy decrease can still be measured and compared across the different runs below.
As a starter, try minimizing without setting any additional options:

campari_threads -k water.key >& log1 &
Look at the growing output in "log1" as the simulation is running. Every 10 steps (requested by FMCSC_CHECKFREQ), CAMPARI will print out current statistics concerning minimization progress. Unless you started from a structure still containing large steric overlaps, this run should finish before too long by reaching the gradient RMS convergence criterion although this depends of course on the choice of inputs. If the calculation takes too long, it is recommended to switch to a smaller system or limit the value for FMCSC_NRSTEPS, which in minimization is the maximum number of allowed energy/gradient evaluations. If a run finishes this way, the evolution of the potential energy (convergence rate) can still be analyzed.
In "log1", you may note that the changes in energy along 10 steps still seemed large even if the run converged. This is due to the large convergence criterion set by default.

Step 2 - Tuning steepest-descent

If you consult "log1", you will also be able to find information on which minimizer CAMPARI used. The default one is steepest-descent. Five parameters will be listed in the same log-file, and they can be controlled through the key-file as follows:

FMCSC_MINI_MODE 1 FMCSC_MINI_GRMS 0.02 FMCSC_MINI_STEPSIZE 0.1 FMCSC_MINI_XYZ_STEPSIZE 0.1 FMCSC_MINI_ROT_STEPSIZE 0.5 FMCSC_MINI_INT_STEPSIZE 0.5
The gradient convergence criterion is what made the previous run terminate so quickly. It is given in kcal/mol because all gradients used are unit-normalized, and we reduce it now to 0.02 kcal/mol. The last three keywords are base step sizes in units of Å (for rigid-body translation → FMCSC_MINI_XYZ_STEPSIZE) and degrees, respectively (for rigid-body rotation and dihedral angles → FMCSC_MINI_ROT_STEPSIZE and FMCSC_MINI_INT_STEPSIZE). They constitute "base" step sizes because CAMPARI will - irrespective of algorithm - use a dynamic step size protocol to make minimization efficient. This is required due to the heterogeneity between different degrees of freedom in rigid-body / torsional space (not really an issue if you do this for a simple box of water). Part of this dynamic nature comes in through the outside scaling factor given by FMCSC_MINI_STEPSIZE. It has formal units of mol/kcal, and we pick 0.01 mol/kcal above. This factor is basically what is changed dynamically (except when using the BFGS method). This means that the primary point of user control is to change the relative step sizes with which different classes of degrees of freedom are propagated. We left the three corresponding keywords (FMCSC_MINI_INT_STEPSIZE is important only if you work from Tutorial 6) at default values. After appending the key-file with the keywords above, execute again:

campari_threads -k water.key >& log2 &
You should note that this run takes longer due to the much more stringent convergence criterion. Next modify the key-file as follows:

FMCSC_MINI_XYZ_STEPSIZE 0.3 FMCSC_MINI_ROT_STEPSIZE 0.3
These modifications make the base increment for rigid-body translation large compared to that for rigid-body rotation. Do again:

campari_threads -k water.key >& log3 &
If you compare the output in "log2" and "log3", you should note that the run performed last progressed at a different pace down the gradient (ideally, plot the respective potential energy traces). This is because of different combinations of translational and rotational base step sizes imply a fundamentally different search direction. In the pure water case, the last settings should be inferior, but in a system where most of the energetic frustration comes from solutes, this may be flipped (at least initially). As one last illustration, modify the key-file as follows:

FMCSC_MINI_STEPSIZE 0.02 FMCSC_MINI_XYZ_STEPSIZE 0.1 FMCSC_MINI_ROT_STEPSIZE 0.5
Here, we scale down the outside step size scale by a factor of 5:

campari_threads -k water.key >& log4 &
Due to the dynamic step-scaling, the results here should be quite similar to those in "log2" in terms of convergence rate (note, however, that the scaling factor is needed and used whenever forced step size reductions due to overstepping occur; it is therefore not at all trivial to rigorously predict the impact of the choice for FMCSC_MINI_STEPSIZE).

Step 3 - More complex minimization techniques

CAMPARI offers two further canonical minimization algorithms, the conjugate-gradient (CG) method and a specific implementation of a quasi-Newton technique (BFGS). Conjugate-gradient minimization uses the same parameters as steepest-descent. Modifying the key-file accordingly is therefore trivial:

FMCSC_MINI_MODE 2 FMCSC_MINI_STEPSIZE 0.1
Note that we reset the outside scale factor for step sizes to 0.1 mol/kcal. Run again:

campari_threads -k water.key >& log5
You will note that the reported values for the RMS gradient are qualitatively different for similar values of the potential energy indicating that the system is moving along a different path. You may also note that the evolution of the potential energy differs qualitatively. The rate of convergence is different (should overall be faster), and the method will occasionally stall while trying to find suitable directions to search in. Because CAMPARI will persist in trying to find a suitable search direction, both the conjugate gradient and the BFGS minimizers can "waste" many energy/force evaluations on unsuccessful directions. Lastly, the CG run will probably find a lower local minimum than the previous runs. The best performance comparison would be to plot the total energy against the number of force evaluations (or CPU time).
The BFGS method attempts to compute and maintain a numerical estimate of the Hessian for the current position on the landscape. This in itself should clarify that the success of the method will be parameter- and system-dependent since an inaccurate estimate of the Hessian will actually impede minimization. The BFGS method introduces two new parameters, a tolerance setting for occasionally accepting uphill steps, and a memory length used to compute the Hessian:

FMCSC_MINI_MODE 3 FMCSC_MINI_UPTOL 0.0 FMCSC_MINI_XYZ_STEPSIZE 0.05 FMCSC_MINI_MEMORY 25
We decide not to accept any uphill propagation and set a memory length of 25 steps. Then run:

campari_threads -k water.key >& log6
It is quite challenging to predict whether a more complicated technique like BFGS will also perform better. The average curvature of the energy landscape for a dense, polar liquid is very high and consequently contains much useful information. The opposite is true in dilute systems where curvature is low. The problem is that this information is not trivially usable in a discrete scheme, and step sizes matter a lot. Just like the minimization progress itself, the added value of the Hessian estimation depends on these step sizes (and this problem is made worse by working in an internal coordinate space), and the BFGS method may easily become the least efficient of the three minimizers. It is also prone to stability issues if there are dihedral angle degrees of freedom to consider in addition to rigid-body degrees of freedom (see FMCSC_MINI_INT_STEPSIZE). If minimization is indeed the method of choice for a given problem, we recommend to try all algorithms. The BFGS method (FMCSC_MINI_MODE is 3) offers, from a theoretical vantage point, the best peak performance. The steepest descent method (FMCSC_MINI_MODE is 1) is most robust but generally slowest while the CG method (FMCSC_MINI_MODE is 2) may offer a suitable compromise between robustness and efficiency.
However, it has to be said that for a system that it is relatively far away from an energetic minimum and contains many tightly coupled degrees of freedom, the fastest minimization technique by far is the stochastic minimizer (FMCSC_MINI_MODE is 4), which is based on (temperature) annealing and molecular dynamics. This technique is also vastly superior in escaping from local minima, which are frequently uninteresting. It mainly requires setting four parameters as follows. The algorithm spends a fraction of the total steps at a high temperature set by keyword FMCSC_TEMP before cooling to a target temperature with a rate detemined by the coupling time of the thermostat in use. To give it a try for the system at hand, add/append keywords as follows:

FMCSC_MINI_MODE 4 FMCSC_MINI_SC_TBATH 1.0 # in K FMCSC_MINI_SC_HEAT 0.25 # fraction of NRSTEPS FMCSC_TSTAT_TAU 0.2 # in ps FMCSC_TEMP 330.0 # in K FMCSC_NRSTEPS 5000 # must be adjusted explicitly FMCSCS_MINI_SC_SDSTEPS 0 # no need
Here we use 25% of 5000 steps at 330K before cooling down to 1K. While the system is at the high sampling temperature, termination cannot occur (unless nonsensical values for FMCSC_TEMP or FMCSC_MINI_GRMS have been chosen). Thus, the extent of the run should always be set explicitly.

campari_threads -k water.key >& log7
While it might not always be what you want to do, this stochastic method clearly offers some favorable properties (better global optimization, predictable runtime, natural dynamics) compared to canonical minimzers.

Epilogue - What else to try

To conclude this tutorial, you could attempt three more things. First, you could take the final structure obtained in the very last run and use it as input for a secondary minimization using a canonical minimizer. The minimization of that structure should be a much better-defined process yielding similar results for all canonical minimizers. The second would be to play the game of attempting to find parameters that give you the maximal rate in potential energy relaxation over the entire progress trace (it is quite common for steepest descent to be very competitive early on but slow to converge later). The third would be to compare the final structures of all the converged runs to each other. Some of these may be mutually consistent indicating that minimization found a similar solution to the optimization problem.