CAMPARI Documentation - Introduction
Download and Install
Instructions on how to obtain a copy of CAMPARI are provided on the Download page. Installation instructions are found in the corresponding section of the documentation. Please note that we have almost exclusively used CAMPARI on commodity hardware using Intel and AMD chips running more or less standard Linux operating systems of the RedHat (CentOS), SuSE, or Ubuntu/Debian flavors. Recent CPU-only partitions at the Swiss Supercomputing Center (CSCS) (of both Piz Daint and Alps), which are or used to be Cray-based systems, also successfully run the CAMPARI software. If you manage to build the program on unusual hardware or other operating systems, please let us and the rest of the user base know via our SourceForge page. We ourselves have compiled but not used extensively CAMPARI on both Windows and MacOS (with some limitations).Start Running Simulations
Unlike most other simulation software, CAMPARI uses only a single executable (possibly in different flavors, compiling in MPI or OpenMP support or both) that is driven exclusively by a key-file. The only exception to this is the data mining workflow that obviates the notion of a simulation system (executable "camp_ncminer"). In our opinion, having all or most of the offered functionality "under the same hood" simplifies working with CAMPARI. The basic command line for running any CAMPARI simulation or analysis task in a shared memory parallel execution mode will therefore be (assuming a Unix-shell):
cd ${EXEC_DIR}/
${CAMPARI_HOME}/bin/{ARCH}/campari_threads -k ${EXEC_DIR}/sample.key > sample.log
CAMPARI always writes to the current working directory, and will overwrite output files with nonunique names. Again, this is intended, and might differ from the workflow implemented by other simulation software. In serial or OpenMP-only execution, the log-file will be written to standard out and should normally be captured for inspection ("sample.log" above). The above command line also makes it clear that a thorough understanding of the relevant keywords is what makes CAMPARI an efficient tool in the users' hands.
A comprehensive list of all keywords is part of the documentation, and typically this is the file users will refer to most often. In the beginning, however, it may prove useful to have a look at our tutorials. The latter have the advantage of leading the practitioner through a series of steps that construct a key-file suitable for the computation the respective tutorial is concerned with. This allows an easier point of entry as it filters the information down to the scope of a specific simulation. Keywords, input and output files, and some special topics (e.g., sections of parameter files) are html-linked throughout the documentation allowing easy navigation through sets of related content (also from the tutorials).
All CAMPARI simulations require at least two auxiliary input files: a file with a sequence specifying the system to be simulated and a parameter file. There is a section of the documentation dedicated to explaining the formats of user-generated input files (such as the sequence file). The parameter files are editable, but are shipped with CAMPARI under the assumption that they are most commonly going to be used "as is". A list is provided here, and their structure is explained in another dedicated section of the documentation.
In the executable implementing the dedicated data mining workflow, the required input files are different, but the general execution pattern is the same:
cd ${EXEC_DIR}/
${CAMPARI_HOME}/bin/{ARCH}/camp_ncminer_threads -k ${EXEC_DIR}/data_mining.key > data_mining.log
You can consult the corresponding keyword section or Tutorial 14 for further details.
Look at the Output
When running simulations with it, CAMPARI can print spatial information (trajectory data) of the system evolving under some form of sampling algorithm in a number of community formats. CAMPARI will also produce a variable number of additional output files that are documented in detail elsewhere. Almost all of the analyses can be performed "on-the-fly" meaning that data are accumulated and/or averaged throughout the simulation. This is unusual but can be an advantage as it circumvents writing large amounts of data to disk. Such on-the-fly analyses can also be augmented, and new ones can be added by means of user-supplied Python code (a feature available since version 5.0). For quantities which rely on precise details of fluctuations (such as most free energy estimates) or average across many identical particles (like water-water pair correlation functions), this logic can offer a favorable trade-off since it will allow many more samples to be analyzed than conventional means would typically allow. Such a logic can even be required: for instance, for quantities that are not computable after the fact, such as estimates dependent on particle velocities. It is of course possible to use the key-file to disable all auxiliary analysis and analyze the trajectory data during post-processing. This would mimic the more conventional workflow in the simulation community. In exactly the same vein, CAMPARI can analyze a preexisting trajectory in a special analysis mode. This means it can be, with minimal modifications, used to perform mundane but tedious tasks such as the conversion of trajectories between different formats. Generally speaking, in this analysis mode, the exact same type of information can be utilized that is available during simulations. This can be a fundamental advantage over conventional trajectory analysis tools, which generally treat trajectory files as sets of coordinate vectors only.Optimal Use of CAMPARI
The most important key toward performing meaningful simulation tasks in any molecular simulation software is a sound understanding of statistical mechanics. It is of course outside of the scope of the software and this documentation to address this. CAMPARI does, however, attempt to simplify things as follows:- CAMPARI attempts to simplify certain auxiliary tasks such as the generation of random starting structures for running otherwise identical simulations in "batch"-mode.
- We have attempted to write the documentation in such a way that it explicitly mentions technical issues ("closet skeletons") pertaining to implemented simulation algorithms. This will not always be comprehensible to nonexperts but should allow more experienced users to make appropriate choices more easily. While reading of the corresponding literature is indispensable, even that will rarely cover all implementation details.
- CAMPARI allows a great deal of customization in crucial parts of the algorithms, such as for interaction functions (force field). It can therefore be used as an efficient exploration tool for canonical elements and assumptions of and in force field design.
Most speed-limiting algorithms within CAMPARI have gone through some form of performance optimization and their workload can be distributed to all the cores of a shared memory architecture by means of OpenMP. Nonetheless, CAMPARI cannot compete with the speed of a program such as GROMACS, in particular for the types of applications GROMACS has been optimized for, i.e., explicit solvent, Cartesian-space molecular dynamics calculations with 3- or 4-point water models where the majority of the system is solvent. In comparison, CAMPARI, for example, maintains double-precision arithmetics throughout all computations. It is also written in a form that allows modifications and extensions to be accomplished relatively quickly for programmers with any form of Fortran savvy. CAMPARI also maintains an internal representation of the systems it allows simulations of. This is normally an advantage for the user (assuming the systems of interest are supported of course) and in many analysis-related tasks, but it does affect both performance and code size adversely. Of course, with a small developer core, CAMPARI cannot realistically be compared to software packages grown over the course of decades such as AMBER, CHARMM, or the aforementioned GROMACS. It is, however, being actively maintained and developed, is well-tested, and offers some unique features.
Modifying CAMPARI
We provide a preliminary section on development. This is meant as a first place of reference for those who wish to edit CAMPARI to add or modify features. We hope that in the future all users resort to the SourceForge pages more such that these pages can eventually serve as the knowledge repository for working with the code we anticipated originally.Citing CAMPARI in Your Work
One of the motivations for creating CAMPARI was to use Monte Carlo sampling on biomolecules with a novel implicit solvation model. This combination is one of the unique features within CAMPARI. The implicit solvent model - termed ABSINTH - is introduced in the following publication.- Andreas Vitalis, Rohit V. Pappu: ABSINTH: A new continuum solvation model for simulations of polypeptides in aqueous solutions. Journal of Computational Chemistry, 30 (5): 673-699 (2009).
- Andreas Vitalis, Rohit V. Pappu: Methods for Monte Carlo Simulations of Biomacromolecules. Annual Reports in Computational Chemistry, 5: 49-76 (2009).
- Andreas Vitalis and Amedeo Caflisch. Efficient Construction of Mesostate Networks from Molecular Dynamics Trajectories. J. Chem. Theor. Comput. 8 (3), 1108-1120 (2012) (tree-based clustering)
- Nicolas Blöchliger, Andreas Vitalis, and Amedeo Caflisch. A scalable algorithm to order and annotate continuous observations reveals the metastable states visited by dynamical systems. Comput. Phys. Comm. 184 (11), 2446-2453 (2013) (progress index method)
- Andreas Vitalis and Amedeo Caflisch. Equilibrium sampling approach to the interpretation of electron density maps.
Structure, 22 (1), 156-167 (2014) (spatial density restraints)
- Andreas Vitalis and Rohit V. Pappu. A simple molecular mechanics integrator in mixed rigid body and dihedral angle space. J. Chem. Phys., 141 (3), 034105 (2014) (molecular dynamics integrators in rigid-body/torsional space)
- Nicolas Blöchliger, Amedeo Caflisch, and Andreas Vitalis. Weighted distance functions improve analysis of high-dimensional data: Application to molecular dynamics simulations. J. Chem. Theor. Comput. 11 (11), 5481-5492 (2015) (dynamic and static weights for data mining)
- Marco Bacci, Andreas Vitalis, and Amedeo Caflisch. A molecular simulation protocol to avoid sampling redundancy and discover new states. Biochim. Biophys. Acta, 1850 (5), 889-902 (2015) (progress index-guided sampling (PIGS))
- C. Esposito and A. Vitalis. Precise estimation of transfer free energies for ionic species between similar media. Phys. Chem. Chem. Phys. 20 (42), 27003-27010 (2018) (use of compartmentalization potentials to calculate transfer free energies)
- Marco Bacci, Amedeo Caflisch, and Andreas Vitalis. On the removal of initial state bias from simulation data. J. Chem. Phys., 150 (10), 104105 (2019) (a new way of extracting thermodynamic weights from biased simulation data)
- J. R. Marchand, T. Knehans, Amedeo Caflisch, and Andreas Vitalis. An ABSINTH-based protocol for predicting binding affinities between proteins and small molecules. J. Chem. Inf. Model. 60 (10), 5188-5202 (2020) (use of molecular simulation software for virtual screening; further development of ABSINTH)