Molecular Modeling Software: OpenMM
OpenMM is "A high performance toolkit for molecular simulation. Use it as a library, or as an application. We include extensive language bindings for Python, C, C++, and even Fortran. The code is open source and actively maintained on Github, licensed under MIT and LGPL. Part of the Omnia suite of tools for predictive biomolecular simulation." Here's their GitHub repo, and Conda link, though I think they might be relocating their channel to conda-forge.
Here's my opinion, OpenMM is a very powerful, flexible engine that has integration with a variety of other MD engines, supports a variety of molecular models, excellent GPU support, active open-source development, and is the underlying molecular dynamics engine for OpenForceField efforts, but very easy to port to other MD engines via ParmEd. There's also support for enhanced sampling and integration with deep learning libraries. If there was a 21st century, best-software-practices, open-source software for molecular modelling and simulation, OpenMM (or HOOMD) would likely be it.
In reality, I'm not sure how many graduate students/academic labs opt to use OpenMM if the lab has historically used another MD engine. Also, this is a somewhat unfounded observation, but I'm curious if/how much the computer-aided drug design industry has adopted the use of OpenMM. More editorializing, but my graduate work never brought me into tight overlap with the OpenMM world/community, but it certainly seems like a vibrant community that is pushing the development and popularity of molecular modelling and simulation
The OpenMM Public API
I'm mainly summarizing and regurgitating the OpenMM documentation. These are some important terms to know within the OpenMM API:
- System - this object stores information about numbers of particles, particle masses, box information, constraints, and virtual site information. Note the lack of positions, bonding information, integrators, simulation run parameters. The System object also contains your Forces.
- Force - Force objects describe how your particles interact with each other. This is where your force field gets implemented - outlining the molecular model forces in play, the treatment of long range interactions, and even your barostat. This is, broadly, what a Force object is, but there is much more in the details of specific Force objects, like an
openmm.HarmonicBondForce
.- Upon implementation, it's interesting to note that the "Container" is the Force object, and it contains the parameters and particles the obey this force. Sort of turning this concept upside-down, Parmed's atoms and bonds are the objects that contain the interaction parameters of that force. from a Context
- Integrator - This is the integration algorithm by which you progress your particle's positions and simulation over time.
- Context - this object stores information about your particle coordinates, velocities, and specially-defined/parametrized Forces. When you run an actual simulation or produce a trajectory, you will have to start from a Context. Contexts contain information about integrators, which helps distinguish information about your molecular model of your System (forces, masses) from the things that will be used to run your simulation.
- State - this is like a single frame/snapshot/checkpoint within your simulation. It's everything that was being calculated at that particular timestep. If you want peer into your simulation, you will be looking at its State. If you want to report some information, you will be parsing information from the State.
There are numerous tutorials on running OpenMM simulations, but I want to focus on building the OpenMM objects and everything before you need to think about Integrators or States, as this is key for builting interoperability between molecular modelling software.
import simtk.unit as unit
import simtk.openmm as openmm
In this bare-bones model, we will just create an OpenMM.System
object, and the only forces interacting in the system will be the OpenMM.NonbondedForce
. After we add the force
to the system
, we are returned the index of the force
- if you wanted to find it within our system
via system.getForces()
, which is a list of force
objects. Credit to the OpenMM documentation
system = openmm.System() # Create the openmm System
nonbonded_force = openmm.NonbondedForce() # Create the Force object, specifically, a NonbondedForce object
print(system.addForce(nonbonded_force)) # Returns the index of the force we just added
print(system.getForces())
As a brief foray into python-C++ interfaces, these two objects have slightly different (python) addresses, but we will see that they refer to the same C++ object
print(system.getForce(0))
print(nonbonded_force)
Next, we will start creating our particles and nonbonded interaction parameters. This code is contrived for sake of example, but you can imagine there are more sophisticated and relevant ways to add positions, masses, or nonbonded parameters
import itertools as it
import numpy as np
positions = [] # Create a running list of positions
for x,y,z in it.product([0,1,2], repeat=3): # Looping through a 3-dimensional grid, 27 coordinates
# Add to our running list of positions
# Note that these are just ints, we will have to turn them into simtk.Quantity later
positions.append([x,y,z])
# Add the particle's mass to the System object
system.addParticle(39.95 * unit.amu)
# Add nonbonded parameters to our NonbondedForce object - charge, LJ sigma, LJ epsilon
nonbonded_force.addParticle(0*unit.elementary_charge,
0.3350 * unit.nanometer,
0.996 * unit.kilojoule_per_mole)
We can compare the two force
objects from earlier - the NonbondedForce
we created from code and the NonbondedForce
that is returned when we access our system
. Both refer to the same underlying OpenMM.NonbondedForce
object and will reflect the same information. These are just two ways of accessing this object. The system
also agrees with the number of particles we have added.
(system.getForce(0).getNumParticles(), nonbonded_force.getNumParticles(), system.getNumParticles())
The next object to deal with is the OpenMM.Context
, which specifies positions. First we need to convert our list of coordinates into a more-tractable numpy.ndarray
of coordinates, and then turn that into a simtk.Quantity
of our coordinates. Additionally, the OpenMM.Context
constructor requires an integrator (at this point we are trying to build our simulation), and then we can specify the positions within that context
np_positions = np.asarray(positions)
unit_positions = np_positions * unit.nanometer
type(np_positions), type(unit_positions)
integrator = openmm.VerletIntegrator(1.0) # 1 ps timestep
context = openmm.Context(system, integrator) # create context
context.setPositions(unit_positions) # specify positions within context
We can parse some information about our context
, and this is done by getting the state
of our context
.
Note how the time is 0.0 ps (we haven't run our simulation at all).
But we can also parse the potential energy of our context - this is the potential energy given the positions we initialized and forces we specified.
print(context.getState().getTime())
print(context.getState(getEnergy=True).getPotentialEnergy())
What happens to our state
after we've run for some amount of time? We will run for 10 time steps (or 10 ps since our timestep is 1 ps). We can see the the time
reported by our state
has changed, and so has the potentialEnergy
integrator.step(10) # Run for 10 timesteps
print(context.getState().getTime())
print(context.getState(getEnergy=True).getPotentialEnergy())
type(system), type(context), type(integrator), type(nonbonded_force)
This summarizes how system
, force
, context
, state
, and integrator
objects interact with each other within the OpenMM API. Side note, observe where in the API these are stored - at the base level openmm.XYZ
, this next section will move "up a level" to some objects and API that build off these base level API
More practical OpenMM simulations
We just talked about some of the base-layer objects within OpenMM, but often people will "wrap" those base layer objects within an OpenMM.Simulation
object, pass topological (bonding + box information) through a openmm.Topology
object, attach reporter
objects, and then run the simulation.
The Simulation
wraps the topology
, system
, integrator
, and hardware platforms and implicitly creates the Context
.
The Topology
contains information about the atoms, bonds, chains, and residues within your system, in addition to box information.
Reporter objects are used to print/save various information about the trajectory.
Here's some contrived code to quickly make an ethane molecule, atomtype, and parametrize according to OPLSAA
import mbuild as mb
import foyer
import parmed as pmd
from mbuild.examples import Ethane
cmpd = Ethane() # mbuild compound
ff = foyer.Forcefield(name='oplsaa') # foyer forcefield
structure = ff.apply(cmpd) # apply forcefield to compound to get a pmd.Structure
Now we have a parmed.Structure
that has atomtypes and force field parameters.
Conveniently, parmed.Structure
can quickly create an openmm.app.topology
object, and we can see some basic information like numbers of atoms and bonds.
It's also worth observing that this is openmm.app.topology
, within the "application layer", one level above the base layer
print(structure.topology) # the parmed structure can create the openmm topology
print(type(structure.topology))
[a for a in structure.topology.atoms()]
We can now build out some other relevant features of running a simulation
system = structure.createSystem() # the parmed structure can create the openmm system
integrator = openmm.VerletIntegrator(1.0) # create another openmm integrator
Putting it all together, we make our Simluation
object. Once again, note how this is within the app
layer
simulation = openmm.app.Simulation(structure.topology, system, integrator)
type(simulation)
After creating the Simulation
object, we have access to the Context
related to the System
and Integrator
simulation.context
Once again, we need to specify the positions. Fortunately, the parmed.Structure
already uses simtk.Quantity
for its positions.
simulation.context.setPositions(structure.positions)
Before running the simulation, we can get some State
information related to this Context
simulation.context.getState().getTime()
We can now run this simulation and observe that the State
changes
simulation.step(10)
simulation.context.getState().getTime()
The application layer to interact with OpenMM
The OpenMM application layer is largely everything you would need to build and run a simulation with OpenMM, with some compatibility with files from other MD engines. The application layer was built on top of the base library that housed the core OpenMM classes.
Summary
OpenMM is a flexible library and API for molecular modelling. It has well-designed classes wrapped in convenience API for users, while supporting hardware/GPU acceleration with minimial user effort. This may just be me, but I found learning the "vocabulary" and distinction between the base classes was a little hard to understand, but this kind of issue addresses itself over time if one plays around with the API. I am interested to investigate how well one can use OpenMM to build a variety of molecular models and how OpenMM can interface with interconversion libraries such as ParmEd to facilitate engine-flexibility. The devil is always in the details, so building your molecular model in OpenMM or ParmEd is always going to require due diligence to ensure correct output.
Using this notebook
All notebooks within this webiste/repo can be found here