Molecular Modeling Software: OpenMM

11 minute read

Published:

OpenMM

OpenMM is “A high performance toolkit for molecular simulation. Use it as a library, or as an application. We include extensive language bindings for Python, C, C++, and even Fortran. The code is open source and actively maintained on Github, licensed under MIT and LGPL. Part of the Omnia suite of tools for predictive biomolecular simulation.” Here’s their GitHub repo, and Conda link, though I think they might be relocating their channel to conda-forge.

Here’s my opinion, OpenMM is a very powerful, flexible engine that has integration with a variety of other MD engines, supports a variety of molecular models, excellent GPU support, active open-source development, and is the underlying molecular dynamics engine for OpenForceField efforts, but very easy to port to other MD engines via ParmEd. There’s also support for enhanced sampling and integration with deep learning libraries. If there was a 21st century, best-software-practices, open-source software for molecular modeling and simulation, OpenMM (or HOOMD) would likely be it.

In reality, I’m not sure how many graduate students/academic labs opt to use OpenMM if the lab has historically used another MD engine. Also, this is a somewhat unfounded observation, but I’m curious if/how much the computer-aided drug design industry has adopted the use of OpenMM. More editorializing, but my graduate work never brought me into tight overlap with the OpenMM world/community, but it certainly seems like a vibrant community that is pushing the development and popularity of molecular modeling and simulation

The OpenMM Public API

I’m mainly summarizing and regurgitating the OpenMM documentation. These are some important terms to know within the OpenMM API:

  • System - this object stores information about numbers of particles, particle masses, box information, constraints, and virtual site information. Note the lack of positions, bonding information, integrators, simulation run parameters. The System object also contains your Forces.
  • Force - Force objects describe how your particles interact with each other. This is where your force field gets implemented - outlining the molecular model forces in play, the treatment of long range interactions, and even your barostat. This is, broadly, what a Force object is, but there is much more in the details of specific Force objects, like an openmm.HarmonicBondForce.
    • Upon implementation, it’s interesting to note that the “Container” is the Force object, and it contains the parameters and particles that obey this force. Sort of turning this concept upside-down, Parmed’s atoms and bonds are the objects that contain the interaction parameters of that force.
  • Integrator - This is the integration algorithm by which you progress your particle’s positions and simulation over time.
  • Context - this object stores information about your particle coordinates, velocities, and specially-defined/parametrized Forces. When you run an actual simulation or produce a trajectory, you will have to start from a Context. Contexts contain information about integrators, which helps distinguish information about your molecular model of your System (forces, masses) from the things that will be used to run your simulation.
  • State - this is like a single frame/snapshot/checkpoint within your simulation. It’s everything that was being calculated at that particular timestep. If you want peer into your simulation, you will be looking at its State. If you want to report some information, you will be parsing information from the State.

There are numerous tutorials on running OpenMM simulations, but I want to focus on building the OpenMM objects and everything before you need to think about Integrators or States, as this is key for builting interoperability between molecular modeling software.

import simtk.unit as unit
import simtk.openmm as openmm

In this bare-bones model, we will just create an OpenMM.System object, and the only forces interacting in the system will be the OpenMM.NonbondedForce. After we add the force to the system, we are returned the index of the force - if you wanted to find it within our system via system.getForces(), which is a list of force objects. Credit to the OpenMM documentation

system = openmm.System() # Create the openmm System

nonbonded_force = openmm.NonbondedForce() # Create the Force object, specifically, a NonbondedForce object
print(system.addForce(nonbonded_force)) # Returns the index of the force we just added
print(system.getForces())
0
[<simtk.openmm.openmm.NonbondedForce; proxy of <Swig Object of type 'OpenMM::NonbondedForce *' at 0x107166510> >]

As a brief foray into python-C++ interfaces, these two objects have slightly different (python) addresses, but we will see that they refer to the same C++ object

print(system.getForce(0))
print(nonbonded_force)
<simtk.openmm.openmm.NonbondedForce; proxy of <Swig Object of type 'OpenMM::NonbondedForce *' at 0x1071664e0> >
<simtk.openmm.openmm.NonbondedForce; proxy of <Swig Object of type 'OpenMM::NonbondedForce *' at 0x103e46fc0> >

Next, we will start creating our particles and nonbonded interaction parameters. This code is contrived for sake of example, but you can imagine there are more sophisticated and relevant ways to add positions, masses, or nonbonded parameters

import itertools as it
import numpy as np
positions = [] # Create a running list of positions
for x,y,z in it.product([0,1,2], repeat=3): # Looping through a 3-dimensional grid, 27 coordinates
    # Add to our running list of positions
    # Note that these are just ints, we will have to turn them into simtk.Quantity later
    positions.append([x,y,z]) 
    
    # Add the particle's mass to the System object
    system.addParticle(39.95 * unit.amu)
    
    # Add nonbonded parameters to our NonbondedForce object - charge, LJ sigma, LJ epsilon
    nonbonded_force.addParticle(0*unit.elementary_charge, 
                                0.3350 * unit.nanometer,
                               0.996 * unit.kilojoule_per_mole)

We can compare the two force objects from earlier - the NonbondedForce we created from code and the NonbondedForce that is returned when we access our system. Both refer to the same underlying OpenMM.NonbondedForce object and will reflect the same information. These are just two ways of accessing this object. The system also agrees with the number of particles we have added.

(system.getForce(0).getNumParticles(), nonbonded_force.getNumParticles(), system.getNumParticles())
(27, 27, 27)

The next object to deal with is the OpenMM.Context, which specifies positions. First we need to convert our list of coordinates into a more-tractable numpy.ndarray of coordinates, and then turn that into a simtk.Quantity of our coordinates. Additionally, the OpenMM.Context constructor requires an integrator (at this point we are trying to build our simulation), and then we can specify the positions within that context

np_positions = np.asarray(positions)
unit_positions = np_positions * unit.nanometer
type(np_positions), type(unit_positions)
integrator = openmm.VerletIntegrator(1.0) # 1 ps timestep
context = openmm.Context(system, integrator) # create context
context.setPositions(unit_positions) # specify positions within context

We can parse some information about our context, and this is done by getting the state of our context. Note how the time is 0.0 ps (we haven’t run our simulation at all). But we can also parse the potential energy of our context - this is the potential energy given the positions we initialized and forces we specified.

print(context.getState().getTime())
print(context.getState(getEnergy=True).getPotentialEnergy())
0.0 ps
-0.3682566285133362 kJ/mol

What happens to our state after we’ve run for some amount of time? We will run for 10 time steps (or 10 ps since our timestep is 1 ps). We can see the the time reported by our state has changed, and so has the potentialEnergy

integrator.step(10) # Run for 10 timesteps
print(context.getState().getTime())
print(context.getState(getEnergy=True).getPotentialEnergy())
10.0 ps
-0.5352763533592224 kJ/mol
type(system), type(context), type(integrator), type(nonbonded_force)
(simtk.openmm.openmm.System,
 simtk.openmm.openmm.Context,
 simtk.openmm.openmm.VerletIntegrator,
 simtk.openmm.openmm.NonbondedForce)

This summarizes how system, force, context, state, and integrator objects interact with each other within the OpenMM API. Side note, observe where in the API these are stored - at the base level openmm.XYZ, this next section will move “up a level” to some objects and API that build off these base level API

More practical OpenMM simulations

We just talked about some of the base-layer objects within OpenMM, but often people will “wrap” those base layer objects within an OpenMM.Simulation object, pass topological (bonding + box information) through a openmm.Topology object, attach reporter objects, and then run the simulation.

The Simulation wraps the topology, system, integrator, and hardware platforms and implicitly creates the Context.

The Topology contains information about the atoms, bonds, chains, and residues within your system, in addition to box information.

Reporter objects are used to print/save various information about the trajectory.

Here’s some contrived code to quickly make an ethane molecule, atomtype, and parametrize according to OPLSAA

import mbuild as mb
import foyer
import parmed as pmd
from mbuild.examples import Ethane
cmpd = Ethane() # mbuild compound
ff = foyer.Forcefield(name='oplsaa') # foyer forcefield
structure = ff.apply(cmpd) # apply forcefield to compound to get a pmd.Structure
/Users/ayang41/Programs/foyer/foyer/validator.py:132: ValidationWarning: You have empty smart definition(s)
  warn("You have empty smart definition(s)", ValidationWarning)
/Users/ayang41/Programs/foyer/foyer/forcefield.py:248: UserWarning: Parameters have not been assigned to all impropers. Total system impropers: 8, Parameterized impropers: 0. Note that if your system contains torsions of Ryckaert-Bellemans functional form, all of these torsions are processed as propers
  warnings.warn(msg)

Now we have a parmed.Structure that has atomtypes and force field parameters. Conveniently, parmed.Structure can quickly create an openmm.app.topology object, and we can see some basic information like numbers of atoms and bonds. It’s also worth observing that this is openmm.app.topology, within the “application layer”, one level above the base layer

print(structure.topology) # the parmed structure can create the openmm topology
print(type(structure.topology))
[a for a in structure.topology.atoms()]
<Topology; 1 chains, 1 residues, 8 atoms, 7 bonds>
<class 'simtk.openmm.app.topology.Topology'>





[<Atom 0 (C) of chain 0 residue 0 (RES)>,
 <Atom 1 (H) of chain 0 residue 0 (RES)>,
 <Atom 2 (H) of chain 0 residue 0 (RES)>,
 <Atom 3 (H) of chain 0 residue 0 (RES)>,
 <Atom 4 (C) of chain 0 residue 0 (RES)>,
 <Atom 5 (H) of chain 0 residue 0 (RES)>,
 <Atom 6 (H) of chain 0 residue 0 (RES)>,
 <Atom 7 (H) of chain 0 residue 0 (RES)>]

We can now build out some other relevant features of running a simulation

system = structure.createSystem() # the parmed structure can create the openmm system
integrator = openmm.VerletIntegrator(1.0) # create another openmm integrator

Putting it all together, we make our Simluation object. Once again, note how this is within the app layer

simulation = openmm.app.Simulation(structure.topology, system, integrator)
type(simulation)
simtk.openmm.app.simulation.Simulation

After creating the Simulation object, we have access to the Context related to the System and Integrator

simulation.context
<simtk.openmm.openmm.Context; proxy of <Swig Object of type 'OpenMM::Context *' at 0x1153785d0> >

Once again, we need to specify the positions. Fortunately, the parmed.Structure already uses simtk.Quantity for its positions.

simulation.context.setPositions(structure.positions)

Before running the simulation, we can get some State information related to this Context

simulation.context.getState().getTime()
Quantity(value=0.0, unit=picosecond)

We can now run this simulation and observe that the State changes

simulation.step(10)
simulation.context.getState().getTime()
Quantity(value=10.0, unit=picosecond)

The application layer to interact with OpenMM

The OpenMM application layer is largely everything you would need to build and run a simulation with OpenMM, with some compatibility with files from other MD engines. The application layer was built on top of the base library that housed the core OpenMM classes.

Summary

OpenMM is a flexible library and API for molecular modeling. It has well-designed classes wrapped in convenience API for users, while supporting hardware/GPU acceleration with minimial user effort. This may just be me, but I found learning the “vocabulary” and distinction between the base classes was a little hard to understand, but this kind of issue addresses itself over time if one plays around with the API. I am interested to investigate how well one can use OpenMM to build a variety of molecular models and how OpenMM can interface with interconversion libraries such as ParmEd to facilitate engine-flexibility. The devil is always in the details, so building your molecular model in OpenMM or ParmEd is always going to require due diligence to ensure correct output.

Using this notebook

All notebooks within this website/repo can be found here