mdtraj.formats.HDF5TrajectoryFile

class mdtraj.formats.HDF5TrajectoryFile(filename, mode='r', force_overwrite=True, compression='zlib')

Interface for reading and writing to a MDTraj HDF5 molecular dynamics trajectory file, whose format is described here.

This is a file-like object, that both reading or writing depending on the mode flag. It implements the context manager protocol, so you can also use it with the python ‘with’ statement.

The format is extremely flexible and high performance. It can hold a wide variety of information about a trajectory, including fields like the temperature and energies. Because it’s built on the fantastic HDF5 library, it’s easily extensible too.

Parameters
filenamestr

Path to the file to open

mode{‘r, ‘w’}

Mode in which to open the file. ‘r’ is for reading and ‘w’ is for writing

force_overwritebool

In mode=’w’, how do you want to behave if a file by the name of filename already exists? if force_overwrite=True, it will be overwritten.

compression{‘zlib’, None}

Apply compression to the file? This will save space, and does not cost too many cpu cycles, so it’s recommended.

See also

mdtraj.load_hdf5

High-level wrapper that returns a md.Trajectory

Attributes
root

Direct access to the root group of the underlying Tables HDF5 file handle.

title

User-defined title for the data represented in the file

application

Suite of programs that created the file

topology

Get the topology out from the file

randomState

State of the creators internal random number generator at the start of the simulation

forcefield

Description of the hamiltonian used.

reference

A published reference that documents the program or parameters used to generate the data

constraints

Constraints applied to the bond lengths

Methods

close(self)

Close the HDF5 file handle

flush(self)

Write all buffered data in the to the disk file.

read(self[, n_frames, stride, atom_indices])

Read one or more frames of data from the file

read_as_traj(self[, n_frames, stride, …])

Read a trajectory from the HDF5 file

seek(self, offset[, whence])

Move to a new file position

tell(self)

Current file position

write(self, coordinates[, time, …])

Write one or more frames of data to the file

__init__(self, filename, mode='r', force_overwrite=True, compression='zlib')

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(self, filename[, mode, …])

Initialize self.

close(self)

Close the HDF5 file handle

flush(self)

Write all buffered data in the to the disk file.

read(self[, n_frames, stride, atom_indices])

Read one or more frames of data from the file

read_as_traj(self[, n_frames, stride, …])

Read a trajectory from the HDF5 file

seek(self, offset[, whence])

Move to a new file position

tell(self)

Current file position

write(self, coordinates[, time, …])

Write one or more frames of data to the file

Attributes

application

Suite of programs that created the file

constraints

Constraints applied to the bond lengths

distance_unit

forcefield

Description of the hamiltonian used.

randomState

State of the creators internal random number generator at the start of the simulation

reference

A published reference that documents the program or parameters used to generate the data

root

Direct access to the root group of the underlying Tables HDF5 file handle.

title

User-defined title for the data represented in the file

topology

Get the topology out from the file

property application

Suite of programs that created the file

close(self)

Close the HDF5 file handle

property constraints

Constraints applied to the bond lengths

Returns
constraints{None, np.array, dtype=[(‘atom1’, ‘<i4’), (‘atom2’, ‘<i4’), (‘distance’, ‘<f4’)])}

A one dimensional array with the a int, int, float type giving the index of the two atoms involved in the constraints and the distance of the constraint. If no constraint information is in the file, the return value is None.

flush(self)

Write all buffered data in the to the disk file.

property forcefield

Description of the hamiltonian used. A short, human readable string, like AMBER99sbildn.

property randomState

State of the creators internal random number generator at the start of the simulation

read(self, n_frames=None, stride=None, atom_indices=None)

Read one or more frames of data from the file

Parameters
n_frames{int, None}

The number of frames to read. If not supplied, all of the remaining frames will be read.

stride{int, None}

By default all of the frames will be read, but you can pass this flag to read a subset of of the data by grabbing only every stride-th frame from disk.

atom_indices{int, None}

By default all of the atom will be read, but you can pass this flag to read only a subsets of the atoms for the coordinates and velocities fields. Note that you will have to carefully manage the indices and the offsets, since the i-th atom in the topology will not necessarily correspond to the i-th atom in your subset.

Returns
framesnamedtuple

The returned namedtuple will have the fields “coordinates”, “time”, “cell_lengths”, “cell_angles”, “velocities”, “kineticEnergy”, “potentialEnergy”, “temperature” and “alchemicalLambda”. Each of the fields in the returned namedtuple will either be a numpy array or None, dependening on if that data was saved in the trajectory. All of the data shall be n units of “nanometers”, “picoseconds”, “kelvin”, “degrees” and “kilojoules_per_mole”.

Notes

If you’d like more flexible access to the data, that is available by using the pytables group directly, which is accessible via the root property on this class.

read_as_traj(self, n_frames=None, stride=None, atom_indices=None)

Read a trajectory from the HDF5 file

Parameters
n_frames{int, None}

The number of frames to read. If not supplied, all of the remaining frames will be read.

stride{int, None}

By default all of the frames will be read, but you can pass this flag to read a subset of of the data by grabbing only every stride-th frame from disk.

atom_indices{int, None}

By default all of the atom will be read, but you can pass this flag to read only a subsets of the atoms for the coordinates and velocities fields. Note that you will have to carefully manage the indices and the offsets, since the i-th atom in the topology will not necessarily correspond to the i-th atom in your subset.

Returns
trajectoryTrajectory

A trajectory object containing the loaded portion of the file.

property reference

A published reference that documents the program or parameters used to generate the data

property root

Direct access to the root group of the underlying Tables HDF5 file handle.

This can be used for random or specific access to the underlying arrays on disk

seek(self, offset, whence=0)

Move to a new file position

Parameters
offsetint

A number of frames.

whence{0, 1, 2}

0: offset from start of file, offset should be >=0. 1: move relative to the current position, positive or negative 2: move relative to the end of file, offset should be <= 0. Seeking beyond the end of a file is not supported

tell(self)

Current file position

Returns
offsetint

The current frame in the file.

property title

User-defined title for the data represented in the file

property topology

Get the topology out from the file

Returns
topologymdtraj.Topology

A topology object

write(self, coordinates, time=None, cell_lengths=None, cell_angles=None, velocities=None, kineticEnergy=None, potentialEnergy=None, temperature=None, alchemicalLambda=None)

Write one or more frames of data to the file

This method saves data that is associated with one or more simulation frames. Note that all of the arguments can either be raw numpy arrays or unitted arrays (with simtk.unit.Quantity). If the arrays are unittted, a unit conversion will be automatically done from the supplied units into the proper units for saving on disk. You won’t have to worry about it.

Furthermore, if you wish to save a single frame of simulation data, you can do so naturally, for instance by supplying a 2d array for the coordinates and a single float for the time. This “shape deficiency” will be recognized, and handled appropriately.

Parameters
coordinatesnp.ndarray, shape=(n_frames, n_atoms, 3)

The cartesian coordinates of the atoms to write. By convention, the lengths should be in units of nanometers.

timenp.ndarray, shape=(n_frames,), optional

You may optionally specify the simulation time, in picoseconds corresponding to each frame.

cell_lengthsnp.ndarray, shape=(n_frames, 3), dtype=float32, optional

You may optionally specify the unitcell lengths. The length of the periodic box in each frame, in each direction, a, b, c. By convention the lengths should be in units of angstroms.

cell_anglesnp.ndarray, shape=(n_frames, 3), dtype=float32, optional

You may optionally specify the unitcell angles in each frame. Organized analogously to cell_lengths. Gives the alpha, beta and gamma angles respectively. By convention, the angles should be in units of degrees.

velocitiesnp.ndarray, shape=(n_frames, n_atoms, 3), optional

You may optionally specify the cartesian components of the velocity for each atom in each frame. By convention, the velocities should be in units of nanometers / picosecond.

kineticEnergynp.ndarray, shape=(n_frames,), optional

You may optionally specify the kinetic energy in each frame. By convention the kinetic energies should b in units of kilojoules per mole.

potentialEnergynp.ndarray, shape=(n_frames,), optional

You may optionally specify the potential energy in each frame. By convention the kinetic energies should b in units of kilojoules per mole.

temperaturenp.ndarray, shape=(n_frames,), optional

You may optionally specify the temperature in each frame. By convention the temperatures should b in units of Kelvin.

alchemicalLambdanp.ndarray, shape=(n_frames,), optional

You may optionally specify the alchemical lambda in each frame. These have no units, but are generally between zero and one.