mdtraj.formats.HDF5TrajectoryFile

class mdtraj.formats.HDF5TrajectoryFile(filename, mode='r', force_overwrite=True, compression='zlib')

Interface for reading and writing to a MDTraj HDF5 molecular dynamics trajectory file, whose format is described here.

This is a file-like object, that both reading or writing depending on the mode flag. It implements the context manager protocol, so you can also use it with the python ‘with’ statement.

The format is extremely flexible and high performance. It can hold a wide variety of information about a trajectory, including fields like the temperature and energies. Because it’s built on the fantastic HDF5 library, it’s easily extensible too.

Parameters:

filename : str

Path to the file to open

mode : {‘r, ‘w’}

Mode in which to open the file. ‘r’ is for reading and ‘w’ is for writing

force_overwrite : bool

In mode=’w’, how do you want to behave if a file by the name of filename already exists? if force_overwrite=True, it will be overwritten.

compression : {‘zlib’, None}

Apply compression to the file? This will save space, and does not cost too many cpu cycles, so it’s recommended.

See also

mdtraj.load_hdf5
High-level wrapper that returns a md.Trajectory

Attributes

root Direct access to the root group of the underlying Tables HDF5 file handle.
title User-defined title for the data represented in the file
application Suite of programs that created the file
topology Get the topology out from the file
randomState State of the creators internal random number generator at the start of the simulation
forcefield Description of the hamiltonian used.
reference A published reference that documents the program or parameters used to generate the data
constraints Constraints applied to the bond lengths

Methods

close() Close the HDF5 file handle
flush() Write all buffered data in the to the disk file.
read([n_frames, stride, atom_indices]) Read one or more frames of data from the file
read_as_traj([n_frames, stride, atom_indices]) Read a trajectory from the HDF5 file
seek(offset[, whence]) Move to a new file position
tell() Current file position
write(coordinates[, time, cell_lengths, …]) Write one or more frames of data to the file
__init__(filename, mode='r', force_overwrite=True, compression='zlib')

Methods

__init__(filename[, mode, force_overwrite, …])
close() Close the HDF5 file handle
flush() Write all buffered data in the to the disk file.
read([n_frames, stride, atom_indices]) Read one or more frames of data from the file
read_as_traj([n_frames, stride, atom_indices]) Read a trajectory from the HDF5 file
seek(offset[, whence]) Move to a new file position
tell() Current file position
write(coordinates[, time, cell_lengths, …]) Write one or more frames of data to the file
application

Suite of programs that created the file

close()

Close the HDF5 file handle

constraints

Constraints applied to the bond lengths

Returns:

constraints : {None, np.array, dtype=[(‘atom1’, ‘<i4’), (‘atom2’, ‘<i4’), (‘distance’, ‘<f4’)])}

A one dimensional array with the a int, int, float type giving the index of the two atoms involved in the constraints and the distance of the constraint. If no constraint information is in the file, the return value is None.

flush()

Write all buffered data in the to the disk file.

forcefield

Description of the hamiltonian used. A short, human readable string, like AMBER99sbildn.

randomState

State of the creators internal random number generator at the start of the simulation

read(n_frames=None, stride=None, atom_indices=None)

Read one or more frames of data from the file

Parameters:

n_frames : {int, None}

The number of frames to read. If not supplied, all of the remaining frames will be read.

stride : {int, None}

By default all of the frames will be read, but you can pass this flag to read a subset of of the data by grabbing only every stride-th frame from disk.

atom_indices : {int, None}

By default all of the atom will be read, but you can pass this flag to read only a subsets of the atoms for the coordinates and velocities fields. Note that you will have to carefully manage the indices and the offsets, since the i-th atom in the topology will not necessarily correspond to the i-th atom in your subset.

Returns:

frames : namedtuple

The returned namedtuple will have the fields “coordinates”, “time”, “cell_lengths”, “cell_angles”, “velocities”, “kineticEnergy”, “potentialEnergy”, “temperature” and “alchemicalLambda”. Each of the fields in the returned namedtuple will either be a numpy array or None, dependening on if that data was saved in the trajectory. All of the data shall be n units of “nanometers”, “picoseconds”, “kelvin”, “degrees” and “kilojoules_per_mole”.

Notes

If you’d like more flexible access to the data, that is available by using the pytables group directly, which is accessible via the root property on this class.

read_as_traj(n_frames=None, stride=None, atom_indices=None)

Read a trajectory from the HDF5 file

Parameters:

n_frames : {int, None}

The number of frames to read. If not supplied, all of the remaining frames will be read.

stride : {int, None}

By default all of the frames will be read, but you can pass this flag to read a subset of of the data by grabbing only every stride-th frame from disk.

atom_indices : {int, None}

By default all of the atom will be read, but you can pass this flag to read only a subsets of the atoms for the coordinates and velocities fields. Note that you will have to carefully manage the indices and the offsets, since the i-th atom in the topology will not necessarily correspond to the i-th atom in your subset.

Returns:

trajectory : Trajectory

A trajectory object containing the loaded portion of the file.

reference

A published reference that documents the program or parameters used to generate the data

root

Direct access to the root group of the underlying Tables HDF5 file handle.

This can be used for random or specific access to the underlying arrays on disk

seek(offset, whence=0)

Move to a new file position

Parameters:

offset : int

A number of frames.

whence : {0, 1, 2}

0: offset from start of file, offset should be >=0. 1: move relative to the current position, positive or negative 2: move relative to the end of file, offset should be <= 0. Seeking beyond the end of a file is not supported

tell()

Current file position

Returns:

offset : int

The current frame in the file.

title

User-defined title for the data represented in the file

topology

Get the topology out from the file

Returns:

topology : mdtraj.Topology

A topology object

write(coordinates, time=None, cell_lengths=None, cell_angles=None, velocities=None, kineticEnergy=None, potentialEnergy=None, temperature=None, alchemicalLambda=None)

Write one or more frames of data to the file

This method saves data that is associated with one or more simulation frames. Note that all of the arguments can either be raw numpy arrays or unitted arrays (with simtk.unit.Quantity). If the arrays are unittted, a unit conversion will be automatically done from the supplied units into the proper units for saving on disk. You won’t have to worry about it.

Furthermore, if you wish to save a single frame of simulation data, you can do so naturally, for instance by supplying a 2d array for the coordinates and a single float for the time. This “shape deficiency” will be recognized, and handled appropriately.

Parameters:

coordinates : np.ndarray, shape=(n_frames, n_atoms, 3)

The cartesian coordinates of the atoms to write. By convention, the lengths should be in units of nanometers.

time : np.ndarray, shape=(n_frames,), optional

You may optionally specify the simulation time, in picoseconds corresponding to each frame.

cell_lengths : np.ndarray, shape=(n_frames, 3), dtype=float32, optional

You may optionally specify the unitcell lengths. The length of the periodic box in each frame, in each direction, a, b, c. By convention the lengths should be in units of angstroms.

cell_angles : np.ndarray, shape=(n_frames, 3), dtype=float32, optional

You may optionally specify the unitcell angles in each frame. Organized analogously to cell_lengths. Gives the alpha, beta and gamma angles respectively. By convention, the angles should be in units of degrees.

velocities : np.ndarray, shape=(n_frames, n_atoms, 3), optional

You may optionally specify the cartesian components of the velocity for each atom in each frame. By convention, the velocities should be in units of nanometers / picosecond.

kineticEnergy : np.ndarray, shape=(n_frames,), optional

You may optionally specify the kinetic energy in each frame. By convention the kinetic energies should b in units of kilojoules per mole.

potentialEnergy : np.ndarray, shape=(n_frames,), optional

You may optionally specify the potential energy in each frame. By convention the kinetic energies should b in units of kilojoules per mole.

temperature : np.ndarray, shape=(n_frames,), optional

You may optionally specify the temperature in each frame. By convention the temperatures should b in units of Kelvin.

alchemicalLambda : np.ndarray, shape=(n_frames,), optional

You may optionally specify the alchemical lambda in each frame. These have no units, but are generally between zero and one.