mdtraj.Topology

class mdtraj.Topology

Topology stores the topological information about a system.

The structure of a Topology object is similar to that of a PDB file. It consists of a set of Chains (often but not always corresponding to polymer chains). Each Chain contains a set of Residues, and each Residue contains a set of Atoms. In addition, the Topology stores a list of which atom pairs are bonded to each other.

Atom and residue names should follow the PDB 3.0 nomenclature for all molecules for which one exists.

Examples

>>> topology = md.load('example.pdb').topology
>>> print(topology)
<mdtraj.Topology with 1 chains, 3 residues, 22 atoms, 21 bonds at 0x105a98e90>
>>> table, bonds = topology.to_dataframe()
>>> print(table.head())
   serial name element  resSeq resName  chainID
0       0   H1       H       1     CYS        0
1       1  CH3       C       1     CYS        0
2       2   H2       H       1     CYS        0
3       3   H3       H       1     CYS        0
4       4    C       C       1     CYS        0
>>> # rename residue "CYS" to "CYSS"
>>> table[table['residue'] == 'CYS']['residue'] = 'CYSS'
>>> print(table.head())
   serial name element  resSeq resName   chainID
0       0   H1       H       1     CYSS        0
1       1  CH3       C       1     CYSS        0
2       2   H2       H       1     CYSS        0
3       3   H3       H       1     CYSS        0
4       4    C       C       1     CYSS        0
>>> t2 = md.Topology.from_dataframe(table, bonds)

Attributes

chains Iterator over all Chains in the Topology.
residues Iterator over all Residues in the Topology.
atoms Iterator over all Atoms in the Topology.

Methods

add_atom(name, element, residue[, serial]) Create a new Atom and add it to the Topology.
add_bond(atom1, atom2) Create a new bond and add it to the Topology.
add_chain() Create a new Chain and add it to the Topology.
add_residue(name, chain[, resSeq, segment_id]) Create a new Residue and add it to the Topology.
atom(index) Get a specific atom by index.
atoms_by_name(name) Iterator over all Atoms in the Topology with a specified name
chain(index) Get a specific chain by index.
copy() Return a copy of the topology
create_disulfide_bonds(positions) Identify disulfide bonds based on proximity and add them to the Topology.
create_standard_bonds() Create bonds based on the atom and residue names for all standard residue types.
find_molecules() Identify molecules based on bonds.
from_dataframe(atoms[, bonds]) Create a mdtraj topology from a pandas data frame
from_openmm(value) Create a mdtraj topology from an OpenMM topology
guess_anchor_molecules() Guess anchor molecules for imaging
join(other) Join two topologies together
residue(index) Get a specific residue by index.
select(selection_string) Execute a selection against the topology
select_atom_indices([selection]) Get the indices of biologically-relevant groups by name.
select_expression(selection_string) Translate a atom selection expression into a pure python expression.
select_pairs([selection1, selection2]) Generate unique pairs of atom indices.
subset(atom_indices) Create a new Topology from a subset of the atoms in an existing topology.
to_bondgraph() Create a NetworkX graph from the atoms and bonds in this topology
to_dataframe() Convert this topology into a pandas dataframe
to_fasta([chain]) Convert this topology into FASTA string
to_openmm([traj]) Convert this topology into OpenMM topology
__init__()

Create a new Topology object

Methods

__init__() Create a new Topology object
add_atom(name, element, residue[, serial]) Create a new Atom and add it to the Topology.
add_bond(atom1, atom2) Create a new bond and add it to the Topology.
add_chain() Create a new Chain and add it to the Topology.
add_residue(name, chain[, resSeq, segment_id]) Create a new Residue and add it to the Topology.
atom(index) Get a specific atom by index.
atoms_by_name(name) Iterator over all Atoms in the Topology with a specified name
chain(index) Get a specific chain by index.
copy() Return a copy of the topology
create_disulfide_bonds(positions) Identify disulfide bonds based on proximity and add them to the Topology.
create_standard_bonds() Create bonds based on the atom and residue names for all standard residue types.
find_molecules() Identify molecules based on bonds.
from_dataframe(atoms[, bonds]) Create a mdtraj topology from a pandas data frame
from_openmm(value) Create a mdtraj topology from an OpenMM topology
guess_anchor_molecules() Guess anchor molecules for imaging
join(other) Join two topologies together
residue(index) Get a specific residue by index.
select(selection_string) Execute a selection against the topology
select_atom_indices([selection]) Get the indices of biologically-relevant groups by name.
select_expression(selection_string) Translate a atom selection expression into a pure python expression.
select_pairs([selection1, selection2]) Generate unique pairs of atom indices.
subset(atom_indices) Create a new Topology from a subset of the atoms in an existing topology.
to_bondgraph() Create a NetworkX graph from the atoms and bonds in this topology
to_dataframe() Convert this topology into a pandas dataframe
to_fasta([chain]) Convert this topology into FASTA string
to_openmm([traj]) Convert this topology into OpenMM topology
add_atom(name, element, residue, serial=None)

Create a new Atom and add it to the Topology.

Parameters:

name : str

The name of the atom to add

element : mdtraj.element.Element

The element of the atom to add

residue : mdtraj.topology.Residue

The Residue to add it to

serial : int

Serial number associated with the atom.

Returns:

atom : mdtraj.topology.Atom

the newly created Atom

add_bond(atom1, atom2)

Create a new bond and add it to the Topology.

Parameters:

atom1 : mdtraj.topology.Atom

The first Atom connected by the bond

atom2 : mdtraj.topology.Atom

The second Atom connected by the bond

add_chain()

Create a new Chain and add it to the Topology.

Returns:

chain : mdtraj.topology.Chain

the newly created Chain

add_residue(name, chain, resSeq=None, segment_id='')

Create a new Residue and add it to the Topology.

Parameters:

name : str

The name of the residue to add

chain : mdtraj.topology.Chain

The Chain to add it to

resSeq : int, optional

Residue sequence number, such as from a PDB record. These sequence numbers are arbitrary, and do not necessarily start at 0 (or 1). If not supplied, the resSeq attribute will be set to the residue’s sequential (0 based) index.

segment_id : str, optional

A label for the segment to which this residue belongs

Returns:

residue : mdtraj.topology.Residue

The newly created Residue

atom(index)

Get a specific atom by index. These indices start from zero.

Parameters:

index : int

The index of the atom to select.

Returns:

atom : Atom

The index-th atom in the topology.

atoms

Iterator over all Atoms in the Topology.

Returns:

atomiter : generator

Iterator over all Atoms in the Topology.

atoms_by_name(name)

Iterator over all Atoms in the Topology with a specified name

Parameters:

name : str

The particular atom name of interest.

Returns:

atomiter : generator

Examples

>>> for atom in topology.atoms_by_name('CA'):
...     print(atom)
bonds

Iterator over all bonds (each represented as a tuple of two Atoms) in the Topology.

Returns:

atomiter : generator

Iterator over all tuple of Atoms in the Trajectory involved in a bond.

chain(index)

Get a specific chain by index. These indices start from zero.

Parameters:

index : int

The index of the chain to select.

Returns:

chain : Chain

The index-th chain in the topology.

chains

Iterator over all Chains in the Topology.

Returns:

chainiter : listiterator

Iterator over all Chains in the Topology.

copy()

Return a copy of the topology

Returns:

out : Topology

A copy of this topology

create_disulfide_bonds(positions)

Identify disulfide bonds based on proximity and add them to the Topology.

Parameters:

positions : list

The list of atomic positions based on which to identify bonded atoms

create_standard_bonds()

Create bonds based on the atom and residue names for all standard residue types.

find_molecules()

Identify molecules based on bonds.

A molecule is defined as a set of atoms that are connected to each other by bonds. This method uses the list of bonds to divide up the Topology’s atoms into molecules.

Returns:

molecules : list of sets

Each entry represents one molecule, and is the set of all Atoms in that molecule

classmethod from_dataframe(atoms, bonds=None)

Create a mdtraj topology from a pandas data frame

Parameters:

atoms : pandas.DataFrame

The atoms in the topology, represented as a data frame. This data frame should have columns “serial” (atom index), “name” (atom name), “element” (atom’s element), “resSeq” (index of the residue) “resName” (name of the residue), “chainID” (index of the chain), and optionally “segmentID”, following the same conventions as wwPDB 3.0 format.

bonds : np.ndarray, shape=(n_bonds, 2), dtype=int, optional

The bonds in the topology, represented as an n_bonds x 2 array of the indices of the atoms involved in each bond. Specifiying bonds here is optional. To create standard protein bonds, you can use create_standard_bonds to “fill in” the bonds on your newly created Topology object

classmethod from_openmm(value)

Create a mdtraj topology from an OpenMM topology

Parameters:

value : simtk.openmm.app.Topology

An OpenMM topology that you wish to convert to a mdtraj topology.

guess_anchor_molecules()

Guess anchor molecules for imaging

Returns:

anchor_molecules : list of atom sets

List of anchor molecules

join(other)

Join two topologies together

Parameters:

other : Topology

Another topology object

Returns:

out : Topology

A joint topology, with all of the atoms/residues/chains/bonds in each of the individual topologies

n_atoms

Get the number of atoms in the Topology

n_bonds

Get the number of bonds in the Topology

n_chains

Get the number of chains in the Topology

n_residues

Get the number of residues in the Topology.

residue(index)

Get a specific residue by index. These indices start from zero.

Parameters:

index : int

The index of the residue to select.

Returns:

residue : Residue

The index-th residue in the topology.

residues

Iterator over all Residues in the Topology.

Returns:

residueiter : generator

Iterator over all Residues in the Topology.

select(selection_string)

Execute a selection against the topology

Parameters:

selection_string : str

An expression in the MDTraj atom selection DSL

Returns:

indices : np.ndarray, dtype=int, ndim=1

Array of the indices of the atoms matching the selection expression.

See also

select_expression, mdtraj.core.selection.parse_selection

Examples

>>> topology.select('name O and water')
array([1, 3, 5, 10, ...])
select_atom_indices(selection='minimal')

Get the indices of biologically-relevant groups by name.

Parameters:

selection : {‘all’, ‘alpha’, ‘minimal’, ‘heavy’, ‘water’}

What types of atoms to select.

all

All atoms

alpha

Protein residue alpha carbons

minimal

Keep the atoms in protein residues with names in {CA, CB, C, N, O}

heavy

All non-hydrogen protein atoms.

water

Water oxygen atoms

Returns:

indices : np.ndarray (N,)

An array of the indices of the selected atoms.

select_expression(selection_string)

Translate a atom selection expression into a pure python expression.

Parameters:

selection_string : str

An expression in the MDTraj atom selection DSL

Returns:

python_string : str

A string containing a pure python expression, equivalent to the selection expression.

Examples

>>> topology.select_expression('name O and water')
"[atom.index for atom in topology.atoms if ((atom.name == 'O') and atom.residue.is_water)]")
select_pairs(selection1=None, selection2=None)

Generate unique pairs of atom indices.

If a selecton is a string, it will be resolved using the atom selection DSL, otherwise it is expected to be an array of atom indices.

Parameters:

selection1 : str or array-like, shape=(n_indices, ), dtype=int

A selection for select() or an array of atom indices.

selection2 : str or array-like, shape=(n_indices, ), dtype=int

A selection for select() or an array of atom indices.

Returns:

pairs : array-like, shape=(n_pairs, 2), dtype=int

Each row gives the indices of two atoms.

subset(atom_indices)

Create a new Topology from a subset of the atoms in an existing topology.

Parameters:

atom_indices : array_like

A list of the indices corresponding to the atoms in that you’d like to retain.

Notes

The existing topology will not be altered.

to_bondgraph()

Create a NetworkX graph from the atoms and bonds in this topology

Returns:

g : nx.Graph

A graph whose nodes are the Atoms in this topology, and whose edges are the bonds

See also

atoms, bonds

Notes

This method requires the NetworkX python package.

to_dataframe()

Convert this topology into a pandas dataframe

Returns:

atoms : pandas.DataFrame

The atoms in the topology, represented as a data frame.

bonds : np.ndarray

The bonds in this topology, represented as an n_bonds x 2 array of the indices of the atoms involved in each bond.

to_fasta(chain=None)

Convert this topology into FASTA string

Parameters:

chain : Integer, optional, default=None

If specified, will return the FASTA string for this chain in the Topology.

Returns:

fasta : String or list of Strings

A FASTA string for each chain specified.

to_openmm(traj=None)

Convert this topology into OpenMM topology

Parameters:

traj : MDTraj.Trajectory, optional, default=None

If specified, use the first frame from this trajectory to set the unitcell information in the openmm topology.

Returns:

topology : simtk.openmm.app.Topology

This topology, as an OpenMM topology