# mdtraj.Topology¶

class mdtraj.Topology

Topology stores the topological information about a system.

The structure of a Topology object is similar to that of a PDB file. It consists of a set of Chains (often but not always corresponding to polymer chains). Each Chain contains a set of Residues, and each Residue contains a set of Atoms. In addition, the Topology stores a list of which atom pairs are bonded to each other.

Atom and residue names should follow the PDB 3.0 nomenclature for all molecules for which one exists.

Examples

>>> topology = md.load('example.pdb').topology
>>> print(topology)
<mdtraj.Topology with 1 chains, 3 residues, 22 atoms, 21 bonds at 0x105a98e90>
>>> table, bonds = topology.to_dataframe()
serial name element  resSeq resName  chainID
0       0   H1       H       1     CYS        0
1       1  CH3       C       1     CYS        0
2       2   H2       H       1     CYS        0
3       3   H3       H       1     CYS        0
4       4    C       C       1     CYS        0
>>> # rename residue "CYS" to "CYSS"
>>> table[table['residue'] == 'CYS']['residue'] = 'CYSS'
serial name element  resSeq resName   chainID
0       0   H1       H       1     CYSS        0
1       1  CH3       C       1     CYSS        0
2       2   H2       H       1     CYSS        0
3       3   H3       H       1     CYSS        0
4       4    C       C       1     CYSS        0
>>> t2 = md.Topology.from_dataframe(table, bonds)


Attributes

 chains Iterator over all Chains in the Topology. residues Iterator over all Residues in the Topology. atoms Iterator over all Atoms in the Topology.
__init__()

Create a new Topology object

Methods

 __init__() Create a new Topology object add_atom(name, element, residue[, serial]) Create a new Atom and add it to the Topology. add_bond(atom1, atom2) Create a new bond and add it to the Topology. add_chain() Create a new Chain and add it to the Topology. add_residue(name, chain[, resSeq, segment_id]) Create a new Residue and add it to the Topology. atom(index) Get a specific atom by index. atoms_by_name(name) Iterator over all Atoms in the Topology with a specified name chain(index) Get a specific chain by index. copy() Return a copy of the topology create_disulfide_bonds(positions) Identify disulfide bonds based on proximity and add them to the Topology. create_standard_bonds() Create bonds based on the atom and residue names for all standard residue types. find_molecules() Identify molecules based on bonds. from_dataframe(atoms[, bonds]) Create a mdtraj topology from a pandas data frame from_openmm(value) Create a mdtraj topology from an OpenMM topology join(other) Join two topologies together residue(index) Get a specific residue by index. select(selection_string) Execute a selection against the topology select_atom_indices([selection]) Get the indices of biologically-relevant groups by name. select_expression(selection_string) Translate a atom selection expression into a pure python expression. select_pairs([selection1, selection2]) Generate unique pairs of atom indices. subset(atom_indices) Create a new Topology from a subset of the atoms in an existing topology. to_bondgraph() Create a NetworkX graph from the atoms and bonds in this topology to_dataframe() Convert this topology into a pandas dataframe to_fasta([chain]) Convert this topology into FASTA string to_openmm([traj]) Convert this topology into OpenMM topology

Attributes

 atoms Iterator over all Atoms in the Topology. bonds Iterator over all bonds (each represented as a tuple of two Atoms) in the Topology. chains Iterator over all Chains in the Topology. n_atoms Get the number of atoms in the Topology n_bonds Get the number of bonds in the Topology n_chains Get the number of chains in the Topology n_residues Get the number of residues in the Topology. residues Iterator over all Residues in the Topology.
copy()

Return a copy of the topology

Returns: out : Topology A copy of this topology
join(other)

Join two topologies together

Parameters: other : Topology Another topology object out : Topology A joint topology, with all of the atoms/residues/chains/bonds in each of the individual topologies
to_fasta(chain=None)

Convert this topology into FASTA string

Parameters: chain : Integer, optional, default=None If specified, will return the FASTA string for this chain in the Topology. fasta : String or list of Strings A FASTA string for each chain specified.
to_openmm(traj=None)

Convert this topology into OpenMM topology

Parameters: traj : MDTraj.Trajectory, optional, default=None If specified, use the first frame from this trajectory to set the unitcell information in the openmm topology. topology : simtk.openmm.app.Topology This topology, as an OpenMM topology
classmethod from_openmm(value)

Create a mdtraj topology from an OpenMM topology

Parameters: value : simtk.openmm.app.Topology An OpenMM topology that you wish to convert to a mdtraj topology.
to_dataframe()

Convert this topology into a pandas dataframe

Returns: atoms : pandas.DataFrame The atoms in the topology, represented as a data frame. bonds : np.ndarray The bonds in this topology, represented as an n_bonds x 2 array of the indices of the atoms involved in each bond.
classmethod from_dataframe(atoms, bonds=None)

Create a mdtraj topology from a pandas data frame

Parameters: atoms : pandas.DataFrame The atoms in the topology, represented as a data frame. This data frame should have columns “serial” (atom index), “name” (atom name), “element” (atom’s element), “resSeq” (index of the residue) “resName” (name of the residue), “chainID” (index of the chain), and optionally “segmentID”, following the same conventions as wwPDB 3.0 format. bonds : np.ndarray, shape=(n_bonds, 2), dtype=int, optional The bonds in the topology, represented as an n_bonds x 2 array of the indices of the atoms involved in each bond. Specifiying bonds here is optional. To create standard protein bonds, you can use create_standard_bonds to “fill in” the bonds on your newly created Topology object
to_bondgraph()

Create a NetworkX graph from the atoms and bonds in this topology

Returns: g : nx.Graph A graph whose nodes are the Atoms in this topology, and whose edges are the bonds

Notes

This method requires the NetworkX python package.

Create a new Chain and add it to the Topology.

Returns: chain : mdtraj.topology.Chain the newly created Chain

Create a new Residue and add it to the Topology.

Parameters: name : str The name of the residue to add chain : mdtraj.topology.Chain The Chain to add it to resSeq : int, optional Residue sequence number, such as from a PDB record. These sequence numbers are arbitrary, and do not necessarily start at 0 (or 1). If not supplied, the resSeq attribute will be set to the residue’s sequential (0 based) index. segment_id : str, optional A label for the segment to which this residue belongs residue : mdtraj.topology.Residue The newly created Residue

Create a new Atom and add it to the Topology.

Parameters: name : str The name of the atom to add element : mdtraj.element.Element The element of the atom to add residue : mdtraj.topology.Residue The Residue to add it to serial : int Serial number associated with the atom. atom : mdtraj.topology.Atom the newly created Atom

Create a new bond and add it to the Topology.

Parameters: atom1 : mdtraj.topology.Atom The first Atom connected by the bond atom2 : mdtraj.topology.Atom The second Atom connected by the bond
chain(index)

Get a specific chain by index. These indices start from zero.

Parameters: index : int The index of the chain to select. chain : Chain The index-th chain in the topology.
chains

Iterator over all Chains in the Topology.

Returns: chainiter : listiterator Iterator over all Chains in the Topology.
n_chains

Get the number of chains in the Topology

residue(index)

Get a specific residue by index. These indices start from zero.

Parameters: index : int The index of the residue to select. residue : Residue The index-th residue in the topology.
residues

Iterator over all Residues in the Topology.

Returns: residueiter : generator Iterator over all Residues in the Topology.
n_residues

Get the number of residues in the Topology.

atom(index)

Get a specific atom by index. These indices start from zero.

Parameters: index : int The index of the atom to select. atom : Atom The index-th atom in the topology.
atoms

Iterator over all Atoms in the Topology.

Returns: atomiter : generator Iterator over all Atoms in the Topology.
atoms_by_name(name)

Iterator over all Atoms in the Topology with a specified name

Parameters: name : str The particular atom name of interest. atomiter : generator

Examples

>>> for atom in topology.atoms_by_name('CA'):
...     print(atom)

n_atoms

Get the number of atoms in the Topology

bonds

Iterator over all bonds (each represented as a tuple of two Atoms) in the Topology.

Returns: atomiter : generator Iterator over all tuple of Atoms in the Trajectory involved in a bond.
n_bonds

Get the number of bonds in the Topology

create_standard_bonds()

Create bonds based on the atom and residue names for all standard residue types.

create_disulfide_bonds(positions)

Identify disulfide bonds based on proximity and add them to the Topology.

Parameters: positions : list The list of atomic positions based on which to identify bonded atoms
subset(atom_indices)

Create a new Topology from a subset of the atoms in an existing topology.

Parameters: atom_indices : array_like A list of the indices corresponding to the atoms in that you’d like to retain.

Notes

The existing topology will not be altered.

select_expression(selection_string)

Translate a atom selection expression into a pure python expression.

Parameters: selection_string : str An expression in the MDTraj atom selection DSL python_string : str A string containing a pure python expression, equivalent to the selection expression.

Examples

>>> topology.select_expression('name O and water')
"[atom.index for atom in topology.atoms if ((atom.name == 'O') and atom.residue.is_water)]")

select(selection_string)

Execute a selection against the topology

Parameters: selection_string : str An expression in the MDTraj atom selection DSL indices : np.ndarray, dtype=int, ndim=1 Array of the indices of the atoms matching the selection expression.

select_expression, mdtraj.core.selection.parse_selection

Examples

>>> topology.select('name O and water')
array([1, 3, 5, 10, ...])

select_atom_indices(selection='minimal')

Get the indices of biologically-relevant groups by name.

Parameters: selection : {‘all’, ‘alpha’, ‘minimal’, ‘heavy’, ‘water’} What types of atoms to select. all All atoms alpha Protein residue alpha carbons minimal Keep the atoms in protein residues with names in {CA, CB, C, N, O} heavy All non-hydrogen protein atoms. water Water oxygen atoms indices : np.ndarray (N,) An array of the indices of the selected atoms.
select_pairs(selection1=None, selection2=None)

Generate unique pairs of atom indices.

If a selecton is a string, it will be resolved using the atom selection DSL, otherwise it is expected to be an array of atom indices.

Parameters: selection1 : str or array-like, shape=(n_indices, ), dtype=int A selection for select() or an array of atom indices. selection2 : str or array-like, shape=(n_indices, ), dtype=int A selection for select() or an array of atom indices. pairs : array-like, shape=(n_pairs, 2), dtype=int Each row gives the indices of two atoms.
find_molecules()

Identify molecules based on bonds.

A molecule is defined as a set of atoms that are connected to each other by bonds. This method uses the list of bonds to divide up the Topology’s atoms into molecules.

Returns: molecules : list of sets Each entry represents one molecule, and is the set of all Atoms in that molecule
Versions