mdtraj.load_pdb(filename, stride=None, atom_indices=None, frame=None, no_boxchk=False, standard_names=True)

Load a RCSB Protein Data Bank file from disk.


Path to the PDB file on disk. The string could be a URL. Valid URL schemes include http and ftp.

strideint, default=None

Only read every stride-th model from the file

atom_indicesarray_like, default=None

If not None, then read only a subset of the atoms coordinates from the file. These indices are zero-based (not 1 based, as used by the PDB format). So if you want to load only the first atom in the file, you would supply atom_indices = np.array([0]).

frameint, default=None

Use this option to load only a single frame from a trajectory on disk. If frame is None, the default, the entire trajectory will be loaded. If supplied, stride will be ignored.

no_boxchkbool, default=False

By default, a heuristic check based on the particle density will be performed to determine if the unit cell dimensions are absurd. If the particle density is >1000 atoms per nm^3, the unit cell will be discarded. This is done because all PDB files from RCSB contain a CRYST1 record, even if there are no periodic boundaries, and dummy values are filled in instead. This check will filter out those false unit cells and avoid potential errors in geometry calculations. Set this variable to True in order to skip this heuristic check.

standard_namesbool, default=True

If True, non-standard atomnames and residuenames are standardized to conform with the current PDB format version. If set to false, this step is skipped.


The resulting trajectory, as an md.Trajectory object.

See also


Low level interface to PDB files


>>> import mdtraj as md
>>> pdb = md.load_pdb('2EQQ.pdb')
>>> print(pdb)
<mdtraj.Trajectory with 20 frames, 423 atoms at 0x110740a90>