POP-OUT | CLOSE

File Formats

 


 

PDB File Format

 

The Protein Data Bank (PDB) format provides a standard representation for macromolecular structure data derived from X-ray diffraction and NMR studies. This representation was created in the 1970's and a large amount of software using it has been written.

Documentation describing the PDB file format is available from the wwPDB at http://www.wwpdb.org/docs.html.

Historical copies of the PDB file format from 1992* and 1996* are available.

* PDF documents require Acrobat Reader



 

mmCIF File Format and PDB Exchange Dictionary

 

The Protein Data Bank (PDB) uses macromolecular Crystallographic Information File (mmCIF) data dictionaries to describe the information content of PDB entries. The PDB Exchange data dictionary consolidates content from a variety of crystallographic dictionaries including: the IUCr Core, mmCIF, Image and symmetry dictionaries. The PDB Exchange Dictionary also includes extensions describing NMR, Cryo-EM, and protein production data. PDB data processing, data exchange, annotation, and database management operations all make heavy use of the data format and the content of the PDB Exchange Dictionary. Software tools are used to convert mmCIF data files to the older PDB format and to PDBML/XML.

Further information and related resources are available at http://mmcif.pdb.org/.



 

PDBML/XML File Format

 

The Protein Data Bank Markup Language (PDBML) provides a representation of PDB data in XML format. The description of this format is provided in XML schema of the PDB Exchange Data Dictionary. This schema is produced by direct translation of the mmCIF format PDB Exchange Data Dictionary. Other data dictionaries used by the PDB have been electronically translated into XML/XSD schemas.

Further information and related resources are available at http://pdbml.pdb.org/.



 

Chemical Component Dictionary

 

The Chemical Component Dictionary (formerly the HET Group Dictionary) is an external reference file describing all residue and small molecule components found in PDB entries. This dictionary contains detailed chemical descriptions for standard and modified amino acids/nucleotides, small molecule ligands, and solvent molecules. Each chemical definition includes descriptions of chemical properties such as stereochemical assignments, aromatic bond assignments, idealized coordinates, chemical descriptors (SMILES & InChI), and systematic chemical names.

The chemical component dictionary is organized by the 3-character alphanumeric code that PDB assigns to each chemical component. New chemical component definitions appear in the dictionary as the entries in which they are observed are released in the PDB archive; consequently, the dictionary is updated with each weekly PDB release.

Further information, including links to download the chemical component dictionary, is available at http://www.wwpdb.org/ccd.html.