#55 List for PDB atom types

closed
cdk.core (5)
5
2012-10-08
2004-11-08
Adam Auton
No

Better support for PDB files would be great. For
example, PDBs are broken down into residues. However,
residues do not exist in the CDK, and it is therefore
difficult to operate on them.

Also, the CDK seems unable to recognise the atom types
contained in PDB files.

Discussion

  • Egon Willighagen

    Logged In: YES
    user_id=25678

    The PDBReader in CDK breaks the protein down into residues, called
    Monomer, and agregates it into a SetOfMolecules. Can you elaborate on
    why you think it is difficult to use this setup?

    Can you explain what the PDB atom types look like, and which info is
    specific for them, so that we can then start a atom type list for PDB atom
    types?

     
  • Adam Auton

    Adam Auton - 2004-11-20

    Logged In: YES
    user_id=1054373

    I didn't realise Monomer was equivalent to residue. Thanks for
    that.
    Regarding a list of atom types, a detailed description can be
    found at
    http://www.rcsb.org/pdb/docs/format/pdbguide2.2/part_76.ht
    ml
    with the main PDB format description at
    http://www.rcsb.org/pdb/docs/format/pdbguide2.2/guide2.2_fr
    ame.html

    However, maybe more usefully, here is a list of all residues
    and atom types....
    Amino PDB
    Acid
    ---- ----

    ALA H
    ALA HA
    ALA 1HB
    ALA 2HB
    ALA 3HB
    ALA C
    ALA CA
    ALA CB
    ALA N
    ALA O

    ARG H
    ARG HA
    ARG 2HB
    ARG 3HB
    ARG 2HG
    ARG 3HG
    ARG 2HD
    ARG 3HD
    ARG HE
    ARG 1HH1
    ARG 2HH1
    ARG 1HH2
    ARG 2HH2
    ARG C
    ARG CA
    ARG CB
    ARG CG
    ARG CD
    ARG CZ
    ARG N
    ARG NE
    ARG NH1
    ARG NH2
    ARG O

    ASP H
    ASP HA
    ASP 2HB
    ASP 3HB
    ASP HD2
    ASP C
    ASP CA
    ASP CB
    ASP CG
    ASP N
    ASP O
    ASP OD1
    ASP OD2

    ASN H
    ASN HA
    ASN 2HB
    ASN 3HB
    ASN 1HD2
    ASN 2HD2
    ASN C
    ASN CA
    ASN CB
    ASN CG
    ASN N
    ASN ND2
    ASN O
    ASN OD1

    CYS H
    CYS HA
    CYS 2HB
    CYS 3HB
    CYS HG
    CYS C
    CYS CA
    CYS CB
    CYS N
    CYS O
    CYS SG

    GLU H
    GLU HA
    GLU 2HB
    GLU 3HB
    GLU 2HG
    GLU 3HG
    GLU HE2
    GLU C
    GLU CA
    GLU CB
    GLU CG
    GLU CD
    GLU N
    GLU O
    GLU OE1
    GLU OE2

    GLN H
    GLN HA
    GLN 2HB
    GLN 3HB
    GLN 2HG
    GLN 3HG
    GLN 1HE2
    GLN 2HE2
    GLN C
    GLN CA
    GLN CB
    GLN CG
    GLN CD
    GLN N
    GLN NE2
    GLN O
    GLN OE1

    GLY H
    GLY 2HA
    GLY 3HA
    GLY C
    GLY CA
    GLY N
    GLY O

    HIS H
    HIS HA
    HIS 2HB
    HIS 3HB
    HIS HD1
    HIS HD2
    HIS HE1
    HIS HE2
    HIS C
    HIS CA
    HIS CB
    HIS CG
    HIS CD2
    HIS CE1
    HIS N
    HIS ND1
    HIS NE2
    HIS O

    ILE H
    ILE HA
    ILE HB
    ILE 2HG1
    ILE 3HG1
    ILE 1HG2
    ILE 2HG2
    ILE 3HG2
    ILE 1HD1
    ILE 2HD1
    ILE 3HD1
    ILE C
    ILE CA
    ILE CB
    ILE CG1
    ILE CG2
    ILE CD1
    ILE N
    ILE O

    LEU H
    LEU HA
    LEU 2HB
    LEU 3HB
    LEU HG
    LEU 1HD1
    LEU 2HD1
    LEU 3HD1
    LEU 1HD2
    LEU 2HD2
    LEU 3HD2
    LEU C
    LEU CA
    LEU CB
    LEU CG
    LEU CD1
    LEU CD2
    LEU N
    LEU O

    LYS H
    LYS HA
    LYS 2HB
    LYS 3HB
    LYS 2HG
    LYS 3HG
    LYS 2HD
    LYS 3HD
    LYS 2HE
    LYS 3HE
    LYS 1HZ
    LYS 2HZ
    LYS 3HZ
    LYS C
    LYS CA
    LYS CB
    LYS CG
    LYS CD
    LYS CE
    LYS N
    LYS NZ
    LYS O

    MET H
    MET HA
    MET 2HB
    MET 3HB
    MET 2HG
    MET 3HG
    MET 1HE
    MET 2HE
    MET 3HE
    MET C
    MET CA
    MET CB
    MET CG
    MET CE
    MET N
    MET O
    MET SD

    PHE H
    PHE HA
    PHE 1HB
    PHE 2HB
    PHE HD1
    PHE HD2
    PHE HE1
    PHE HE2
    PHE HZ
    PHE C
    PHE CA
    PHE CB
    PHE CG
    PHE CD1
    PHE CD2
    PHE CE1
    PHE CE2
    PHE CZ
    PHE N
    PHE O

    PRO H2
    PRO H3
    PRO HA
    PRO 2HB
    PRO 3HB
    PRO 2HG
    PRO 3HG
    PRO 2HD
    PRO 3HD
    PRO C
    PRO CA
    PRO CB
    PRO CG
    PRO CD
    PRO N
    PRO O

    SER H
    SER HA
    SER 2HB
    SER 3HB
    SER HG
    SER C
    SER CA
    SER CB
    SER N
    SER O
    SER OG

    THR H
    THR HA
    THR HB
    THR HG1
    THR 1HG2
    THR 2HG2
    THR 3HG2
    THR C
    THR CA
    THR CB
    THR CG2
    THR N
    THR O
    THR OG1

    TRP H
    TRP HA
    TRP 2HB
    TRP 3HB
    TRP HD1
    TRP HE1
    TRP HE3
    TRP HZ2
    TRP HZ3
    TRP HH2
    TRP C
    TRP CA
    TRP CB
    TRP CG
    TRP CD1
    TRP CD2
    TRP CE2
    TRP CE3
    TRP CZ2
    TRP CZ3
    TRP CH2
    TRP N
    TRP NE1
    TRP O

    TYR H
    TYR HA
    TYR 2HB
    TYR 3HB
    TYR HD1
    TYR HD2
    TYR HE1
    TYR HE2
    TYR HH
    TYR C
    TYR CA
    TYR CB
    TYR CG
    TYR CD1
    TYR CD2
    TYR CE1
    TYR CE2
    TYR CZ
    TYR N
    TYR O
    TYR OH

    VAL H
    VAL HA
    VAL HB
    VAL 1HG1
    VAL 2HG1
    VAL 3HG1
    VAL 1HG2
    VAL 2HG2
    VAL 3HG2
    VAL C
    VAL CA
    VAL CB
    VAL CG1
    VAL CG2
    VAL N
    VAL O

    Also noting that there is also the terminating atom type OXT
    possible for all residues.

    Hope that helps!

     
  • Egon Willighagen

    Logged In: YES
    user_id=25678

    Ok, thanx that was indeed useful. I've added the atom type in CML2
    format to CVS.

     

Log in to post a comment.