From: Dan B. <dm...@mr...> - 2004-08-11 20:30:24
|
The solution (in my experience) is to just apply manual filtering of such structures after applying statistics. When I find these things I delete them from my psimap database. These things are so abhorrent they generally skew any analysis by miles, allowing us to recognize the problem immediatly. This mailing list is the perfect place to keep up to date on 'problem structures', as subtle different analysis can show up different problems. For example I have the following code in my psimaper... my $st = " SELECT a.PDB, a.SUNID, a.DOMAIN, a.NEAT_SCCS FROM $scopDb.domain a # Domain table INNER JOIN $pdbDb.pdb b # PDB table ON a.PDB = b.PDB WHERE ( b.TYPE = 'X-RAY DIFFRACTION' # PDB criteria # OR b.TYPE = 'SYNCHROTON DIFFRACTION' # # OR b.TYPE = 'NMR' # # OR b.TYPE = 'ELECTRON MICROSCOPY' # # OR b.TYPE = 'NEUTRON DIFFRACTION' # # OR b.TYPE = 'FIBER DIFFRACTION' # # OR b.TYPE = 'THEORETICAL MODEL' # # OR b.TYPE = 'FLUORESCENCE' # ) # # AND b.RESOLUTION < 2 # "; The above helps a lot with subsequent domain assignment. And... next if $pdb eq '1gtv'; # Multi-Model PDB entries next if $pdb eq '1cm4'; # next if $pdb eq '1cwq'; # next if $pdb eq '1e0p'; # next if $pdb eq '1gu8'; # next if $pdb eq '4cpa'; # Contains null chain AND # named chain! # PDB stubournly refuse to fix, # so you should complain! next if $pdb eq '1g7y'; # Contain 'significant' next if $pdb eq '1ygp'; # crystal overlap between next if $pdb eq '1hb9'; # subunits - PDB will fix. next if $pdb eq '1hb5'; # I see that they don't fully overlap with yours which is interesting. We should make a community webpage to name and same these pdb structures. I call it PDBug, lets set up a Bugzilla! I am doing a simple analysis of interface density, where I compare the number of residues on each side of a domain-domain interface (a simple plot shows an interesting trend). This analysis showed up this problem structure... 1ona Which is a problem both in the PDB and with my domain parser... Anyway it has an unnatural 'domain interface' mediated by a mis-labeled ligand. Is Wan a psimapper? I bet he knows of some PDB problems. All the above problems should be fixed by using MSD. I am planning to write an O-rily book (mini) called 'MSD', basically because I like the idea of adding a virus into the O-rily taxonomy (using a picture of a virus from the 3D structure in MSD). We should start using MSD with PSIMAP ASAP. We can write the book after that. MSD will ... Fix multimodel PDB / NMR structures. Fix problems mapping PDB to SwissProt. Automatically provide sequence / sturcture alignments. Provide clean ;) SCOP/CATH/PFAM/DALI (e-family) assignments to structues. Provide clean ligand assignments to stuctures. Give us 'biomolecule' data Filter crystal contacts Do away with the need for parsers in PSIMAP. Allow easy integration of domain-domain, domain-ligand and ligand-ligand information etc., etc. P.S. I did an analysis of residue connectivity statistics for a second year report. You can find this at... http://interaction.mrc-dunn.cam.ac.uk/Scientists/Dan_Bolser/MyDoc/New%20Folder/ (I may change the folder name without warning.) Just let me know if you want to turn this into a paper with some extra work (actually it looks quite comprehensive). I can work closly with anyone on this topic, as I still remeber the literature and the problems in detail. I am half planning to revive this report into a thesis chapter anyway, so any help I can get would be great ;) All the best, Dan. On Wed, 11 Aug 2004, Andreas Henschel wrote: >Hi psimap folks, > >I found some pdb entries that look like superimposed structures, a >structural alignment of very similar (they are not NMR!) >that cause trouble to psimap. One of them is 1cm4. I found out about >them when I counted contacts per residue and some where as high as 50, >thus falsifying my hotspot statistics. > >Also, these guys have suspiciuosly high contact residues: >| 1ae6 | >| 1bwm | >| 1cm4 | >| 1cr9 | >| 1cu4 | >| 1eo8 | >| 1ob1 | >| 1qfu | >For the moment, I will ignore them. >Any idea? >Cheers, >Andreas > > > >------------------------------------------------------- >SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media >100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 >Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. >http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 >_______________________________________________ >Psisoft-devl mailing list >Psi...@li... >https://lists.sourceforge.net/lists/listinfo/psisoft-devl > |