From: Wan K. K. <kim...@bi...> - 2004-08-12 06:52:52
|
Hi, I would like to add a few PDBugs. # next if ($mpdb =~ /^(1cwq|1e0p|1gtw|1cm4)$/) ; ### same model with different chain identifier I am using PQS coordinates for X-ray crystallography structures, which solves many such problems (i.e. 1g7y is splitted into 3 files without overlap.) cheers wan >The solution (in my experience) is to just apply manual filtering of such >structures after applying statistics. When I find these things I delete >them from my psimap database. > >These things are so abhorrent they generally skew any analysis by miles, >allowing us to recognize the problem immediatly. > >This mailing list is the perfect place to keep up to date on 'problem >structures', as subtle different analysis can show up different problems. > > >For example I have the following code in my psimaper... > >my $st = " > SELECT a.PDB, a.SUNID, a.DOMAIN, a.NEAT_SCCS > FROM $scopDb.domain a # Domain table > INNER JOIN $pdbDb.pdb b # PDB table > ON a.PDB = b.PDB > WHERE ( > b.TYPE = 'X-RAY DIFFRACTION' # PDB criteria ># OR b.TYPE = 'SYNCHROTON DIFFRACTION' # ># OR b.TYPE = 'NMR' # ># OR b.TYPE = 'ELECTRON MICROSCOPY' # ># OR b.TYPE = 'NEUTRON DIFFRACTION' # ># OR b.TYPE = 'FIBER DIFFRACTION' # ># OR b.TYPE = 'THEORETICAL MODEL' # ># OR b.TYPE = 'FLUORESCENCE' # > ) # ># AND b.RESOLUTION < 2 # > "; > >The above helps a lot with subsequent domain assignment. > >And... > > next if $pdb eq '1gtv'; # Multi-Model PDB entries > next if $pdb eq '1cm4'; # > next if $pdb eq '1cwq'; # > next if $pdb eq '1e0p'; # > next if $pdb eq '1gu8'; # > > next if $pdb eq '4cpa'; # Contains null chain AND > # named chain! > # PDB stubournly refuse to fix, > # so you should complain! > > > next if $pdb eq '1g7y'; # Contain 'significant' > next if $pdb eq '1ygp'; # crystal overlap between > next if $pdb eq '1hb9'; # subunits - PDB will fix. > next if $pdb eq '1hb5'; # > > >I see that they don't fully overlap with yours which is interesting. We >should make a community webpage to name and same these pdb structures. > >I call it PDBug, lets set up a Bugzilla! > > > >I am doing a simple analysis of interface density, where I compare the >number of residues on each side of a domain-domain interface (a simple >plot shows an interesting trend). This analysis showed up this problem >structure... > >1ona > >Which is a problem both in the PDB and with my domain parser... Anyway it >has an unnatural 'domain interface' mediated by a mis-labeled ligand. > > >Is Wan a psimapper? I bet he knows of some PDB problems. > > >All the above problems should be fixed by using MSD. I am planning to >write an O-rily book (mini) called 'MSD', basically because I like the >idea of adding a virus into the O-rily taxonomy (using a picture of a >virus from the 3D structure in MSD). > >We should start using MSD with PSIMAP ASAP. We can write the book after >that. MSD will ... > > Fix multimodel PDB / NMR structures. > Fix problems mapping PDB to SwissProt. > Automatically provide sequence / sturcture alignments. > Provide clean ;) SCOP/CATH/PFAM/DALI (e-family) assignments to structues. > Provide clean ligand assignments to stuctures. > Give us 'biomolecule' data > Filter crystal contacts > Do away with the need for parsers in PSIMAP. > Allow easy integration of domain-domain, domain-ligand and ligand-ligand >information > > etc., etc. > > > >P.S. I did an analysis of residue connectivity statistics for a second >year report. > > >You can find this at... > >http://interaction.mrc-dunn.cam.ac.uk/Scientists/Dan_Bolser/MyDoc/New%20Folder/ > >(I may change the folder name without warning.) > > >Just let me know if you want to turn this into a paper with some extra >work (actually it looks quite comprehensive). I can work closly with >anyone on this topic, as I still remeber the literature and the problems >in detail. > >I am half planning to revive this report into a thesis chapter anyway, so >any help I can get would be great ;) > > >All the best, >Dan. > > > >On Wed, 11 Aug 2004, Andreas Henschel wrote: > > > >>Hi psimap folks, >> >>I found some pdb entries that look like superimposed structures, a >>structural alignment of very similar (they are not NMR!) >>that cause trouble to psimap. One of them is 1cm4. I found out about >>them when I counted contacts per residue and some where as high as 50, >>thus falsifying my hotspot statistics. >> >>Also, these guys have suspiciuosly high contact residues: >>| 1ae6 | >>| 1bwm | >>| 1cm4 | >>| 1cr9 | >>| 1cu4 | >>| 1eo8 | >>| 1ob1 | >>| 1qfu | >>For the moment, I will ignore them. >>Any idea? >>Cheers, >>Andreas >> >> >> >>------------------------------------------------------- >>SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media >>100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 >>Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. >>http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 >>_______________________________________________ >>Psisoft-devl mailing list >>Psi...@li... >>https://lists.sourceforge.net/lists/listinfo/psisoft-devl >> >> >> > > > -- ********************************************** * Wan Kyu Kim * * * * Biotechnologisches Zentrum der TU-Dresden * * Bioinformatics Group * * Tatzberg 47-51 * * 01307 Dresden * * * * web: www.biotec.tu-dresden.de * * tel: 049 351 46340064 * * fax: 049 351 46340061 * ********************************************** |