You can subscribe to this list here.
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
(5) |
Sep
(1) |
Oct
(2) |
Nov
(4) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2005 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Dan B. <dm...@mr...> - 2005-01-27 00:09:23
|
Sorry to keep sending emails, but I just found what I was looking for... The most higly ranked individual in WikiPedia, that individual is, by this measure, more important than any other individual in the whole encyclopedia... J. R. R. Tolkien You should understand that entries such as 'US President' dont count as individuals. He ranks 548 out of every single entry in Wikipedia, and that means he beat a lot of stuff. Bill Clinton comes next (I think), but a long way down the list at number 656 out of the WikiPedia universe of things. oops... I missed Aristotle (566), Adolf Hitler (569) and Jesus Christ (617). I wonder where the first person you have never heard of ranks. There are some really interesting things in this list. The most important specific date ever is March 1st, then Febuary 11th, followed soon after by July 1st! The days of the week go like this... 320 Sunday 330 Friday 337 Monday 338 Tuesday 351 Thursday 352 Saturday 359 Wednesday The BBC is the highes ranking corperation (342), but comes after Christianity (62), Islam (142), Buddhism (249), Hinduism (267) and Judaism (327). But it beats Jesus Christ (617), but not the bible (215). Jazz is the first non time related, non geographic, non academic, non religious, non sporting human endevour to make the list (unless you count film, tv and actor). Sadly none of my own wiki pages are ranked in the top 1000, but protein does pritty well (447)! Microsoft (577) ranks just above Nazi Germany (578), Democracy (364) beats Wikipedia (365), and the sun (378) beats the moon (390). On Wed, 26 Jan 2005, Dan Bolser wrote: > >http://www.searchmorph.com/weblog/index.php?id=43 > >Page rank is like 'importance', and is the algorithm used by google to >rank webpages. Wikipedia is an encyclopedia. By ranking articles in wiki >by their interlinks, you find out the most important things (or biases) in >the encyclopedia, and hence human knowladge. > > > > >------------------------------------------------------- >This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting >Tool for open source databases. Create drag-&-drop reports. Save time >by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. >Download a FREE copy at http://www.intelliview.com/go/osdn_nl >_______________________________________________ >Psisoft-devl mailing list >Psi...@li... >https://lists.sourceforge.net/lists/listinfo/psisoft-devl > |
From: Dan B. <dm...@mr...> - 2005-01-26 23:22:54
|
Quite frankly this is the coolest thing I have ever seen! Film and TV rank higher than Animal! We are a strange species! It is really interesting that individual european countries rank higher than the 'europe' entry. Could this be a function of the 'decay' parameter used in passing weights around the links? I assume europe isn't widely linked outside the context of particular countries, and therefore is less ranked overall than particular countries. For similar reasons the fact that ww2 ranks very high, and isn't specifically about countries or years is significant. outside of countries and years, it is the most important page in wiki... but I guess ww2 has a heck of a lot to do with countries and years. Its interesting to see that the google rank for the query 'wikipedia' is very different. I guess this measures the effects of 'external' links on the ranking (or google trickery). 34. 0.000652 17.40 Mathematics How ace is that! On Wed, 26 Jan 2005, Dan Bolser wrote: > >http://www.searchmorph.com/weblog/index.php?id=43 > >Page rank is like 'importance', and is the algorithm used by google to >rank webpages. Wikipedia is an encyclopedia. By ranking articles in wiki >by their interlinks, you find out the most important things (or biases) in >the encyclopedia, and hence human knowladge. > > > > >------------------------------------------------------- >This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting >Tool for open source databases. Create drag-&-drop reports. Save time >by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. >Download a FREE copy at http://www.intelliview.com/go/osdn_nl >_______________________________________________ >Psisoft-devl mailing list >Psi...@li... >https://lists.sourceforge.net/lists/listinfo/psisoft-devl > |
From: Dan B. <dm...@mr...> - 2005-01-26 22:57:12
|
http://www.searchmorph.com/weblog/index.php?id=43 Page rank is like 'importance', and is the algorithm used by google to rank webpages. Wikipedia is an encyclopedia. By ranking articles in wiki by their interlinks, you find out the most important things (or biases) in the encyclopedia, and hence human knowladge. |
From: <bi...@ka...> - 2004-11-28 02:59:10
|
Hi, >Previously we have sent round emails about bad PDB files for use in >PSIMAP (bad for various reasons). > >At the moment I have come across (again) the old problem of amino acids in >the interface with multiple 'alternate location' forms (those residues or >short stretches with some conformational shift in the crystal structure). >If you are not careful you can end up counting such amino acids multiple >times, and if you *are* careful which actual location should you use? > > > I see. This is a good problem. >Another potential problem is that some SCOP Folds are garbage folds, and >should not be used in the way intended (evolutionary/structural units). >For example some split chain domains have not been properly annotated by >Alexy, and he is totally happy because the problem lies in the PDB file. > >Also some domains are only fragments of a particular fold, and actually >assume a different conformations in the 'full length' version of the >protein. However, this is not clear from SCOP. > > > I agree. Fold class should be removed in the proper classification. >Finally, we don't have a proper assessment of psimap in terms of known >(expertly assessed and experimentally verified) oligomerization state and >(therefore) crystal contacts. > >I would like to make a database of interfaces which adds all this kind of >information. > > > Good. >We can start by looking at all the papers relating to the determination of >crystal contacts, and store all their hard work in determining (from the >literature) the known oligomerization state for the proteins in our >database (along with the citation for the state assigned). > > > OK. >We also need a strategy to cluster and filter the interfaces so we can >remove specific problem instances from the dataset (when a simpler >version of the same interface can be used instead). > >We should remove all potential garbage from problematic SCOP entries. > > > I agree. I hope our ECOPS will handle that. >This is quite a big job, and not too exciting, but the resulting dataset >(if we can organize so we can collectively maintain it) will be very >useful to have. > > Yeah. >For example we should each pick a different paper on the study of protein >interfaces and copy out the list of pdb files used and the category >assigned to the interface in the paper. Such a simple library (which is >essential for detailed analysis of protein protein interaction) could be >very useful for the whole community, as currently no such manually >created, expertly defined, machine readable database exists. > > > I agree. >It would be very simple to put a web interface on a simple underlying >mysql database. > > > Agree. Cheers Jong >Please let me know what you think, > >All the best, >Dan. > > >_______________________________________________ >Psimap mailing list >Ps...@sa... >http://saju.kaist.ac.kr/mailman/listinfo/psimap > > > > |
From: Dan B. <dm...@mr...> - 2004-11-27 23:03:48
|
On Sat, 27 Nov 2004, Dan Bolser wrote: > >Hi, > >Previously we have sent round emails about bad PDB files for use in >PSIMAP (bad for various reasons). > >At the moment I have come across (again) the old problem of amino acids in >the interface with multiple 'alternate location' forms (those residues or >short stretches with some conformational shift in the crystal structure). >If you are not careful you can end up counting such amino acids multiple >times, and if you *are* careful which actual location should you use? > >Another potential problem is that some SCOP Folds are garbage folds, and >should not be used in the way intended (evolutionary/structural units). >For example some split chain domains have not been properly annotated by >Alexy, and he is totally happy because the problem lies in the PDB file. > >Also some domains are only fragments of a particular fold, and actually >assume a different conformations in the 'full length' version of the >protein. However, this is not clear from SCOP. > >Finally, we don't have a proper assessment of psimap in terms of known >(expertly assessed and experimentally verified) oligomerization state and >(therefore) crystal contacts. > >I would like to make a database of interfaces which adds all this kind of >information. > >We can start by looking at all the papers relating to the determination of >crystal contacts, and store all their hard work in determining (from the >literature) the known oligomerization state for the proteins in our >database (along with the citation for the state assigned). > >We also need a strategy to cluster and filter the interfaces so we can >remove specific problem instances from the dataset (when a simpler >version of the same interface can be used instead). > >We should remove all potential garbage from problematic SCOP entries. > >This is quite a big job, and not too exciting, but the resulting dataset >(if we can organize so we can collectively maintain it) will be very >useful to have. I just checked, and I found that there are only around 3000 distinct domain-domain pairs at the 40% sequence identity threshold. that means that between us we could quite quickly check each one, and manually classify the interfaces. Such a dataset would be totally unique (as far as I know) and invaluable for future work. Now we just have to decide what to call the database ;) > >For example we should each pick a different paper on the study of protein >interfaces and copy out the list of pdb files used and the category >assigned to the interface in the paper. Such a simple library (which is >essential for detailed analysis of protein protein interaction) could be >very useful for the whole community, as currently no such manually >created, expertly defined, machine readable database exists. > >It would be very simple to put a web interface on a simple underlying >mysql database. > >Please let me know what you think, > >All the best, >Dan. > > > > >------------------------------------------------------- >SF email is sponsored by - The IT Product Guide >Read honest & candid reviews on hundreds of IT Products from real users. >Discover which products truly live up to the hype. Start reading now. >http://productguide.itmanagersjournal.com/ >_______________________________________________ >Psisoft-devl mailing list >Psi...@li... >https://lists.sourceforge.net/lists/listinfo/psisoft-devl > |
From: Dan B. <dm...@mr...> - 2004-11-27 19:00:46
|
Hi, Previously we have sent round emails about bad PDB files for use in PSIMAP (bad for various reasons). At the moment I have come across (again) the old problem of amino acids in the interface with multiple 'alternate location' forms (those residues or short stretches with some conformational shift in the crystal structure). If you are not careful you can end up counting such amino acids multiple times, and if you *are* careful which actual location should you use? Another potential problem is that some SCOP Folds are garbage folds, and should not be used in the way intended (evolutionary/structural units). For example some split chain domains have not been properly annotated by Alexy, and he is totally happy because the problem lies in the PDB file. Also some domains are only fragments of a particular fold, and actually assume a different conformations in the 'full length' version of the protein. However, this is not clear from SCOP. Finally, we don't have a proper assessment of psimap in terms of known (expertly assessed and experimentally verified) oligomerization state and (therefore) crystal contacts. I would like to make a database of interfaces which adds all this kind of information. We can start by looking at all the papers relating to the determination of crystal contacts, and store all their hard work in determining (from the literature) the known oligomerization state for the proteins in our database (along with the citation for the state assigned). We also need a strategy to cluster and filter the interfaces so we can remove specific problem instances from the dataset (when a simpler version of the same interface can be used instead). We should remove all potential garbage from problematic SCOP entries. This is quite a big job, and not too exciting, but the resulting dataset (if we can organize so we can collectively maintain it) will be very useful to have. For example we should each pick a different paper on the study of protein interfaces and copy out the list of pdb files used and the category assigned to the interface in the paper. Such a simple library (which is essential for detailed analysis of protein protein interaction) could be very useful for the whole community, as currently no such manually created, expertly defined, machine readable database exists. It would be very simple to put a web interface on a simple underlying mysql database. Please let me know what you think, All the best, Dan. |
From: Dan B. <dm...@mr...> - 2004-11-19 13:33:18
|
Hello psimappers! This paper look highly relevant to our work... Discovery of binding motif pairs from protein complex structural data and protein interaction sequence data. % 9711271 (JID) @Article{pmid14992513, Author="Li, H and Li, J and Tan, S H and Ng, S K", Title="{Discovery of binding motif pairs from protein complex structural data and protein interaction sequence data}", Journal="Pac Symp Biocomput", Year="2004", Pages="312-323" } |
From: Dan B. <dm...@mr...> - 2004-10-31 16:01:31
|
One example (the reason I started the last mail but forgot...) PDB 1grh ... This guy has a seqres entry of 478 residues, but an atom entry for only 1 residue! The PDB file is a single residue! You can mine such fun for yourself with rafTools. On Sun, 31 Oct 2004, Dan Bolser wrote: > >Dear PsiMappers, > >I have been looking at old code to parse the ASTRAL RAF files. The RAF >files form the very core of SCOP, and allow very convenient mapping >between seqres and atom 'sequences' and residue identifiers (in the PDB >convention). > >This approach is very powerful, and avoids much of the complication of >using the SCOP parsable files with the 'raw' PDB data. Additionally, using >RAF allows you to generate ASTRAL sequence files automatically, which is >very useful for analysing alignments of these sequences. > >If anyone is interested I will upload the 'rafTools' package into psiSoft. > >All the best, >Dan. > > |
From: Dan B. <dm...@mr...> - 2004-10-31 15:58:31
|
Dear PsiMappers, I have been looking at old code to parse the ASTRAL RAF files. The RAF files form the very core of SCOP, and allow very convenient mapping between seqres and atom 'sequences' and residue identifiers (in the PDB convention). This approach is very powerful, and avoids much of the complication of using the SCOP parsable files with the 'raw' PDB data. Additionally, using RAF allows you to generate ASTRAL sequence files automatically, which is very useful for analysing alignments of these sequences. If anyone is interested I will upload the 'rafTools' package into psiSoft. All the best, Dan. |
From: Dan B. <dm...@mr...> - 2004-09-18 17:50:27
|
The following PDB:SUNID1:SUNID2 pairs have the most skewed interfaces due to bugs in the PDB C1 C2 EDGE TYPE MODE cRatio 1d0n:40838:40839 4 1 4 INTRA HOMO 4.0 1el1:59452:59451 4 1 4 INTER HOMO 4.0 1lt9:78190:78189 1 4 4 INTRA HETERO 4.0 1lt9:78195:78194 1 4 4 INTRA HETERO 4.0 1ltj:78200:78199 1 4 4 INTRA HETERO 4.0 1o6v:81099:81101 1 5 5 INTER HOMO 5.0 1ona:23972:23974 6 1 6 INTER HOMO 6.0 4gpd:39934:30018 7 2 7 INTER HETERO 3.5 All the best, Dan. |
From: Dan B. <dm...@mr...> - 2004-08-14 04:34:50
|
Regarding fixing problems with PDB files... (note that I reported a problem with pdb 1ona, not 1aon). ---------- Forwarded message ---------- Date: Fri, 13 Aug 2004 17:28:58 -0400 From: Christine Zardecki <zar...@rc...> To: Dan Bolser <dm...@mr...> Cc: "'in...@rc...'" <in...@rc...> Subject: Re: pdb-l: RE: mistakes in PDB files? Dear Dan, Thanks for your comments -- we will fix the error in 1aon with the next update of the PDB. PDB users wishing to contribute corrections to PDB entries should send this information to in...@rc.... We're working on a news item to describe this in greater detail, but some information is below. Sincerely, Christine Zardecki RCSB Protein Data Bank Corrections to entries orginally processed by the RCSB, EBI/MSD, or PDBj are handled by the PDB annotation staff and subsequently reviewed by the author(s) depositing the structure. Any changes in released PDB entries are described in the PDB REVDAT records and in the mmCIF/XML category DATABASE_PDB_REV_RECORD categories. In certain cases replacement coordinates for an entry are provided by the depositing author. In these cases the original entry is obsoleted and the replacement coordinates are released in a new superseding PDB entry. The relationship between obsolete and superseding entries is stored in OBSLTE/SPRSDE PDB records and in the mmCIF/XML category PDBX_DATABASE_PDB_OBS_SPR. Queries of obsoleted entries on the RCSB/PDB website always produce the most recent superseding entry. Obsoleted entries remain available in a separate area of the PDB ftp site, ftp://ftp.rcsb.org/pub/pdb/data/structures/obsolete/. For the entries deposited prior to 1998, a variety of consistency checks have been performed. This has been done as part of an ongoing project to maintain uniformity within the PDB archive. This effort is described in detail at http://www.rcsb.org/pdb/uniformity and in (ref data uniformity papers). Examples of uniformity corrections include corrections related to atomic nomenclature for both macromolecule and ligand, sequence-coordinate consistency, and the addition of missing records (e.g. citations, synonyms, and sequence database references). Corrections in the pre-1998 entries have been made only in the mmCIF and XML data files. The mmCIF and XML data files are download options of the RCSB PDB website and are also available via ftp from the following RCSB PDB servers: ftp://ftp.rcsb.org/pub/pdb/data/structures/all/mmCIF ftp://beta.rcsb.org/pub/pdb/uniformity/data/XML The XML data files were produced as part of a joint project by all wwPDB members, and these files are in the final stage of beta testing. Both mmCIF and XML data files conform to the PDB Exchange data dictionary. This dictionary is available in both mmCIF and XML schema form at http://deposit.pdb.org/mmcif/. On Aug 9, 2004, at 8:02 PM, Dan Bolser wrote: > On Fri, 6 Aug 2004, Oscar Hur wrote: > >> >> Hi >> >> Is it common to find mistakes in published PDB files? I just found >> quite >> a few mistakes in one of the proteins in this database. Is there >> anyway >> for Protein Data Bank to update/correct the existing file in its >> database? >> Is there a form for me to submit? What is the procedure to do so? > > It would be really nice to have a list of known problems and a > mechanism of reporting them. A simple 'bug tracker' would work for > these cases and would let people easily access a list of PDB files > with outstanding format issues. > > It is the little things which would be nice to 'que up' for someone to > check - and if necessary assign to 'not a bug' status. > > For example the PDB 1ona has a problematic assignment of hetero atoms > to chains, having a Ca and an Mn assigned to chain B which clearly > belong in chain C and another Ca and Mn assigned to chain C which > clarly belong in chain D and another Ca and Mn assigned to chain D > which clearly belong in chain B. The chain A Ca and Mn are correctly > assigned. > > > >> >> For example, in 1KNB, all the strands are anti-parallel. But in its >> PDB >> files: >> >> SHEET 1 V 6 THR 400 TRP 402 0 >> SHEET 2 V 6 ASP 418 LYS 427 1 >> SHEET 3 V 6 SER 430 ALA 440 1 >> SHEET 4 V 6 ASN 479 ASN 482 1 >> SHEET 5 V 6 LEU 485 THR 486 1 >> SHEET 6 V 6 TYR 573 TYR 577 1 >> SHEET 1 R 4 SER 454 PHE 461 0 >> SHEET 2 R 4 ASN 515 TYR 521 1 >> SHEET 3 R 4 LYS 528 THR 535 1 >> SHEET 4 R 4 TYR 550 TRP 556 1 >> >> The senses of the strands are all incorrectly labelled as parallel as >> indicated in col 39-40. They should be labelled as "-1". >> >> Oscar >> >> >> >> >> >> >> >> >> >> >> >> TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see >> https://lists.sdsc.edu/mailman/listinfo.cgi/pdb-l . >> > > > TO UNSUBSCRIBE OR CHANGE YOUR SUBSCRIPTION OPTIONS, please see > https://lists.sdsc.edu/mailman/listinfo.cgi/pdb-l . |
From: Dan B. <dm...@mr...> - 2004-08-12 10:00:17
|
Cheers Wan! How are you mapping scop domain def to PQS? On Thu, 12 Aug 2004, Wan Kyu Kim wrote: >Hi, > >I would like to add a few PDBugs. > ># next if ($mpdb =~ /^(1cwq|1e0p|1gtw|1cm4)$/) ; ### same model with >different chain identifier > >I am using PQS coordinates for X-ray crystallography structures, >which solves many such problems (i.e. 1g7y is splitted into 3 files >without overlap.) > >cheers > >wan > >>The solution (in my experience) is to just apply manual filtering of such >>structures after applying statistics. When I find these things I delete >>them from my psimap database. >> >>These things are so abhorrent they generally skew any analysis by miles, >>allowing us to recognize the problem immediatly. >> >>This mailing list is the perfect place to keep up to date on 'problem >>structures', as subtle different analysis can show up different problems. >> >> >>For example I have the following code in my psimaper... >> >>my $st = " >> SELECT a.PDB, a.SUNID, a.DOMAIN, a.NEAT_SCCS >> FROM $scopDb.domain a # Domain table >> INNER JOIN $pdbDb.pdb b # PDB table >> ON a.PDB = b.PDB >> WHERE ( >> b.TYPE = 'X-RAY DIFFRACTION' # PDB criteria >># OR b.TYPE = 'SYNCHROTON DIFFRACTION' # >># OR b.TYPE = 'NMR' # >># OR b.TYPE = 'ELECTRON MICROSCOPY' # >># OR b.TYPE = 'NEUTRON DIFFRACTION' # >># OR b.TYPE = 'FIBER DIFFRACTION' # >># OR b.TYPE = 'THEORETICAL MODEL' # >># OR b.TYPE = 'FLUORESCENCE' # >> ) # >># AND b.RESOLUTION < 2 # >> "; >> >>The above helps a lot with subsequent domain assignment. >> >>And... >> >> next if $pdb eq '1gtv'; # Multi-Model PDB entries >> next if $pdb eq '1cm4'; # >> next if $pdb eq '1cwq'; # >> next if $pdb eq '1e0p'; # >> next if $pdb eq '1gu8'; # >> >> next if $pdb eq '4cpa'; # Contains null chain AND >> # named chain! >> # PDB stubournly refuse to fix, >> # so you should complain! >> >> >> next if $pdb eq '1g7y'; # Contain 'significant' >> next if $pdb eq '1ygp'; # crystal overlap between >> next if $pdb eq '1hb9'; # subunits - PDB will fix. >> next if $pdb eq '1hb5'; # >> >> >>I see that they don't fully overlap with yours which is interesting. We >>should make a community webpage to name and same these pdb structures. >> >>I call it PDBug, lets set up a Bugzilla! >> >> >> >>I am doing a simple analysis of interface density, where I compare the >>number of residues on each side of a domain-domain interface (a simple >>plot shows an interesting trend). This analysis showed up this problem >>structure... >> >>1ona >> >>Which is a problem both in the PDB and with my domain parser... Anyway it >>has an unnatural 'domain interface' mediated by a mis-labeled ligand. >> >> >>Is Wan a psimapper? I bet he knows of some PDB problems. >> >> >>All the above problems should be fixed by using MSD. I am planning to >>write an O-rily book (mini) called 'MSD', basically because I like the >>idea of adding a virus into the O-rily taxonomy (using a picture of a >>virus from the 3D structure in MSD). >> >>We should start using MSD with PSIMAP ASAP. We can write the book after >>that. MSD will ... >> >> Fix multimodel PDB / NMR structures. >> Fix problems mapping PDB to SwissProt. >> Automatically provide sequence / sturcture alignments. >> Provide clean ;) SCOP/CATH/PFAM/DALI (e-family) assignments to structues. >> Provide clean ligand assignments to stuctures. >> Give us 'biomolecule' data >> Filter crystal contacts >> Do away with the need for parsers in PSIMAP. >> Allow easy integration of domain-domain, domain-ligand and ligand-ligand >>information >> >> etc., etc. >> >> >> >>P.S. I did an analysis of residue connectivity statistics for a second >>year report. >> >> >>You can find this at... >> >>http://interaction.mrc-dunn.cam.ac.uk/Scientists/Dan_Bolser/MyDoc/New%20Folder/ >> >>(I may change the folder name without warning.) >> >> >>Just let me know if you want to turn this into a paper with some extra >>work (actually it looks quite comprehensive). I can work closly with >>anyone on this topic, as I still remeber the literature and the problems >>in detail. >> >>I am half planning to revive this report into a thesis chapter anyway, so >>any help I can get would be great ;) >> >> >>All the best, >>Dan. >> >> >> >>On Wed, 11 Aug 2004, Andreas Henschel wrote: >> >> >> >>>Hi psimap folks, >>> >>>I found some pdb entries that look like superimposed structures, a >>>structural alignment of very similar (they are not NMR!) >>>that cause trouble to psimap. One of them is 1cm4. I found out about >>>them when I counted contacts per residue and some where as high as 50, >>>thus falsifying my hotspot statistics. >>> >>>Also, these guys have suspiciuosly high contact residues: >>>| 1ae6 | >>>| 1bwm | >>>| 1cm4 | >>>| 1cr9 | >>>| 1cu4 | >>>| 1eo8 | >>>| 1ob1 | >>>| 1qfu | >>>For the moment, I will ignore them. >>>Any idea? >>>Cheers, >>>Andreas >>> >>> >>> >>>------------------------------------------------------- >>>SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media >>>100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 >>>Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. >>>http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 >>>_______________________________________________ >>>Psisoft-devl mailing list >>>Psi...@li... >>>https://lists.sourceforge.net/lists/listinfo/psisoft-devl >>> >>> >>> >> >> >> > > > |
From: Wan K. K. <kim...@bi...> - 2004-08-12 06:52:52
|
Hi, I would like to add a few PDBugs. # next if ($mpdb =~ /^(1cwq|1e0p|1gtw|1cm4)$/) ; ### same model with different chain identifier I am using PQS coordinates for X-ray crystallography structures, which solves many such problems (i.e. 1g7y is splitted into 3 files without overlap.) cheers wan >The solution (in my experience) is to just apply manual filtering of such >structures after applying statistics. When I find these things I delete >them from my psimap database. > >These things are so abhorrent they generally skew any analysis by miles, >allowing us to recognize the problem immediatly. > >This mailing list is the perfect place to keep up to date on 'problem >structures', as subtle different analysis can show up different problems. > > >For example I have the following code in my psimaper... > >my $st = " > SELECT a.PDB, a.SUNID, a.DOMAIN, a.NEAT_SCCS > FROM $scopDb.domain a # Domain table > INNER JOIN $pdbDb.pdb b # PDB table > ON a.PDB = b.PDB > WHERE ( > b.TYPE = 'X-RAY DIFFRACTION' # PDB criteria ># OR b.TYPE = 'SYNCHROTON DIFFRACTION' # ># OR b.TYPE = 'NMR' # ># OR b.TYPE = 'ELECTRON MICROSCOPY' # ># OR b.TYPE = 'NEUTRON DIFFRACTION' # ># OR b.TYPE = 'FIBER DIFFRACTION' # ># OR b.TYPE = 'THEORETICAL MODEL' # ># OR b.TYPE = 'FLUORESCENCE' # > ) # ># AND b.RESOLUTION < 2 # > "; > >The above helps a lot with subsequent domain assignment. > >And... > > next if $pdb eq '1gtv'; # Multi-Model PDB entries > next if $pdb eq '1cm4'; # > next if $pdb eq '1cwq'; # > next if $pdb eq '1e0p'; # > next if $pdb eq '1gu8'; # > > next if $pdb eq '4cpa'; # Contains null chain AND > # named chain! > # PDB stubournly refuse to fix, > # so you should complain! > > > next if $pdb eq '1g7y'; # Contain 'significant' > next if $pdb eq '1ygp'; # crystal overlap between > next if $pdb eq '1hb9'; # subunits - PDB will fix. > next if $pdb eq '1hb5'; # > > >I see that they don't fully overlap with yours which is interesting. We >should make a community webpage to name and same these pdb structures. > >I call it PDBug, lets set up a Bugzilla! > > > >I am doing a simple analysis of interface density, where I compare the >number of residues on each side of a domain-domain interface (a simple >plot shows an interesting trend). This analysis showed up this problem >structure... > >1ona > >Which is a problem both in the PDB and with my domain parser... Anyway it >has an unnatural 'domain interface' mediated by a mis-labeled ligand. > > >Is Wan a psimapper? I bet he knows of some PDB problems. > > >All the above problems should be fixed by using MSD. I am planning to >write an O-rily book (mini) called 'MSD', basically because I like the >idea of adding a virus into the O-rily taxonomy (using a picture of a >virus from the 3D structure in MSD). > >We should start using MSD with PSIMAP ASAP. We can write the book after >that. MSD will ... > > Fix multimodel PDB / NMR structures. > Fix problems mapping PDB to SwissProt. > Automatically provide sequence / sturcture alignments. > Provide clean ;) SCOP/CATH/PFAM/DALI (e-family) assignments to structues. > Provide clean ligand assignments to stuctures. > Give us 'biomolecule' data > Filter crystal contacts > Do away with the need for parsers in PSIMAP. > Allow easy integration of domain-domain, domain-ligand and ligand-ligand >information > > etc., etc. > > > >P.S. I did an analysis of residue connectivity statistics for a second >year report. > > >You can find this at... > >http://interaction.mrc-dunn.cam.ac.uk/Scientists/Dan_Bolser/MyDoc/New%20Folder/ > >(I may change the folder name without warning.) > > >Just let me know if you want to turn this into a paper with some extra >work (actually it looks quite comprehensive). I can work closly with >anyone on this topic, as I still remeber the literature and the problems >in detail. > >I am half planning to revive this report into a thesis chapter anyway, so >any help I can get would be great ;) > > >All the best, >Dan. > > > >On Wed, 11 Aug 2004, Andreas Henschel wrote: > > > >>Hi psimap folks, >> >>I found some pdb entries that look like superimposed structures, a >>structural alignment of very similar (they are not NMR!) >>that cause trouble to psimap. One of them is 1cm4. I found out about >>them when I counted contacts per residue and some where as high as 50, >>thus falsifying my hotspot statistics. >> >>Also, these guys have suspiciuosly high contact residues: >>| 1ae6 | >>| 1bwm | >>| 1cm4 | >>| 1cr9 | >>| 1cu4 | >>| 1eo8 | >>| 1ob1 | >>| 1qfu | >>For the moment, I will ignore them. >>Any idea? >>Cheers, >>Andreas >> >> >> >>------------------------------------------------------- >>SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media >>100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 >>Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. >>http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 >>_______________________________________________ >>Psisoft-devl mailing list >>Psi...@li... >>https://lists.sourceforge.net/lists/listinfo/psisoft-devl >> >> >> > > > -- ********************************************** * Wan Kyu Kim * * * * Biotechnologisches Zentrum der TU-Dresden * * Bioinformatics Group * * Tatzberg 47-51 * * 01307 Dresden * * * * web: www.biotec.tu-dresden.de * * tel: 049 351 46340064 * * fax: 049 351 46340061 * ********************************************** |
From: Dan B. <dm...@mr...> - 2004-08-11 20:30:24
|
The solution (in my experience) is to just apply manual filtering of such structures after applying statistics. When I find these things I delete them from my psimap database. These things are so abhorrent they generally skew any analysis by miles, allowing us to recognize the problem immediatly. This mailing list is the perfect place to keep up to date on 'problem structures', as subtle different analysis can show up different problems. For example I have the following code in my psimaper... my $st = " SELECT a.PDB, a.SUNID, a.DOMAIN, a.NEAT_SCCS FROM $scopDb.domain a # Domain table INNER JOIN $pdbDb.pdb b # PDB table ON a.PDB = b.PDB WHERE ( b.TYPE = 'X-RAY DIFFRACTION' # PDB criteria # OR b.TYPE = 'SYNCHROTON DIFFRACTION' # # OR b.TYPE = 'NMR' # # OR b.TYPE = 'ELECTRON MICROSCOPY' # # OR b.TYPE = 'NEUTRON DIFFRACTION' # # OR b.TYPE = 'FIBER DIFFRACTION' # # OR b.TYPE = 'THEORETICAL MODEL' # # OR b.TYPE = 'FLUORESCENCE' # ) # # AND b.RESOLUTION < 2 # "; The above helps a lot with subsequent domain assignment. And... next if $pdb eq '1gtv'; # Multi-Model PDB entries next if $pdb eq '1cm4'; # next if $pdb eq '1cwq'; # next if $pdb eq '1e0p'; # next if $pdb eq '1gu8'; # next if $pdb eq '4cpa'; # Contains null chain AND # named chain! # PDB stubournly refuse to fix, # so you should complain! next if $pdb eq '1g7y'; # Contain 'significant' next if $pdb eq '1ygp'; # crystal overlap between next if $pdb eq '1hb9'; # subunits - PDB will fix. next if $pdb eq '1hb5'; # I see that they don't fully overlap with yours which is interesting. We should make a community webpage to name and same these pdb structures. I call it PDBug, lets set up a Bugzilla! I am doing a simple analysis of interface density, where I compare the number of residues on each side of a domain-domain interface (a simple plot shows an interesting trend). This analysis showed up this problem structure... 1ona Which is a problem both in the PDB and with my domain parser... Anyway it has an unnatural 'domain interface' mediated by a mis-labeled ligand. Is Wan a psimapper? I bet he knows of some PDB problems. All the above problems should be fixed by using MSD. I am planning to write an O-rily book (mini) called 'MSD', basically because I like the idea of adding a virus into the O-rily taxonomy (using a picture of a virus from the 3D structure in MSD). We should start using MSD with PSIMAP ASAP. We can write the book after that. MSD will ... Fix multimodel PDB / NMR structures. Fix problems mapping PDB to SwissProt. Automatically provide sequence / sturcture alignments. Provide clean ;) SCOP/CATH/PFAM/DALI (e-family) assignments to structues. Provide clean ligand assignments to stuctures. Give us 'biomolecule' data Filter crystal contacts Do away with the need for parsers in PSIMAP. Allow easy integration of domain-domain, domain-ligand and ligand-ligand information etc., etc. P.S. I did an analysis of residue connectivity statistics for a second year report. You can find this at... http://interaction.mrc-dunn.cam.ac.uk/Scientists/Dan_Bolser/MyDoc/New%20Folder/ (I may change the folder name without warning.) Just let me know if you want to turn this into a paper with some extra work (actually it looks quite comprehensive). I can work closly with anyone on this topic, as I still remeber the literature and the problems in detail. I am half planning to revive this report into a thesis chapter anyway, so any help I can get would be great ;) All the best, Dan. On Wed, 11 Aug 2004, Andreas Henschel wrote: >Hi psimap folks, > >I found some pdb entries that look like superimposed structures, a >structural alignment of very similar (they are not NMR!) >that cause trouble to psimap. One of them is 1cm4. I found out about >them when I counted contacts per residue and some where as high as 50, >thus falsifying my hotspot statistics. > >Also, these guys have suspiciuosly high contact residues: >| 1ae6 | >| 1bwm | >| 1cm4 | >| 1cr9 | >| 1cu4 | >| 1eo8 | >| 1ob1 | >| 1qfu | >For the moment, I will ignore them. >Any idea? >Cheers, >Andreas > > > >------------------------------------------------------- >SF.Net email is sponsored by Shop4tech.com-Lowest price on Blank Media >100pk Sonic DVD-R 4x for only $29 -100pk Sonic DVD+R for only $33 >Save 50% off Retail on Ink & Toner - Free Shipping and Free Gift. >http://www.shop4tech.com/z/Inkjet_Cartridges/9_108_r285 >_______________________________________________ >Psisoft-devl mailing list >Psi...@li... >https://lists.sourceforge.net/lists/listinfo/psisoft-devl > |
From: Andreas H. <hen...@mp...> - 2004-08-11 18:03:20
|
Hi psimap folks, I found some pdb entries that look like superimposed structures, a structural alignment of very similar (they are not NMR!) that cause trouble to psimap. One of them is 1cm4. I found out about them when I counted contacts per residue and some where as high as 50, thus falsifying my hotspot statistics. Also, these guys have suspiciuosly high contact residues: | 1ae6 | | 1bwm | | 1cm4 | | 1cr9 | | 1cu4 | | 1eo8 | | 1ob1 | | 1qfu | For the moment, I will ignore them. Any idea? Cheers, Andreas |
From: Dan B. <dm...@mr...> - 2004-07-28 13:06:14
|
On Wed, 21 Jul 2004 pa...@so... wrote: Did I reply to this already? >Quoting Dan Bolser <dm...@mr...>: > >> >> This may be one mailing list too many, but what the hey... >> >> At least this way we know who to email regarding questions of software >> devl... or is that devel (sorry about that). >> >> Should we add psimap software documentation (from biogrid) to the >> sourcefourge project? >> > >Do you mean the java technical documentation? This can be generated very easy >from the source code. That would be nice to include (if it isn't a problem). I tried to add 'tags' to my comments, but they may mess things up a bit. >> A nice gif of the database would be cool. > >Please find an old tech. document for the database. Ta, this is the kind of documentaion I was thinking about. Was there a consensus on open MMS ? Ta, Dan. >> For relational diagrams I recommend the latest version of dia (0.93) using >> UML diagrams - actually there are tools to automatically generate dia >> diagrams from mysql. > >That's nice... > >OMONDO UML is also really cool for java src. It is a plug in for eclipse. > >> Dan. >> >> ------------------------------------------------------- >> This SF.Net email is sponsored by BEA Weblogic Workshop >> FREE Java Enterprise J2EE developer tools! >> Get your free copy of BEA WebLogic Workshop 8.1 today. >> http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click >> _______________________________________________ >> Psisoft-devl mailing list >> Psi...@li... >> https://lists.sourceforge.net/lists/listinfo/psisoft-devl >> > > > > >---------------------------------------------------------------- >This message was sent using IMP, the Internet Messaging Program. > |
From: <pa...@so...> - 2004-07-21 10:10:22
|
Quoting Dan Bolser <dm...@mr...>: > > This may be one mailing list too many, but what the hey... > > At least this way we know who to email regarding questions of software > devl... or is that devel (sorry about that). > > Should we add psimap software documentation (from biogrid) to the > sourcefourge project? > Do you mean the java technical documentation? This can be generated very easy from the source code. > A nice gif of the database would be cool. Please find an old tech. document for the database. > For relational diagrams I recommend the latest version of dia (0.93) using > UML diagrams - actually there are tools to automatically generate dia > diagrams from mysql. That's nice... OMONDO UML is also really cool for java src. It is a plug in for eclipse. > Dan. > > ------------------------------------------------------- > This SF.Net email is sponsored by BEA Weblogic Workshop > FREE Java Enterprise J2EE developer tools! > Get your free copy of BEA WebLogic Workshop 8.1 today. > http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click > _______________________________________________ > Psisoft-devl mailing list > Psi...@li... > https://lists.sourceforge.net/lists/listinfo/psisoft-devl > ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. |
From: Dan B. <dm...@mr...> - 2004-07-20 18:27:55
|
This may be one mailing list too many, but what the hey... At least this way we know who to email regarding questions of software devl... or is that devel (sorry about that). Should we add psimap software documentation (from biogrid) to the sourcefourge project? A nice gif of the database would be cool. For relational diagrams I recommend the latest version of dia (0.93) using UML diagrams - actually there are tools to automatically generate dia diagrams from mysql. Would anyone else like to be added to the list admin? Dan. |