|
From: Andrew D. <da...@da...> - 2008-08-27 23:04:01
|
On Aug 21, 2008, Egon Willighagen wrote (responding to Rajarshi Guha): > Full roundtripping is outside the scope of the CDK... a PDB file is a > document, really, not a chemical format. I've been thinking about this distinction, as someone who has written several PDB parsers, and other parsers. I don't understand the nuance you're pointing out. What do you mean by document here? Is CML, with its ability to embed other data, also a document rather than a chemical format? What biases a given file type more towards one or the other? Do the tag fields in an SD file make it somewhat of a document? Are all Gaussian output files documents? >> I can see the CDK being used in metabolomics and hence the need for >> PDB support > > Ummm... I don't often see metabolites in the PDB format... :) That's because most people doing metabolomics treat molecules as identifier names only and don't track atom level details, much less use something like reaction SMILES. Still, a quick Google search for the words "metabolites in the PDB format" ;) found the Human Metabolome Database, and a search of it for "Isocitric acid" found this record http://hmdb.ca/scripts/show_card.cgi?METABOCARD=HMDB00193.txt with a link to the isocitric acid as a standalone/synthetic PDB and a link to structure 1b0j, which is CRYSTAL STRUCTURE OF ACONITASE WITH ISOCITRATE Out of curiosity, I also checked KEGG. There are a few PDB-linked records, for example: http://www.genome.jp/dbget-bin/www_bget?compound+C00167 which links to the "PDB-CCD" record (first time I heard of that term; "PDB Chemical Component Dictionary") for UGA at http://www.ebi.ac.uk/msd-srv/msdchem/cgi-bin/cgi.pl? FUNCTION=getByCode&CODE=UGA and has a way to get the ligand as a PDB file, as well as a link to "In PDB Entries" which lists PDB files containing that ligand, at http://www.ebi.ac.uk/msd-srv/msdchem/cgi-bin/cgi.pl? FUNCTION=relation&PARENTENTITY=CHEM_COMP&APPLICATION=1&ENTITY=COMP_OCCUR ENCES&RELATIONID=3193&PARENTINDEX=0&PARENT0=:UGA%20UGA%20: > btw, I was not aware that BioJava did 3D structures nowadays... According to the CVS logs for org/biojava/bio/structure/io/PDBFileParser.java revision 1.3 date: 2005/12/06 15:08:10; author: andreas; state: Exp; lines: +661 -647 added a check that ignores empty lines that some people might have at the end of their (local) PDB files. ---------------------------- revision 1.2 date: 2005/04/14 12:24:58; author: andreas; state: Exp; lines: +1 -1 made convert_3code_1code public ---------------------------- revision 1.1 date: 2004/10/25 20:37:08; author: andreas; state: Exp; added PDBFileParser as independent class There's a BioJava-based structure viewer called SPICE that Andreas Prlic (the one mentioned in the CVS logs) has been working on. It's pretty widely used at the EBI and the Sanger Institute. See: http://www.efamily.org.uk/software/dasclients/spice/ I first saw it at ISMB Detroit, I think, so in 2005. It's meant more for sequence/structure comparisons and feature annotations, which aren't things that small molecule viewers typically care about. Just like large molecule structure viewers don't all care about things like bond orders. ;) It does seem sometimes like the big molecule and small molecule people are in mostly disjoint fields. > The Jmol PDB reader might even be a better > one... lot's of user community around that PDB reader, likely more > than for the BioJava version... and there's some integration already between the two, as for example: http://www.biojava.org/wiki/BioJava:CookBook:PDB:Jmol > Andrew da...@da... |