Re: [Jmol-developers] Standard syntax for pulling from remote databases

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

At 22:23 01/02/2004 +0100, Egon Willighagen wrote:
>On Sunday 01 February 2004 17:28, Egon Willighagen wrote:
> > dadml://nist/cas?50-00-0
> >
> > This would allow things in Java like:
> >
> > URI uri = new URI("dadml://nist/cas?50-00-0")
> > String protocol = uri.getScheme();
> > String service = uri.getAuthority();
> > String index = uri.getPath();
> > String query = uri.getQuery();
>
>Ok, CDK's cdk/internet/DADMLReader now accepts things like:
>
>dadml://any/CAS-NUMBER?50-00-0
>
>It's not really fine tuned to the syntax above, but a nice start.
>
>Note the any, which indicates that the URI should be resolved to the first
>database that could contain information...
>
>Later this week it will be possible to use services like pdb... so things like
>
>dadml://pdb/?1CRN
>
>or dadml://any/pdbid?1CRN
>
>The second will try any mirror that can return information based on the
>pdbid...

Presumably someone enters these mirrors and keeps their addresses and 
templates up to date. Is there a cascade - if mirror 1 fails does mirror2 
get called? And what is returned - the actual file?

If so we have something like:

User -> PDBCode -> server
server -> munged URL (format1)-> mirror1 -> success/error
success -> PDB file -> user
failure
server -> munged URL (format2)-> mirror2 -> success/error
and so on

is this the model?

>The DADML system also support retrieving information in other formats, not
>just chemical/x-pdb or chemical/x-cml, but also text/html etc..
>I'm not sure if we want to be able to do that sort of things too, so for now
>it only supports reading chemical formats...

The attraction of chemical/x-* is that the information contained within 
each is (relatively?!) consistent and structured. For an arbitrary web site 
producing HTML the structure could be anything and a separate parser has to 
be written for each. (For example we have written parsers for 2 of the main 
sites offering small molecule information and they obviously are completely 
different. Moreover the structure of the pages changes regularly. For 
example the *text/html* on the RCSB site will be completely different from 
that on the EBI site even though the actual PDB file is presumably the same 
or closely related. It is the consistency of chemical/x-* that makes it 
useful for machines to parse.

P.

Peter Murray-Rust
Unilever Centre for Molecular Informatics
Chemistry Department, Cambridge University
Lensfield Road, CAMBRIDGE, CB2 1EW, UK
Tel: +44-1223-763069

Re: [Jmol-developers] Standard syntax for pulling from remote databases

An interactive viewer for three-dimensional chemical structures.

Re: [Jmol-developers] Standard syntax for pulling from remote databases