RE: [Digir-dev] DiGIR Protocol Documentation

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi Donald,
Here's a quick response to the issues you mention below.  Documentation
is the weak point of DiGIR at the moment and is an issue that certainly
needs to be addressed more completely, though with the limited resources
available for this project, the primary focus has been on development of
a functional system. 

> -----Original Message-----
> From: dig...@li... [mailto:digir-
> dev...@li...] On Behalf Of Hobern, Donald
> Sent: Monday, September 23, 2002 02:46
> To: 'DiG...@li...'
> Subject: [Digir-dev] DiGIR Protocol Documentation
> 
> Hello.
> 
> Aside from the documents accessible from the
http://digir.sourceforge.net
> page, are there any others which describe the expected sequence of
> exchanges
> for the DiGIR protocol?  

Not really, although the expected sequence of exchanges is pretty
simple- a portal queries UDDI to find providers, the portal then
requests metadata from the providers to discern which ones might be of
interest, and then the portal issues a query to the relevant portals
which happily send back content from their resource(s) in the format(s)
requested by the portal(s).  

We all have notes scattered around the place, mine are at
http://chipotle.speciesanalyst.net/digir.  I endeavor to keep portions
of the notes up to date, but the primary focus has been on
implementation for the last few weeks.  There is a lot of documentation
within the code itself, although that is not particularly useful to the
non-programmers that require further information about DiGIR.

I am trying to work out some proto-design
> documents
> for the GBIF Network to act as the basis for discussing how the global
> service architecture should work.  I have my own ideas on the right
way to
> do everything but I would like to make sure I know where the DiGIR
> protocol
> is going so that I can avoid discrepancies.
> 
> The sort of issues I am thinking about include:
> 
> 1. What should be included in metadata describing provider nodes
and/or
> services?

The current metadata content can be discerned by examining the DiGIR
protocol schema.  There is a general awareness that more metadata would
be useful for helping with the location of data, but the distinction
between metadata and intelligent use of data mining becomes difficult to
discern.  I believe that is the primary reason why the metadata
currently available from DiGIR providers is essentially information
about the resources and host which can not be readily (or efficiently)
discerned from the content of the resource(s).

> 2. What 'administrative' interfaces should exist for nodes (e.g. to
allow
> users to query services available from a node, or to allow
applications to
> query supported options - supported protocol versions, support for
data
> compression, etc.).

Resources available from a node are described in the metadata for the
provider.  Services available from a DiGIR provider outside of the scope
of the DiGIR service itself are of course not an issue.  Portions of a
DiGIR service that are implemented on a particular provider are
currently determined by the version of the provider.  It would be a
simple and sensible move to include this information within the metadata
for a provider.

> 3. What sequence of exchanges are required in registering a new node?
(We
> probably need to consider how both a central UDDI registry node and
the
> individual data nodes can avoid being 'spoofed' by bogus nodes.)

We currently use the public UDDI registry, which of course is wide open
for spoofing.  A private registry would significantly reduce this
possibility.  Right now the process is very simple- a new node simply
registers themselves with the public UDDI and indicates a service that
supports the DiGIR tModel.  The portal queries UDDI for services that
support that tModel and assumes that everyone is playing the game
nicely.  The process would basically be the same for a private registry
except that there could be some element of human intervention necessary
(this could also be done with the public registry, but not as easily).

> 4. What options should the request documents include to allow control
over
> the level of detail returned?

That is completely controlled by the request document by the requested
record structure.  A single field can be returned through to the entire
record.  We assume that all information accessible to the provider is
public information, so we are not concerned about securing particular
columns or rows.

> 5. How should we manage backward-compatibility when we introduce new
> versions of protocols and data exchange standards?

A sensible example of this is provided by the Open GIS consortium's
approach to their various web service specifications.  The approach is
not novel, but certainly works well.  The version stamp of the provider
is the primary mechanism for identifying what capabilities and
request/response structures are supported.  

> 6. Given that the data exchange standards will need extensions for
> different
> communities (e.g. the example of including 'Depth' for ichthyologists
but
> not others), how should we manage such extensions?  Should they be
> centrally
> administered?  Should they all be incorporated as optional elements
within
> the main schema?

Schemas can be derived from other schemas.  Think of it like object
oriented description of data.  Various communities can derive their own
version of the federation (or "conceptual") schema.  It would make a lot
of sense for the various communities to manage these in a central
location, which is actually pretty easy since it basically comes down to
hosting an XML schema document and some documentation.

> 
> Are there any additional documents or discussion archives which
address
> any
> such issues?  If so, I would very much like to see them.
> 
> Many thanks,
> 
> Donald
> 
> ---------------------------------------------------------------
> Donald Hobern
> Programme Officer for Data Access and Database Interoperability
> Global Biodiversity Information Facility Secretariat
> Zoological Museum - University of Copenhagen
> Universitetsparken 15, DK-2100 Copenhagen, Denmark
> Tel: +45-35321483 Fax: +45-35321480 E-mail: dh...@gb...
> ---------------------------------------------------------------
> 
> 
> 
> -------------------------------------------------------
> This sf.net email is sponsored by:ThinkGeek
> Welcome to geek heaven.
> http://thinkgeek.com/sf
> _______________________________________________
> DiGIR-developers mailing list
> DiG...@li...
> https://lists.sourceforge.net/lists/listinfo/digir-developers