Re: [PyWrapper-devel] [tdwg-tapir] RE: WG: tapir: capabilities

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

That looks great to me. Thanks for your consideration of this request.

On 7/27/06, "D=F6ring, Markus" <m.d...@bg...> wrote:
>
> I have implemented the log-only request now and would like to suggest the
> following:
>
> 1) add "log-only" attribute to responeOperationGroup. Ive chosen
> "log-only" cause we already have there an attribute "apply-xslt"
> 2) add a mandatory boolean attribute "logRequestsDenied" to the operation
> element in a capabilities request
> 3) use existing responses for the log-only request. So if you do a search
> with log-only active, you will get an empty search response back. Thats m=
uch
> easier to implement and doesnt require any change in the schema. The same
> works with inventories. Pong, Capa & Meta responses dont cost much anyway=
,
> so we could do a normal response (if anyway uses log-only with those
> requests at all)
>
>
> Does everyone agree to this?
> The schema is already updated for this.
>
>
> Markus
>
>
>
> -----Original Message-----
> From:   pyw...@li... on behalf of John R=
.
> WIECZOREK
> Sent:   Tue 7/25/2006 5:29 PM
> To:     D=F6ring, Markus
> Cc:     PyWrapper Developers mailing list; ro...@td...;
> tdw...@li...
> Subject:        Re: [PyWrapper-devel] [tdwg-tapir] RE: WG: tapir:
> capabilities
>
> Why make this so hard? We're talking about a portal sending out one
> message
> to n providers and not having to wait for a response. Right now we would
> have to send out a message and wait for all of the responses (up to a
> timeout). This is a net improvement.
>
> Why do we need metadata from providers to know if they want logging
> requests? We don't ask them if they want metadata requests, or how often.
> Just send the requests. If they want to log them, they will configure
> their
> provider to do so. Otherwise the provider will ignore them.
>
> In the meantime, always log on the portal side. Not only will that give
> providers with flakey connections a place to see usage statistics, but it
> will also generate information will be interesting on its own - summary
> information about how the portal is used that you wouldn't get from the
> providers.
>
> To me, this usage business is important enough that I would even go so fa=
r
> as to certify portals as being in compliance with meeting this social
> contract. That way a provider could release access to certified portals
> and
> disallow access for those who don't abide by the contract. Remember, the
> semblance of control is really important to a lot of our providers. If yo=
u
> don't think so, have someone do a survey of existing providers to see if
> they would want it or not. It would be a sample biased against needing
> logging (since they are already doing without). If that survey turned up
> interest in logging anyway, then it's worth doing. My feeling is that it
> is
> so easy to implement (if you don't try to get unnecessarily fancy) that i=
t
> should just be done - it would be easier than conducting a survey about
> it.
>
> On 7/25/06, "D=F6ring, Markus" <m.d...@bg...> wrote:
> >
> > I agree that logging via GUID doesnt help in many cases where the
> provider
> > wants to know what was searched for.
> >
> > But searching on a portal-cache to find data from 20 different provider=
s
> > in 1 search and then sending of 20 log requests could also be annoying.
> Plus
> > the burden of the portal of checking the registry if a providers really
> > wants logging.
> >
> > The most efficient is probably a portal specific logging as Donald
> > suggests. But then providers would have a hard time agglomerating the
> > logging data from several totaly different portals.
> >
> > To get a comparable logging across different portals though it seems to
> me
> > that Renatos suggestions are worth a try. It would definitely need
> guidlines
> > for portal developers to know when to use a log request and how to use
> it.
> > How to treat paging and map data are good examples where there is no
> obvious
> > correct behaviour.
> >
> >
> > -- Markus
> >
> >
> > > -----Urspr=FCngliche Nachricht-----
> > > Von: tdw...@li...
> > > [mailto:tdw...@li...] Im Auftrag von
> > > John R. WIECZOREK
> > > Gesendet: Montag, 24. Juli 2006 18:28
> > > An: ro...@td...
> > > Cc: PyWrapper Developers mailing list; tdw...@li...
> > > Betreff: Re: [PyWrapper-devel] [tdwg-tapir] RE: WG: tapir:
> > > capabilities
> > >
> > > Logging that a GUID was used isn't sufficient; it doesn't
> > > tell how the data were used. What I'm after is to log the
> > > actual query that would have had to go to the provider to
> > > produce the results used. The example in your second example
> > > shouldn't happen, the query should specify a record limit per
> > > provider.
> > >
> > >
> > > On 7/24/06, Roger Hyam <ro...@td...> wrote:
> > >
> > >
> > >       I thought this sounded like a good idea but since
> > > reading Renato's message I am now confused.
> > >
> > >       If a user does a search on a portal and gets 100
> > > results and looks at the first 10 does the provider of the
> > > 11th record get notified? Their data has been used because it
> > > has been given in the count.  Another example would be if a
> > > portal gave a distribution map to 10km squares based in data
> > > from multiple providers. Each data point is made from several
> > > suppliers data and removal of any one supplier's data may not
> > > change the map. Do we notify them all?
> > >
> > >       I could envisage a GUID based system just about. The
> > > call to the log function would basically say "Some one has
> > > accessed the data that I got from you that you tag with this
> > > GUID" but I can't see how this would work on a search based
> > > system. The log call would mean "Some one searched for
> > > something that made used of data I got from a search I did on
> > > you once".
> > >
> > >       So really the only service we need is a GUID based one.
> > > Perhaps extending the LSID resolution spec would be more appropriate?
> > >
> > >       Roger
> > >
> > >
> > >
> > >       Renato De Giovanni wrote:
> > >
> > >               Hi John,
> > >
> > >               Implementing the log request was never a
> > > problem. We discussed about
> > >               that again during the Madrid meeting, and only
> > > after that a feature
> > >               freeze was suggested. It's true that PyWrapper
> > > is being adjusted now
> > >
> > >               to conform to the new specs, and considering
> > > that DiGIR2 (or wasabi)
> > >               postponed implementation of TAPIR, I suppose it
> > > should not be a big
> > >               problem to make additional changes if necessary.
> > >
> > >               The main problem I had with the log request was
> > > that it would
> > >
> > >               probably not solve the issue behind it, which
> > > is to track usage by
> > >               data aggregators. I still have the same
> > > feeling, and I can easily
> > >               imagine situations when it would not be easy or
> > > even possible to
> > >               translate searches on top of cached databases
> > > to TAPIR requests.
> > >
> > >
> > >               But maybe I'm wrong, and if you all think it's
> > > a good feature then we
> > >               can try to include it. However, I do think that
> > > providers should be
> > >               able to advertise as the part of capabilities
> > > if they want to receive
> > >
> > >               log requests or not.
> > >
> > >               To me it also sounds like a new operation,
> > > especially if it's only
> > >               related to search. It could make sense for
> > > view, inventory and
> > >               metadata operations. Maybe capabilities too.
> > > But it doesn't make
> > >
> > >               sense for ping. Well, maybe it could make sense
> > > for ping if the data
> > >               aggregator monitors provider status and accepts
> > > similar requests on
> > >               top of its results...
> > >
> > >               So, yes, it could be a new attribute "logOnly"
> > > as part of the
> > >
> > >               operationRequestGroup with an answer
> > > </received> (just after the
> > >               response header). And we could add an attribute
> > > "acceptLogRequests"
> > >               in the <operations> element in capabilities
> > > responses. The other
> > >
> > >               option would be to include a new operation, but
> > > maybe it's better to
> > >               just have it as an optional attribute for all
> > > operations.
> > >
> > >               Best Regards,
> > >               --
> > >               Renato
> > >
> > >               On 19 Jul 2006 at 10:32, John R. WIECZOREK wrote:
> > >
> > >
> > >
> > >                       I appreciate that you will consider
> > > this request. I always thought
> > >                       it
> > >                       would be trivial to implement. Your
> > > simulation mode sounds very much
> > >                       like what I had in mind. I hadn't
> > > thought it necessary to get a
> > >
> > >                       response from a log request, but if
> > > there was a simple response, it
> > >                       could be used as a ping, or it could be
> > > used to retry logging until
> > >                       the provider did respond. So, something
> > > like <log request received>.
> > >
> > >                       I think the addition oflogOnly
> > > attribute is a good one, and could
> > >                       apply to every request type.
> > >
> > >                       Javi, I don't disagree that portals
> > > SHOULD log the data usage,
> > >                       especially to cover the situations
> > > where a provider doesn't respond.
> > >
> > >                       I also think that having the
> > > information logged at the provider is a
> > >                       responsible course of action, since
> > > they will have immediate access
> > >                       to the usage statistics that way. It
> > > will be much easier for a
> > >                       portal
> > >
> > >                       builder to send log requests than it
> > > will be to build the
> > >                       infrastructure and interfaces to logs,
> > > therefore it is more likely
> > >                       to
> > >                       actually get done.
> > >
> > >
> > >                       On 7/18/06, Javier de la Torre
> > >                       <ja...@mn...>
> > > <mailto:ja...@mn...>  wrote:
> > >                           I am not sure about this,
> > >
> > >                           I still think that portals should
> > > be gathering this data and
> > >                       making
> > >                           it available for data providers...
> > >
> > >                           But in any case if you like it then
> > > I agree with MArkus that the
> > >
> > >                       best
> > >                           is to include another parameter in
> > > the operationRequestGroup.
> > >
> > >                           I havent checked but what happens
> > > if you do an extension there
> > >                       with
> > >                           an attribute that is implementation
> > > specific? A qualified
> > >
> > >                       attribute.
> > >                           Will this still validate against
> > > our schema? You were
> > >                       discussing
> > >                           about qualification of attributes before no=
?
> > >
> > >                           Javi.
> > >
> > >                           On 18/07/2006, at 10:41, D=F6ring,
> > > Markus wrote:
> > >
> > >
> > >                           > John,
> > >                           > all changes going on with TAPIR
> > > right now are really only
> > >                       changes
> > >                           > in terminology or removing
> > > inconsistencies we did not detect
> > >                       before
> > >                           > we started the documentation and
> > > final implementation.
> > >
> > >                           >
> > >                           >
> > >                           > But nevertheless I would support
> > > your request. Especially from
> > >                       the
> > >                           > implementation side of view this
> > > is a trivial change to the
> > >                       code.
> > >                           > So why dont add it? Just some
> > > additional thoughts:
> > >
> > >                           >
> > >                           >
> > >                           > - Ive added a "simulation" mode
> > > already to my code where no
> > >                       SQL
> > >                           > gets executed but just logged. So
> > > you can test
> > >                       configurations
> > >                           > without risking sending off
> > > killer statements. Thats similar
> > >
> > >                       to
> > >                           > logOnly I guess, returning
> > > nothing but diagnostics. What would
> > >                       you
> > >                           > suggest to be returned for a
> > > logOnly request? just the empty
> > >                       TAPIR
> > >                           > envelope? Nothing? <OK>?
> > >                           >
> > >
> > >                           > - would this log-only request not
> > > be needed for all requests?
> > >                       at
> > >                           > least for inventories? So it
> > > would be easiest to have a new
> > >                       logOnly
> > >                           > parameter in the header or
> > > "request element" just after the
> > >
> > >                       header?
> > >                           > something like <search logOnly=3D"true">
> > >                           >
> > >                           >
> > >                           >
> > >                           > -- Markus
> > >                           >
> > >                           >
> > >                           >> -----Urspr=FCngliche Nachricht-----
> > >                           >> Von:
> > >                       pyw...@li...
> > >
> > >                           >>
> > > [mailto:pyw...@li...
> > >                       ] Im
> > >                           >> Auftrag von John R. WIECZOREK
> > >                           >> Gesendet: Montag, 17. Juli 2006 23:30
> > >                           >> An: Renato De Giovanni
> > >                           >> Cc:
> > >                       pyw...@li...
> > > <mailto:pyw...@li...> ; tdwg-
> > >                           ta...@li...
> > >
> > >                           >> Betreff: Re: [PyWrapper-devel]
> > > [tdwg-tapir] RE: WG: tapir:
> > >                           >> capabilities
> > >                           >>
> > >                           >> A little off topic, but it
> > > occurs to me that a great deal
> > >                       of
> > >                           >> work is still ongoing with
> > > TAPIR, which suggests to me that
> > >
> > >                           >> it may be warranted to re-state
> > > my request for a simple
> > >                           >> message type - a log request.
> > > This request would be the
> > >                       same
> > >                           >> as a search request, except that
> > > the caller doesn't need a
> > >
> > >                           >> response. Providers would use
> > > this type of request to log
> > >                           >> data usage if the data were
> > > retrieved from a cache
> > >                       elsewhere.
> > >                           >> I remember talking about this in
> > > Berlin, at which time
> > >
> > >                       there
> > >                           >> was supposed to be a feature
> > > freeze. Clearly we've gone
> > >                           >> beyond that, so I'm requesting it again.
> > >                           >>
> > >                           >>
> > >                           >> On 7/17/06, Renato De Giovanni
> > >                       <re...@cr...>
> > > <mailto:re...@cr...>  wrote:
> > >                           >>
> > >                           >>Hi,
> > >                           >>
> > >                           >>If I remember well, the "view"
> > > operation was re-included in
> > >                           the
> > >                           >>protocol just to handle query
> > > templates, especifically
> > >
> > >                           >> for TapirLite
> > >                           >>providers. So if someone wants to
> > > query a provider using
> > >                       some
> > >                           >>external output model that should
> > > be dynamically
> > >                           >> parsed, then the
> > >                           >>"search" operation must be used
> > > instead (using either
> > >
> > >                           >> XML or simple
> > >                           >>GET request). View operations are
> > > really bound to query
> > >                           >> templates,
> > >                           >>and they are not allowed to
> > > specify "filter" or
> > >                           >> "partial" parameters.
> > >
> > >                           >>--
> > >                           >>Renato
> > >                           >>
> > >                           >>On 17 Jul 2006 at 21:26, "D=F6ring,
> > > Markus" wrote:
> > >                           >>
> > >                           >>> I was just about to edit the
> > > schema and realizing
> > >
> > >                           >> that output models
> > >                           >>> are only specified for
> > > searches. but what about
> > >                           >> views? they use
> > >                           >>> query templates, yes. but only
> > > the ones listed in
> > >                           >> capabilities? we
> > >
> > >                           >>> should have dynamic ones here
> > > as well I think. And
> > >                           >> they link back to
> > >                           >>> static/dynamic models.
> > >                           >>>
> > >                           >>> So should models maybe become a
> > > seperate section not tight
> > >
> > >                           to
> > >                           >>> search/view operations? I am
> > > going to modify the
> > >                           >> schema nevertheless
> > >                           >>> already to accomodate the
> > > changes below - ignoring
> > >                           >> views for now.
> > >
> > >                           >>>
> > >                           >>> Markus
> > >
> > >
> > >               _______________________________________________
> > >               tdwg-tapir mailing list
> > >
> > >               tdw...@li...
> > > <mailto:tdw...@li...>
> > >               http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
> > >
> > >
> > >
> > >
> > >
> > >
> > >       --
> > >
> > >       -------------------------------------
> > >        Roger Hyam
> > >        Technical Architect
> > >        Taxonomic Databases Working Group
> > >       -------------------------------------
> > >
> > >       http://www.tdwg.org <http://www.tdwg.org>
> > >        ro...@td...
> > >        +44 1578 722782
> > >       -------------------------------------
> > >
> > >       _______________________________________________
> > >       tdwg-tapir mailing list
> > >       tdw...@li...
> > >       http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
> > > <http://lists.tdwg.org/mailman/listinfo/tdwg-tapir>
> > >
> > >
> > >
> > >
> > >
> > >
> > _______________________________________________
> > tdwg-tapir mailing list
> > tdw...@li...
> > http://lists.tdwg.org/mailman/listinfo/tdwg-tapir
> >
>
>
>
>