#47 Conn specific User/Module Storage and API

open-accepted
Jim Davidson
None
5
2003-06-01
2003-05-11
Jerry Asher
No

I would like to see an API developed that would let
modules attach data and metadata to specific connections.

Such an API would make it easier for modules to perform
work at various times in a connection's life: init,
request processing, logging, and conn tear down.

This API would also make it much easier for modules to
cooperate and delegate tasks amongst themselves.

An example (from 3.x)

Access logging is performed by the nslog module. Right
now, on several of my servers, there are a bunch of
metarequests that are made. Uptime monitoring,
ns_telemetry requests, etc. There are also useless
requests caused by Code Red etc. For a variety of
reasons, I choose not to even log these requests. I do
this by creating a input request filter. When that
filter recognizes a virus, or metarequests, it knows to
flag the connection in such a way as to inform the log
module not to log the request. There is no connection
specific method for doing so at this time, so what I do
do is set an X-Log False header in the connection's
Input Header Set. This is a complete and ugly kluge.
But it does work, and when the nslog module comes
along, it sees the input header and checks for X-Log
False. If it finds it, it returns.

Recently there was discussion of creating a mechanism
that would recognize requests that come from search
engines, and that would alter the output stream to
highlight the search engine keywords. One way of
approaching this task is with two modules. One, on
accepting a request, examines the referrer field and
noting it comes from a search engine, it proceeds to
annotate the connection with the specific words or
phrases to be highlighted. Then, as output is created,
another module can recognize the connection specific
keywords and alter the output stream accordingly.

The value of breaking this into two modules as
described is it that it separates the highlight
processing from the request processing. This would
make it easier to highlight requests from external
search engines, or from internal site queries.

The key requirements of this API are to:

a) provide connection specific storage
b) provide an API to enable modules to cooperate in
sharing this connection specific storage. Presumably,
modules would want to hook or daisy chain individual
pieces of storage together.
c) provide well defined times for module specific
procedures to initialize and cleanup the connection
specific storage

Discussion

  • Jerry Asher
    Jerry Asher
    2003-05-11

    • summary: Conn dpecific User/Module Storage and API --> Conn specific User/Module Storage and API
     
    • assigned_to: nobody --> vasiljevic
     
  • Logged In: YES
    user_id=95086

    I understand the need for such API. To make things happen,
    though, a reference implementation would be ideal, followed by
    some/any formal API description (function names/args) next.
    Can you provide one of those?

     
  • Logged In: YES
    user_id=95086

    Tossing it to Jim, as Nathan suggested :)

     
    • assigned_to: vasiljevic --> jgdavidson
    • status: open --> open-accepted
     
  • Jerry Asher
    Jerry Asher
    2003-06-04

    Logged In: YES
    user_id=20647

    I don't think I am enough tcl/aolserver expert to suggest
    the API and I suspect I would leave major issues unresolved.

    My first thought was always just to attach a thread specific
    ns_set to the ns_conn struct and let modules hack away at
    the ns_set knowing that the ns_set is to be tossed when the
    connection is closed.

    What I think that leaves out is who will or how to clean up
    structures allocated at runtime and associated with keys or
    values in the ns_set.

    That makes me think there should be a more formal API, and
    perhaps something similar to the way filters work.

    So we need a name. Let's call them rules. (daemons?
    agents? golems? tinks? (just read peter pan to my kids))
    Their purpose is twofold: provide connection specific
    storage and provide a facility to let modules communicate
    with each other about a specific connection.

    Like filters, they might have a specific when-even for when
    they run

    Init -- initialization

    Request -- After the request is received but before it has
    been dispatched and processed (this lets me
    mung/transcribe/transform the request from bizarro language
    X to HTTP)

    Output -- After all the output has been created but before
    it has been written (when possible) (this lets me implement
    search highlighting, or compression, or change from
    HTTP/HTML back to bizarro language X)

    Cleanup -- When a connection has been closed (this helps me
    create memory leaks)

    Like a filter they have a URL (including *) to match against

    Additionally, they have a meta/introspection facility so
    that they can:

    A) Determine if another rule is present that provides some
    facility
    B) Position themselves in the firing stream with respect to
    another named rule or facility (aka :after <rule> or :before
    <rule> or :firstest or :lastest)

    Unlike filters, all applicable rules run and there is no
    TCL_RETURN or TCL_BREAK kind of processing.