cweb-developer Mailing List for CognitiveWeb

Status: Beta

Brought to you by: beebs, guylukes, thompsonbry

cweb-developer — Discussion list for project developers

You can subscribe to this list here.

2003	_Jan	_Feb	_Mar	_Apr	_May	_Jun	_Jul	_Aug	_Sep	_Oct	_Nov	_Dec (3)
2004	_Jan	_Feb	_Mar	_Apr	_May (1)	_Jun	_Jul	_Aug	_Sep	_Oct	_Nov	_Dec
2006	_Jan	_Feb	_Mar (1)	_Apr	_May	_Jun	_Jul	_Aug	_Sep	_Oct	_Nov	_Dec

Flat | Threaded

[Cweb-developer] cweb RELEASE-HOWTO.html,1.3,1.4

From: Bryan T. <tho...@us...> - 2006-03-08 20:34:45

Update of /cvsroot/cweb/cweb
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv24400

Modified Files:
	RELEASE-HOWTO.html 
Log Message:
Updates to the HOWTO document for releasing cweb modules on sourceforge.

Index: RELEASE-HOWTO.html
===================================================================
RCS file: /cvsroot/cweb/cweb/RELEASE-HOWTO.html,v
retrieving revision 1.3
retrieving revision 1.4
diff -C2 -d -r1.3 -r1.4
*** RELEASE-HOWTO.html	27 Feb 2005 02:03:08 -0000	1.3
--- RELEASE-HOWTO.html	8 Mar 2006 20:34:32 -0000	1.4
***************
*** 299,304 ****
  <strong>ftp upload.sourceforge.net</strong>
  Connected to osdn.dl.sourceforge.net.
! User (osdn.dl.sourceforge.net:(none)): anonymous
! 331 Anonymous login ok, send your complete e-mail address as password.
  Password:
  230 Anonymous access granted, restrictions apply.
--- 299,304 ----
  <strong>ftp upload.sourceforge.net</strong>
  Connected to osdn.dl.sourceforge.net.
! User (osdn.dl.sourceforge.net:(none)): <strong>anonymous</strong>
! 331 Anonymous login ok, <i>send your complete e-mail address as password.</i>
  Password:
  230 Anonymous access granted, restrictions apply.
***************
*** 352,356 ****
          file release tools</a> on SourceForge.  Fill in the <strong>
          "release name" </strong> from the value of the
!         &lt;currentVersion&gt; element in the POM.  <strong>The files
          are released as soon as you identify the artifacts in the
          incoming FTP directory on SourceForge. </strong> You can (and
--- 352,362 ----
          file release tools</a> on SourceForge.  Fill in the <strong>
          "release name" </strong> from the value of the
!         &lt;currentVersion&gt; element in the POM <strong>before</strong>
!         you modified it above (i.e., the version which was in development).
!         The "-dev" extension is dropped, so <code>1.1-b2-dev</code> becomes
!         <code>1.1-b2</code>.
!         
!         <p>
!         <strong>The files
          are released as soon as you identify the artifacts in the
          incoming FTP directory on SourceForge. </strong> You can (and
***************
*** 362,366 ****
          topic gets a little bit complicated (you have to wait
          overnight for the incoming directory to get wiped), so read up
!         on its at SourceForge. </li>
  
     </ol>
--- 368,374 ----
          topic gets a little bit complicated (you have to wait
          overnight for the incoming directory to get wiped), so read up
!         on its at SourceForge. </p>
!         
!         </li>
  
     </ol>

[Cweb-developer] FASD project: Online survey launched

From: <ben...@id...> - 2004-05-25 08:18:55

Dear Open Source developer

I am doing a research project on "Fun and Software Development" in which I kindly invite you to participate.
You will find the online survey under http://fasd.ethz.ch/qsf/. The questionnaire consists of 53 questions and you will need about 15 minutes to complete it.

With the FASD project (Fun and Software Development) we want to define the motivational significance of fun when software developers decide to engage in Open Source projects. What is special about our research project is that a similar survey is planned with software developers in commercial firms. This procedure allows the immediate comparison between the involved individuals and the conditions of production of these two development models. Thus we hope to obtain substantial new insights to the phenomenon of Open Source Development.


With many thanks for your participation,
Benno Luthiger


PS:
The results of the survey will be published under http://www.isu.unizh.ch/fuehrung/blprojects/FASD/.
We have set up the mailing list fa...@we... for this study. Please see http://fasd.ethz.ch/qsf/mailinglist_en.html for registration to this mailing list.

_______________________________________________________________________

Benno Luthiger
Swiss Federal Institute of Technology Zurich
8092 Zurich

Mail: benno.luthiger(at)id.ethz.ch
_______________________________________________________________________

[Cweb-developer] updated notes on the rest-ful repository request journeling mechanism

From: Bryan T. <br...@th...> - 2003-12-28 15:11:03

These notes are from the javadoc for the RequestManager class.  This class
was original developed for a workflow executive.  You can still see this
where
the narrative has not been updated.

Feedback is appreciated.  The actual RDBMS schema are in CVS if you are
interested.

-bryan

    HTTP requests may be journeled as a basic repository behavior to
    provide a mechanism for restoring history from a known database
    state, monitoring requests that result in error conditions,
    analyzing DOS attackes, etc.  Each journeled request contains the
    complete HTTP request data, including the HTTP version, request
    method, host, HTTP request and entity headers, and (normally) the
    HTTP entity.<p>

    For journeling, security, and many other reasons, it is important
    that all modifications to the representation of resource state go
    through the HTTP interface.  If you modify the state of a resource
    directly (other than as the implementation of the service
    interface for that resource) then the journeling mechanism will no
    longer function and you will be unable to restore your data from a
    prior known consistent repository snapshot.<p>

    This class manages the RDBMS tables for the <tt>requestLog</tt>,
    the <tt>requestQueue</tt>, and the <tt>errorLog</tt> which journel
    historical HTTP requests, pending and active HTTP requests, and
    HTTP requests that have resulting in an error condition
    respectively.<p>

    <h4>Highlights</h4>

    The following items explore some different aspects of the request
    journeling mechanism:<ul>

    <li>Synchronous vs Asynchronous HTTP Requests.<br>

        Typically HTTP services use synchronous processing.  In this
        model, the HTTP response is generated after the service has
        fulfilled the service level request, e.g., by creating a new
        resource.  In this model the life cycle of the service request
        is the same as the life cycle of the HTTP request-response,
        the HTTP status code indicates the success or failure of the
        service level request, and the response headers and the
        optional response entity often encode the service level
        response, e.g., the URI of the newly created resource.<br>

        However, HTTP provides a means using the 202 (Accepted) status
        code by which a service may indicate that it has accepted a
        request for asynchronous processing.  The use of this status
        code essentially decouples the HTTP request - response life
        cycle from the life cycle of the service request.<br>

	A service may elect to use asynchronous processing for any of
        a number of reasons, including high expected latency,
        inherently long-running processing, high server load, etc.
        However, a service that elects asynchronous processing should
        have provisions in the service contract by which the client
        may become informed about the eventual outcome of the request.
        Even if the semantics of the service do not require the client
        to become informed about the eventual outcome of the request,
        it is stil good practice to specify in the service contract
        the conditions under which the service may elect asynchronous
        processing.<br>

    <li>Consistent regeneration of assigned URIs during playback.<br>

        Playback of the request journel must re-generate the same URIs
        for each request so that the managed resources may be found at
        the same URI addresses by remote agents.  Failure to do this
        will result in broken links from external resources when
        attempting to restore the repository by playback the request
        journel starting at a prior known state.<br>

        By default, the IDENTITY column of the requestLog RDBMS table
        is used to assign a unique "local name" that will be used as
        the last path component of the URI returned to the User Agent
        for a POST request that result in the creation of a new
        resource as indicated by the 201 (Created) status code.  Some
        services support the HTTP extension header "Local-Name-Hint",
        which permits the User Agent to indicate the local name for
        the newly created resource.  Those services are free to choose
        to use the caller specified local name over the value in the
        localName column in the requestLog RDBMS table.  However, the
        service should make this choice in a consistent manner such
        that it does not result in broken links during playback.<br>

	Services are also free to override the default generated local
	name, e.g., to use a different mechanism for assigning
	resource names, but they MUST do so either at the time that
	the local name is generated using the provided override
	mechanism since the original requestLog entry is persisted in
	a separate transaction to insure that it is not rolled back if
	the request fails.<br>

    <li>Safe journeling of the request entity.<br>

        If the content length is unknown or exceeds some limit, then
        the request entity MUST be journeled by the service (in a
        separate transaction so that it will not be rolled back if the
        request fails).  This is to avoid DOS attackes based on the
        size of the request entity.<br>

    <li>Journeling of idempotent requests.<br>

        While a GET request should be idempotent, journeling GET
	requests provides us with information on failed GET requests
	and on DOS attacks based on GET.<br>

    <li>Consistent timestamp assignment by the RDBMS.<br>

        In order to facilitate consistent management of the repository
	RDBMS store, all request timestamps are assigned by the RDBMS
	when the HTTP request is INSERTed into the <tt>requestLog</tt>
	table.<br>

	When truncating the journel, while you may safely DELETE all
	journel entries that are OLDER THAN the date and time of the
	most recent consistent snapshot of the repository RDBMS tables
	it is often wise to keep more history in case there is a
	problem with your database backup.  There are methods in this
	class to facilitate the journel truncation operation.<br>

	Note: How you create an database backup depends on
	<strong>you</strong> and your RDBMS platform.  One safe method
	is to take the application server offline, take the RDBMS
	offline, and create a backup of the repository database using
	the tools provided by your platform.  However, this has the
	distinct disadvantage of taking your service offline (e.g., a
	planned service outage).  However there are also technologies
	that provide streaming backups of the live database, journels
	of all database transactions, etc.<br>

	The safe backup and restoration of repository data crosses the
	levels of database administration, network administration,
	service administation, marking and isolating request that
	should NOT be recovered, e.g., DOS attackes, and managing user
	expectations.  Administrators are encouraged to create and
	share best practices.  There are not any easy answers.<br>

    <li>Journeling of TCP/IP connection information.<br>

        Should information be collected about the TCP/IP connection,
        e.g., the IP address of the User Agent making the request?  I
        am inclined to think that DOS attack analysis and security
        measures that operate at the network level do not belong in
        the repository.<br>

    <li>Resource revision history mechanism.<br>

        There is also an optional resource version history mechansism
        that can be used to expose access to historical resource state
        versions as a service aspect for any resource.  These
        mechanisms are distinct but may be used to accomplish some of
        the same management goals.<br>

        If the resource revision history mechansism is used to restore
        the state of a resource, then it must be done using another
        PUT (walking in forward time only) otherwise the repository
        state can no longer be used for journel playback
        restoration.<br>

	<strong>

	        Always make corrections through the HTTP interface,
                whether correcting by re-establishing the state of a
                resource from the version history or by
                re-establishing the state of the repository by
                playback from a known historical state.  Failure to
                follow this contract will make it impossible to
                restore the repository from a known state by playing
                back journeled requests.

	</strong>

    <li>Operator intervention.<br>

    	Requests may result in an error condition.  Responding to
        error conditions is a ideosyncratic business.  Depending on
        the semantics of the service, some errors may yeild to
        operator inspection and intervention while other errors may be
        intrinsically handled by the requestor.<br>

	For example, a work item may represent error conditions in
        delegated requests within the evolving representation of the
        work item state.  In this case, the HTTP status code
        indicating the error condition is accessible to corrective
        behaviors declared by the work item.  If no such corrective
        behaviors have been declared, then an operator monitoring that
        work item can observe the error conditions that are causing
        the work item to block and take corrective actions, e.g., by
        editing the work item state.  In this case errors in the
        request journel may only provide an oversight mechanism since
        there is more application specific context available in the
        work item resource itself.<br>

	In general, intervention is more difficult for a synchronous
        request since an HTTP response has already been committed to
        the User Agent.  However, it is possible that an error for
        either a synchronous or asynchronous HTTP request can be
        corrected, the error condition cleared, and the request will
        then run to a successful completion.<br>

	<i>Everything depends on the service semantics.</i> It may or
        may not make sense to do this for a synchronous request since
        the User Agent has already received an HTTP response
        indicating an error condition.<br>

	Operators may also use the logged errors as a source of
        information about problems with the realization of the service
        contract.  If the service is failing to respect its contract,
        then that can be addressed by tracking down the source of the
        problem and updating the service realization class and/or
        repository version.  If the clients are failing to respect the
        service contract, then you should ask yourself whether or not
        the contract is clearly written and decide how to notify the
        requesting organizations concerning the proper use of the
        service contract.<br>

	Errors often indicate an underlying social problem concerning
        the communciation about the service contract rather than a
        technical problem.  Such problems are best addressed through
        clear service descriptions, use cases, examples and support.
        Hopefully such errors will be caught by declarative service
        description mechanisms, ala WRLD, as those efforts evolve.<br>

    <li>Unique request identifier.<br>

        Each HTTP request is assigned a unique request identifier
        using an IDENTITY column in the requestLog table.  Several of
        the methods in this class, especially those that are
        responsible for entering a request into the requestQueue and
        errorLog tables, make use this identifier.<br>

	Those methods actually INSERT the data into the target table
        using a SELECT on the requestLog table in order to avoid first
        bringing the journeled request into the client.  This
        efficiency is important when there are many asynchronous
        service requests and when large request entities have been
        journeled as it minimizes the local network traffic between
        the application server and the RDBMS.<br>

    </ul>

    <hr>

    <h4>TODO</h4>

    a. Now:<br>

       - refactor the WorkflowRequest class to also provide a view of
         the HTTP headers and migrate into consistent support that
         carries the request context into processing throughout the
         service behavior.  In particular, update the methods in the
         RestWebHelper class.<br>

       - Update the RDBMS schema for the request journeling tables to
         support the serialization and persistence of HTTP
         headers.<br>

       - Provide a mechanism to mark resources and requests that
         should not be re-created during playback so that the journel
         can be annotated such as to not re-create a journeled DOS
         attack during playback.<br>

       - Provide a high-level interface to reply the journel for the
         repository.  Should this happen in an offline state?  It does
         not seem necessary, but it could be important for some users.
         Of course, you can just disable the incoming network traffic
         and get the same ends.  The individual services need to be
         "online" so that the requests can be processed, but "outside"
         requests could be disabled at some level.<br>

    b. Add optional support for snapshot based version history to the
       representation table.  Consider whether this is the primary
       representation table or a secondary (and possible a set of
       secondary) representation tables specifically for version
       histories.  Note that XML diff algorithms could be used here,
       but we are already getting most of the benefit of from the
       request journel that if people use XPointer and PUT to make
       changes in resource state.<p>

    c. Entries in both the version history and journel replay
       mechanisms can be truncated based on coherent repository state
       snapshots (as an optional admin task) since the repository can
       be replayed from any known state using the journel.<p>

    d. Playback can encounter stale authentication tokens that cause
       problems with either the direct manipulation of the repository
       (if there is local network security between the repository and
       the playback mechanism) and with delegated requests.  Such
       authentication can be circumvented by tunneling the playback
       requests.  However, requests that require delegating authority
       across network boundaries will always need to provide a
       challenge to the original agent.  Clearly this can only succeed
       under certain network architecture plans.<p>

    e. Expose XML view of the requestLog (journel), requestQueue
       (requests that are being processed or that are pending async
       processing), and the errorLog (requests that have resulting in
       an error condition).  This view can be used to build an aspect
       of the management interface for the repository - perhaps by
       reusing the queue mechanism.  It will also need access to a
       controller for performing common operations, e.g., database
       snapshots, triming the requestLog, retrying requests that
       resulted in an error condition.  Be careful of grounding
       conditions here since there is an opportunity for recursion.
       For example, are management interface requests also journeled
       and if so can that lead into a downward spiral?  Also, if the
       queue service relies on the repository to represent state (not
       sure that it does really except for the existance of the queue
       resource (and queue-entry resources?)) then the management
       interface can not operate without causing side effects on the
       repository that may be unintended.<p>

    <hr>

    <h4>RDBMS Schema for Request Journeling Mechanism</h4>

    There are three RDBMS tables that are used to realize the request
    journeling mechanism.  These tables have essentially the same
    columns, though the <tt>errorLog</tt> contains some additional
    information about the error condition.<ul>

    <li><tt>requestLog</tt>

        Provides a journeled entry for <strong>every</strong> request
        submitted to the repository.

    <li><tt>requestQueue</tt>

        Contains <strong>only</strong> those requests marked for
        asynchronous process (by the service), including those that
        are in an error condition.  The access methods in this class
        provide a view onto this table that shows only those requests
        that are not in an error state.  This view is used to support
        a request dispatcher for pending requests.

    <li><tt>errorLog</tt>

        Contains <strong>only</strong> requests that resulted in an
        error condition and are pending operator intervention.<br>

    </ul>

    <h4>Contract for asynchronous processing</h4>

    Note: This section needs a rewrite.  Also, since this contract
    (and code) was originally developed solely for asynchronous
    request process, we need to review this in order to develop the
    contract for synchronous request processing.  One of the key
    questions is whether synchronous requests are entered into the
    requestQueue.  If not, then the next key question is how to mark
    synchronous vs asynchronous requests (in the database) and how to
    get error conditions for synchronous requests into the error log
    and what do to if the error log entry for a synchronous request is
    cleared.  (My expectation is that we will do nothing.  I.e., if a
    synchronous request fails then it fails.  Clearing the error
    condition should not cause the request to be retried since the
    User Agent will not be notified of the success/failure of the
    retried request.)<p>

    ...<br>

    When a workflow executive begins to process a workflow request, it
    opens a database transaction, reads the next record from the
    request queue for which no error condition exists and immediately
    deletes that record from the request queue.  If the delete fails,
    then some other executive has already begun processing.  At this
    point you MUST rollback the transaction.  You MAY read again
    <strong>in a different transaction</strong> to get the another
    record.<p>

    If the executive succeeds, then the transaction is committed.  At
    this point the request may be deleted from the <tt>requestLog</tt>
    by web services that do not desired to preserve their request
    history.<p>

    If the workflow executive fails, then it MUST open
    <strong>another</strong> {@link Connection} and record the error
    condition in the <tt>errorLog</tt>.  Once the error condition has
    been succesfully committed, the workflow executive transaction is
    rolled back -- this restores the request to the queue.  At this
    point some compensating action needs to be taken, e.g., operator
    intervention.<p>

    If an error condition can not be written into the
    <tt>errorLog</tt>, e.g., owing to database access failure, then
    the workflow executive MUST record an urgent message in the system
    log.<p>

    If the client dies instantly while the workflow executive is
    running, then a dirty read will still bring back the data from the
    uncommitted transaction.  At this level identifying and correcting
    dropped requests becomes an issue for the network and database
    administrators.  For example, you might monitor the database for
    long-running open transactions and use network administration
    tools to inform you when the client dies.<p>

[Cweb-developer] Repository versioning and journeling

From: Bryan T. <br...@th...> - 2003-12-25 14:30:44

All ,

I was just thinking about how to reuse the workflow request journeling
mechansims
that Guy and I developed last year.  It occurred to me that we could be
journeling
all requests as a repostiory behavior, not only for workflow concerns.  This
gives us
one mechanism for restoring history, analyzing DOS attackes, etc.  We can
also
resurrect the resource versioning mechanism that I did originally and expose
access
to version histories as a service aspect for any resource.

   a. Migrate workflow request, error, log mechanisms into the core
      repository to provide optional journeling for all requests.  The
      journel entry should include the request headers, should
      optionally include various information about the HTTP
      Connection, especially the source IP, etc. for use not only in
      being able to playback the journel to re-create a repository
      state but also in analyzing DOS attacks.  Use the journel
      IDENTITY column to assign a unique local name to resources
      created during POST requests (unless overriden by
      Local-Name-Hint).  This will make it possible to replay the
      journel without breaking any depending external URIs.

   b. Add optional support for snapshot based version history to the
      representation table.  Consider whether this is the primary
      representation table or a secondary (and possible a set of
      secondary) representation tables specifically for version
      histories.  Note that XML diff algorithms could be used here,
      but we are already getting most of the benefit of from the
      request journel that if people use XPointer and PUT to make
      changes in resource state.

   c. Entries in both the version history and journel replay
      mechanisms can be sunset based on coherent repository state
      snapshots (as an optional admin task) since the repository can
      always be replayed from either the zero state or any other known
      state using the journel.  If the version history mechansism is
      used to restore the state of a resource, then it must be done
      using another PUT (walking in forward time only) otherwise the
      repository state can no longer be used for journel playback
      restoration.  That is: always make corrections through the HTTP
      interface, whether correcting by re-establishing the state of a
      resource from the version history or by re-establishing the
      state of the repository by playback from a known historical
      state.  Provide a mechanism to mark resources and requests that
      should not be re-created during playback so that the journel can
      be annotated such as to not re-create a journeled DOS attack
      during playback.

   d. Playback can encounter stale authentication tokens that cause
      problems with either the direct manipulation of the repository
      (if there is local network security between the repository and
      the playback mechanism) and with delegated requests.  Such
      authentication can be circumvented by tunneling the playback
      requests.  However, requests that require delegating authority
      across network boundaries will always need to provide a
      challenge to the original agent.  Clearly this can only succeed
      under certain network architecture plans.

Happy holidays to all,

-bryan

[Cweb-developer] Welcome

From: Bryan T. <br...@th...> - 2003-12-21 23:02:17

Welcome.

This email list is for developers on the Cognitive Web Open Source project.
The project is divided into
a number of CVS modules.  Some of these modules provide the support that is
used to realize the
REST-ful repository (cweb-rest module) while other modules implement
specific REST Web Services
and have a dependency on the repository.

For more information, please see:

	http://www.cognitiveweb.org (main web site)
	http://www.cognitiveweb.org/technology/projects (project documentation)
	http://www.sourceforge.net/projects/cweb (administration page on source
forge)
and	http://wiki.cognitiveweb.org (wiki)

Thanks,

-bryan

Flat | Threaded

2003	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec (3)
2004	Jan	Feb	Mar	Apr	May (1)	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2006	Jan	Feb	Mar (1)	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec