A codebase management problem has developed for us where we have created a
servlet codebase that we want to be shared as gracefully as possible by
Right now, it's very easy for WebKit applications to share non-servlet
modules and packages, through the standard pythonic ways (MiddleKit,
cStringIO, PIL, etc.). It's NOT correspondingly easy to have WebKit
applications share servlets. Consequently, it's quite difficult to manage
shared complex servlet codebases.
Imagine a WebKit app that provides an "on-line storefront". We want the same
code to be used by one company that sells fastening hardware and another
that sells plush toys. Each of these companies has their own context (or
even their own appserver). If we get a third customer (one that sells tea,
for instance), we want to be able to reuse the same codebase a third time.
And we want to be able to continue to make improvements to that codebase
that are enjoyed by all three customers. Given the requirement of separate
app servers (so, no, we don't want to use mod_rewrite) stock WebKit provides
only two real options for making this happen:
1) Physically replicate all the servlet files for each deployment: each
deployment has a complete set of code and will work fine, however updating
the codebase becomes unpleasant because the changes have to be manually
implemented for each and every deployment. The more deployments, the more
pain. And, any customizations unique to a deployment make it hurt even more,
because you have to manually edit around the customizations.
2) Create abstract "master" servlets somewhere, and then import and subclass
them in a published context. This works, but now you have a plethora of
servlet subclasses littering your context that, for the most part, don't do
anything useful themselves. Most are just there for AppServer's convenience,
because it expects to find a file for each requested URI. Furthermore, some
of these servlets may implement customizations (overwrite methods or
whatever) and there's no way of telling which ones do or do not without
opening them up (which means, there's no way to know if a file does
something important or not).
Option one is what we have been doing. It sucks, empirically. Option two is
what we're facing, and while it buys us that easily-managed central codebase
in the form of the master tree, all those subclassing servlets in the
respective deployments continue to cause trouble. What we WANT to do is
banish all those subclassing servlets that contain no custom code. This is
the third option we are looking for.
The solution to our problem might be in the way WebKit maps URIs to objects
(servlets). Right now, it's a very strict mapping:
URI --> <file>.py --> class instance of same name as <file>
Where <file>.py is a python module, perhaps in a "package" that contains a
class definition called <file>. The App Server then makes an instance of
this class, which is the running servlet.
If a URI request maps to a file that isn't there, WebKit currently returns a
If, however, we could provide one (or more) ALTERNATIVE paths (packages) for
the app server to find files (modules) in, we would probably be set. If the
URI mapped to index.py, and there was no index.py file in the context where
it was expected, the app server could look for the module in the
corresponding relative location in an alternate package tree (the master, as
it happens) and instantiate a class from there.
context "foo" alternate path
[missing index.py] ---> index.py
The critical point is that the alternate path is simply an alternate
PHYSICAL LOCATION for a module; it has absolutely no namespace implications.
Going back to our "online storefront" example, we could hypothetically have
a completely EMPTY directory tree for each of these different deployments,
save for one servlet subclass in each that provides the right database
connection for that customer. And if the plush toys store needed a deviation
from the master on one of its servlets, we could subclass that servlet for
that deployment. Otherwise, all the code would be loaded from the shared
pool. No files or code would exist in the deployment directories except what
was truly unique from the master.
From a cursory inspection of the servlet factory code in WebKit, a patch to
search additional physical paths doesn't look too terribly complicated. The
main question is how the list of paths would get in there in the first
place. The most elegant approach we could think of to-date is to modify the
parsing of application.config so that if a context is mapped to a string, it
works exactly as now; if the context is mapped to a list of strings, it will
iterate through that list of strings as necessary to find a mapping that
works. In theory with a list, this could be (n) levels deep, but in our
case, we just need two.
Would this implementation work? How can we ensure that the module/namespaces
stay "correct" (in the local namespace, not the master)? What do we need to
take into account to patch the code? Can we do the same thing for
non-servlet URI's like .jpg images and .css files? Are we crazy?
We know that some of you are working on other mappings, but they're not in
Webware yet (and the Wiki is down again).