From: Ian B. <ia...@co...> - 2002-04-16 01:59:19
|
On Mon, 2002-04-15 at 16:03, Jeffrey P Shell wrote: > A pattern I've been using for text handling has migrated through a few > refactorings to become a bag of Handlers. The main interfaces are:: > > from Interface import Base, Attribute > class IHandlerResult(Base): > """ > IHandler objects return IHandlerResult objects, which are > simple records that any text managing utility can use. > """ > source = Attribute("The web-editable source code.") > cooked = Attribute("Rendered (cooked) code, used for display") > fullsource = Attribute(("The full source of the text, to be presented " > "to non-web clients")) > headers = Attribute("A mapping object of headers/values") > > class IHandler(Base): > """\ > IHandler objects process text and return an IHandlerResult object. > """ > def handle(text): > """\ > Processes the incoming text and returns an IHandlerResult > object. > """ I'm not entirely clear on the interface -- what are Base and Attribute? Oh... the new interface definition stuff? I assume all the meta-data like title and such is just kept as headers? > I have a couple of existing singletons for common uses -- editing text > directly in a text area with little or no HTML knowledge, and uploading HTML > exported from monsters like WordPerfect and other tools. They parse out the > meta-tags and title tag (and put them in the headers result); they parse out > the contents of the <body> tags. Then the "SafeHTMLHandler" uses an SGML > parser coupled with a table of valid tags to rewrite the incoming code to > something deemed as safe - basically it gets rid of a lot of the extra crap > that Word, Wordperfect, Composer, etc, puts in. The original full source is > kept (at the handler's clients discretion) for FTP/DAV editing without too > much surprise on the authors part, while a <textarea> friendly version of > the source (the parts between the body tags) is also kept for web based > editing. And, there's the cooked content - basically this is where the > handler transforms the input into HTML to be used within the standard look > of the site. This is computed only on upload time. This is very interesting... I've been doing stuff a little like this with another project, converting from Word doc format to HTML. And I find I've been working towards this with the Wiki stuff as well. How do you deal with the conflict when person A edits a document via Word, then person B edits the same document via textarea, and then person A goes back? I assume the original Word style HTML document gets junked (or archived) and person A edits HTML generated from person B's edits? I guess that's fine as long as the Word style HTML document is a convenience for the Word user, not a necessity. You mentioned DAV, in what way are you using that? I've used mod_dav successfully enough, but I haven't seen how you'd take control of it in the same way I like to keep control over the other documents. If you're willing to make your code public, I'd like to help flesh this out into something more reusable/library like, and start adding in different input/output types. (I figure CMSKit is actually too ambitious -- rather, we'll have lots of CMS-related modules which will, over time, make creating CMSes easier. Actually, this is more document management, but it's all just terminology) Ian |