[Xweb-developers] Re: Changes to XWeb

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

[moved to the dev list -- for those tuning in: Hendrik submitted a few 
patches and I plan to do some changes to XWeb together with Jon 
(mentioned below). I tried to add some hints about the preceeding 
discussions, it was only two mails back and forth]

Hendrik Lipka wrote:

>Wednesday, November 5, 2003, 2:51:30 PM, you wrote:
>
>  
>
>>I have thought about making big architectural changes for a long time by
>>now, up to a complex processing model with streams, meta-data, 
>>multi-plexing and others. But a first step would be getting closer to 
>>the model of Lagoon or Transmorpher with simple one input, one output 
>>processors in a queue and implicit SAX- to ASCII-stream conversions.
>>    
>>
>
>Personally, I'm happy with the current model. Its easy to unterstand, and
>fulfills most needs, I think.
>  
>
It is limiting, e.g. you can't do things like first XSLT, then SVG 
transformation. Same for FOP. And you can't combine the power of UNIX 
tools into the process by running stuff like sed or scripting languages 
at some stages. Freemarker would be another thing I'd like to support 
(http://freemarker.sourceforge.net/).

To do this properly, XWeb needs a general notion of a toolchain, not 
just for XSLT.

>>Some pointers:
>>- http://transmorpher.inrialpes.fr/docs/compare.html
>>- http://meganesia.int.gu.edu.au/~pbecker/xweb/processingModel.html
>>    
>>
>
>Interesting read.
>
>  
>
>>But it would probably be handy for people who want to manage the whole
>>process around this with Ant -- like the ftp or http uploads, maybe
>>    
>>
>
>Sound like me :)
>
>  
>
>>My suspicion is that you will end up rendering again quite often. But at
>>least in the situation of a typo-fix and similar local changes it should 
>>be ok. We would need a way to define extra dependencies, though -- at 
>>    
>>
>
>Something like a <depenson> element below the entries should do. When a
>style changes, everything is rebuild, otherwiese the dependencies are used.
>  
>
Yes, that was what I was thinking of.

>>Libraries needed in 1.3, which are part of 1.4:
>>- RegExp library like ORO
>>    
>>
>
>I think ORO is the most complete RegEx library around, and its only 65k.
>  
>
The main thing I'd like to get rid of is Xerces since it is so huge. Do 
you know how close the ORO API is to the JDK 1.4 RegEx approach?

>>- something for logging like log4j
>>    
>>
>
>Log4J ist also one of the most complete loggers, and not _that_ large.
>  
>
It is not just about size, it is also about being mainstream. But of 
course Jakarta is reasonably mainstream and there are other things I 
might want to use from them, e.g. the CLI or FileUpload stuff from 
Commons. The Commons Logging package might be a good idea, too -- esp. 
in the case we want to go a mixed 1.3/1.4 route.

>>- something for image output like JIMI (another bigger lib in the 
>>current XWeb)
>>    
>>
>
>JDK1.4 has some image processing classes, but the ImageIO lib is still
>required :(
>  
>
ImageIO is part of the JDK -- I use it in some other programs to export 
PNGs and JPGs. No extra libraries needed. And the API is a lot nicer 
than JIMI or JAI (the latter being incredibly bad in design).

[discussing the idea of an <entryset>, which uses regular expressions or 
globbing and gets expanded to a number of <entry>s]

>>globbing bit might require writing some matcher of our own, but that is
>>easy, too.
>>    
>>
>
>You are really optimistic :) My first test case would be something like
>source='??some*.?htm?' target='$1next$2.html'
>Renaming just the extensions would be easier...
>  
>
You could always map it to something like "..some.*\..htm." and run the 
RegExp machinery. I find the glob format a lot easier for simple things 
like matching file names and I think there are many people who use XWeb 
but don't know much about RegExp. Forgetting to escape the dot would be 
a first problem.

[adding the file name also as id]

>>>My website has a list of downloads, and with such generated IDs I could
>>>make sure all links are indeed correct.
>>>      
>>>
>>Couldn't you just id the section (or better directory) and do a match on
>>"\\directory[id='downloads']\file"?
>>    
>>
>
>I wanted to generate the links just by giving the ID.
>  
>
I am a bit afraid of namespace pollution. If you just use a file name as 
ID, there are lots of IDs generated. And if you want to use the file 
name anyway, you can just put it into a URL. The internal linking 
feature of the generic stylesheet could be extended to evaluate 
something like href="!downloads/myFile.pdf". Admittably a bit more 
typing than just "!myFile.pdf", but as I said: the other option is a 
huge collection of IDs around.

Another option would be doing both with optional ID generation, possibly 
with an id pattern attribute on the <entryset>. This could look like this:

  <fileset sourceFiles="*.xml" targetFiles="$1.pdf" type="docbookPDF" 
ids="pdf_$1"/>

The @ids would be optional and no ids would be generated if it is missing.

>>Ok -- I don't think Jon is subscribed yet. Jon: can you make sure you
>>are on the list so we can all take it there? I suspect some of the old 
>>    
>>
>
>He was not on CC...
>  
>
My sentbox claims so. He should get this and we'll meet tomorrow anyway.

>>I'll go through the changes and might commit some to the trunk. After
>>that I'll post some update on the list.
>>    
>>
>
>I stay tuned.
>  
>
Back soon.

  Peter