[FleXML-Users] Re: xslt question

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Sat, 2005-03-26 at 16:38 +0100, Martin Quinson wrote:
> On Fri, Mar 25, 2005 at 03:44:23PM -0500, William F. Dowling wrote:
> > Hi Martin,
> > 
> > I have a question for you.  Have you ever used xslt?  What do you think
> > of it?  I am working on a presentation (sort of a sales pitch) for some
> > people I work with, on flexml.  Part of my talk will be a comparison
> > with xslt.

FWIW I did a comparison between flexml and xsltproc.  The test was to
extract the pcdata from a couple different elements.  The DTD has ~ 100
elements, a similar total number of attributes, and (I don't have a
specific measurement here) is not very complex -- generally simple
content models.  The "document" was 128MB; the root element in the DTD
is
<!ELEMENT db (header | issue | item | ref)+>
and there are about 150K (header | issue | item | ref) elements in the
test document.

Two points: xsltproc was >200 times slower on my box for this task than
what flexml gave; and the flexml-generated program was 20 times slower
than an egrep scan.

Re the latter point -- I suspect the problem is inherent in the flexml
parse model.  That *I assume* is LL(1), correct?  Why else would I have
had to increase my stack size to 20M (10M was not big enough for my
file)?  Is there an LR or LALR parser for XML out there?

Will

stats
-----
# get a baseline -- scan the file with egrep to find authors
egrep '<(primary)?author[^s]' /proj/data/WoS.2004000109 > xxx 
   0.38s user      0.38s system  93%    cpu 0.814 total

# use flexml-generated 'authors' program
./authors < /proj/data/WoS.2004000109 > xxx
  7.15s user       0.85s system  93%    cpu 8.521 total

# use comparable xslt stylesheet
xsltproc -o xxx testauth.xsl /proj/data/WoS.2004000109
  1618.41s user    11.26s system 63%    cpu 42:52.44 total

# How big was that input, anyway?
wc /proj/data/WoS.2004000109
  3098636  10465719 128457959 /proj/data/WoS.2004000109

-- 
William F Dowling
wil...@th...
www.isinet.com