[perl-xml-cvs] CVS: perl-xml-faq perl-xml-faq.xml,1.26,1.27

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Update of /cvsroot/perl-xml/perl-xml-faq
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv26347

Modified Files:
	perl-xml-faq.xml 
Log Message:
- add 'quick answer - XML::LibXML' and freshen up various other answers

Index: perl-xml-faq.xml
===================================================================
RCS file: /cvsroot/perl-xml/perl-xml-faq/perl-xml-faq.xml,v
retrieving revision 1.26
retrieving revision 1.27
diff -u -d -r1.26 -r1.27

--- perl-xml-faq.xml	14 Nov 2006 08:39:28 -0000	1.26
+++ perl-xml-faq.xml	18 Mar 2008 09:11:37 -0000	1.27
@@ -15,11 +15,7 @@
     </para></authorblurb>
   </author>
   <copyright>
-    <year>2002</year>
-    <year>2003</year>
-    <year>2004</year>
-    <year>2005</year>
-    <year>2006</year>
+    <year>2002 - 2008</year>
     <holder>Grant McLean</holder>
   </copyright>
 
@@ -224,6 +220,22 @@
     </answer>
   </qandaentry>
 
+  <qandaentry id="quick_choice">
+    <question>
+      <para>The Quick Answer</para>
+    </question>
+    <answer>
+
+      <para>For general purpose XML processing with Perl, <classname>XML::LibXML</classname>
+      is usually the best choice.  It is stable, fast and powerful.  To make the most of the
+      module you need to learn and use XPath expressions.  The documentation for XML::LibXML is
+      its biggest weakness.</para>
+
+      <para>Other modules may be better suited to particular niches - as discussed below.</para>
+
+    </answer>
+  </qandaentry>
+
   <qandaentry id="tree_vs_stream">
     <question>
       <para>Tree versus stream parsers</para>
@@ -347,14 +359,13 @@
       <para>If your needs are simple, try <classname>XML::Simple</classname>.
       It's loosely classified as a tree based parser although the 'tree' is
       really just nested Perl hashes and arrays.  You may need to swot up on
-      Perl references (<command>perldoc perlreftut</command>) to take advantage
-      of this module.</para>
+      Perl references (see: <command>perldoc perlreftut</command>) to take
+      advantage of this module.</para>
 
       <para>If you're looking for a more powerful tree based approach, try
-      <classname>XML::XPath</classname>.  This module offers a DOM style API
-      with the added bonus of XPath support.  If speed is critical, you'll
-      find that <classname>XML::LibXML</classname> is much faster but a bit
-      more 'bleeding edge'.</para>
+      <classname>XML::LibXML</classname> for a standards compliant DOM or
+      <classname>XML::Twig</classname> for a more 'Perl-like' API.  Both of
+      these modules support XPath.</para>
 
       <para>If you've decided to use a stream based approach, head
       directly for SAX.  The <classname>XML::SAX</classname> distribution
@@ -365,12 +376,6 @@
       C-based parser library ('expat' by James Clark) as
       <classname>XML::Parser</classname>, for faster parsing.</para>
 
-      <para>Another option worthy of investigation is
-      <classname>XML::Twig</classname>.  This hybrid module combines the
-      convenience of a tree approach with the lower memory demands of the
-      stream style.  You configure the parser and it gives you the document in
-      chunks (bits of the tree or 'twigs').</para>
-
       <para>Finally, the latest trendy buzzword in Java and C# circles is
       'pull' parsing (see <ulink url="http://www.xmlpull.org/"
       >www.xmlpull.org</ulink>).  Unlike SAX, which 'pushes' events at your
@@ -481,26 +486,29 @@
 
       <para><classname>XML::LibXML</classname> provides a Perl wrapper around
       the GNOME Project's libxml2 library.  This module was originally written
-      by Matt Sergeant and is now actively maintained by Christian Glahn.  It
-      is very fast, complete and stable.  It can run in validating or
-      non-validating modes and offers a DOM with XPath support.  The DOM and
-      associated memory management is implemented in C which offers significant
-      performance advantages over DOM trees built from Perl datatypes.  The
-      <classname>XML::LibXML::SAX::Builder</classname> module allows a libxml2
-      DOM to be constructed from SAX events.
+      by Matt Sergeant and Christian Glahn and is now actively maintained by
+      Petr Pajas.  It is very fast, complete and stable.  It can run in
+      validating or non-validating modes and offers a DOM with XPath support.
+      The DOM and associated memory management is implemented in C which offers
+      significant performance advantages over DOM trees built from Perl
+      datatypes.  The <classname>XML::LibXML::SAX::Builder</classname> module
+      allows a libxml2 DOM to be constructed from SAX events.
       <classname>XML::LibXML::SAX</classname> is a SAX parser based on the
       libxml2 library.</para>
       
-      <para><classname>XML::LibXML</classname> can be used to parse HTML (4.0
-      strict) and SGML files into DOM structures - which is especially useful
-      when converting other formats to XML.</para>
+      <para><classname>XML::LibXML</classname> can also be used to parse HTML
+      files into DOM structures - which is especially useful when converting
+      other formats to XML or using XPath to 'scrape' data from web
+      pages.</para>
 
       <para>The libxml2 library is not part of the
-      <classname>XML::LibXML</classname> distribution.  The source is available
-      for download from <ulink url="http://xmlsoft.org">xmlsoft.org</ulink>;
-      it is a standard package in most Linux distributions; it can be compiled
-      on numerous other platforms; and it is bundled with PPM packages of
-      <classname>XML::LibXML</classname> for Windows.</para>
+      <classname>XML::LibXML</classname> distribution.  Precompiled
+      distributions of the libxml2 library and the
+      <classname>XML::LibXML</classname> Perl wrapper are available for most
+      operating systems.   The library is a standard package in most Linux
+      distributions; it can be compiled on numerous other platforms; and it is
+      bundled with PPM packages of <classname>XML::LibXML</classname> for
+      Windows.</para>
 
       <para>For early access to upcoming features such as W3C Schema and RelaxNG
       validation, you can access the CVS version of <classname>XML::LibXML</classname> at:</para>
@@ -518,13 +526,10 @@
     </question>
     <answer>
 
-      <para>Matt Sergeant's <classname>XML::XPath</classname> module provides a
-      DOM implementation (in Perl) which supports XPath queries.  It can't
-      rival <classname>XML::LibXML::SAX</classname> for speed but it may be
-      easier to install - especially if you don't have a compiler.  Parsing XML
-      documents is performed by the expat library via
-      <classname>XML::Parser</classname>.  You can serialise the DOM to SAX
-      events.</para>
+      <para>Matt Sergeant's <classname>XML::XPath</classname> module was the first
+      Perl DOM implementation to support XPath.  It has largely been supplanted by
+      <classname>XML::LibXML</classname> which is better maintained and more
+      powerful.</para>
 
     </answer>
   </qandaentry>
@@ -566,6 +571,16 @@
       module also supports building the tree from SAX events or using a simple
       Perl data structure to drive a SAX pipeline.</para>
 
+      <para>If you are using <classname>XML::Simple</classname>, you should
+      read "<ulink url="http://www.perlmonks.org/index.pl?node_id=218480">Does
+      your XML::Simple code pass the strict test?</ulink>" for a discussion of
+      common pitfalls and ways to avoid them.</para>
+
+      <para>If you are becoming frustrated by the limitations of
+      <classname>XML::Simple</classname>, see: "<ulink
+      url="http://www.perlmonks.org/index.pl?node_id=490846">Stepping up from
+      XML::Simple to XML::LibXML</ulink>".</para>
+
     </answer>
   </qandaentry>
 
@@ -588,8 +603,8 @@
       <para>Another advantage of <classname>XML::Twig</classname> is that it is
       not constrained by the tyranny of DOM compliance.  Instead, it offers a
       number of conveniences to help the experienced Perl programmer feel right
-      at home.  The official home page for <classname>XML::Twig</classname> is
-      <ulink
+      at home.  <classname>XML::Twig</classname> also supports XPath
+      expressions.  The module's official home page for is <ulink
       url="http://www.xmltwig.com/">http://www.xmltwig.com/</ulink>.</para>
 
     </answer>
@@ -643,13 +658,13 @@
     </question>
     <answer>
 
-      <para>Matt Sergeant's <classname>XML::PYX</classname> comes with
-      some wrapper scripts for working with XML files using command line
-      pipelines.  The PYX notation allows you to apply commands like
-      <command>grep</command> and <command>sed</command> to specific parts of
-      the XML document (eg: element names, attribute values, text content).
-      For example, this one-liner provides a report of how many times each
-      type of element is used in a document:</para>
+      <para>Although written in Perl, Matt Sergeant's
+      <classname>XML::PYX</classname> is really designed for working with XML
+      files using shell command pipelines.  The PYX notation allows you to
+      apply commands like <command>grep</command> and <command>sed</command> to
+      specific parts of the XML document (eg: element names, attribute values,
+      text content).  For example, this one-liner provides a report of how many
+      times each type of element is used in a document:</para>
 
       <programlisting><![CDATA[
 pyx doc.xml | sed -n 's/^(//p' | sort | uniq -c