My problem is that it is not just one HTML source document (my extension is already cleaning such cases up fairly well using Firefox's own DOM parser and my own code), nor is the source known to the application beforehand. What I was hoping to be able to do is allow (any number of) runtime XQuery calls from the user such as doc('http://yahoo.com') and have those HTML document(s) cleaned up ideally while the XQuery is being attempted. I'm sorry if this should be obvious how to do this, but I'm pretty new to Java, etc.

thanks,
Brett

Michael Kay wrote:
I mentioned how to do it via the command line, but of course there's underlying support in the Java API as well. The approach that gives most control is to supply the Source of the transformation as a SAXSource; this contains an XMLReader which can of course be an instance of the TagSoup XMLReader. Using this approach you can configure the TagSoup XMLReader any way you like before invoking Saxon to do the transformation.
 
Michael Kay
http://www.saxonica.com/


From: saxon-help-bounces@lists.sourceforge.net [mailto:saxon-help-bounces@lists.sourceforge.net] On Behalf Of Brett Zamir
Sent: 18 January 2008 11:05
To: Mailing list for SAXON XSLT queries
Subject: Re: [saxon] Feature request

Hello all,

Thanks for the help.

My interest is integrating TagSoup/Tidy with Saxon-B in an open-source application (a Firefox extension) and I have no idea how or if I could script the command line along with using the API.

I do see that the TagSoup site has a combined version, but it says that due to a bug in Java 5 or 6, one must use Saxon 6.5.5... Not sure what trade-offs that entails, though I'd of course like to incorporate the latest version if possible...

thanks,
Brett

Michael Kay wrote:
tagsoup has a version of Saxon which incorporates the tagsoup 
parser, you can use 'saxon' as normal, and get the benefits 
of treating bad html as wellformed xml.
    

In fact, you can use the standard Saxon distribution: it's possible to
nominate TagSoup as the XML source parser using the -x option on the Saxon
command line.

Michael Kay
http://www.saxonica.com/


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
saxon-help mailing list
saxon-help@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/saxon-help

  


------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/

_______________________________________________ saxon-help mailing list saxon-help@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/saxon-help