|
From: Vincent L. <vin...@an...> - 2001-02-08 11:49:57
|
-----Original Message----- From: dai...@eu... [mailto:dai...@eu...] Sent: Thursday, February 08, 2001 9:13 AM To: Vincent Loach Cc: xml...@li... Subject: Re: [XML::XSLT] getting XML declaration and DOCTYPE to work Although I was the one that wrote the initial versions, I cannot help with the function serve. (Who wrote that one!!??) Forward this message to xml...@li..., maybe one of the current developers has time to look into your problem. I can at least tell this. The last version I delivered didn't support the <xsl:output> tag... Greets, Geert Quoting Vincent Loach <vin...@an...>: > > A note about my recent experiences with XML::XSLT. > > Ananova wants to run arbitrary translations on XML documents, but unlike > most XSL users she wants to transform XML from one format to another (i.e. > two different DTDs with a common subset of conceptual items). She also wants > 'non-programmers' to do the transformations, so XSL seemed a good choice. A > programmer (me) will write a high level mechanism to process the > transformations. Ananova prefers to do this kind of thing in Perl. > > Here are some very noddy test documents (yes, I know they produce invalid > XML): > > p.xml: > <?xml version="1.0" encoding="UTF-8"?> > <test><p>test para</p></test> > > x.xsl: > <?xml version="1.0" ?> > <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > version="1.0"> > <xsl:output method="xml" omit-xml-declaration="no" > doctype-system="http://content.ananova.com/dtd/story.dtd" /> > <xsl:template match="/"> > <story> > <xsl:value-of select="/test/p"/> > </story> > </xsl:template> > </xsl:stylesheet> > > If we run that using a Xalan XSLT mechanism: > java org.apache.xalan.xslt.Process -in p.xml -xsl x.xsl -out ./test > > we get: > <?xml version="1.0" encoding="UTF-8"?> > <!DOCTYPE story SYSTEM "http://content.ananova.com/dtd/story.dtd"> > <story>test para</story> > > which is reasonable. > > > I then write a simple Perl program (txslt.pl): > use XML::XSLT; > > $xsl = shift( @ARGV ); > $xmlfile = shift( @ARGV ); > > my $xslt = XML::XSLT->new( $xsl ); > > $xslt->transform( $xmlfile ); > print $xslt->toString; > > $xslt->dispose(); > > to test things out. This produces: > > <story> > test para > </story> > > with no XML header or DOCTYPE declaration, which is a problem for Ananova. > > > Eventually, I work out that if txslt.pl uses: > > print $xslt->serve( $xmlfile, xml_declaration => 1, http_headers => > 0 ); > > Then I can get: > > <?xml version="1.0" encoding="UTF-8"?> > > <story> > test para > </story> > > which still does not have a DOCTYPE. > > I looked at the examples, mailing list archive and tried various things > (using CDATA), and I *could* get a DOCTYPE out, as long as I didn't mind if > being "<!DOCTYPE", which still meant post-processing the output. > > I looked at the source code and noticed that a deprecated subroutine > "print_output" seemed to do something similar to what I wanted. I could not > see what looked like equivalent code in "serve" though. > > So, eventually, I hacked around with the "serve" subroutine to create > something like what I want. This will work for Ananova, but I haven't really > considered the overall effects of this on *other* types (i.e. HTML) of > processing. Here's my hacked version: > > sub serve { > my $self = shift; > my $class = ref $self || croak "Not a method call"; > my %args = $self->__parse_args(@_); > my $ret; > my $extra; > > $args{http_headers} = 1 unless defined $args{http_headers}; > $args{xml_declaration} = 1 unless defined $args{xml_declaration}; > $args{xml_version} = "1.0" unless defined $args{xml_version}; > $args{doctype} = "SYSTEM" unless defined $args{doctype}; > $args{clean} = 0 unless defined $args{clean}; > > $ret = $self->transform($args{Source})->toString; > > if($args{clean}) { > eval {require HTML::Clean}; > > if($@) { > CORE::warn("Not passing through HTML::Clean -- install the module"); > } else { > my $hold = HTML::Clean->new(\$ret); > $hold->strip; > $ret = ${$hold->data}; > } > } > > if($args{xml_declaration}) { > $extra = '<?xml version="' . $args{xml_version} . '" > encoding="UTF-8"?>'. > "\n"; > } > > if($args{http_headers}) { > $extra = "Content-Type: " . $self->media_type . "\n" . > "Content-Length: " . length($ret) . "\n\n"; > } > > if ( $self->{METHOD} eq 'xml' ) { > if ($self->{DOCTYPE_SYSTEM}) { > my $root_name = > $self->{RESULT_DOCUMENT}->getElementsByTagName('*',0)->item(0)->getTagName; > if ($self->{DOCTYPE_PUBLIC}) { > $extra .= qq{<!DOCTYPE $root_name PUBLIC "} . $self->{DOCTYPE_PUBLIC} . > qq{" "} . $self->{DOCTYPE_SYSTEM} . qq{">\n}; > } else { > $extra .= qq{<!DOCTYPE $root_name SYSTEM "} . $self->{DOCTYPE_SYSTEM} . > qq{">\n}; > } > } > } > > return $extra . $ret; > } > > (I can't quite see what $args{doctype} really achieves in the above). > > So now I get: > > <?xml version="1.0" encoding="UTF-8"?> > <!DOCTYPE story SYSTEM "http://content.ananova.com/dtd/story.dtd"> > > <story> > test para > </story> > > which Ananova likes. > > I don't know if anyone is interested in the above, but I need to write it up > anyway to add to notes here to explain why the local version of XML::XSLT > has been modified. > > The only thing bothering me now is that I may have missed something really > obvious and worked around the houses, when all I needed to know was one > extra option to pass in somewhere. I'm sure someone will let me know :-) > > Vincent Loach > Programmer > > Direct Line: +44 (0)113 367 4513 > Switchboard: +44 (0)845 121 6060 > Fax: +44 (0)845 121 5060 > Mobile: +44 (0)7977 171135 > Email: vin...@an... > > http://www.ananova.com > > Ananova Ltd > Marshall Mill > Marshall Street > Leeds LS11 9YJ > > Registered Office: > St James Court > Great Park Road > Almondsbury Park > Bradley Stoke > Bristol BS32 4QJ > Registered in England No.2858918 > > The information transmitted is intended only for the person or entity to > which it is addressed > and may contain confidential and/or privileged material. Any review, > retransmission, > dissemination or other use of, or taking of any action in reliance upon, > this information by > persons or entities other than the intended recipient is prohibited. If you > receive this in > error, please contact the sender and delete the material from any computer. > > > _______________________________________________ > Xmlxslt-discuss mailing list > Xml...@li... > http://lists.sourceforge.net/lists/listinfo/xmlxslt-discuss > > > |