When evaluating this profile,
<?xml version="1.0" encoding="UTF-8"?>
<config> <var-def name="foo"> <template><![CDATA[<html><head><link src="foo.css"></head><body></body><br></html>]]></template> </var-def> <xpath expression="/"> <html-to-xml> <var name="foo" /> </html-to-xml> </xpath> </config>
html-to-xml returns this:
<?xml version="1.0"?> <html><head><link src="foo.css"/></head><body><br/></body></html>
but, xpath returns this:
<html> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <link src="foo.css"> </head> <body><br></body> </html>
How come? I'd expect the XPath Processor to this (shouldn't it also return the <?xml version="1.0"?> tag?):
<html><head><link src="foo.css"/></head><body><br/></body></html>
It doesn't even return XML anymore...
In case anyone is interested:
I fixed this issue by adding:
> props.setProperty(OutputKeys.METHOD, "xml");
to the list of properties in org.webharvest.utils.CommonUtils.serializeItem().
This will force the saxon's Transformer to output XML, regardless of which input tags it may encounter.
Thanks fuero.
Log in to post a comment.
When evaluating this profile,
<?xml version="1.0" encoding="UTF-8"?>
<config>
<var-def name="foo">
<template><![CDATA[<html><head><link src="foo.css"></head><body></body><br></html>]]></template>
</var-def>
<xpath expression="/">
<html-to-xml>
<var name="foo" />
</html-to-xml>
</xpath>
</config>
html-to-xml returns this:
<?xml version="1.0"?>
<html><head><link src="foo.css"/></head><body><br/></body></html>
but, xpath returns this:
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<link src="foo.css">
</head>
<body><br></body>
</html>
How come? I'd expect the XPath Processor to this (shouldn't it also return the <?xml version="1.0"?> tag?):
<html><head><link src="foo.css"/></head><body><br/></body></html>
It doesn't even return XML anymore...
In case anyone is interested:
I fixed this issue by adding:
> props.setProperty(OutputKeys.METHOD, "xml");
to the list of properties in org.webharvest.utils.CommonUtils.serializeItem().
This will force the saxon's Transformer to output XML, regardless of which input tags it may encounter.
Thanks fuero.