Hi check team,
I wanted to report the fact that strings reported in the XML test output are not correctly sanitized for XML markups. This will result in an invalid XML file each time any time a string contains any markup. For example if "sip:foo@bar.com" is written in the XML, it will corrupt any XML parsing.
Thanks for your interest in Check!
What version of Check are you using? There is a unit test
http://sourceforge.net/p/check/code/HEAD/tree/trunk/tests/test_xml_output.sh
in Check that attempts to verify that using any of the characters that
should be escaped in XML does result in valid XML. The test dates back to
2012, so I would expect versions of Check prior to 0.9.9 may not properly
escape XML.
According to the XML 1.1 specification http://www.w3.org/TR/xml11/#syntax
the following characters must be escaped: " ' < > &. The part of the unit
test mentioned above that outputs characters which must be escaped is the
following:
s = suite_create("XML escape \" ' < > & tests");
tc = tcase_create("description \" ' < > &");
...
START_TEST(test_xml_esc_fail_msg)
{
}
END_TEST
And this results in the following XML:
...
<description>description " ' < > &</description>
<message>fail sip:foo@bar.com " ' < > &
message</message>
...
If the example you mention, sip:foo@bar.com, were added to these strings.
The result would be the following XML:
...
<description>description " ' < > & sip:foo@bar.com
</description>
<message>fail " ' < > & message sip:foo@bar.com
</message>
...
which according to xmllint is valid. Note that none of the characters
mentioned in your email address needed to be escaped.
Do you have a minimal example which shows that Check is able to produce
invalid XML? If so, kindly send it over, as it would helpful is reproducing
and resolving the issue.
Thanks!
Branden
On Thu, Feb 26, 2015 at 3:24 AM, Jonathan Martin homeroth@users.sf.net
wrote:
Related
Bugs:
#103I'm seeing very similar behaviour related to illegal characters. As part of an assert, we log both the expected value and the actual value. Occasionally, this results in illegal characters being printed in the xml.
According to the XML 1.1 spec, the following characters are illegal:
http://www.w3.org/TR/xml11/#charsets
Is there any support for stripping these characters?
Thanks,
Ben
Perhaps this is really a request for Check to support XML 1.1. Check
currently supports XML 1.0.
If there is interest in adding 1.1 support, we would gladly accept a patch
against Check to do so. Otherwise, a feature request can be opened for
adding XML 1.1 support. Or, do you believe that Check's support of XML 1.0
does not conform to the spec?
On Tue, Jul 28, 2015 at 11:10 AM, InfernoZeus infernozeus@users.sf.net
wrote:
Related
Bugs:
#103(Sorry for the slow response, I've been on holiday the past week!)
I believe the same restriction is present in the XML 1.0 spec, although that doesn't explicilty list illegal characters. Here's what it lists as legal:
The same characters (below #x20 at least) are illegal, they're just not explicitly listed.
Think I understand what you mean now. As of r1225 Check should now either
properly escape XML characters or emit them if they are not valid. Check
assumes that the strings being logged are ASCII (as opposed to UTF-8,
UTF-16, UTF-32, etc). That is, if the strings being logged are not ASCII
they will be encoded in an unexpected way.
Does this help resolve the issue you are having? If so, great! If not, can
you provide a minimal example where Check is not properly outputting valid
XML?
On Wed, Aug 12, 2015 at 5:25 AM, InfernoZeus infernozeus@users.sf.net
wrote:
Related
Bugs:
#103Thanks Branden, that looks perfect!
Ben