Menu

#103 XML output generated is not valid

v1.0 (example)
closed-works-for-me
nobody
None
5
2015-09-14
2015-02-26
No

Hi check team,

I wanted to report the fact that strings reported in the XML test output are not correctly sanitized for XML markups. This will result in an invalid XML file each time any time a string contains any markup. For example if "sip:foo@bar.com" is written in the XML, it will corrupt any XML parsing.

Related

Bugs: #103

Discussion

  • Branden Archer

    Branden Archer - 2015-03-15

    Thanks for your interest in Check!

    What version of Check are you using? There is a unit test
    http://sourceforge.net/p/check/code/HEAD/tree/trunk/tests/test_xml_output.sh
    in Check that attempts to verify that using any of the characters that
    should be escaped in XML does result in valid XML. The test dates back to
    2012, so I would expect versions of Check prior to 0.9.9 may not properly
    escape XML.

    According to the XML 1.1 specification http://www.w3.org/TR/xml11/#syntax
    the following characters must be escaped: " ' < > &. The part of the unit
    test mentioned above that outputs characters which must be escaped is the
    following:

    s = suite_create("XML escape \" ' < > & tests");

    tc = tcase_create("description \" ' < > &");
    ...

    START_TEST(test_xml_esc_fail_msg)

    {

    ck_abort_msg("fail \" ' < > & message");
    

    }

    END_TEST

    And this results in the following XML:

    XML escape " ' < > & tests
    ...

    <description>description " ' < > &</description>

    <message>fail sip:foo@bar.com " ' < > &
    message</message>
    ...

    If the example you mention, sip:foo@bar.com, were added to these strings.
    The result would be the following XML:

    XML escape " ' < > & tests sip:foo@bar.com<br>
    ...

    <description>description " ' < > & sip:foo@bar.com
    </description>

    <message>fail " ' < > & message sip:foo@bar.com
    </message>
    ...

    which according to xmllint is valid. Note that none of the characters
    mentioned in your email address needed to be escaped.

    Do you have a minimal example which shows that Check is able to produce
    invalid XML? If so, kindly send it over, as it would helpful is reproducing
    and resolving the issue.

    Thanks!

    Branden

    On Thu, Feb 26, 2015 at 3:24 AM, Jonathan Martin homeroth@users.sf.net
    wrote:


    Status: open
    Group: v1.0 (example)
    Created: Thu Feb 26, 2015 08:24 AM UTC by Jonathan Martin
    Last Updated: Thu Feb 26, 2015 08:24 AM UTC
    Owner: nobody

    Hi check team,

    I wanted to report the fact that strings reported in the XML test output
    are not correctly sanitized for XML markups. This will result in an invalid
    XML file each time any time a string contains any markup. For example if "
    sip:foo@bar.com" is written in the XML, it will corrupt any XML parsing.


    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/check/bugs/103/

    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/

     

    Related

    Bugs: #103

  • Branden Archer

    Branden Archer - 2015-06-14
    • status: open --> closed-works-for-me
     
  • InfernoZeus

    InfernoZeus - 2015-07-28

    I'm seeing very similar behaviour related to illegal characters. As part of an assert, we log both the expected value and the actual value. Occasionally, this results in illegal characters being printed in the xml.

    According to the XML 1.1 spec, the following characters are illegal:

    [#x1-#x8] | [#xB-#xC] | [#xE-#x1F] | [#x7F-#x84] | [#x86-#x9F]

    http://www.w3.org/TR/xml11/#charsets

    Is there any support for stripping these characters?

    Thanks,
    Ben

     
    • Branden Archer

      Branden Archer - 2015-07-31

      Perhaps this is really a request for Check to support XML 1.1. Check
      currently supports XML 1.0.

      If there is interest in adding 1.1 support, we would gladly accept a patch
      against Check to do so. Otherwise, a feature request can be opened for
      adding XML 1.1 support. Or, do you believe that Check's support of XML 1.0
      does not conform to the spec?

      • Branden

      On Tue, Jul 28, 2015 at 11:10 AM, InfernoZeus infernozeus@users.sf.net
      wrote:

      I'm seeing very similar behaviour related to illegal characters. As part
      of an assert, we log both the expected value and the actual value.
      Occasionally, this results in illegal characters being printed in the xml.

      According to the XML 1.1 spec, the following characters are illegal:

      [#x1-#x8] | [#xB-#xC] | [#xE-#x1F] | [#x7F-#x84] | [#x86-#x9F]

      http://www.w3.org/TR/xml11/#charsets

      Is there any support for stripping these characters?

      Thanks,
      Ben


      Status: closed-works-for-me
      Group: v1.0 (example)
      Created: Thu Feb 26, 2015 08:24 AM UTC by Jonathan Martin
      Last Updated: Sun Jun 14, 2015 04:12 PM UTC
      Owner: nobody

      Hi check team,

      I wanted to report the fact that strings reported in the XML test output
      are not correctly sanitized for XML markups. This will result in an invalid
      XML file each time any time a string contains any markup. For example if "
      sip:foo@bar.com" is written in the XML, it will corrupt any XML parsing.


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/check/bugs/103/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       

      Related

      Bugs: #103

  • InfernoZeus

    InfernoZeus - 2015-08-12

    (Sorry for the slow response, I've been on holiday the past week!)

    I believe the same restriction is present in the XML 1.0 spec, although that doesn't explicilty list illegal characters. Here's what it lists as legal:

    Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]

    The same characters (below #x20 at least) are illegal, they're just not explicitly listed.

     
    • Branden Archer

      Branden Archer - 2015-09-14

      Think I understand what you mean now. As of r1225 Check should now either
      properly escape XML characters or emit them if they are not valid. Check
      assumes that the strings being logged are ASCII (as opposed to UTF-8,
      UTF-16, UTF-32, etc). That is, if the strings being logged are not ASCII
      they will be encoded in an unexpected way.

      Does this help resolve the issue you are having? If so, great! If not, can
      you provide a minimal example where Check is not properly outputting valid
      XML?

      • Branden

      On Wed, Aug 12, 2015 at 5:25 AM, InfernoZeus infernozeus@users.sf.net
      wrote:

      (Sorry for the slow response, I've been on holiday the past week!)

      I believe the same restriction is present in the XML 1.0 spec, although
      that doesn't explicilty list illegal characters. Here's what it lists as
      legal:

      Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
      [#x10000-#x10FFFF]

      The same characters (below #x20 at least) are illegal, they're just not
      explicitly listed.


      Status: closed-works-for-me
      Group: v1.0 (example)
      Created: Thu Feb 26, 2015 08:24 AM UTC by Jonathan Martin
      Last Updated: Tue Jul 28, 2015 03:10 PM UTC
      Owner: nobody

      Hi check team,

      I wanted to report the fact that strings reported in the XML test output
      are not correctly sanitized for XML markups. This will result in an invalid
      XML file each time any time a string contains any markup. For example if "
      sip:foo@bar.com" is written in the XML, it will corrupt any XML parsing.


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/check/bugs/103/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       

      Related

      Bugs: #103

  • InfernoZeus

    InfernoZeus - 2015-09-14

    Thanks Branden, that looks perfect!

    Ben

     

Log in to post a comment.