Menu

Another parse bug (similar to 1307032)?

igorv007
2005-10-15
2013-05-20
  • igorv007

    igorv007 - 2005-10-15

    The following snippet shows how I create xml:

    TiXmlDocument xmlDoc;
    TiXmlElement * pElement;
    TiXmlElement * pItem;
    TiXmlText * pText;

    TiXmlElement * pRoot = new TiXmlElement("ROOT");
    xmlDoc.LinkEndChild(pRoot);

    for (int i=0; i<2; i++)
    {
        // Add a <BOOK> node
        pElement = new TiXmlElement("BOOK");
        pRoot->LinkEndChild(pElement);
        // Add an ID attribute
        pElement->SetAttribute("ID", "1001");

        // Add a <TITLE> sub-node
        pItem = new TiXmlElement("TITLE");
        pElement->LinkEndChild(pItem);
        pText = new TiXmlText("title");
        pItem->LinkEndChild(pText);

        // Add a <DESCRIPTION> sub-node
        pItem = new TiXmlElement("DESCRIPTION");
        pElement->LinkEndChild(pItem);
        pText = new TiXmlText("CDATA test");
        pText->SetCDATA(true);
        pItem->LinkEndChild(pText);
    }

    // Save the file
    bool bRes = xmlDoc.SaveFile("MyTest.xml");

    This is how it was saved:

    <ROOT VERSION="1">
        <BOOK ID="1001">
            <TITLE>title</TITLE>
            <DESCRIPTION>
                <![CDATA[
    CDATA test
                ]]>
    </DESCRIPTION>
        </BOOK>
    </ROOT>

    Looks like extra \r\n are added inside CDATA section.

    Thanks.

     
    • Lee Thomason

      Lee Thomason - 2005-10-18

      ...I'm not sure that is a bug. Even by the XML spec. EOL of normalized on reading the file *before parsing.* TinyXML writes the XML file with the current system EOL. But this is guarenteed to be correct, because the EOL is normalized on input. (The W3C spec can be circular.)

      The point being, there is no guarentee about the EOL and I don't think there is any way to preserve them in CDATAs.

      It is very easy for me to make a mistake interpreting that spec, and I would welcome comment.

      lee

       
    • igorv007

      igorv007 - 2005-10-19

      I am not an expert but thinking logically
      Lets say I want to save the string 10/13/2005 04:36 PM in CDATA section and obviously Id want to save it like <![CDATA[10/13/2005 04:36 PM]]>, without any extra characters, right? More importantly Id want to retrieve it like 10/13/2005 04:36 PM without any extra EOL. But it does not seem to be happening.
      Its retrieved as

      10/13/2005 04:36 PM
               

      You said The point being, there is no guarantee about the EOL and I don't think there is any way to preserve them in CDATAs but you misinterpreted my point. My point is that I want to retrieve the string exactly the same way I saved it regardless of what characters it contains. If it has EOL then please keep it, if it does not then please do not add it like TinyXml does. Btw MS XML behaves this way  it does not add or remove EOL characters but it keeps them if there are any (I understand that MS apps are hardly the standard, it just to show that it can be done).
      What do you think?

      Thanks,
      Igor.

       
    • Lee Thomason

      Lee Thomason - 2005-10-19

      Ah -- I misinterpreted your last post. Yes, that's a bug.

      lee

       
    • Lee Thomason

      Lee Thomason - 2005-10-19

      Fixed in 2.4.1. I hope. At least it doesn't add extra newlines now, whether the CDATA is formatted quite the way it should be will have to be seen.

      lee

       
    • igorv007

      igorv007 - 2005-10-19

      Great, thanks a lot. I did not even know that 2.4.2 is out already. I'll check it out in a day or two.

      Thanks for the nice product,
      Igor.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.