Menu

#112 Whitespace only element

open
nobody
None
5
2019-12-12
2010-10-11
Anonymous
No

The XML standard defines an attribute 'space' used to define how whitespace is treated for current node and all below. See http://www.w3.org/TR/REC-xml/#sec-white-space
If an element contains no text, but whitespace only, TinyXml skips the whitespace. The following suggestion will create a text node containing the whitespace. Changes in TiXmlElement::ReadValue()

// We hit a '<'
// Have we hit a new element or an end tag? This could also be
// a TiXmlText in the "CDATA" style.
if ( StringEqual( p, "</", false, encoding ) )
{
// If whitespace exists, add a text node containing whitespace
if ( !TiXmlBase::IsWhiteSpaceCondensed() )
{
if (p != pWithWhiteSpace)
{
TiXmlText* textNode = new TiXmlText( "" );
p = textNode->Parse( pWithWhiteSpace, data, encoding );
LinkEndChild( textNode );
}
}

return p;
}

Discussion

  • Andreas Schönle

    Patch worked for me, thanks.

     
  • Andreas Schönle

    Small addition: For elements that contain e.g. a CR or LF character first this still fails as the initial SkipWhitespace will NOT skip "&#x0D; ", however the GetChar() called when parsing transforms this back to '\n' and therefore Blank() later returns TRUE and the TextNode is not linked.
    I solved it by changing the code of Blank() as follows:

    bool TiXmlText::Blank() const
    {
    if ( !TiXmlBase::IsWhiteSpaceCondensed() )
    {
    return value.length() == 0;
    }
    else
    {
    for ( unsigned i=0; i<value.length(); i++ )
    if ( !IsWhiteSpace( value[i] ) )
    return false;
    return true;
    }
    }

     
  • Gina Gorgog

    Gina Gorgog - 2019-12-12

    Hello!

    I am trying to build an xml message where some tags need to have a value that includes CR LF chars.

    I have made the patch change in ReadValue function and I have the following:

    TiXmlDocument XMLdoc;
    TiXmlPrinter XMLprinter;
    TiXmlElement *pRoot, *pData;
    TiXmlText *pText;
    char XMLBuffer[624] = { 0 };
    
    TiXmlBase::SetCondenseWhiteSpace(false);
    
    pRoot = new TiXmlElement("MyTest");
    XMLdoc.LinkEndChild(pRoot);
    
    pData = new TiXmlElement("MyTag");
    
    // Tag Content
    char receipt_line[512] = { 0 };
    strcpy(receipt_line, "this is line 1");
    strcat(receipt_line, ("\x0D\x0A");      
    strcat(receipt_line, "this is line 2");
    
    pText = new TiXmlText(receipt_line);
    pData->LinkEndChild(pText);
    pRoot->LinkEndChild(pData);
    
    
    /* Save XML to buffer */
    XMLprinter.SetIndent("  ");
    XMLprinter.SetLineBreak("\x0D\x0A");
    XMLdoc.Accept(&XMLprinter);
    strncpy(XMLBuffer, XMLprinter.CStr(), sizeof(XMLBuffer) - 1);
    
    XMLdoc.SaveFile("text.xml");
    
    --
    What happens in the output file is the following:
    

    <mytest>
    <mytag>this is line 1 this is line 2</mytag>
    </mytest>

    The \r\n characters within the tag value are transformed into , while they exist at the end of each line above.
    I have also tried to add these characters in the tag value as follows:
    "\r\n"
    " "
    "0x0D0x0A"
    " "

    No success so far!
    I also added the second patch proposed here, still same result.
    Any help would be greatly appreciated!

    Thank you!

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.