Good morning,
I wrote a little code to read a xml file. So I find a little probleme that happened when the xml file is not well-formed and
not terminated by a CRLF.
Example :
<annuaire>
<personne class = "etudiant">
<nom>Pillou</nom>
<prenom>Jean-Francois</prenom>
<telephone>555-123456</telephone>
<email>webm
Here the last caractere is m and there is no CRLF.
The program is very simple :
char* inputFileName="e:\\temp\\test2.xml"; // test2.xml is my xml file described above
TiXmlDocument doc;
TiXmlEncoding encoding = TIXML_DEFAULT_ENCODING;
doc.SetTabSize(3);
resu=doc.LoadFile(inputFileName,encoding);
This program runs and don't stop, with consumption of all available memory.
So, with a debugging it seems that it comes from the TiXmlElement::ReadValue in the file tinyxmlparser.cpp .
I tried a little addition that seems to work :
// Read in text and elements in any order.
const char* pWithWhiteSpace = p;
p = SkipWhiteSpace( p, encoding );
int iter=0,maxiter=100;
char* pini=(char*)p;
while (( p && *p )&&(iter<maxiter))
{
if (p!=pini) {pini=(char*)p;iter=0;}
else iter++;
if ( *p != '<' )
{
…..
}
if (( !p )||(iter>=maxiter))
{
if ( document ) document->SetError( TIXML_ERROR_READING_ELEMENT_VALUE, 0, 0, encoding );
}
return p;
}
So, it is unfortunate that there are not a function that verifies the validity of the document.
Sincerely,
JP Zimmer
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Good morning,
I wrote a little code to read a xml file. So I find a little probleme that happened when the xml file is not well-formed and
not terminated by a CRLF.
Example :
<annuaire>
<personne class = "etudiant">
<nom>Pillou</nom>
<prenom>Jean-Francois</prenom>
<telephone>555-123456</telephone>
<email>webm
Here the last caractere is m and there is no CRLF.
The program is very simple :
char* inputFileName="e:\\temp\\test2.xml"; // test2.xml is my xml file described above
TiXmlDocument doc;
TiXmlEncoding encoding = TIXML_DEFAULT_ENCODING;
doc.SetTabSize(3);
resu=doc.LoadFile(inputFileName,encoding);
This program runs and don't stop, with consumption of all available memory.
So, with a debugging it seems that it comes from the TiXmlElement::ReadValue in the file tinyxmlparser.cpp .
I tried a little addition that seems to work :
const char* TiXmlElement::ReadValue( const char* p, TiXmlParsingData* data, TiXmlEncoding encoding )
{
TiXmlDocument* document = GetDocument();
// Read in text and elements in any order.
const char* pWithWhiteSpace = p;
p = SkipWhiteSpace( p, encoding );
int iter=0,maxiter=100;
char* pini=(char*)p;
while (( p && *p )&&(iter<maxiter))
{
if (p!=pini) {pini=(char*)p;iter=0;}
else iter++;
if ( *p != '<' )
{
…..
}
if (( !p )||(iter>=maxiter))
{
if ( document ) document->SetError( TIXML_ERROR_READING_ELEMENT_VALUE, 0, 0, encoding );
}
return p;
}
So, it is unfortunate that there are not a function that verifies the validity of the document.
Sincerely,
JP Zimmer
It should definitely fail when it hits EOF (LF or CRLF is irrelevant, but EOF is not). Actually I'm surprised it doesn't fail at EOF.
With that exceptioin, I'm not sure what you're expecting. The library is called *tiny* XML. Not big-and-does-everything XML!
There are plenty of alternatives if you want a validating all-singing-all-dancing parser.
E