Possibly the most requested feature for TinyXML is the ability to print to memory in non-STL mode. The proposed API is:
/** Write the document to a string using formatted printing ("pretty print"). This
will allocate a character array (new char[]) and return it as a pointer. The
calling code pust call delete[] on the return char* to avoid a memory leak.
*/
char* PrintToMemory() const;
The slightly more daring can use the slightly more efficient:
"Why non-STL? For which platforms/compilers it needed?"
Good question.
In STL mode this API is pretty pointless. (And is already fully supported using the stream operators). But it is the most requested feature, hands down. (With "fix white space" a distant second). Let us take it for granted that there are many cases where non-STL is useful, else people wouldn't request the feature.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
This is typicly something you'd use the Visitor pattern form....
Don't crop up the TiXml API, just make a TiXmlVisitor and the proposed Visitor function to implement this function.
I.e. write a TiXmlVisitor class that keeps this pointer internally, and pass it to the document::Visit() and it'll fill the pointer - just as you would implement this proposed PrintToMemory. Then you can either access the pointer with some get() function, and it gives the added bonus that you can let this Visitor object handle the memory (i.e. free it in the destructor) so you won't have memory leaks from people not deleting the pointer.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
However, the performance impact of creating the text form of the string for every call to Visit() is probably not the best thing. (And TinyXML doesn't keep such a thing internally.)
One option is to pass the string for the current tag in the TiXmlVisitor. So, StartElement() would pass:
"<foo bar="something">"
back to the user. This has the advantage of not pushing memory management to the user (a good idea) but does force the user to implement the string concatenation (not great).
Thoughts on this? Did I interpret your idea correctly?
lee
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Sorry for not answering earlier, login was disabled here at SF.
I see what you mean, you meant for Visitor to get parsed data (different TiXmlNodes) rather than the raw data that PrintToMemory is supposed to have. Passing that data in StartElement() is an idea yeah, but you'll still do the parsing only that the PrintToMemoryVisitor won't care about it - causing the performance hit.
But if Print() already can do this, why bother? Or maybe I misunderstood you?
When you talk about the user, do you mean the user of the library or the PrintToMemoryVisitor class author? The memory should be handled by that class in my suggestion. But why would they have to implement string concatenation? Doesn't the TinyXML string class have that? And STL does too. It can just return the char* on user request.
Also, you'd might want to have a Visitor folder or something in the repository where people can submit and find usefull visitors. Like the PrintToMemoryVisitor.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
"But if Print() already can do this, why bother? Or maybe I misunderstood you?"
Print() in non-STL mode prints to a FILE, or an internal representation of a string. There is no convenient way for a developer using TinyXML to get to a character arrar represenation of the XML.
"...PrintToMemoryVisitor class author?"
It turns out that PrintToMemoryVisitor is more involved thing to write than it appears (at least to me). Possibly makes sense if you need output formatted in a particular way, but it seems like an simple thing like "print to memory" should be simple in the API as well.
"Doesn't the TinyXML string class have that?"
Yes, TinyXML has a string class. But it isn't exposed through the API. The world does not need another string class. And for STL, print to memory already works fine. This is all about the non-STL case. (See earlier post.)
"...Visitor folder or something in the repository ..."
A very good idea. I'll add that as I get submissions.
lee
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
"It turns out that PrintToMemoryVisitor is more involved thing to write than it appears"
I have to agree with you there. Although it is possible and not to hard, and the performance hit might be neglitable it might still be worth it to keep the main XML tree interface clean. If you implement the visitor pattern I'm willing to write the PrintToMemoryVisitor class so you can try it out and make a decision based on that. I'll send you a mail so you can get ahold of me easer.
"Yes, TinyXML has a string class. But it isn't exposed through the API."
Oh ok. I've never used the non-stl version of TiXml. :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have added this functionality recently. I want to add a patch. Our functions are called GetAsString() and GetAsCharBuffer( char* buffer, size_t bufferSize ).
This is for a "Pretty-Print" versionin both STL and non-STL, but we opted to have the buffer size passed into the function instead and failing the function call if your buffer was to small.
What do you think?
Regards,
Ryan
RJP Computing
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I wrestled with something similar, but it's hard to get right: If you pass a bufferSize, how to know if it is big enough? If you do a first pass to get the size, and a 2nd pass to print, that's very inefficient. (Printing is an intensive operation for TinyXML.)
In the end, I opted for allocated an array of memory and returning it. The caller is responsible for deletion. It is somewhat dangerous, but wildly more efficient.
lee
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Well what about the STL version called GetAsString()? Did you want to look at that. It required us to add a function to every node, and it mimics the current Print() method. I really wouldn't mind if you took what we have done and made it the way you see fit.
Regards,
Ryan
RJP Computing
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've implemented a PrintToMemory visitor. So it works without adding anything to the standard TiXml interface. Internally it uses either a std::string or TiXmlString depending on the STL switch define in TinyXML. It then has a const char* Buffer() const; that allows one to look at the buffer, or copy it or whatever. It supports both pretty print and onelined, with full flexibility in the pretty print. The code looks something like this:
as it takes strings, you can make a indentation as 4 spaces or whatever you like. The same goes for newline. There's a ctor that takes number of spaces, so you can say printer(4, "\n"), to get four spaces per indentation level.
The code has been submitted to Lee. But untill the Visitor discussions are fully resolved it's useless :)
Cheers,
JP
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Possibly the most requested feature for TinyXML is the ability to print to memory in non-STL mode. The proposed API is:
/** Write the document to a string using formatted printing ("pretty print"). This
will allocate a character array (new char[]) and return it as a pointer. The
calling code pust call delete[] on the return char* to avoid a memory leak.
*/
char* PrintToMemory() const;
The slightly more daring can use the slightly more efficient:
virtual void Print( FILE* cfile, int depth = 0, TIXML_STRING* str = 0 ) const;
Again, suggestions / comments / improvemnts appreciated!
lee
Why non-STL? For which platforms/compilers it needed?
Friend of TiXmlNode
std::string & operator<< (std::string &out, const TiXmlNode &base)
is also not needed:
ostringstream os;
os << node;
os.str() - std::string object with dump
os.str().c_str() - const char * with dump.
"Why non-STL? For which platforms/compilers it needed?"
Good question.
In STL mode this API is pretty pointless. (And is already fully supported using the stream operators). But it is the most requested feature, hands down. (With "fix white space" a distant second). Let us take it for granted that there are many cases where non-STL is useful, else people wouldn't request the feature.
This is typicly something you'd use the Visitor pattern form....
Don't crop up the TiXml API, just make a TiXmlVisitor and the proposed Visitor function to implement this function.
I.e. write a TiXmlVisitor class that keeps this pointer internally, and pass it to the document::Visit() and it'll fill the pointer - just as you would implement this proposed PrintToMemory. Then you can either access the pointer with some get() function, and it gives the added bonus that you can let this Visitor object handle the memory (i.e. free it in the destructor) so you won't have memory leaks from people not deleting the pointer.
John-Philip --
I like the suggestion.
However, the performance impact of creating the text form of the string for every call to Visit() is probably not the best thing. (And TinyXML doesn't keep such a thing internally.)
One option is to pass the string for the current tag in the TiXmlVisitor. So, StartElement() would pass:
"<foo bar="something">"
back to the user. This has the advantage of not pushing memory management to the user (a good idea) but does force the user to implement the string concatenation (not great).
Thoughts on this? Did I interpret your idea correctly?
lee
Sorry for not answering earlier, login was disabled here at SF.
I see what you mean, you meant for Visitor to get parsed data (different TiXmlNodes) rather than the raw data that PrintToMemory is supposed to have. Passing that data in StartElement() is an idea yeah, but you'll still do the parsing only that the PrintToMemoryVisitor won't care about it - causing the performance hit.
But if Print() already can do this, why bother? Or maybe I misunderstood you?
When you talk about the user, do you mean the user of the library or the PrintToMemoryVisitor class author? The memory should be handled by that class in my suggestion. But why would they have to implement string concatenation? Doesn't the TinyXML string class have that? And STL does too. It can just return the char* on user request.
Also, you'd might want to have a Visitor folder or something in the repository where people can submit and find usefull visitors. Like the PrintToMemoryVisitor.
"But if Print() already can do this, why bother? Or maybe I misunderstood you?"
Print() in non-STL mode prints to a FILE, or an internal representation of a string. There is no convenient way for a developer using TinyXML to get to a character arrar represenation of the XML.
"...PrintToMemoryVisitor class author?"
It turns out that PrintToMemoryVisitor is more involved thing to write than it appears (at least to me). Possibly makes sense if you need output formatted in a particular way, but it seems like an simple thing like "print to memory" should be simple in the API as well.
"Doesn't the TinyXML string class have that?"
Yes, TinyXML has a string class. But it isn't exposed through the API. The world does not need another string class. And for STL, print to memory already works fine. This is all about the non-STL case. (See earlier post.)
"...Visitor folder or something in the repository ..."
A very good idea. I'll add that as I get submissions.
lee
"It turns out that PrintToMemoryVisitor is more involved thing to write than it appears"
I have to agree with you there. Although it is possible and not to hard, and the performance hit might be neglitable it might still be worth it to keep the main XML tree interface clean. If you implement the visitor pattern I'm willing to write the PrintToMemoryVisitor class so you can try it out and make a decision based on that. I'll send you a mail so you can get ahold of me easer.
"Yes, TinyXML has a string class. But it isn't exposed through the API."
Oh ok. I've never used the non-stl version of TiXml. :)
I have added this functionality recently. I want to add a patch. Our functions are called GetAsString() and GetAsCharBuffer( char* buffer, size_t bufferSize ).
This is for a "Pretty-Print" versionin both STL and non-STL, but we opted to have the buffer size passed into the function instead and failing the function call if your buffer was to small.
What do you think?
Regards,
Ryan
RJP Computing
I wrestled with something similar, but it's hard to get right: If you pass a bufferSize, how to know if it is big enough? If you do a first pass to get the size, and a 2nd pass to print, that's very inefficient. (Printing is an intensive operation for TinyXML.)
In the end, I opted for allocated an array of memory and returning it. The caller is responsible for deletion. It is somewhat dangerous, but wildly more efficient.
lee
Well what about the STL version called GetAsString()? Did you want to look at that. It required us to add a function to every node, and it mimics the current Print() method. I really wouldn't mind if you took what we have done and made it the way you see fit.
Regards,
Ryan
RJP Computing
I've implemented a PrintToMemory visitor. So it works without adding anything to the standard TiXml interface. Internally it uses either a std::string or TiXmlString depending on the STL switch define in TinyXML. It then has a const char* Buffer() const; that allows one to look at the buffer, or copy it or whatever. It supports both pretty print and onelined, with full flexibility in the pretty print. The code looks something like this:
PrintToMemoryVisitor printer("\t", "\n");
myXmlDoc.Accept( printer );
cout << printer.Buffer();
as it takes strings, you can make a indentation as 4 spaces or whatever you like. The same goes for newline. There's a ctor that takes number of spaces, so you can say printer(4, "\n"), to get four spaces per indentation level.
The code has been submitted to Lee. But untill the Visitor discussions are fully resolved it's useless :)
Cheers,
JP
TiXmlPrinter has been added to the Beta, and has subsumed the PrintToMemory() API.
Thanks JP!
lee