TiXmlNode has a number of downcasting functions that I'm not very happy with: ToDocument(), ToElement(), and several others, with both a const and a non-const version of each. For example, here is ToDocument(), after a little reformatting:
TiXmlDocument* ToDocument()
{
return ( this && type == DOCUMENT )
? (TiXmlDocument*) this : 0;
}
The purpose of these functions is to provide a safe way to convert a pointer to TiXmlNode to a pointer to one of the derived classes. They are equivalent to the dynamic_cast operator, except for the test of "this". If you call this function through a null pointer, it will presumably return null.
I'm not convinced that the "this" test does anyone any favors. Calling a member function through a null pointer is always a bad idea, and probably invokes undefined behavior even if you can get away with it in practice for a given implementation. If I ever do such a thing I want to know about it as soon as possible so that I can fix my mistake. Covering up the problem will only make it harder to debug.
Assuming that you call it with a valid pointer, ToDocument() does the downcast with a C-style cast -- a practice frowned upon by all the most enlightened C++ pundits. One of the compilers I use gives me a long series of stern warnings, saying that it is implementing the cast as a reinterpret_cast, which is dangerous and non-portable (though in this case it's probably portable enough). Actually the intended semantics are those of a dynamic_cast and should probably be coded that way:
TiXmlDocument* ToDocument()
{
return this ? dynamic_cast< TiXmlDocument * >( this ) : 0;
}
If even a proper dynamic_cast is too ugly for your tastes, and you don't mind abandoning the "this" test, you could accomplish the same thing with virtual functions. The base class TiXmlNode would implement it as { return 0; }, and TiXmlDocument would override it with { return this; }.
Granted, the virtual function call would incur some overhead. On the other hand the present implementation incurs other overhead by testing the "this" pointer, fetching the "type" member, and comparing it to a constant, whereas the virtual version (once entered) would return a constant immediately. The performance tradeoff is likely to be insignificant either way.
Finally, these functions are completely unnecessary. The client code can always do a safe downcast with dynamic_cast. If it is considered convenient to wrap the cast in a function, the client code can readily do so, with or without the "this" test, using either the Type() member function to access the "type" member, or dynamic_casts as described above. There's no point in cluttering the interface with member functions that can be trivially replaced by non-member functions in the client code.
By now, of course, there is no doubt some client code out there that calls these functions, and one hates to break working code just to clean up an ill-considered interface. Perhaps these functions could be marked as deprecated, to give people fair warning, and removed from some future release.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I never got the 'this' test; I expect its an evolutionary leftover that Lee wouldn't mind erasing.
But the dynamic_casts get into a different realm.
Definitely, I see your point, and in my own apps I use dynamic_cast when appropriate.
However...
(a) That one compiler gives warnings on a technique is not grounds for changing an API.
One java IDE I know gives warnings on the switch statement, saying that "use of switch is an indication of bad OO style".
That is well into OO purist fundamentalist architecture-astronaut 'everyone should use vi' kind of territory.
Now, your suggestion is not nearly in that area, but wanting to change an API based on a compiler warning is, well, not a good argument :)
(b) Making TinyXml use dynamic_cast forces every user to enable that option or, if required, move to a compiler that supports dyn cast. For the benefit of ... oh wait, the same functionality? No, not worth it.
(c) The current technique is simple, clean (depending on your POV), and works.
Consider...
switch (pNode->type) {
case DOCUMENT: processDoc( pNode->ToDocument()); break;
case ELEMENT: processElem( pNode->ToElement()); break;
...
}
vs
if (TiXmlDocument* pDoc = dynamic_cast<TiXmlDocument*>(pNode)) {
// do stuff
} else if (TiXmlElement* pElem = dynamic_cast<TiXmlElement*>(pNode)) {
// do stuff
}
or
if (pNode->type == DOCUMENT) {
TiXmlDocument* pDoc = pNode->ToDocument()
// do stuff
}
vs
if (TiXmlDocument* pDoc = dynamic_cast<TiXmlDocument*>(pNode)) {
// do stuff
}
The dyncast style above is AFAIK not supported on all compilers too. I may be wrong, but I think some compilers only support the older style:
TiXmlDocument* pDoc = dynamic_cast<TiXmlDocument*>(pNode);
if (pDoc) {
// do stuff
}
Mainly, the dynamic cast approach is, IMO, clunky/ugly and changes the API without any net benefit to the client code.
"If it ain't broke, don't fix it"
(c) virtual functions can be avoided.
As you say, virtual functions do incur an overhead. But I'm a big believer that you can do all sorts of activity and not even make the CPU blink. So, in that frame of mind, making the ToDocument() (etc) method virtual would be fine and arguably cleaner. Again, in my own stuff I tend to use this approach, and the visitor pattern I advocated in another post is based largely on virtual funcs.
But again it doesn't really fix a real problem.
In short:
I think this is the sort of thing that suits a #define option. For example, a tinyxml_setup.h could be added and various defines put in there. One would be STL, and another could be TINYXML_DYN_CAST.
Then its 100% identical to the current implementation if you're happy with C-style casts, and other users have the option of changing to a virtual function approach to ToDocument() if they want.
If you want the dynamic_cast approach - you can do that right now, with the code as-is :)
FWIW
Ellers
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
That is an odd 'this' check. There isn't any reason for it to be there - some strange artifact of 6 year old code. I'm certainly not advocating calling member functions on a null pointer. :)
TinyXML specifically, intentionally, and documents not requiring support for RTTI or Exceptions. Many systems in the embedded space disable these features for real performance or memory consumption requirements, and many programs aimed at the desktop envirement simply don't use them.
While I certainly do have opinions on RTTI and exceptions, that's not really the point. The point is to make TinyXML as portable and friendly to application code as possible, and that means staying away from both features.
And, as Ellers states, developers are free to ignore the ToStuff() functions and just use dynamic_cast<>.
lee
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
(a) Changing from a C-style cast to a dynamic_cast would *not* change the API. It would merely change the internal implementation of the same API. (Dropping the "this" test would subtly change the API, but that is a different issue.)
I am not asking for a change in order to avoid compiler warnings, annoying as the warnings may be. However it does worry me that the compiler implements the cast as a reinterpret_cast rather than as the more appropriate dynamic_cast.
(b) I know that some compilers don't support dynamic_cast. However version 2.4.0 uses static_cast, even for the TiXmlString class, which is supposedly aimed at more primitive compilers. Are there many compilers that support static_cast but not dynamic_cast?
(c) I'm not sure what your point is in this section. Nothing I've proposed would force anyone to use if/else if instead of switch/case. The Type() function would still be available.
Perhaps you are arguing that the syntax for dynamic casts is klunky and ugly. Yes, it is. Stroustrup has written more than once that the ugliness is intentional, because he wants to discourage people from using casts in the first place.
If it is convenient or aesthetically pleasing to bury the ugliness in some little inline function, that function can be in the client code. It doesn't need to be a member function because it needs no privileged access to the class internals.
(d) I think we agree that the overhead of virtual function calls is not an issue.
It might make sense to use conditional compilation to accommodate compilers that don't support the C++ style casts, provided that other C++ style casts such as static_cast receive the same treatment. Now that TinyXml uses static_cast, let's see there are any complaints from the users of such compilers.
(I suspect that older compilers will have more serious problems anyway. The compiler I'm stuck with at work (aCC on HP-UX) can't compile TinyXml without extensive changes because, among other things, it doesn't have <sstream>, or any form of ostringstream.)
I understand your reluctance to remove the downcasting functions altogether, even though they are unnecessary when considered in isolation. If they are kept, I'd like to see them made virtual (which implies eliminating the "this" test) so that the casts are eliminated. Failing that, I'd like to see the C-style casts replaced with dynamic_casts. Of course these preferences are a matter of style rather than function.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- the 'this' test has nothing to do with anything; just a leftover to be removed. I have no idea what you mean by removing it could subtly affect the API.
- you're not prevented from using RTTI if you want to. I don't want to, and I won't be forced to because someone or an author has his/her style/religious bias, Moreover it really doesn't offer any advantage in this case.
- A virt function approach, e.g. TiXmlDocument::ToDocument() { return this }, will be good. Its basically letting the compiler do the switch statement for you, which OO purists get off on, and hey, it does the job. But my point is it isn't strictly "necessary" for this scenario - it works fine as-is.
- its not broken! Focus attention and effort in areas that need improvement.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I was not aware, or had forgotten, that TinyXml was committed to a policy of not requiring RTTI. I have no desire to overturn that policy. Given that constraint, the ToStuff functions should remain in the public interface in some form.
Lee has indicated that he plans to make these functions virtual, and thereby eliminate both the cast and the "this" test. I'm more than satisfied with that result.
My remark about the "this" test subtly changing the API refers to what will happen if you call the function through a null pointer. With the "this" test the function will probably return null. Without it, it will probably produce some obvious problem such as an access violation.
As for the side issue of <sstream.h>: the compiler I use at work supports <string> and the other usual STL features, so I planned to compile with STL turned on. But it doesn't support <sstream>. Only after I had developed my aargh utility at home under Linux did I try to port it to HP-UX at work, where the problem showed up. I expect to work around it somehow, but I was frustrated because I really wanted to use aargh on my job. TinyXml is not at fault, and I'm not complaining.
Thank you for being so responsive.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I had more time tonight than I expected and it was a simple change so I just commited a fix bug 1333451 (virtual for To*() functions). You'll get it if you get the latest CVS version.
STL: would it be worth trying stlport? It might be better than trusting the HP implementation of STL anyway..
HTH
Ellers
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Stlport would be a last resort. I would have to use a separate STL implementation for a single program, or else rewrite a bunch of code that's already written for the HP implementation (which is actually an old Rogue Wave implementation).
I haven't yet exhausted the simpler options. I can compile with STL turned off for TinyXml, or I can rewrite the one function that uses ostringstream. I may decide to wait until the next release of aargh, which will generate C code as well as C++ code.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Actually -- the <sstream> thing is probably a (subtle) bug in TinyXML come to think of it. It is easy with STL headers to include the wrong thing.
I'd like to compile TinyXML against both VC2005 with its very aggressive safety checking, and STLPort with its very correct STL, and clean up all the warnings and problems. I have both environments, just haven't taken the time yet.
On the "todo" list.
lee
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
TiXmlNode has a number of downcasting functions that I'm not very happy with: ToDocument(), ToElement(), and several others, with both a const and a non-const version of each. For example, here is ToDocument(), after a little reformatting:
TiXmlDocument* ToDocument()
{
return ( this && type == DOCUMENT )
? (TiXmlDocument*) this : 0;
}
The purpose of these functions is to provide a safe way to convert a pointer to TiXmlNode to a pointer to one of the derived classes. They are equivalent to the dynamic_cast operator, except for the test of "this". If you call this function through a null pointer, it will presumably return null.
I'm not convinced that the "this" test does anyone any favors. Calling a member function through a null pointer is always a bad idea, and probably invokes undefined behavior even if you can get away with it in practice for a given implementation. If I ever do such a thing I want to know about it as soon as possible so that I can fix my mistake. Covering up the problem will only make it harder to debug.
Assuming that you call it with a valid pointer, ToDocument() does the downcast with a C-style cast -- a practice frowned upon by all the most enlightened C++ pundits. One of the compilers I use gives me a long series of stern warnings, saying that it is implementing the cast as a reinterpret_cast, which is dangerous and non-portable (though in this case it's probably portable enough). Actually the intended semantics are those of a dynamic_cast and should probably be coded that way:
TiXmlDocument* ToDocument()
{
return this ? dynamic_cast< TiXmlDocument * >( this ) : 0;
}
If even a proper dynamic_cast is too ugly for your tastes, and you don't mind abandoning the "this" test, you could accomplish the same thing with virtual functions. The base class TiXmlNode would implement it as { return 0; }, and TiXmlDocument would override it with { return this; }.
Granted, the virtual function call would incur some overhead. On the other hand the present implementation incurs other overhead by testing the "this" pointer, fetching the "type" member, and comparing it to a constant, whereas the virtual version (once entered) would return a constant immediately. The performance tradeoff is likely to be insignificant either way.
Finally, these functions are completely unnecessary. The client code can always do a safe downcast with dynamic_cast. If it is considered convenient to wrap the cast in a function, the client code can readily do so, with or without the "this" test, using either the Type() member function to access the "type" member, or dynamic_casts as described above. There's no point in cluttering the interface with member functions that can be trivially replaced by non-member functions in the client code.
By now, of course, there is no doubt some client code out there that calls these functions, and one hates to break working code just to clean up an ill-considered interface. Perhaps these functions could be marked as deprecated, to give people fair warning, and removed from some future release.
I think the points you raise are valid and well put.
Certainly, ToDocument would be better written:
TiXmlDocument* ToDocument()
{
return (type == DOCUMENT ) ? (TiXmlDocument*) this : 0;
}
I never got the 'this' test; I expect its an evolutionary leftover that Lee wouldn't mind erasing.
But the dynamic_casts get into a different realm.
Definitely, I see your point, and in my own apps I use dynamic_cast when appropriate.
However...
(a) That one compiler gives warnings on a technique is not grounds for changing an API.
One java IDE I know gives warnings on the switch statement, saying that "use of switch is an indication of bad OO style".
That is well into OO purist fundamentalist architecture-astronaut 'everyone should use vi' kind of territory.
Now, your suggestion is not nearly in that area, but wanting to change an API based on a compiler warning is, well, not a good argument :)
(b) Making TinyXml use dynamic_cast forces every user to enable that option or, if required, move to a compiler that supports dyn cast. For the benefit of ... oh wait, the same functionality? No, not worth it.
(c) The current technique is simple, clean (depending on your POV), and works.
Consider...
switch (pNode->type) {
case DOCUMENT: processDoc( pNode->ToDocument()); break;
case ELEMENT: processElem( pNode->ToElement()); break;
...
}
vs
if (TiXmlDocument* pDoc = dynamic_cast<TiXmlDocument*>(pNode)) {
// do stuff
} else if (TiXmlElement* pElem = dynamic_cast<TiXmlElement*>(pNode)) {
// do stuff
}
or
if (pNode->type == DOCUMENT) {
TiXmlDocument* pDoc = pNode->ToDocument()
// do stuff
}
vs
if (TiXmlDocument* pDoc = dynamic_cast<TiXmlDocument*>(pNode)) {
// do stuff
}
The dyncast style above is AFAIK not supported on all compilers too. I may be wrong, but I think some compilers only support the older style:
TiXmlDocument* pDoc = dynamic_cast<TiXmlDocument*>(pNode);
if (pDoc) {
// do stuff
}
Mainly, the dynamic cast approach is, IMO, clunky/ugly and changes the API without any net benefit to the client code.
"If it ain't broke, don't fix it"
(c) virtual functions can be avoided.
As you say, virtual functions do incur an overhead. But I'm a big believer that you can do all sorts of activity and not even make the CPU blink. So, in that frame of mind, making the ToDocument() (etc) method virtual would be fine and arguably cleaner. Again, in my own stuff I tend to use this approach, and the visitor pattern I advocated in another post is based largely on virtual funcs.
But again it doesn't really fix a real problem.
In short:
I think this is the sort of thing that suits a #define option. For example, a tinyxml_setup.h could be added and various defines put in there. One would be STL, and another could be TINYXML_DYN_CAST.
Then its 100% identical to the current implementation if you're happy with C-style casts, and other users have the option of changing to a virtual function approach to ToDocument() if they want.
If you want the dynamic_cast approach - you can do that right now, with the code as-is :)
FWIW
Ellers
That is an odd 'this' check. There isn't any reason for it to be there - some strange artifact of 6 year old code. I'm certainly not advocating calling member functions on a null pointer. :)
TinyXML specifically, intentionally, and documents not requiring support for RTTI or Exceptions. Many systems in the embedded space disable these features for real performance or memory consumption requirements, and many programs aimed at the desktop envirement simply don't use them.
While I certainly do have opinions on RTTI and exceptions, that's not really the point. The point is to make TinyXML as portable and friendly to application code as possible, and that means staying away from both features.
And, as Ellers states, developers are free to ignore the ToStuff() functions and just use dynamic_cast<>.
lee
(a) Changing from a C-style cast to a dynamic_cast would *not* change the API. It would merely change the internal implementation of the same API. (Dropping the "this" test would subtly change the API, but that is a different issue.)
I am not asking for a change in order to avoid compiler warnings, annoying as the warnings may be. However it does worry me that the compiler implements the cast as a reinterpret_cast rather than as the more appropriate dynamic_cast.
(b) I know that some compilers don't support dynamic_cast. However version 2.4.0 uses static_cast, even for the TiXmlString class, which is supposedly aimed at more primitive compilers. Are there many compilers that support static_cast but not dynamic_cast?
(c) I'm not sure what your point is in this section. Nothing I've proposed would force anyone to use if/else if instead of switch/case. The Type() function would still be available.
Perhaps you are arguing that the syntax for dynamic casts is klunky and ugly. Yes, it is. Stroustrup has written more than once that the ugliness is intentional, because he wants to discourage people from using casts in the first place.
If it is convenient or aesthetically pleasing to bury the ugliness in some little inline function, that function can be in the client code. It doesn't need to be a member function because it needs no privileged access to the class internals.
(d) I think we agree that the overhead of virtual function calls is not an issue.
It might make sense to use conditional compilation to accommodate compilers that don't support the C++ style casts, provided that other C++ style casts such as static_cast receive the same treatment. Now that TinyXml uses static_cast, let's see there are any complaints from the users of such compilers.
(I suspect that older compilers will have more serious problems anyway. The compiler I'm stuck with at work (aCC on HP-UX) can't compile TinyXml without extensive changes because, among other things, it doesn't have <sstream>, or any form of ostringstream.)
I understand your reluctance to remove the downcasting functions altogether, even though they are unnecessary when considered in isolation. If they are kept, I'd like to see them made virtual (which implies eliminating the "this" test) so that the casts are eliminated. Failing that, I'd like to see the C-style casts replaced with dynamic_casts. Of course these preferences are a matter of style rather than function.
Lee has answered all the main points.
My remaining points are
- the 'this' test has nothing to do with anything; just a leftover to be removed. I have no idea what you mean by removing it could subtly affect the API.
- you're not prevented from using RTTI if you want to. I don't want to, and I won't be forced to because someone or an author has his/her style/religious bias, Moreover it really doesn't offer any advantage in this case.
- A virt function approach, e.g. TiXmlDocument::ToDocument() { return this }, will be good. Its basically letting the compiler do the switch statement for you, which OO purists get off on, and hey, it does the job. But my point is it isn't strictly "necessary" for this scenario - it works fine as-is.
- its not broken! Focus attention and effort in areas that need improvement.
Key point on the casting: TinyXML does not require RTTI. RTTI is required for dynamic_cast<> but not static_cast<> or reintepret_cast<>.
As far as your concerns about old compilers not being able to handle <sstream> etc. that's why STL can be turned off.
The virtual functions are probably a better way to implement it. I filed a bug.
lee
I was not aware, or had forgotten, that TinyXml was committed to a policy of not requiring RTTI. I have no desire to overturn that policy. Given that constraint, the ToStuff functions should remain in the public interface in some form.
Lee has indicated that he plans to make these functions virtual, and thereby eliminate both the cast and the "this" test. I'm more than satisfied with that result.
My remark about the "this" test subtly changing the API refers to what will happen if you call the function through a null pointer. With the "this" test the function will probably return null. Without it, it will probably produce some obvious problem such as an access violation.
As for the side issue of <sstream.h>: the compiler I use at work supports <string> and the other usual STL features, so I planned to compile with STL turned on. But it doesn't support <sstream>. Only after I had developed my aargh utility at home under Linux did I try to port it to HP-UX at work, where the problem showed up. I expect to work around it somehow, but I was frustrated because I really wanted to use aargh on my job. TinyXml is not at fault, and I'm not complaining.
Thank you for being so responsive.
I had more time tonight than I expected and it was a simple change so I just commited a fix bug 1333451 (virtual for To*() functions). You'll get it if you get the latest CVS version.
STL: would it be worth trying stlport? It might be better than trusting the HP implementation of STL anyway..
HTH
Ellers
Stlport would be a last resort. I would have to use a separate STL implementation for a single program, or else rewrite a bunch of code that's already written for the HP implementation (which is actually an old Rogue Wave implementation).
I haven't yet exhausted the simpler options. I can compile with STL turned off for TinyXml, or I can rewrite the one function that uses ostringstream. I may decide to wait until the next release of aargh, which will generate C code as well as C++ code.
Actually -- the <sstream> thing is probably a (subtle) bug in TinyXML come to think of it. It is easy with STL headers to include the wrong thing.
I'd like to compile TinyXML against both VC2005 with its very aggressive safety checking, and STLPort with its very correct STL, and clean up all the warnings and problems. I have both environments, just haven't taken the time yet.
On the "todo" list.
lee