I've been using tinyxml for a while. I've just noticed that every time Find is called, it creates a string object, does the comparison, and then destroys the string object. This causes an unnecessary malloc, which slows things down.
This is avoided with the addition of a new comparison in TiXmlString:
bool operator == (const char* compare) const;
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The bug entry got closed without any changes, however the issue still remains. Possibly not enough info in the bug report?
I've since noticed that not only does it alloc/free inside the loop multiple times per Find, but since Find is called when loading, its doing extra ones during loading as well.
Also as this new operator function boils down to a strcmp, it should probably be inlined for extra goodness. :-)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
No, it's my fault; Lee's put it down as tixmlstring::Find in the bug report which is what I looked at, but I really should have looked at the subject of your post. :) Apologies.
Just to clarify this to save Lee some time: when using TiXmlStrings, there's no native equality operator for const char*. Therefore the char* on the right hand side of 'if ( node->name == name )' in TiXmlAttributeSet :: Find() is implicitly converted to a TiXmlString so that they can be compared via TiXmlString::operator ==(const TiXmlString&). Adding a copy of that operator that took a const char* would eliminate this.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Similar considerations apply to operator !=, operator>, operator<, and operator+=.
Furthermore: I routinely add the explicit keyword to constructors with single arguments (other than the copy constructor), or when all but the first argument are optional. This habit prevents me from creating temporaries inadvertently -- especially when the constructors are non-trivial.
Unfortunately if you adopted this approach for TinyXml at this point you would probably break somebody's code somewhere. On the other hand such breakage is likely to lead to better code once it is fixed. Perhaps you could provide an option to disable the explicit keyword by conditional compilation.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Oh, that explicit suggestion reminds me, apart from Find, this issue also exists in FirstChild, LastChild, etc.
However there it uses an explicit string temporary so that code must also be changed. It currently uses:
node->SValue() == TIXML_STRING(_value)
Not sure why it does - perhaps the explicit cast is needed for stl::string? Anyway, it either needs to go, or be replaced with a new macro:
node->SValue() == TIXML_CAST_STRING(_value)
This can be defined as nothing for tinystr so the raw char* comes through, while still allowing the explicit string cast for stl::string (if indeed that is necessary).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
..unless someone wants to submit a patch, of course. :) I ask that it be limited to this fix, and not break or do damage to th STL path.
That said, I have a pretty short list of targeted fixes (more const cleanup, 'unsigned char' / 'signed char' problems, security on MSVC) that I can put in pretty quickly. Possibly this weekend.
lee
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I added 'explicit' to the TinyStr constructors. That shouldn't break any user code, because nothing should be using the TinyStr outside of the TinyXml core. (Which is why there is the SValue() cast, by the way -- only char* is ever returned, but the value is stored is the STL or Tiny Str).
That quickly cleaned up the Find issue. (I added operator==(const char*)) If anyone wants to check it out in CVS, it's checked in, and I should have a new release pretty soon. So I'm going to close the existing bugs, but reopen (or open a new bug) if there are other conversion problems.
lee
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Wow, strange extra superfluous cast. Also removed.
Part of the fun of a 6 year old code base is that you know you wrote it, and you had some reason why, but since I don't remember it all seems fresh. :)
lee
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've been using tinyxml for a while. I've just noticed that every time Find is called, it creates a string object, does the comparison, and then destroys the string object. This causes an unnecessary malloc, which slows things down.
This is avoided with the addition of a new comparison in TiXmlString:
bool operator == (const char* compare) const;
Good catch. Bug filed.
lee
The bug entry got closed without any changes, however the issue still remains. Possibly not enough info in the bug report?
I've since noticed that not only does it alloc/free inside the loop multiple times per Find, but since Find is called when loading, its doing extra ones during loading as well.
Also as this new operator function boils down to a strcmp, it should probably be inlined for extra goodness. :-)
There's nothing in there which could cause frees or allocs. The function in my copy looks like this:
char * lookup; (4 bytes on the stack)
if (offset >= length ()) (just a call to strlen)
return (unsigned) notfound; (return a number)
for (lookup = cstring + offset; * lookup; lookup++) (this is just pointer arithmetic)
if (* lookup == tofind) (comparing 2 chars)
return lookup - cstring; (more pointer arithmetic)
return (unsigned) notfound; (again, just returning a number)
Sorry, I didn't clarify the function properly. I meant an alloc from the function:
TiXmlAttribute* TiXmlAttributeSet::Find( const char * name ) const
No, it's my fault; Lee's put it down as tixmlstring::Find in the bug report which is what I looked at, but I really should have looked at the subject of your post. :) Apologies.
Just to clarify this to save Lee some time: when using TiXmlStrings, there's no native equality operator for const char*. Therefore the char* on the right hand side of 'if ( node->name == name )' in TiXmlAttributeSet :: Find() is implicitly converted to a TiXmlString so that they can be compared via TiXmlString::operator ==(const TiXmlString&). Adding a copy of that operator that took a const char* would eliminate this.
Original post said:
>>
This is avoided with the addition of a new comparison in TiXmlString:
bool operator == (const char* compare) const;
<<
In addition, I suggest a second version to test an equality in the opposite direction. This would be a non-member inline function:
inline bool operator==(const char * str, const TiXmlString & tistrl)
{ return ( tistr == str ); }
(Warning: uncompiled and untested)
Similar considerations apply to operator !=, operator>, operator<, and operator+=.
Furthermore: I routinely add the explicit keyword to constructors with single arguments (other than the copy constructor), or when all but the first argument are optional. This habit prevents me from creating temporaries inadvertently -- especially when the constructors are non-trivial.
Unfortunately if you adopted this approach for TinyXml at this point you would probably break somebody's code somewhere. On the other hand such breakage is likely to lead to better code once it is fixed. Perhaps you could provide an option to disable the explicit keyword by conditional compilation.
Geoff wrote me and explained the issue. Rather obvious once it was pointed out. The bug is open again - I'll get it fixed in the next release.
lee
Oh, that explicit suggestion reminds me, apart from Find, this issue also exists in FirstChild, LastChild, etc.
However there it uses an explicit string temporary so that code must also be changed. It currently uses:
node->SValue() == TIXML_STRING(_value)
Not sure why it does - perhaps the explicit cast is needed for stl::string? Anyway, it either needs to go, or be replaced with a new macro:
node->SValue() == TIXML_CAST_STRING(_value)
This can be defined as nothing for tinystr so the raw char* comes through, while still allowing the explicit string cast for stl::string (if indeed that is necessary).
..unless someone wants to submit a patch, of course. :) I ask that it be limited to this fix, and not break or do damage to th STL path.
That said, I have a pretty short list of targeted fixes (more const cleanup, 'unsigned char' / 'signed char' problems, security on MSVC) that I can put in pretty quickly. Possibly this weekend.
lee
I added 'explicit' to the TinyStr constructors. That shouldn't break any user code, because nothing should be using the TinyStr outside of the TinyXml core. (Which is why there is the SValue() cast, by the way -- only char* is ever returned, but the value is stored is the STL or Tiny Str).
That quickly cleaned up the Find issue. (I added operator==(const char*)) If anyone wants to check it out in CVS, it's checked in, and I should have a new release pretty soon. So I'm going to close the existing bugs, but reopen (or open a new bug) if there are other conversion problems.
lee
Strike the comment on the cast. Should have read more carefully - there is an extra cast in there.
lee
Wow, strange extra superfluous cast. Also removed.
Part of the fun of a 6 year old code base is that you know you wrote it, and you had some reason why, but since I don't remember it all seems fresh. :)
lee