Menu

#26 Poor performance with UTF-8 in RKEnumerator

open
5
2009-01-29
2009-01-29
Anonymous
No

When using a utf-8 NSString RKEnumerator performance drops dramatically. The problem lies in RKStringBufferWithString() and the heavy use of it in all the enumerator methods. In RKStringBufferWithString() it calls CFStringGetFastestEncoding() which returns utf-16 when called on a utf-8 NSString. This causes the string to unnecessarily be converted on each call which is terribly slow.

To make matters worse, the resulting string buffer is not cached throughout the lifetime of RKEnumerator so it is being recreated over and over. We've tweaked the code to cache this buffer (and changed calls to RKutf8to16 to RKConvertUTF8ToUTF16RangeForStringBuffer, etc) and saw a 500%+ speed increase.

Furthermore, it may be useful to allow users to explicitly send the string encoding into the various RegexKit methods to avoid the very expensive conversion to UTF-8 when it doesn't need to happen.

Thanks,
Will

will@panic.com

Discussion


Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.