Thread: [Camelbones-devel] CamelBones && UTF-8
Brought to you by:
shermpendley
From: Sherm P. <sh...@do...> - 2004-02-04 21:15:53
|
I've converted all of the cString, stringWithCString, and other such calls to their UTF8 equivalents. Nothing appears to blow up - both my test apps, ShuX and RendezvousBrowser, continue to work normally under both Panther and Jaguar. Anyone want to check out the latest CVS code and give it a whirl? Keep in mind that it takes sourceforge a while to propagate changes from the developers' CVS to the anonymous CVS... None of today's checkins have showed up there yet. ;-( sherm-- |
From: Tom I. <to...@je...> - 2004-02-04 22:53:15
|
On Feb 4, 2004, at 21:15, Sherm Pendley wrote: > > Anyone want to check out the latest CVS code and give it a whirl? Keep > in mind that it takes sourceforge a while to propagate changes from > the developers' CVS to the anonymous CVS... None of today's checkins > have showed up there yet. ;-( Love to. How? Heh. I can check the CVS out, but the CamelBones.pbproj folder is empty, 'make' won't work as I have no GNUstep files, and I run out of ideas at that point. And no, there's no overloading stuff in there as yet, certainly... .tom |
From: Sherm P. <sh...@do...> - 2004-02-05 00:39:30
|
On Feb 4, 2004, at 5:53 PM, Tom Insam wrote: > Love to. How? Heh. I can check the CVS out, but the CamelBones.pbproj > folder is empty If you use the 'checkout -P' option, cvs will prune the empty directories. > I have no GNUstep files They're included in the CBDevInstall package, but they're optional. You can install them separately, without having to re-install the whole thing - just look in the Packages directory for GNUStep-Makefiles.pkg. sherm-- |
From: Tom I. <to...@je...> - 2004-02-05 10:51:41
|
On Feb 5, 2004, at 0:39, Sherm Pendley wrote: > > They're included in the CBDevInstall package, but they're optional. > You can install them separately, without having to re-install the > whole thing - just look in the Packages directory for > GNUStep-Makefiles.pkg. > Ah, happy happy. Build works, overloading works, great. .tom |
From: Sherm P. <sh...@do...> - 2004-02-05 13:11:42
|
On Feb 5, 2004, at 5:51 AM, Tom Insam wrote: > Ah, happy happy. Build works, overloading works, great. Music to my ears! Were you reading this list when folks were trying to use PB to build the 0.3.0-pre releases on Jaguar? It wasn't pretty. I honestly don't know if anyone managed it - and some very smart people tried pretty hard. sherm-- |
From: Sherm P. <sh...@do...> - 2004-02-06 14:46:54
|
On Feb 6, 2004, at 6:54 AM, Thilo Planz wrote: > For me, this crashes like this: I think I've found and fixed the problem. The problem occurred when newSVpv() was called to create a new Perl scalar from an NSString. The second parameter needs to be the length of the string in bytes, not characters, and what was passed was the result of calling the -length method on the NSString - which returns characters. That worked fine with C strings, because characters and bytes are the same, but with UTF8 strings they're not always the same, so it crashed. I've checked the fix into CVS. If you don't want to wait for anonymous CVS to catch up, open up Conversions.m and find this line (102): SV *newSV = newSVpv([(NSString *)target UTF8String], [(NSString *)target length]); Replace it with: const char *u = [(NSString *)target UTF8String]; int len = strlen(u); SV *newSV = newSVpv(u, len); As you can see, the solution is to use the C function strlen() on the UTF8String, instead of NSString's -length method. The test now works, and I was able to concatenate and output even strings that use completely different character sets, such as Japanese and Russian. It appears that Perl's length() function is broken with respect to UTF8 strings, even in 5.8. It returns the length in bytes, not characters. It appeared to give the correct results before, but that was the result of the string being incorrectly truncated, so that the Perl string's size in bytes was the original string's size in characters. This might also be a consequence of the way I'm creating the UTF8 strings. There appear to be a number of new functions listed in perlapi that relate to the handling of UTF8 strings, specifically flagging them as such so that other functions can correctly differentiate between bytes and characters. Perhaps I can use the newer api when CB is being compiled against 5.8, to improve the level of support for that version. sherm-- |
From: Sherm P. <sh...@do...> - 2004-02-06 15:44:20
|
On Feb 6, 2004, at 9:46 AM, Sherm Pendley wrote: > This might also be a consequence of the way I'm creating the UTF8 > strings. There appear to be a number of new functions listed in > perlapi that relate to the handling of UTF8 strings, specifically > flagging them as such so that other functions can correctly > differentiate between bytes and characters. Perhaps I can use the > newer api when CB is being compiled against 5.8, to improve the level > of support for that version. The above is definitely the case. I've found the call that will correctly flag the Perl string as UTF8, so that Perl's length() function (and presumably other Perl functions) once again returns the correct results. As soon as I figure out the proper version-checking #ifdefs to wrap the code in, so that it continues to compile against Perl 5.6, I'll check it in to CVS. sherm-- |