[Afpfs-ng-devel] precompose and decompose
Status: Alpha
Brought to you by:
alexthepuffin
From: HAT <ha...@fa...> - 2008-03-27 18:52:49
|
hi. I'm reading the source of afpfs-ng 0.8.1 for the first time now. I think that there are problems in precomposing and decomposing. 1) precompose two characters only: supported 2) precompose two characters over: unsupported 3) decompose: sample only 4) hangul: unsupported 5) Unicode U+010000 over: unsupported 6) maccodepage for AFP2: unsupported The 2), 3) and 4) can be implemented comparatively easily because I did them for netatalk. There are two methods to support the U+010000 over. a) Using surrogate pair b) Using UCS4 instead of UCS2 The surrogate pair is dirty and complex. Because netatalk 2.1dev use the surrogate pair, It is difficult to support the U+010000 over. If we use UCS4 instead of UCS2, the implementation will be easy. Replace. from char16 *UTF8toUCS2(str) to u_int32_t *UTF8toUCS4(str) from int UCS2precompose(first, second) to u_int64_t UCS4precompose(first, second) from // worst case: 3 bytes of UTF8 per UCS2 char + terminal 0 to // worst case: 4 bytes of UTF8 per UCS4 char + terminal 0 The size of table[] is two times. static struct { int precomposed; unsigned int pattern; } table[] = { { 0x00000000, 0x0000000000000000}, // Dummy entry table[0] { 0x000000C0, 0x0000004100000300}, { 0x000000C1, 0x0000004100000301}, { 0x000000C2, 0x0000004100000302}, (snip) { 0x0001D1BF, 0x0001D1BB0001D16F}, { 0x0001D1BE, 0x0001D1BC0001D16E}, { 0x0001D1C0, 0x0001D1BC0001D16F}, }; PS. Don't trust the Apple's documents. http://developer.apple.com/technotes/tn/tn1150table.html This table is based on Unicode 2.x. http://developer.apple.com/documentation/Networking/Conceptual/AFP/AFP3_1.pdf This document is based on Unicode 3.2. Mac OS X 10.5.2 Leopard use newer Unicode. 0x1B06 to 0x1B05 0x1B35 This is not in Unicode 3.2. -- HAT <ha...@fa...> |