[KoCo-CVS] [Commit] cjkcodecs NOTES.cp932 CHANGES NOTES.big5
Brought to you by:
perky
From: Hye-Shik C. <pe...@us...> - 2003-06-20 09:04:54
|
perky 03/06/20 02:04:52 Modified: . CHANGES NOTES.big5 Added: . NOTES.cp932 Log: - Tweaked some mapping for cp932 and cp950 to make more consistency with MS Windows. - CP932: Added single byte "UNDEFINED" characters 0x80, 0xa0, 0xfd, 0xfe, 0xff (documented on NOTES.cp932) - CP950: Changed encode mappings to another more popular for duplicated unicode points: 5341 -> A451, 5345 -> A4CA - A unittest for big5 mapping is added. - Fixed a bug that cp932 codec couldn't decode half-width katakana. Revision Changes Path 1.3 +11 -0 cjkcodecs/CHANGES Index: CHANGES =================================================================== RCS file: /cvsroot/koco/cjkcodecs/CHANGES,v retrieving revision 1.2 retrieving revision 1.3 diff -u -r1.2 -r1.3 --- CHANGES 19 Jun 2003 19:12:58 -0000 1.2 +++ CHANGES 20 Jun 2003 09:04:52 -0000 1.3 @@ -5,3 +5,14 @@ *) Fixed a bug that JIS X 0201 routine doesn't encode and decode 0x7f. + *) Tweaked some mapping for cp932 and cp950 to make more consistency + with MS Windows. + - CP932: Added single byte "UNDEFINED" characters 0x80, 0xa0, 0xfd, + 0xfe, 0xff (documented on NOTES.cp932) + - CP950: Changed encode mappings to another more popular for + duplicated unicode points: 5341 -> A451, 5345 -> A4CA + + *) A unittest for big5 mapping is added. + + *) Fixed a bug that cp932 codec couldn't decode half-width katakana. + 1.3 +11 -10 cjkcodecs/NOTES.big5 Index: NOTES.big5 =================================================================== RCS file: /cvsroot/koco/cjkcodecs/NOTES.big5,v retrieving revision 1.2 retrieving revision 1.3 diff -u -r1.2 -r1.3 --- NOTES.big5 19 Jun 2003 18:02:11 -0000 1.2 +++ NOTES.big5 20 Jun 2003 09:04:52 -0000 1.3 @@ -1,15 +1,16 @@ big5 codec maps the following characters as cp950 does rather than conforming Unicode.org's that maps to 0xFFFD. -BIG5 Unicode Description + BIG5 Unicode Description -0xA15A 0x2574 SPACING UNDERSCORE -0xA1C3 0xFFE3 SPACING HEAVY OVERSCORE -0xA1C5 0x02CD SPACING HEAVY UNDERSCORE -0xA1FE 0xFF0F LT DIAG UP RIGHT TO LOW LEFT -0xA240 0xFF3C LT DIAG UP LEFT TO LOW RIGHT -0xA2CC 0x5341 HANGZHOU NUMERAL TEN -0xA2CE 0x5345 HANGZHOU NUMERAL THIRTY + 0xA15A 0x2574 SPACING UNDERSCORE + 0xA1C3 0xFFE3 SPACING HEAVY OVERSCORE + 0xA1C5 0x02CD SPACING HEAVY UNDERSCORE + 0xA1FE 0xFF0F LT DIAG UP RIGHT TO LOW LEFT + 0xA240 0xFF3C LT DIAG UP LEFT TO LOW RIGHT + 0xA2CC 0x5341 HANGZHOU NUMERAL TEN + 0xA2CE 0x5345 HANGZHOU NUMERAL THIRTY -Because unicode 0x5341, 0x5345 is mapped to another big5 codes already, -a roundtrip compatibility is not guaranteed for them. +Because unicode 0x5341, 0x5345, 0xFF0F, 0xFF3C is mapped to another +big5 codes already, a roundtrip compatibility is not guaranteed for +them. 1.1 cjkcodecs/NOTES.cp932 Index: NOTES.cp932 =================================================================== To conform to Windows's real mapping, cp932 codec maps the following codepoints in addition of the official cp932 mapping. CP932 Unicode Description 0x80 0x80 UNDEFINED 0xA0 0xF8F0 UNDEFINED 0xFD 0xF8F1 UNDEFINED 0xFE 0xF8F2 UNDEFINED 0xFF 0xF8F3 UNDEFINED |