[KoCo-CVS] [Commit] cjkcodecs NOTES.utf CHANGES MANIFEST.in ROADMAP setup.py
Brought to you by:
perky
From: Hye-Shik C. <pe...@us...> - 2003-07-19 10:46:22
|
perky 03/07/19 03:46:08 Modified: . CHANGES MANIFEST.in ROADMAP setup.py Added: . NOTES.utf Log: Remove UTF-16 codec and explain why we still keep utf-7 and utf-8 codecs. Revision Changes Path 1.18 +8 -0 cjkcodecs/CHANGES Index: CHANGES =================================================================== RCS file: /cvsroot/koco/cjkcodecs/CHANGES,v retrieving revision 1.17 retrieving revision 1.18 diff -u -r1.17 -r1.18 --- CHANGES 12 Jul 2003 15:09:47 -0000 1.17 +++ CHANGES 19 Jul 2003 10:46:08 -0000 1.18 @@ -1,3 +1,11 @@ +Changes with CJKCodecs 1.0 + + *) UTF-16, UTF-8 codec is removed from distribution. + + *) Fixed UTF-7 codec's bug that fails to decode surrogate pair on + ucs4-python + + Changes with CJKCodecs 1.0b1 *) SHIFT-JISX0213, EUC-JISX0213, ISO-2022-JP-2 and ISO-2022-JP-3 1.4 +2 -2 cjkcodecs/MANIFEST.in Index: MANIFEST.in =================================================================== RCS file: /cvsroot/koco/cjkcodecs/MANIFEST.in,v retrieving revision 1.3 retrieving revision 1.4 diff -u -r1.3 -r1.4 --- MANIFEST.in 10 Jun 2003 11:25:52 -0000 1.3 +++ MANIFEST.in 19 Jul 2003 10:46:08 -0000 1.4 @@ -1,7 +1,7 @@ -# $Id: MANIFEST.in,v 1.3 2003/06/10 11:25:52 perky Exp $ +# $Id: MANIFEST.in,v 1.4 2003/07/19 10:46:08 perky Exp $ include README ROADMAP AUTHORS COPYRIGHT THANKS -include MANIFEST.in +include MANIFEST.in NOTES.* recursive-include src *.h *.c recursive-include tests *.py *.txt *.utf8 *.sh 1.6 +1 -2 cjkcodecs/ROADMAP Index: ROADMAP =================================================================== RCS file: /cvsroot/koco/cjkcodecs/ROADMAP,v retrieving revision 1.5 retrieving revision 1.6 diff -u -r1.5 -r1.6 --- ROADMAP 12 Jul 2003 03:55:42 -0000 1.5 +++ ROADMAP 19 Jul 2003 10:46:08 -0000 1.6 @@ -27,8 +27,7 @@ euc-tw Unicode.org utf-8 utf-7 - utf-16 -# $Id: ROADMAP,v 1.5 2003/07/12 03:55:42 perky Exp $ +# $Id: ROADMAP,v 1.6 2003/07/19 10:46:08 perky Exp $ # ex: ts=8 sts=4 et 1.33 +2 -2 cjkcodecs/setup.py Index: setup.py =================================================================== RCS file: /cvsroot/koco/cjkcodecs/setup.py,v retrieving revision 1.32 retrieving revision 1.33 diff -u -r1.32 -r1.33 --- setup.py 12 Jul 2003 19:12:13 -0000 1.32 +++ setup.py 19 Jul 2003 10:46:08 -0000 1.33 @@ -27,7 +27,7 @@ # IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE # POSSIBILITY OF SUCH DAMAGE. # -# $Id: setup.py,v 1.32 2003/07/12 19:12:13 perky Exp $ +# $Id: setup.py,v 1.33 2003/07/19 10:46:08 perky Exp $ # import sys @@ -43,7 +43,7 @@ 'ko_KR': ['euc_kr', 'cp949', 'johab', 'iso_2022_kr'], 'zh_CN': ['gb2312', 'gbk', 'gb18030', 'hz'], 'zh_TW': ['big5', 'cp950'], -'': ['utf_7', 'utf_8', 'utf_16', 'utf_16be', 'utf_16le'], +'': ['utf_7', 'utf_8'], } locales = encodings.keys() 1.1 cjkcodecs/NOTES.utf Index: NOTES.utf =================================================================== CJKCodecs is distributed with utf-7 and utf-8 codec in spite of Python has their own already. Here're my cowardly rationales for that. - Python UTF-7 codec can't encode and/or decode surrogate pair. - Python UTF-7 codec can't decode long shifted sequence. - Python UTF-7 codec isn't stateful, so its StreamReader and StreamWriter can't work correctly. - Python UTF-8 codec is slightly broken for StreamReader's readline and readlines method calls. For example, >>> import StringIO, codecs >>> c = codecs.getreader('utf-8')(StringIO.StringIO("Python\xed\x8c\x8c\xec\x9d\xb4\xec\x8d\xac")) >>> c.readline(1), c.readline(1), c.readline(1) u'P', u'y', u't' >>> c.readline(1), c.readline(1), c.readline(1) (u'h', u'o', u'n') >>> c.readline(1), c.readline(1), c.readline(1) Traceback (most recent call last): File "<stdin>", line 1, in ? File "/usr/local/lib/python2.2/codecs.py", line 252, in readline return self.decode(line, self.errors)[0] UnicodeError: UTF-8 decoding error: unexpected end of data |