[KoCo-CVS] [Commit] cjkcodecs NOTES.utf CHANGES MANIFEST.in ROADMAP setup.py
Brought to you by:
perky
|
From: Hye-Shik C. <pe...@us...> - 2003-07-19 10:46:22
|
perky 03/07/19 03:46:08
Modified: . CHANGES MANIFEST.in ROADMAP setup.py
Added: . NOTES.utf
Log:
Remove UTF-16 codec and explain why we still keep utf-7 and utf-8 codecs.
Revision Changes Path
1.18 +8 -0 cjkcodecs/CHANGES
Index: CHANGES
===================================================================
RCS file: /cvsroot/koco/cjkcodecs/CHANGES,v
retrieving revision 1.17
retrieving revision 1.18
diff -u -r1.17 -r1.18
--- CHANGES 12 Jul 2003 15:09:47 -0000 1.17
+++ CHANGES 19 Jul 2003 10:46:08 -0000 1.18
@@ -1,3 +1,11 @@
+Changes with CJKCodecs 1.0
+
+ *) UTF-16, UTF-8 codec is removed from distribution.
+
+ *) Fixed UTF-7 codec's bug that fails to decode surrogate pair on
+ ucs4-python
+
+
Changes with CJKCodecs 1.0b1
*) SHIFT-JISX0213, EUC-JISX0213, ISO-2022-JP-2 and ISO-2022-JP-3
1.4 +2 -2 cjkcodecs/MANIFEST.in
Index: MANIFEST.in
===================================================================
RCS file: /cvsroot/koco/cjkcodecs/MANIFEST.in,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -r1.3 -r1.4
--- MANIFEST.in 10 Jun 2003 11:25:52 -0000 1.3
+++ MANIFEST.in 19 Jul 2003 10:46:08 -0000 1.4
@@ -1,7 +1,7 @@
-# $Id: MANIFEST.in,v 1.3 2003/06/10 11:25:52 perky Exp $
+# $Id: MANIFEST.in,v 1.4 2003/07/19 10:46:08 perky Exp $
include README ROADMAP AUTHORS COPYRIGHT THANKS
-include MANIFEST.in
+include MANIFEST.in NOTES.*
recursive-include src *.h *.c
recursive-include tests *.py *.txt *.utf8 *.sh
1.6 +1 -2 cjkcodecs/ROADMAP
Index: ROADMAP
===================================================================
RCS file: /cvsroot/koco/cjkcodecs/ROADMAP,v
retrieving revision 1.5
retrieving revision 1.6
diff -u -r1.5 -r1.6
--- ROADMAP 12 Jul 2003 03:55:42 -0000 1.5
+++ ROADMAP 19 Jul 2003 10:46:08 -0000 1.6
@@ -27,8 +27,7 @@
euc-tw
Unicode.org utf-8 utf-7
- utf-16
-# $Id: ROADMAP,v 1.5 2003/07/12 03:55:42 perky Exp $
+# $Id: ROADMAP,v 1.6 2003/07/19 10:46:08 perky Exp $
# ex: ts=8 sts=4 et
1.33 +2 -2 cjkcodecs/setup.py
Index: setup.py
===================================================================
RCS file: /cvsroot/koco/cjkcodecs/setup.py,v
retrieving revision 1.32
retrieving revision 1.33
diff -u -r1.32 -r1.33
--- setup.py 12 Jul 2003 19:12:13 -0000 1.32
+++ setup.py 19 Jul 2003 10:46:08 -0000 1.33
@@ -27,7 +27,7 @@
# IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
#
-# $Id: setup.py,v 1.32 2003/07/12 19:12:13 perky Exp $
+# $Id: setup.py,v 1.33 2003/07/19 10:46:08 perky Exp $
#
import sys
@@ -43,7 +43,7 @@
'ko_KR': ['euc_kr', 'cp949', 'johab', 'iso_2022_kr'],
'zh_CN': ['gb2312', 'gbk', 'gb18030', 'hz'],
'zh_TW': ['big5', 'cp950'],
-'': ['utf_7', 'utf_8', 'utf_16', 'utf_16be', 'utf_16le'],
+'': ['utf_7', 'utf_8'],
}
locales = encodings.keys()
1.1 cjkcodecs/NOTES.utf
Index: NOTES.utf
===================================================================
CJKCodecs is distributed with utf-7 and utf-8 codec in spite of Python
has their own already. Here're my cowardly rationales for that.
- Python UTF-7 codec can't encode and/or decode surrogate pair.
- Python UTF-7 codec can't decode long shifted sequence.
- Python UTF-7 codec isn't stateful, so its StreamReader and
StreamWriter can't work correctly.
- Python UTF-8 codec is slightly broken for StreamReader's readline
and readlines method calls. For example,
>>> import StringIO, codecs
>>> c = codecs.getreader('utf-8')(StringIO.StringIO("Python\xed\x8c\x8c\xec\x9d\xb4\xec\x8d\xac"))
>>> c.readline(1), c.readline(1), c.readline(1)
u'P', u'y', u't'
>>> c.readline(1), c.readline(1), c.readline(1)
(u'h', u'o', u'n')
>>> c.readline(1), c.readline(1), c.readline(1)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/local/lib/python2.2/codecs.py", line 252, in readline
return self.decode(line, self.errors)[0]
UnicodeError: UTF-8 decoding error: unexpected end of data
|