From: Jeff W. <js...@fa...> - 2009-01-29 19:01:54
|
Christopher Barker wrote: > Jeff Whitaker wrote: > >> John: 'rU' apparently doesn't work for gzipped text files (at least >> in python 2.5.2). I had to change the default in back to 'r' when >> using gzip.open (r6846 in trunk). > > darn -- sounds like a bug/missing feature in the gzip module. Strange > , though, unknown flags seem to be ignored by file(), and gzip.open > seems to ignore the 'U' too, in my tests (see below). > > I think having the 'U' ignored is less than optimal, but doesn't make > anything worse than it is. What problems did you have? Chris: Here's a self-contained example of the problem (data file attached): >> import gzip >> f = gzip.open('etopo20lats.gz','rU') >> print f.readline() Traceback (most recent call last): File "testread.py", line 3, in <module> print f.readline() File "/sw/lib/python2.5/gzip.py", line 399, in readline c = self.read(readsize) File "/sw/lib/python2.5/gzip.py", line 227, in read self._read(readsize) File "/sw/lib/python2.5/gzip.py", line 279, in _read uncompress = self.decompress.decompress(buf) zlib.error: Error -3 while decompressing: invalid distance too far back >> import gzip >> f = gzip.open('etopo20lats.gz','r') >> print f.readline() '-8.983333330000000672e+01' -Jeff > > tests (on an OS-X system - native unix newlines): > (python 2.5.2) > > >>> file('test_newlines.txt', 'rb').read() > 'line 1: unix \nline 2: dos \r\nline 3: mac \rline 4: unix \n' > >>> file('test_newlines.txt', 'r').read() > 'line 1: unix \nline 2: dos \r\nline 3: mac \rline 4: unix \n' > >>> file('test_newlines.txt', 'U').read() > 'line 1: unix \nline 2: dos \nline 3: mac \nline 4: unix \n' > > # so file() does the right thing > > >>> gzip.open('test_newlines.txt.gz', 'rb').read() > 'line 1: unix \nline 2: dos \r\nline 3: mac \rline 4: unix \n' > >>> gzip.open('test_newlines.txt.gz', 'r').read() > 'line 1: unix \nline 2: dos \r\nline 3: mac \rline 4: unix \n' > >>> gzip.open('test_newlines.txt.gz', 'rU').read() > 'line 1: unix \nline 2: dos \r\nline 3: mac \rline 4: unix \n' > > > # gzip.open() appears to ignore the 'U' flag -- too bad! > > should we post a bug report/feature request to Python? > > -Chris > > > > -- Jeffrey S. Whitaker Phone : (303)497-6313 Meteorologist FAX : (303)497-6449 NOAA/OAR/PSD R/PSD1 Email : Jef...@no... 325 Broadway Office : Skaggs Research Cntr 1D-113 Boulder, CO, USA 80303-3328 Web : http://tinyurl.com/5telg |