From: Mathew Y. <ma...@fu...> - 2002-02-18 20:23:29
|
Has anyone checked out VMaps at http://snafu.freedom.org/Vmaps/ ?? This might be what you're looking for. Mathew > (I thought I had sent this mail on January 30, but I guess I was > mistaken.) > > Eric Nodwell writes: > > Since I have a 2.4GB data file handy, I thought I'd try this > > package with it. (Normally I process this data file by reading > > it in a chunk at a time, which is perfectly adequate.) Not > > surprisingly, it chokes: > > Yep, that's pretty much what I expected. I think that adding code to > support mapping some arbitrary part of a file should be fairly > straightforward --- do you want to run the tests if I write the code? > > > File "/home/eric/lib/python2.2/site-packages/maparray.py", line 15, > > in maparray > > m = mmap.mmap(fn, os.fstat(fn)[stat.ST_SIZE]) > > OverflowError: memory mapped size is too large (limited by C int) > > This error message's wording led me to something that was *not* what I > expected. > > That's a sort of alarming message --- it suggests that it won't work > on >2G files even on LP64 systems, where longs and pointers are 64 > bits but ints are 32 bits. The comments in the mmap module say: > > The map size is restricted to [0, INT_MAX] because this is the current > Python limitation on object sizes. Although the mmap object *could* handle > a larger map size, there is no point because all the useful operations > (len(), slicing(), sequence indexing) are limited by a C int. > > Horrifyingly, this is true. Even the buffer interface function > arrayfrombuffer uses to get the size of the buffer return int sizes, > not size_t sizes. This is a serious bug in the buffer interface, IMO, > and I doubt it will be fixed --- the buffer interface is apparently > due for a revamp soon at any rate, so little changes won't be > welcomed, especially if they break binary backwards compatibility, as > this one would on LP64 platforms. > > Fixing this, so that LP64 Pythons can mmap >2G files (their > birthright!), is a bit of work --- probably a matter of writing a > modified mmap() module that supports a saner version of the buffer > interface (with named methods instead of a type object slot), and > can't be close()d, to boot. > > Until then, this module only lets you memory-map files up to two gigs. > > > (details: Python 2.2, numpy 20.3, Pentium III, Debian Woody, Linux > > kernel 2.4.13, gcc 2.95.4) > > My kernel is 2.4.13 too, but I don't have any large files, and I don't > know whether any of my kernel, my libc, or my Python even support > them. > > > I'm not a big C programmer, but I wonder if there is some way for > > this package to overcome the 2GB limit on 32-bit systems. That > > could be useful in some situations. > > I don't know, but I think it would probably require extensive code > changes throughout Numpy. > > -- > <kr...@po...> Kragen Sitaker <http://www.pobox.com/~kragen/> > The sages do not believe that making no mistakes is a blessing. They believe, > rather, that the great virtue of man lies in his ability to correct his > mistakes and continually make a new man of himself. -- Wang Yang-Ming > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion |