From: Chris B. <Chr...@no...> - 2004-07-08 23:21:00
|
Chris Barker wrote: >> can't >> you just preallocate the array and read your data directly into it? > > The short answer is that I'm not very smart! The longer answer is that > this is because at first I misunderstood what PyArray_FromDimsAndData > was for. For ScanFileN, I'll re-do it as you suggest. I've re-done it. Now I don't double allocate storage for ScanFileN. There was no noticeable difference in performance, but why use memory you don't have to? For ScanFile, it is unknown at the beginning how big the final array is, so I now have two versions. One is what I had before, it allocates memory in blocks of some Buffersize as it reads the file (now set to 1024 elements). Once it's all read in, it creates an appropriate size PyArray, and copies the data to it. This results in a double copy of all the data until the temporary memory is freed. I now also have a ScanFile2, which scans the whole file first, then creates a PyArray, and re-reads the file to fill it up. This version takes about twice as long, confirming my expectation that the time to allocate and copy data is tiny compared to reading and parsing the file. Here's a simple benchmark: Reading with Standard Python methods (62936, 2) it took 2.824013 seconds to read the file with standard Python methods Reading with FileScan (62936, 2) it took 0.400936 seconds to read the file with FileScan Reading with FileScan2 (62936, 2) it took 0.752649 seconds to read the file with FileScan2 Reading with FileScanN (62936, 2) it took 0.441714 seconds to read the file with FileScanN So it takes twice as long to count the numbers first, but it's still three times as fast as just doing all this with Python. However, I usually don't think it's worth all this effort for a 3 times speed up, and I tend to make copies my arrays all over the place with NumPy anyway, so I'm inclined to stick with the first method. Also, if you are really that tight on memory, you could always read it in chunks with ScanFileN. Any feedback anyone wants to give is very welcome. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no... |