|
From: Anthony S. <sc...@gm...> - 2013-01-03 23:31:29
|
Josh is right that you can just edit the code by hand (which works but sucks). However, on Windows -- on the rare occasion when I also have to develop on it -- I typically use a distribution that includes a compiler, cython, hdf5, and pytables already and then I install my development version from github OVER this. I recommend either EPD or Anaconda, though other distributions listed here [1] might also work. Be well Anthony 1. http://numfocus.org/projects-2/software-distributions/ On Thu, Jan 3, 2013 at 3:46 PM, Josh Ayers <jos...@gm...> wrote: > The change was in pure Python code, so you should be able to just paste in > the changes to your local copy. Start with the table.Column.__iter__ > method (lines 3296-3310) here. > > > https://github.com/PyTables/PyTables/blob/b479ed025f4636f7f4744ac83a89bc947808907c/tables/table.py > > It needs to be modified slightly because it uses some additional features > that aren't available in the released version (the out=buf_slice argument > to table.read). The following should work. > > def __iter__(self): > table = self.table > itemsize = self.dtype.itemsize > nrowsinbuf = table._v_file.params['IO_BUFFER_SIZE'] // itemsize > max_row = len(self) > for start_row in xrange(0, len(self), nrowsinbuf): > end_row = min([start_row + nrowsinbuf, max_row]) > buf = table.read(start_row, end_row, 1, field=self.pathname) > for row in buf: > yield row > > > I haven't tested this, but I think it will work. > > Josh > > > > On Thu, Jan 3, 2013 at 1:25 PM, David Reed <dav...@gm...> wrote: > >> I apologize if I'm starting to sound helpless, but I'm forced to work on >> Windows 7 at work and have never had luck compiling python source >> successfully. I have had to rely on precompiled binaries and now its >> biting me in the butt. >> >> Is there any quick fix I can do to improve this iteration using v2.4.0? >> >> >> On Thu, Jan 3, 2013 at 3:17 PM, < >> pyt...@li...> wrote: >> >>> Send Pytables-users mailing list submissions to >>> pyt...@li... >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> https://lists.sourceforge.net/lists/listinfo/pytables-users >>> or, via email, send a message with subject or body 'help' to >>> pyt...@li... >>> >>> You can reach the person managing the list at >>> pyt...@li... >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of Pytables-users digest..." >>> >>> >>> Today's Topics: >>> >>> 1. Re: Pytables-users Digest, Vol 80, Issue 2 (David Reed) >>> 2. Re: Pytables-users Digest, Vol 80, Issue 3 (David Reed) >>> >>> >>> ---------------------------------------------------------------------- >>> >>> Message: 1 >>> Date: Thu, 3 Jan 2013 13:44:29 -0500 >>> From: David Reed <dav...@gm...> >>> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 80, Issue 2 >>> To: pyt...@li... >>> Message-ID: >>> <CAM6XA7=8ocg5WPD4KLSvLhSw-3BCvq5u7MRxq3Ajd6ha= >>> ev...@ma...> >>> Content-Type: text/plain; charset="iso-8859-1" >>> >>> Thanks Anthony, but unless Im missing something I don't think that method >>> will work since this will only be comparing the ith element with ith+1 >>> element. I still need 2 for loops right? >>> >>> Using itertools might speed things up though, I've never used them so I >>> will give it a shot and let you know how it goes. Looks like I need to >>> download the latest release before I do that too. Thanks for the help. >>> >>> -Dave >>> >>> >>> >>> On Thu, Jan 3, 2013 at 12:12 PM, < >>> pyt...@li...> wrote: >>> >>> > Send Pytables-users mailing list submissions to >>> > pyt...@li... >>> > >>> > To subscribe or unsubscribe via the World Wide Web, visit >>> > https://lists.sourceforge.net/lists/listinfo/pytables-users >>> > or, via email, send a message with subject or body 'help' to >>> > pyt...@li... >>> > >>> > You can reach the person managing the list at >>> > pyt...@li... >>> > >>> > When replying, please edit your Subject line so it is more specific >>> > than "Re: Contents of Pytables-users digest..." >>> > >>> > >>> > Today's Topics: >>> > >>> > 1. Re: Nested Iteration of HDF5 using PyTables (Anthony Scopatz) >>> > >>> > >>> > ---------------------------------------------------------------------- >>> > >>> > Message: 1 >>> > Date: Thu, 3 Jan 2013 11:11:47 -0600 >>> > From: Anthony Scopatz <sc...@gm...> >>> > Subject: Re: [Pytables-users] Nested Iteration of HDF5 using PyTables >>> > To: Discussion list for PyTables >>> > <pyt...@li...> >>> > Message-ID: >>> > <CAPk-6T5b= >>> > 1EG...@ma...> >>> > Content-Type: text/plain; charset="iso-8859-1" >>> > >>> > HI David, >>> > >>> > Tables and table column iteration have been overhauled fairly recently >>> [1]. >>> > So you might try creating two iterators, offset by one, and then >>> doing the >>> > comparison. I am hacking this out super quick so please forgive me: >>> > >>> > from itertools import izip >>> > >>> > with tb.openFile(...) as f: >>> > data = f.root.data >>> > data_i = iter(data) >>> > data_j = iter(data) >>> > data_i.next() # throw the first value away >>> > for i, j in izip(data_i, data_j): >>> > compare(i, j) >>> > >>> > You get the idea ;) >>> > >>> > Be Well >>> > Anthony >>> > >>> > 1. https://github.com/PyTables/PyTables/issues/27 >>> > >>> > >>> > On Thu, Jan 3, 2013 at 9:25 AM, David Reed <dav...@gm...> >>> wrote: >>> > >>> > > I was hoping someone could help me out here. >>> > > >>> > > This is from a post I put up on StackOverflow, >>> > > >>> > > I am have a fairly large dataset that I store in HDF5 and access >>> using >>> > > PyTables. One operation I need to do on this dataset are pairwise >>> > > comparisons between each of the elements. This requires 2 loops, one >>> to >>> > > iterate over each element, and an inner loop to iterate over every >>> other >>> > > element. This operation thus looks at N(N-1)/2 comparisons. >>> > > >>> > > For fairly small sets I found it to be faster to dump the contents >>> into a >>> > > multdimensional numpy array and then do my iteration. I run into >>> problems >>> > > with large sets because of memory issues and need to access each >>> element >>> > of >>> > > the dataset at run time. >>> > > >>> > > Putting the elements into an array gives me about 600 comparisons per >>> > > second, while operating on hdf5 data itself gives me about 300 >>> > comparisons >>> > > per second. >>> > > >>> > > Is there a way to speed this process up? >>> > > >>> > > Example follows (this is not my real code, just an example): >>> > > >>> > > *Small Set*: >>> > > >>> > > >>> > > with tb.openFile(h5_file, 'r') as f: >>> > > data = f.root.data >>> > > >>> > > N_elements = len(data) >>> > > elements = np.empty((N_irises, 1e5)) >>> > > >>> > > for ii, d in enumerate(data): >>> > > elements[ii] = data['element'] >>> > > >>> > > D = np.empty((N_irises, N_irises)) for ii in xrange(N_elements): >>> > > for jj in xrange(ii+1, N_elements): >>> > > D[ii, jj] = compare(elements[ii], elements[jj]) >>> > > >>> > > *Large Set*: >>> > > >>> > > >>> > > with tb.openFile(h5_file, 'r') as f: >>> > > data = f.root.data >>> > > >>> > > N_elements = len(data) >>> > > >>> > > D = np.empty((N_irises, N_irises)) >>> > > for ii in xrange(N_elements): >>> > > for jj in xrange(ii+1, N_elements): >>> > > D[ii, jj] = compare(data['element'][ii], >>> > data['element'][jj]) >>> > > >>> > > >>> > > >>> > > >>> > >>> ------------------------------------------------------------------------------ >>> > > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>> > > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>> current >>> > > with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>> > > MVPs and experts. ON SALE this month only -- learn more at: >>> > > http://p.sf.net/sfu/learnmore_122712 >>> > > _______________________________________________ >>> > > Pytables-users mailing list >>> > > Pyt...@li... >>> > > https://lists.sourceforge.net/lists/listinfo/pytables-users >>> > > >>> > > >>> > -------------- next part -------------- >>> > An HTML attachment was scrubbed... >>> > >>> > ------------------------------ >>> > >>> > >>> > >>> ------------------------------------------------------------------------------ >>> > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>> > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current >>> > with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>> > MVPs and experts. ON SALE this month only -- learn more at: >>> > http://p.sf.net/sfu/learnmore_122712 >>> > >>> > ------------------------------ >>> > >>> > _______________________________________________ >>> > Pytables-users mailing list >>> > Pyt...@li... >>> > https://lists.sourceforge.net/lists/listinfo/pytables-users >>> > >>> > >>> > End of Pytables-users Digest, Vol 80, Issue 2 >>> > ********************************************* >>> > >>> -------------- next part -------------- >>> An HTML attachment was scrubbed... >>> >>> ------------------------------ >>> >>> Message: 2 >>> Date: Thu, 3 Jan 2013 15:17:01 -0500 >>> From: David Reed <dav...@gm...> >>> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 80, Issue 3 >>> To: pyt...@li... >>> Message-ID: >>> < >>> CAM...@ma...> >>> Content-Type: text/plain; charset="iso-8859-1" >>> >>> Thanks a lot for the help so far guys! >>> >>> Looking at itertools, I found what I believe to be the perfect function >>> for >>> what I need, itertools.combinations. This appears to be a valid >>> replacement >>> to the method proposed. >>> >>> There is a small problem that I didn't mention is that my compare >>> function >>> actually takes as inputs 2 columns from the table. Like so: >>> >>> D = np.empty((N_irises, N_irises)) >>> for ii in xrange(N_elements): >>> for jj in xrange(ii+1, N_elements): >>> D[ii, jj] = compare(data['element1'][ii], >>> data['element1'][jj],data['element2'][ii], >>> data['element2'][jj]) >>> >>> Is there an efficient way of using itertools with this structure? >>> >>> >>> On Thu, Jan 3, 2013 at 1:29 PM, < >>> pyt...@li...> wrote: >>> >>> > Send Pytables-users mailing list submissions to >>> > pyt...@li... >>> > >>> > To subscribe or unsubscribe via the World Wide Web, visit >>> > https://lists.sourceforge.net/lists/listinfo/pytables-users >>> > or, via email, send a message with subject or body 'help' to >>> > pyt...@li... >>> > >>> > You can reach the person managing the list at >>> > pyt...@li... >>> > >>> > When replying, please edit your Subject line so it is more specific >>> > than "Re: Contents of Pytables-users digest..." >>> > >>> > >>> > Today's Topics: >>> > >>> > 1. Re: Nested Iteration of HDF5 using PyTables (Josh Ayers) >>> > >>> > >>> > ---------------------------------------------------------------------- >>> > >>> > Message: 1 >>> > Date: Thu, 3 Jan 2013 10:29:33 -0800 >>> > From: Josh Ayers <jos...@gm...> >>> > Subject: Re: [Pytables-users] Nested Iteration of HDF5 using PyTables >>> > To: Discussion list for PyTables >>> > <pyt...@li...> >>> > Message-ID: >>> > < >>> > CAC...@ma...> >>> > Content-Type: text/plain; charset="iso-8859-1" >>> > >>> > David, >>> > >>> > The change in issue 27 was only for iteration over a tables.Column >>> > instance. To use it, tweak Anthony's code as follows. This will >>> iterate >>> > over the "element" column, as in your original example. >>> > >>> > Note also that this will only work with the development version of >>> PyTables >>> > available on github. It will be very slow using the released v2.4.0. >>> > >>> > >>> > from itertools import izip >>> > >>> > with tb.openFile(...) as f: >>> > data = f.root.data.cols.element >>> > data_i = iter(data) >>> > data_j = iter(data) >>> > data_i.next() # throw the first value away >>> > for i, j in izip(data_i, data_j): >>> > compare(i, j) >>> > >>> > >>> > Hope that helps, >>> > Josh >>> > >>> > >>> > >>> > On Thu, Jan 3, 2013 at 9:11 AM, Anthony Scopatz <sc...@gm...> >>> wrote: >>> > >>> > > HI David, >>> > > >>> > > Tables and table column iteration have been overhauled fairly >>> recently >>> > > [1]. So you might try creating two iterators, offset by one, and >>> then >>> > > doing the comparison. I am hacking this out super quick so please >>> > forgive >>> > > me: >>> > > >>> > > from itertools import izip >>> > > >>> > > with tb.openFile(...) as f: >>> > > data = f.root.data >>> > > data_i = iter(data) >>> > > data_j = iter(data) >>> > > data_i.next() # throw the first value away >>> > > for i, j in izip(data_i, data_j): >>> > > compare(i, j) >>> > > >>> > > You get the idea ;) >>> > > >>> > > Be Well >>> > > Anthony >>> > > >>> > > 1. https://github.com/PyTables/PyTables/issues/27 >>> > > >>> > > >>> > > On Thu, Jan 3, 2013 at 9:25 AM, David Reed <dav...@gm...> >>> > wrote: >>> > > >>> > >> I was hoping someone could help me out here. >>> > >> >>> > >> This is from a post I put up on StackOverflow, >>> > >> >>> > >> I am have a fairly large dataset that I store in HDF5 and access >>> using >>> > >> PyTables. One operation I need to do on this dataset are pairwise >>> > >> comparisons between each of the elements. This requires 2 loops, >>> one to >>> > >> iterate over each element, and an inner loop to iterate over every >>> other >>> > >> element. This operation thus looks at N(N-1)/2 comparisons. >>> > >> >>> > >> For fairly small sets I found it to be faster to dump the contents >>> into >>> > a >>> > >> multdimensional numpy array and then do my iteration. I run into >>> > problems >>> > >> with large sets because of memory issues and need to access each >>> > element of >>> > >> the dataset at run time. >>> > >> >>> > >> Putting the elements into an array gives me about 600 comparisons >>> per >>> > >> second, while operating on hdf5 data itself gives me about 300 >>> > comparisons >>> > >> per second. >>> > >> >>> > >> Is there a way to speed this process up? >>> > >> >>> > >> Example follows (this is not my real code, just an example): >>> > >> >>> > >> *Small Set*: >>> > >> >>> > >> >>> > >> with tb.openFile(h5_file, 'r') as f: >>> > >> data = f.root.data >>> > >> >>> > >> N_elements = len(data) >>> > >> elements = np.empty((N_irises, 1e5)) >>> > >> >>> > >> for ii, d in enumerate(data): >>> > >> elements[ii] = data['element'] >>> > >> >>> > >> D = np.empty((N_irises, N_irises)) for ii in xrange(N_elements): >>> > >> for jj in xrange(ii+1, N_elements): >>> > >> D[ii, jj] = compare(elements[ii], elements[jj]) >>> > >> >>> > >> *Large Set*: >>> > >> >>> > >> >>> > >> with tb.openFile(h5_file, 'r') as f: >>> > >> data = f.root.data >>> > >> >>> > >> N_elements = len(data) >>> > >> >>> > >> D = np.empty((N_irises, N_irises)) >>> > >> for ii in xrange(N_elements): >>> > >> for jj in xrange(ii+1, N_elements): >>> > >> D[ii, jj] = compare(data['element'][ii], >>> > data['element'][jj]) >>> > >> >>> > >> >>> > >> >>> > >> >>> > >>> ------------------------------------------------------------------------------ >>> > >> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, >>> CSS, >>> > >> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>> current >>> > >> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>> > >> MVPs and experts. ON SALE this month only -- learn more at: >>> > >> http://p.sf.net/sfu/learnmore_122712 >>> > >> _______________________________________________ >>> > >> Pytables-users mailing list >>> > >> Pyt...@li... >>> > >> https://lists.sourceforge.net/lists/listinfo/pytables-users >>> > >> >>> > >> >>> > > >>> > > >>> > > >>> > >>> ------------------------------------------------------------------------------ >>> > > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>> > > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills >>> current >>> > > with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>> > > MVPs and experts. ON SALE this month only -- learn more at: >>> > > http://p.sf.net/sfu/learnmore_122712 >>> > > _______________________________________________ >>> > > Pytables-users mailing list >>> > > Pyt...@li... >>> > > https://lists.sourceforge.net/lists/listinfo/pytables-users >>> > > >>> > > >>> > -------------- next part -------------- >>> > An HTML attachment was scrubbed... >>> > >>> > ------------------------------ >>> > >>> > >>> > >>> ------------------------------------------------------------------------------ >>> > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>> > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current >>> > with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>> > MVPs and experts. ON SALE this month only -- learn more at: >>> > http://p.sf.net/sfu/learnmore_122712 >>> > >>> > ------------------------------ >>> > >>> > _______________________________________________ >>> > Pytables-users mailing list >>> > Pyt...@li... >>> > https://lists.sourceforge.net/lists/listinfo/pytables-users >>> > >>> > >>> > End of Pytables-users Digest, Vol 80, Issue 3 >>> > ********************************************* >>> > >>> -------------- next part -------------- >>> An HTML attachment was scrubbed... >>> >>> ------------------------------ >>> >>> >>> ------------------------------------------------------------------------------ >>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >>> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current >>> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >>> MVPs and experts. ON SALE this month only -- learn more at: >>> http://p.sf.net/sfu/learnmore_122712 >>> >>> ------------------------------ >>> >>> _______________________________________________ >>> Pytables-users mailing list >>> Pyt...@li... >>> https://lists.sourceforge.net/lists/listinfo/pytables-users >>> >>> >>> End of Pytables-users Digest, Vol 80, Issue 4 >>> ********************************************* >>> >> >> >> >> ------------------------------------------------------------------------------ >> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, >> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current >> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft >> MVPs and experts. ON SALE this month only -- learn more at: >> http://p.sf.net/sfu/learnmore_122712 >> _______________________________________________ >> Pytables-users mailing list >> Pyt...@li... >> https://lists.sourceforge.net/lists/listinfo/pytables-users >> >> > > > ------------------------------------------------------------------------------ > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current > with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft > MVPs and experts. ON SALE this month only -- learn more at: > http://p.sf.net/sfu/learnmore_122712 > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > > |