On Fri, Feb 1, 2013 at 3:27 PM, David Reed <david.reed.c@...> wrote:
> at the error:
>
> result = numpy.empty(shape=nrows, dtype=dtypeField)
>
> nrows = 4620 and dtypeField is ('bool', (17, 9600))
>
> I'm not sure what that means as a dtype, but thats what it is.
>
> Forgive me if I'm being totally naive, but I thought the whole point of
> __iter__ with pyttables was to do iteration on the fly, so there is no
> preallocation.
>
Nope you are not being naive at all. That is the point.
> If you have any ideas on this I'm all ears.
>
If you could send a minimal script which reproduces this error, that would
help a lot.
Be Well
Anthony
>
>
> Thanks again.
>
> Dave
>
>
> On Fri, Feb 1, 2013 at 3:45 PM, <
> pytables-users-request@...> wrote:
>
>> Send Pytables-users mailing list submissions to
>> pytables-users@...
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> or, via email, send a message with subject or body 'help' to
>> pytables-users-request@...
>>
>> You can reach the person managing the list at
>> pytables-users-owner@...
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Pytables-users digest..."
>>
>>
>> Today's Topics:
>>
>> 1. Re: Pytables-users Digest, Vol 81, Issue 2 (Anthony Scopatz)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Fri, 1 Feb 2013 14:44:40 -0600
>> From: Anthony Scopatz <scopatz@...>
>> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 81, Issue 2
>> To: Discussion list for PyTables
>> <pytables-users@...>
>> Message-ID:
>> <
>> CAPk-6T5Fq6gELfNaZQk98jEd7AevCyU7w317qhugQD7njzF_Kw@...>
>> Content-Type: text/plain; charset="iso-8859-1"
>>
>> On Fri, Feb 1, 2013 at 12:43 PM, David Reed <david.reed.c@...>
>> wrote:
>>
>> > Hi Anthony,
>> >
>> > Thanks for the reply.
>> >
>> > I honestly don't know how to monitor my Python memory usage, but I'm
>> sure
>> > that its caused by out of memory.
>> >
>>
>> Well, I would just run top or process monitor or something while running
>> the python script to see what happens to memory usage as the script chugs
>> along...
>>
>>
>> > I'm just trying to find out how to fix it. My HDF5 table has 4620 rows
>> > and the column I'm iterating over is a 17x9600 boolean matrix. The
>> > __iter__ method is preallocating an array that is this size which
>> appears
>> > to be root of the error. I was hoping there is a fix somewhere in here
>> to
>> > not have to do this preallocation.
>> >
>>
>> So a 17x9600 boolean matrix should only be 0.155 MB in space. 4620 of
>> these is ~760 MB. If you have 2 GB of memory and you are iterating over 2
>> of these (templates & masks) it is conceivable that you are just running
>> out of memory. Maybe there is a way that __iter__ could not preallocate
>> something that is basically a temporary. What is the dtype of the
>> templates array?
>>
>> Be Well
>> Anthony
>>
>>
>> >
>> > Thanks again.
>> >
>> >
>> >
>> >
>> > On Fri, Feb 1, 2013 at 11:12 AM, <
>> > pytables-users-request@...> wrote:
>> >
>> >> Send Pytables-users mailing list submissions to
>> >> pytables-users@...
>> >>
>> >> To subscribe or unsubscribe via the World Wide Web, visit
>> >> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> or, via email, send a message with subject or body 'help' to
>> >> pytables-users-request@...
>> >>
>> >> You can reach the person managing the list at
>> >> pytables-users-owner@...
>> >>
>> >> When replying, please edit your Subject line so it is more specific
>> >> than "Re: Contents of Pytables-users digest..."
>> >>
>> >>
>> >> Today's Topics:
>> >>
>> >> 1. Re: Pytables-users Digest, Vol 80, Issue 9 (Anthony Scopatz)
>> >>
>> >>
>> >> ----------------------------------------------------------------------
>> >>
>> >> Message: 1
>> >> Date: Fri, 1 Feb 2013 10:11:47 -0600
>> >> From: Anthony Scopatz <scopatz@...>
>> >> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 80, Issue 9
>> >> To: Discussion list for PyTables
>> >> <pytables-users@...>
>> >> Message-ID:
>> >> <
>> >> CAPk-6T56_7bSsE9BSNLL7P3TuyeZnoe5BVWiO86zJdO02GPEgQ@...>
>> >> Content-Type: text/plain; charset="iso-8859-1"
>> >>
>> >> Hi David,
>> >>
>> >> Sorry, I haven't had a ton of time recently. You seem to be getting a
>> >> memory error on creating a numpy array. This kind of thing typically
>> >> happens when you are out of memory. Does this seem to be the case with
>> >> you? When this dies, is your memory usage at 100%? If so, this
>> algorithm
>> >> might require a little tweaking...
>> >>
>> >> Be Well
>> >> Anthony
>> >>
>> >>
>> >> On Fri, Feb 1, 2013 at 6:15 AM, David Reed <david.reed.c@...>
>> >> wrote:
>> >>
>> >> > I'm still having problems with this one. I can't tell if this
>> something
>> >> > dumb Im doing with itertools, or if its something in pytables.
>> >> >
>> >> > Would appreciate any help.
>> >> >
>> >> > Thanks
>> >> >
>> >> >
>> >> > On Wed, Jan 30, 2013 at 5:00 PM, David Reed <david.reed.c@...
>> >> >wrote:
>> >> >
>> >> >> I think I have to reopen this issue. I have been running fine for
>> >> awhile
>> >> >> using the combinations method from itertools, but have recently run
>> >> into a
>> >> >> memory since I have recently quadrupled the size of the hdf file.
>> >> >>
>> >> >> Here is my code again:
>> >> >>
>> >> >> from itertools import combinations, izip
>> >> >> with tb.openFile(h5_all, 'r') as f:
>> >> >> irises = f.root.irises
>> >> >>
>> >> >> templates = f.root.irises.cols.templates
>> >> >> masks = f.root.irises.cols.masks1
>> >> >>
>> >> >> N_irises = len(irises)
>> >> >> index = np.ones((20 * 480), np.bool)
>> >> >>
>> >> >> print '%i Comparisons' % (N_irises*(N_irises - 1)/2)
>> >> >> D = np.empty((N_irises, N_irises))
>> >> >> for (t1, m1, ii), (t2, m2, jj) in combinations(izip(templates,
>> masks,
>> >> >> range(N_irises)), 2):
>> >> >> # print ii
>> >> >> D[ii, jj] = ham_dist(
>> >> >> t1[8, index],
>> >> >> t2[:, index],
>> >> >> m1[8, index],
>> >> >> m2[:, index],
>> >> >> )
>> >> >>
>> >> >> And here is the error:
>> >> >>
>> >> >> In [10]: get_hd3()
>> >> >> 10669890 Comparisons
>> >> >>
>> >> >>
>> >>
>> ---------------------------------------------------------------------------
>> >> >> MemoryError Traceback (most recent
>> call
>> >> >> last)
>> >> >> <ipython-input-10-cfb255ce7bd1> in <module>()
>> >> >> ----> 1 get_hd3()
>> >> >>
>> >> >>
>> >> >> 118 print '%i Comparisons' %
>> (N_irises*(N_irises -
>> >> >> 1)/2)
>> >> >> 119 D = np.empty((N_irises, N_irises))
>> >> >> --> 120 for (t1, m1, ii), (t2, m2, jj) in
>> >> >> combinations(izip(temp
>> >> >> lates, masks, range(N_irises)), 2):
>> >> >> 121 # print ii
>> >> >> 122 D[ii, jj] = ham_dist(
>> >> >>
>> >> >> c:\python27\lib\site-packages\tables\table.pyc in __iter__(self)
>> >> >> 3274 for start_row in xrange(0, len(self), nrowsinbuf):
>> >> >> 3275 end_row = min([start_row + nrowsinbuf, max_row])
>> >> >> -> 3276 buf = table.read(start_row, end_row, 1,
>> >> >> field=self.pathname)
>> >> >>
>> >> >> 3277 for row in buf:
>> >> >> 3278 yield row
>> >> >>
>> >> >> c:\python27\lib\site-packages\tables\table.pyc in read(self, start,
>> >> stop,
>> >> >> step,
>> >> >> field)
>> >> >> 1772 (start, stop, step) = self._processRangeRead(start,
>> >> stop,
>> >> >> step)
>> >> >> 1773
>> >> >> -> 1774 arr = self._read(start, stop, step, field)
>> >> >> 1775 return internal_to_flavor(arr, self.flavor)
>> >> >> 1776
>> >> >>
>> >> >> c:\python27\lib\site-packages\tables\table.pyc in _read(self, start,
>> >> >> stop, step,
>> >> >> field)
>> >> >> 1719 if field:
>> >> >> 1720 # Create a container for the results
>> >> >> -> 1721 result = numpy.empty(shape=nrows,
>> dtype=dtypeField)
>> >> >> 1722 else:
>> >> >> 1723 # Recarray case
>> >> >>
>> >> >> MemoryError:
>> >> >> > c:\python27\lib\site-packages\tables\table.py(1721)_read()
>> >> >> 1720 # Create a container for the results
>> >> >> -> 1721 result = numpy.empty(shape=nrows,
>> dtype=dtypeField)
>> >> >> 1722 else:
>> >> >>
>> >> >> Also, if you guys see any performance problems in my code, please
>> let
>> >> me
>> >> >> know.
>> >> >>
>> >> >> Thank you so much for the help.
>> >> >>
>> >> >> -Dave
>> >> >>
>> >> >>
>> >> >> On Fri, Jan 4, 2013 at 8:57 AM, <
>> >> >> pytables-users-request@...> wrote:
>> >> >>
>> >> >>> Send Pytables-users mailing list submissions to
>> >> >>> pytables-users@...
>> >> >>>
>> >> >>> To subscribe or unsubscribe via the World Wide Web, visit
>> >> >>>
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> or, via email, send a message with subject or body 'help' to
>> >> >>> pytables-users-request@...
>> >> >>>
>> >> >>> You can reach the person managing the list at
>> >> >>> pytables-users-owner@...
>> >> >>>
>> >> >>> When replying, please edit your Subject line so it is more specific
>> >> >>> than "Re: Contents of Pytables-users digest..."
>> >> >>>
>> >> >>>
>> >> >>> Today's Topics:
>> >> >>>
>> >> >>> 1. Re: Pytables-users Digest, Vol 80, Issue 8 (David Reed)
>> >> >>>
>> >> >>>
>> >> >>>
>> ----------------------------------------------------------------------
>> >> >>>
>> >> >>> Message: 1
>> >> >>> Date: Fri, 4 Jan 2013 08:56:28 -0500
>> >> >>> From: David Reed <david.reed.c@...>
>> >> >>> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 80, Issue
>> 8
>> >> >>> To: pytables-users@...
>> >> >>> Message-ID:
>> >> >>> <
>> >> >>> CAM6XA7nnz-S+BiCS5Eh85S794_88i4ZSyiWqtb-T0WrCfaqS8A@...
>> >
>> >> >>> Content-Type: text/plain; charset="iso-8859-1"
>> >> >>>
>> >> >>> I can't thank you guys enough for the help. I was able to add the
>> >> >>> __iter__
>> >> >>> function to the table.py file and everything seems to be working
>> >> great!
>> >> >>> I'm not quite as fast as I was with iterating right of a matrix
>> but
>> >> >>> pretty
>> >> >>> close. I was at 555 comparisons per second, and now im at 420.
>> >> >>>
>> >> >>> I handled the problem I mentioned earlier by doing this, and it
>> seems
>> >> to
>> >> >>> work great:
>> >> >>>
>> >> >>> A = f.root.data.cols.A
>> >> >>> B = f.root.data.cols.B
>> >> >>>
>> >> >>> D = np.empty((len(A), len(A))
>> >> >>> for (a1, b1, ii), (a2, b2, jj) in combinations(izip(A, B,
>> >> range(len(A))),
>> >> >>> 2):
>> >> >>> D[ii, jj] = compare(a1, a2, b1, b2)
>> >> >>>
>> >> >>> Again, thanks a lot.
>> >> >>>
>> >> >>> -Dave
>> >> >>>
>> >> >>>
>> >> >>> On Thu, Jan 3, 2013 at 6:31 PM, <
>> >> >>> pytables-users-request@...> wrote:
>> >> >>>
>> >> >>> > Send Pytables-users mailing list submissions to
>> >> >>> > pytables-users@...
>> >> >>> >
>> >> >>> > To subscribe or unsubscribe via the World Wide Web, visit
>> >> >>> >
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > or, via email, send a message with subject or body 'help' to
>> >> >>> > pytables-users-request@...
>> >> >>> >
>> >> >>> > You can reach the person managing the list at
>> >> >>> > pytables-users-owner@...
>> >> >>> >
>> >> >>> > When replying, please edit your Subject line so it is more
>> specific
>> >> >>> > than "Re: Contents of Pytables-users digest..."
>> >> >>> >
>> >> >>> >
>> >> >>> > Today's Topics:
>> >> >>> >
>> >> >>> > 1. Re: Pytables-users Digest, Vol 80, Issue 3 (Anthony
>> Scopatz)
>> >> >>> > 2. Re: Pytables-users Digest, Vol 80, Issue 4 (Anthony
>> Scopatz)
>> >> >>> >
>> >> >>> >
>> >> >>> >
>> >> ----------------------------------------------------------------------
>> >> >>> >
>> >> >>> > Message: 1
>> >> >>> > Date: Thu, 3 Jan 2013 17:26:55 -0600
>> >> >>> > From: Anthony Scopatz <scopatz@...>
>> >> >>> > Subject: Re: [Pytables-users] Pytables-users Digest, Vol 80,
>> Issue 3
>> >> >>> > To: Discussion list for PyTables
>> >> >>> > <pytables-users@...>
>> >> >>> > Message-ID:
>> >> >>> > <CAPk-6T6sz=J5ay_a9YGLPe_yBLGa9c+XgxG0CRNs6fJ=
>> >> >>> > Gzi3fA@...>
>> >> >>> > Content-Type: text/plain; charset="iso-8859-1"
>> >> >>> >
>> >> >>> > On Thu, Jan 3, 2013 at 2:17 PM, David Reed <
>> david.reed.c@...>
>> >> >>> wrote:
>> >> >>> >
>> >> >>> > > Thanks a lot for the help so far guys!
>> >> >>> > >
>> >> >>> > > Looking at itertools, I found what I believe to be the perfect
>> >> >>> function
>> >> >>> > > for what I need, itertools.combinations. This appears to be a
>> >> valid
>> >> >>> > > replacement to the method proposed.
>> >> >>> > >
>> >> >>> >
>> >> >>> > Yes, combinations is awesome!
>> >> >>> >
>> >> >>> >
>> >> >>> > >
>> >> >>> > > There is a small problem that I didn't mention is that my
>> compare
>> >> >>> > function
>> >> >>> > > actually takes as inputs 2 columns from the table. Like so:
>> >> >>> > >
>> >> >>> > > D = np.empty((N_irises, N_irises))
>> >> >>> > > for ii in xrange(N_elements):
>> >> >>> > > for jj in xrange(ii+1, N_elements):
>> >> >>> > > D[ii, jj] = compare(data['element1'][ii],
>> >> >>> > data['element1'][jj],data['element2'][ii],
>> >> >>> > > data['element2'][jj])
>> >> >>> > >
>> >> >>> > > Is there an efficient way of using itertools with this
>> structure?
>> >> >>> > >
>> >> >>> >
>> >> >>> > You can always make two other iterators for each column. Since
>> you
>> >> >>> have
>> >> >>> > two columns you would have 4 iterators. I am not sure how fast
>> >> this is
>> >> >>> > going to be but I am confident that there is definitely a way to
>> do
>> >> >>> this in
>> >> >>> > one for-loop, which is going to be way faster than nested loops.
>> >> >>> >
>> >> >>> > Be Well
>> >> >>> > Anthony
>> >> >>> >
>> >> >>> >
>> >> >>> > >
>> >> >>> > >
>> >> >>> > > On Thu, Jan 3, 2013 at 1:29 PM, <
>> >> >>> > > pytables-users-request@...> wrote:
>> >> >>> > >
>> >> >>> > >> Send Pytables-users mailing list submissions to
>> >> >>> > >> pytables-users@...
>> >> >>> > >>
>> >> >>> > >> To subscribe or unsubscribe via the World Wide Web, visit
>> >> >>> > >>
>> >> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >> or, via email, send a message with subject or body 'help' to
>> >> >>> > >> pytables-users-request@...
>> >> >>> > >>
>> >> >>> > >> You can reach the person managing the list at
>> >> >>> > >> pytables-users-owner@...
>> >> >>> > >>
>> >> >>> > >> When replying, please edit your Subject line so it is more
>> >> specific
>> >> >>> > >> than "Re: Contents of Pytables-users digest..."
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> > >> Today's Topics:
>> >> >>> > >>
>> >> >>> > >> 1. Re: Nested Iteration of HDF5 using PyTables (Josh Ayers)
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> > >>
>> >> >>>
>> ----------------------------------------------------------------------
>> >> >>> > >>
>> >> >>> > >> Message: 1
>> >> >>> > >> Date: Thu, 3 Jan 2013 10:29:33 -0800
>> >> >>> > >> From: Josh Ayers <josh.ayers@...>
>> >> >>> > >> Subject: Re: [Pytables-users] Nested Iteration of HDF5 using
>> >> >>> PyTables
>> >> >>> > >> To: Discussion list for PyTables
>> >> >>> > >> <pytables-users@...>
>> >> >>> > >> Message-ID:
>> >> >>> > >> <
>> >> >>> > >>
>> >> CACOB4aNozYD7dafoS7SxS07MCHZb8ZbripbBRVbaZRV4weqtXA@...>
>> >> >>> > >> Content-Type: text/plain; charset="iso-8859-1"
>> >> >>> > >>
>> >> >>> > >> David,
>> >> >>> > >>
>> >> >>> > >> The change in issue 27 was only for iteration over a
>> >> tables.Column
>> >> >>> > >> instance. To use it, tweak Anthony's code as follows. This
>> will
>> >> >>> > iterate
>> >> >>> > >> over the "element" column, as in your original example.
>> >> >>> > >>
>> >> >>> > >> Note also that this will only work with the development
>> version
>> >> of
>> >> >>> > >> PyTables
>> >> >>> > >> available on github. It will be very slow using the released
>> >> >>> v2.4.0.
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> > >> from itertools import izip
>> >> >>> > >>
>> >> >>> > >> with tb.openFile(...) as f:
>> >> >>> > >> data = f.root.data.cols.element
>> >> >>> > >> data_i = iter(data)
>> >> >>> > >> data_j = iter(data)
>> >> >>> > >> data_i.next() # throw the first value away
>> >> >>> > >> for i, j in izip(data_i, data_j):
>> >> >>> > >> compare(i, j)
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> > >> Hope that helps,
>> >> >>> > >> Josh
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> > >> On Thu, Jan 3, 2013 at 9:11 AM, Anthony Scopatz <
>> >> scopatz@...>
>> >> >>> > >> wrote:
>> >> >>> > >>
>> >> >>> > >> > HI David,
>> >> >>> > >> >
>> >> >>> > >> > Tables and table column iteration have been overhauled
>> fairly
>> >> >>> recently
>> >> >>> > >> > [1]. So you might try creating two iterators, offset by
>> one,
>> >> and
>> >> >>> then
>> >> >>> > >> > doing the comparison. I am hacking this out super quick so
>> >> please
>> >> >>> > >> forgive
>> >> >>> > >> > me:
>> >> >>> > >> >
>> >> >>> > >> > from itertools import izip
>> >> >>> > >> >
>> >> >>> > >> > with tb.openFile(...) as f:
>> >> >>> > >> > data = f.root.data
>> >> >>> > >> > data_i = iter(data)
>> >> >>> > >> > data_j = iter(data)
>> >> >>> > >> > data_i.next() # throw the first value away
>> >> >>> > >> > for i, j in izip(data_i, data_j):
>> >> >>> > >> > compare(i, j)
>> >> >>> > >> >
>> >> >>> > >> > You get the idea ;)
>> >> >>> > >> >
>> >> >>> > >> > Be Well
>> >> >>> > >> > Anthony
>> >> >>> > >> >
>> >> >>> > >> > 1. https://github.com/PyTables/PyTables/issues/27
>> >> >>> > >> >
>> >> >>> > >> >
>> >> >>> > >> > On Thu, Jan 3, 2013 at 9:25 AM, David Reed <
>> >> >>> david.reed.c@...>
>> >> >>> > >> wrote:
>> >> >>> > >> >
>> >> >>> > >> >> I was hoping someone could help me out here.
>> >> >>> > >> >>
>> >> >>> > >> >> This is from a post I put up on StackOverflow,
>> >> >>> > >> >>
>> >> >>> > >> >> I am have a fairly large dataset that I store in HDF5 and
>> >> access
>> >> >>> > using
>> >> >>> > >> >> PyTables. One operation I need to do on this dataset are
>> >> pairwise
>> >> >>> > >> >> comparisons between each of the elements. This requires 2
>> >> loops,
>> >> >>> one
>> >> >>> > to
>> >> >>> > >> >> iterate over each element, and an inner loop to iterate
>> over
>> >> >>> every
>> >> >>> > >> other
>> >> >>> > >> >> element. This operation thus looks at N(N-1)/2 comparisons.
>> >> >>> > >> >>
>> >> >>> > >> >> For fairly small sets I found it to be faster to dump the
>> >> >>> contents
>> >> >>> > >> into a
>> >> >>> > >> >> multdimensional numpy array and then do my iteration. I run
>> >> into
>> >> >>> > >> problems
>> >> >>> > >> >> with large sets because of memory issues and need to access
>> >> each
>> >> >>> > >> element of
>> >> >>> > >> >> the dataset at run time.
>> >> >>> > >> >>
>> >> >>> > >> >> Putting the elements into an array gives me about 600
>> >> >>> comparisons per
>> >> >>> > >> >> second, while operating on hdf5 data itself gives me about
>> 300
>> >> >>> > >> comparisons
>> >> >>> > >> >> per second.
>> >> >>> > >> >>
>> >> >>> > >> >> Is there a way to speed this process up?
>> >> >>> > >> >>
>> >> >>> > >> >> Example follows (this is not my real code, just an
>> example):
>> >> >>> > >> >>
>> >> >>> > >> >> *Small Set*:
>> >> >>> > >> >>
>> >> >>> > >> >>
>> >> >>> > >> >> with tb.openFile(h5_file, 'r') as f:
>> >> >>> > >> >> data = f.root.data
>> >> >>> > >> >>
>> >> >>> > >> >> N_elements = len(data)
>> >> >>> > >> >> elements = np.empty((N_irises, 1e5))
>> >> >>> > >> >>
>> >> >>> > >> >> for ii, d in enumerate(data):
>> >> >>> > >> >> elements[ii] = data['element']
>> >> >>> > >> >>
>> >> >>> > >> >> D = np.empty((N_irises, N_irises)) for ii in
>> >> xrange(N_elements):
>> >> >>> > >> >> for jj in xrange(ii+1, N_elements):
>> >> >>> > >> >> D[ii, jj] = compare(elements[ii], elements[jj])
>> >> >>> > >> >>
>> >> >>> > >> >> *Large Set*:
>> >> >>> > >> >>
>> >> >>> > >> >>
>> >> >>> > >> >> with tb.openFile(h5_file, 'r') as f:
>> >> >>> > >> >> data = f.root.data
>> >> >>> > >> >>
>> >> >>> > >> >> N_elements = len(data)
>> >> >>> > >> >>
>> >> >>> > >> >> D = np.empty((N_irises, N_irises))
>> >> >>> > >> >> for ii in xrange(N_elements):
>> >> >>> > >> >> for jj in xrange(ii+1, N_elements):
>> >> >>> > >> >> D[ii, jj] = compare(data['element'][ii],
>> >> >>> > >> data['element'][jj])
>> >> >>> > >> >>
>> >> >>> > >> >>
>> >> >>> > >> >>
>> >> >>> > >> >>
>> >> >>> > >>
>> >> >>> >
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> > >> >> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012,
>> >> HTML5,
>> >> >>> CSS,
>> >> >>> > >> >> MVC, Windows 8 Apps, JavaScript and much more. Keep your
>> >> skills
>> >> >>> > current
>> >> >>> > >> >> with LearnDevNow - 3,200 step-by-step video tutorials by
>> >> >>> Microsoft
>> >> >>> > >> >> MVPs and experts. ON SALE this month only -- learn more at:
>> >> >>> > >> >> http://p.sf.net/sfu/learnmore_122712
>> >> >>> > >> >> _______________________________________________
>> >> >>> > >> >> Pytables-users mailing list
>> >> >>> > >> >> Pytables-users@...
>> >> >>> > >> >>
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >> >>
>> >> >>> > >> >>
>> >> >>> > >> >
>> >> >>> > >> >
>> >> >>> > >> >
>> >> >>> > >>
>> >> >>> >
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> > >> > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012,
>> >> HTML5,
>> >> >>> CSS,
>> >> >>> > >> > MVC, Windows 8 Apps, JavaScript and much more. Keep your
>> skills
>> >> >>> > current
>> >> >>> > >> > with LearnDevNow - 3,200 step-by-step video tutorials by
>> >> Microsoft
>> >> >>> > >> > MVPs and experts. ON SALE this month only -- learn more at:
>> >> >>> > >> > http://p.sf.net/sfu/learnmore_122712
>> >> >>> > >> > _______________________________________________
>> >> >>> > >> > Pytables-users mailing list
>> >> >>> > >> > Pytables-users@...
>> >> >>> > >> > https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >> >
>> >> >>> > >> >
>> >> >>> > >> -------------- next part --------------
>> >> >>> > >> An HTML attachment was scrubbed...
>> >> >>> > >>
>> >> >>> > >> ------------------------------
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> >
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> > >> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012,
>> HTML5,
>> >> >>> CSS,
>> >> >>> > >> MVC, Windows 8 Apps, JavaScript and much more. Keep your
>> skills
>> >> >>> current
>> >> >>> > >> with LearnDevNow - 3,200 step-by-step video tutorials by
>> >> Microsoft
>> >> >>> > >> MVPs and experts. ON SALE this month only -- learn more at:
>> >> >>> > >> http://p.sf.net/sfu/learnmore_122712
>> >> >>> > >>
>> >> >>> > >> ------------------------------
>> >> >>> > >>
>> >> >>> > >> _______________________________________________
>> >> >>> > >> Pytables-users mailing list
>> >> >>> > >> Pytables-users@...
>> >> >>> > >> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> > >> End of Pytables-users Digest, Vol 80, Issue 3
>> >> >>> > >> *********************************************
>> >> >>> > >>
>> >> >>> > >
>> >> >>> > >
>> >> >>> > >
>> >> >>> > >
>> >> >>> >
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> > > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012,
>> HTML5,
>> >> CSS,
>> >> >>> > > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>> >> >>> current
>> >> >>> > > with LearnDevNow - 3,200 step-by-step video tutorials by
>> Microsoft
>> >> >>> > > MVPs and experts. ON SALE this month only -- learn more at:
>> >> >>> > > http://p.sf.net/sfu/learnmore_122712
>> >> >>> > > _______________________________________________
>> >> >>> > > Pytables-users mailing list
>> >> >>> > > Pytables-users@...
>> >> >>> > > https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >
>> >> >>> > >
>> >> >>> > -------------- next part --------------
>> >> >>> > An HTML attachment was scrubbed...
>> >> >>> >
>> >> >>> > ------------------------------
>> >> >>> >
>> >> >>> > Message: 2
>> >> >>> > Date: Thu, 3 Jan 2013 17:30:59 -0600
>> >> >>> > From: Anthony Scopatz <scopatz@...>
>> >> >>> > Subject: Re: [Pytables-users] Pytables-users Digest, Vol 80,
>> Issue 4
>> >> >>> > To: Discussion list for PyTables
>> >> >>> > <pytables-users@...>
>> >> >>> > Message-ID:
>> >> >>> > <
>> >> >>> >
>> CAPk-6T475yZrWLtmGbaYrG8RMDd2AYyujb4-Zij+FTKKwwYsXg@...>
>> >> >>> > Content-Type: text/plain; charset="iso-8859-1"
>> >> >>> >
>> >> >>> > Josh is right that you can just edit the code by hand (which
>> works
>> >> but
>> >> >>> > sucks).
>> >> >>> >
>> >> >>> > However, on Windows -- on the rare occasion when I also have to
>> >> >>> develop on
>> >> >>> > it -- I typically use a distribution that includes a compiler,
>> >> cython,
>> >> >>> > hdf5, and pytables already and then I install my development
>> version
>> >> >>> from
>> >> >>> > github OVER this. I recommend either EPD or Anaconda, though
>> other
>> >> >>> > distributions listed here [1] might also work.
>> >> >>> >
>> >> >>> > Be well
>> >> >>> > Anthony
>> >> >>> >
>> >> >>> > 1. http://numfocus.org/projects-2/software-distributions/
>> >> >>> >
>> >> >>> >
>> >> >>> > On Thu, Jan 3, 2013 at 3:46 PM, Josh Ayers <josh.ayers@...
>> >
>> >> >>> wrote:
>> >> >>> >
>> >> >>> > > The change was in pure Python code, so you should be able to
>> just
>> >> >>> paste
>> >> >>> > in
>> >> >>> > > the changes to your local copy. Start with the
>> >> table.Column.__iter__
>> >> >>> > > method (lines 3296-3310) here.
>> >> >>> > >
>> >> >>> > >
>> >> >>> > >
>> >> >>> >
>> >> >>>
>> >>
>> https://github.com/PyTables/PyTables/blob/b479ed025f4636f7f4744ac83a89bc947808907c/tables/table.py
>> >> >>> > >
>> >> >>> > > It needs to be modified slightly because it uses some
>> additional
>> >> >>> features
>> >> >>> > > that aren't available in the released version (the
>> out=buf_slice
>> >> >>> argument
>> >> >>> > > to table.read). The following should work.
>> >> >>> > >
>> >> >>> > > def __iter__(self):
>> >> >>> > > table = self.table
>> >> >>> > > itemsize = self.dtype.itemsize
>> >> >>> > > nrowsinbuf = table._v_file.params['IO_BUFFER_SIZE'] //
>> >> >>> itemsize
>> >> >>> > > max_row = len(self)
>> >> >>> > > for start_row in xrange(0, len(self), nrowsinbuf):
>> >> >>> > > end_row = min([start_row + nrowsinbuf, max_row])
>> >> >>> > > buf = table.read(start_row, end_row, 1,
>> >> >>> field=self.pathname)
>> >> >>> > > for row in buf:
>> >> >>> > > yield row
>> >> >>> > >
>> >> >>> > >
>> >> >>> > > I haven't tested this, but I think it will work.
>> >> >>> > >
>> >> >>> > > Josh
>> >> >>> > >
>> >> >>> > >
>> >> >>> > >
>> >> >>> > > On Thu, Jan 3, 2013 at 1:25 PM, David Reed <
>> >> david.reed.c@...>
>> >> >>> > wrote:
>> >> >>> > >
>> >> >>> > >> I apologize if I'm starting to sound helpless, but I'm forced
>> to
>> >> >>> work on
>> >> >>> > >> Windows 7 at work and have never had luck compiling python
>> source
>> >> >>> > >> successfully. I have had to rely on precompiled binaries and
>> now
>> >> >>> its
>> >> >>> > >> biting me in the butt.
>> >> >>> > >>
>> >> >>> > >> Is there any quick fix I can do to improve this iteration
>> using
>> >> >>> v2.4.0?
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> > >> On Thu, Jan 3, 2013 at 3:17 PM, <
>> >> >>> > >> pytables-users-request@...> wrote:
>> >> >>> > >>
>> >> >>> > >>> Send Pytables-users mailing list submissions to
>> >> >>> > >>> pytables-users@...
>> >> >>> > >>>
>> >> >>> > >>> To subscribe or unsubscribe via the World Wide Web, visit
>> >> >>> > >>>
>> >> >>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >>> or, via email, send a message with subject or body 'help' to
>> >> >>> > >>> pytables-users-request@...
>> >> >>> > >>>
>> >> >>> > >>> You can reach the person managing the list at
>> >> >>> > >>> pytables-users-owner@...
>> >> >>> > >>>
>> >> >>> > >>> When replying, please edit your Subject line so it is more
>> >> specific
>> >> >>> > >>> than "Re: Contents of Pytables-users digest..."
>> >> >>> > >>>
>> >> >>> > >>>
>> >> >>> > >>> Today's Topics:
>> >> >>> > >>>
>> >> >>> > >>> 1. Re: Pytables-users Digest, Vol 80, Issue 2 (David Reed)
>> >> >>> > >>> 2. Re: Pytables-users Digest, Vol 80, Issue 3 (David Reed)
>> >> >>> > >>>
>> >> >>> > >>>
>> >> >>> > >>>
>> >> >>>
>> ----------------------------------------------------------------------
>> >> >>> > >>>
>> >> >>> > >>> Message: 1
>> >> >>> > >>> Date: Thu, 3 Jan 2013 13:44:29 -0500
>> >> >>> > >>> From: David Reed <david.reed.c@...>
>> >> >>> > >>> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 80,
>> >> Issue
>> >> >>> 2
>> >> >>> > >>> To: pytables-users@...
>> >> >>> > >>> Message-ID:
>> >> >>> > >>> <CAM6XA7=8ocg5WPD4KLSvLhSw-3BCvq5u7MRxq3Ajd6ha=
>> >> >>> > >>> ev8Eg@...>
>> >> >>> > >>> Content-Type: text/plain; charset="iso-8859-1"
>> >> >>> > >>>
>> >> >>> > >>> Thanks Anthony, but unless Im missing something I don't think
>> >> that
>> >> >>> > method
>> >> >>> > >>> will work since this will only be comparing the ith element
>> with
>> >> >>> ith+1
>> >> >>> > >>> element. I still need 2 for loops right?
>> >> >>> > >>>
>> >> >>> > >>> Using itertools might speed things up though, I've never used
>> >> them
>> >> >>> so I
>> >> >>> > >>> will give it a shot and let you know how it goes. Looks
>> like I
>> >> >>> need to
>> >> >>> > >>> download the latest release before I do that too. Thanks for
>> >> the
>> >> >>> help.
>> >> >>> > >>>
>> >> >>> > >>> -Dave
>> >> >>> > >>>
>> >> >>> > >>>
>> >> >>> > >>>
>> >> >>> > >>> On Thu, Jan 3, 2013 at 12:12 PM, <
>> >> >>> > >>> pytables-users-request@...> wrote:
>> >> >>> > >>>
>> >> >>> > >>> > Send Pytables-users mailing list submissions to
>> >> >>> > >>> > pytables-users@...
>> >> >>> > >>> >
>> >> >>> > >>> > To subscribe or unsubscribe via the World Wide Web, visit
>> >> >>> > >>> >
>> >> >>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >>> > or, via email, send a message with subject or body 'help'
>> to
>> >> >>> > >>> > pytables-users-request@...
>> >> >>> > >>> >
>> >> >>> > >>> > You can reach the person managing the list at
>> >> >>> > >>> > pytables-users-owner@...
>> >> >>> > >>> >
>> >> >>> > >>> > When replying, please edit your Subject line so it is more
>> >> >>> specific
>> >> >>> > >>> > than "Re: Contents of Pytables-users digest..."
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>> > Today's Topics:
>> >> >>> > >>> >
>> >> >>> > >>> > 1. Re: Nested Iteration of HDF5 using PyTables (Anthony
>> >> >>> Scopatz)
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> >
>> >> ----------------------------------------------------------------------
>> >> >>> > >>> >
>> >> >>> > >>> > Message: 1
>> >> >>> > >>> > Date: Thu, 3 Jan 2013 11:11:47 -0600
>> >> >>> > >>> > From: Anthony Scopatz <scopatz@...>
>> >> >>> > >>> > Subject: Re: [Pytables-users] Nested Iteration of HDF5
>> using
>> >> >>> PyTables
>> >> >>> > >>> > To: Discussion list for PyTables
>> >> >>> > >>> > <pytables-users@...>
>> >> >>> > >>> > Message-ID:
>> >> >>> > >>> > <CAPk-6T5b=
>> >> >>> > >>> > 1EGAGp4+jhJcD3_4fNVbXROb2jBHaY45RWDqZyq2A@...>
>> >> >>> > >>> > Content-Type: text/plain; charset="iso-8859-1"
>> >> >>> > >>> >
>> >> >>> > >>> > HI David,
>> >> >>> > >>> >
>> >> >>> > >>> > Tables and table column iteration have been overhauled
>> fairly
>> >> >>> > recently
>> >> >>> > >>> [1].
>> >> >>> > >>> > So you might try creating two iterators, offset by one,
>> and
>> >> then
>> >> >>> > >>> doing the
>> >> >>> > >>> > comparison. I am hacking this out super quick so please
>> >> forgive
>> >> >>> me:
>> >> >>> > >>> >
>> >> >>> > >>> > from itertools import izip
>> >> >>> > >>> >
>> >> >>> > >>> > with tb.openFile(...) as f:
>> >> >>> > >>> > data = f.root.data
>> >> >>> > >>> > data_i = iter(data)
>> >> >>> > >>> > data_j = iter(data)
>> >> >>> > >>> > data_i.next() # throw the first value away
>> >> >>> > >>> > for i, j in izip(data_i, data_j):
>> >> >>> > >>> > compare(i, j)
>> >> >>> > >>> >
>> >> >>> > >>> > You get the idea ;)
>> >> >>> > >>> >
>> >> >>> > >>> > Be Well
>> >> >>> > >>> > Anthony
>> >> >>> > >>> >
>> >> >>> > >>> > 1. https://github.com/PyTables/PyTables/issues/27
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>> > On Thu, Jan 3, 2013 at 9:25 AM, David Reed <
>> >> >>> david.reed.c@...>
>> >> >>> > >>> wrote:
>> >> >>> > >>> >
>> >> >>> > >>> > > I was hoping someone could help me out here.
>> >> >>> > >>> > >
>> >> >>> > >>> > > This is from a post I put up on StackOverflow,
>> >> >>> > >>> > >
>> >> >>> > >>> > > I am have a fairly large dataset that I store in HDF5 and
>> >> >>> access
>> >> >>> > >>> using
>> >> >>> > >>> > > PyTables. One operation I need to do on this dataset are
>> >> >>> pairwise
>> >> >>> > >>> > > comparisons between each of the elements. This requires 2
>> >> >>> loops,
>> >> >>> > one
>> >> >>> > >>> to
>> >> >>> > >>> > > iterate over each element, and an inner loop to iterate
>> over
>> >> >>> every
>> >> >>> > >>> other
>> >> >>> > >>> > > element. This operation thus looks at N(N-1)/2
>> comparisons.
>> >> >>> > >>> > >
>> >> >>> > >>> > > For fairly small sets I found it to be faster to dump the
>> >> >>> contents
>> >> >>> > >>> into a
>> >> >>> > >>> > > multdimensional numpy array and then do my iteration. I
>> run
>> >> >>> into
>> >> >>> > >>> problems
>> >> >>> > >>> > > with large sets because of memory issues and need to
>> access
>> >> >>> each
>> >> >>> > >>> element
>> >> >>> > >>> > of
>> >> >>> > >>> > > the dataset at run time.
>> >> >>> > >>> > >
>> >> >>> > >>> > > Putting the elements into an array gives me about 600
>> >> >>> comparisons
>> >> >>> > per
>> >> >>> > >>> > > second, while operating on hdf5 data itself gives me
>> about
>> >> 300
>> >> >>> > >>> > comparisons
>> >> >>> > >>> > > per second.
>> >> >>> > >>> > >
>> >> >>> > >>> > > Is there a way to speed this process up?
>> >> >>> > >>> > >
>> >> >>> > >>> > > Example follows (this is not my real code, just an
>> example):
>> >> >>> > >>> > >
>> >> >>> > >>> > > *Small Set*:
>> >> >>> > >>> > >
>> >> >>> > >>> > >
>> >> >>> > >>> > > with tb.openFile(h5_file, 'r') as f:
>> >> >>> > >>> > > data = f.root.data
>> >> >>> > >>> > >
>> >> >>> > >>> > > N_elements = len(data)
>> >> >>> > >>> > > elements = np.empty((N_irises, 1e5))
>> >> >>> > >>> > >
>> >> >>> > >>> > > for ii, d in enumerate(data):
>> >> >>> > >>> > > elements[ii] = data['element']
>> >> >>> > >>> > >
>> >> >>> > >>> > > D = np.empty((N_irises, N_irises)) for ii in
>> >> >>> xrange(N_elements):
>> >> >>> > >>> > > for jj in xrange(ii+1, N_elements):
>> >> >>> > >>> > > D[ii, jj] = compare(elements[ii], elements[jj])
>> >> >>> > >>> > >
>> >> >>> > >>> > > *Large Set*:
>> >> >>> > >>> > >
>> >> >>> > >>> > >
>> >> >>> > >>> > > with tb.openFile(h5_file, 'r') as f:
>> >> >>> > >>> > > data = f.root.data
>> >> >>> > >>> > >
>> >> >>> > >>> > > N_elements = len(data)
>> >> >>> > >>> > >
>> >> >>> > >>> > > D = np.empty((N_irises, N_irises))
>> >> >>> > >>> > > for ii in xrange(N_elements):
>> >> >>> > >>> > > for jj in xrange(ii+1, N_elements):
>> >> >>> > >>> > > D[ii, jj] = compare(data['element'][ii],
>> >> >>> > >>> > data['element'][jj])
>> >> >>> > >>> > >
>> >> >>> > >>> > >
>> >> >>> > >>> > >
>> >> >>> > >>> > >
>> >> >>> > >>> >
>> >> >>> > >>>
>> >> >>> >
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> > >>> > > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012,
>> >> >>> HTML5,
>> >> >>> > CSS,
>> >> >>> > >>> > > MVC, Windows 8 Apps, JavaScript and much more. Keep your
>> >> skills
>> >> >>> > >>> current
>> >> >>> > >>> > > with LearnDevNow - 3,200 step-by-step video tutorials by
>> >> >>> Microsoft
>> >> >>> > >>> > > MVPs and experts. ON SALE this month only -- learn more
>> at:
>> >> >>> > >>> > > http://p.sf.net/sfu/learnmore_122712
>> >> >>> > >>> > > _______________________________________________
>> >> >>> > >>> > > Pytables-users mailing list
>> >> >>> > >>> > > Pytables-users@...
>> >> >>> > >>> > >
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >>> > >
>> >> >>> > >>> > >
>> >> >>> > >>> > -------------- next part --------------
>> >> >>> > >>> > An HTML attachment was scrubbed...
>> >> >>> > >>> >
>> >> >>> > >>> > ------------------------------
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>>
>> >> >>> >
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> > >>> > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012,
>> >> HTML5,
>> >> >>> CSS,
>> >> >>> > >>> > MVC, Windows 8 Apps, JavaScript and much more. Keep your
>> >> skills
>> >> >>> > current
>> >> >>> > >>> > with LearnDevNow - 3,200 step-by-step video tutorials by
>> >> >>> Microsoft
>> >> >>> > >>> > MVPs and experts. ON SALE this month only -- learn more at:
>> >> >>> > >>> > http://p.sf.net/sfu/learnmore_122712
>> >> >>> > >>> >
>> >> >>> > >>> > ------------------------------
>> >> >>> > >>> >
>> >> >>> > >>> > _______________________________________________
>> >> >>> > >>> > Pytables-users mailing list
>> >> >>> > >>> > Pytables-users@...
>> >> >>> > >>> >
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>> > End of Pytables-users Digest, Vol 80, Issue 2
>> >> >>> > >>> > *********************************************
>> >> >>> > >>> >
>> >> >>> > >>> -------------- next part --------------
>> >> >>> > >>> An HTML attachment was scrubbed...
>> >> >>> > >>>
>> >> >>> > >>> ------------------------------
>> >> >>> > >>>
>> >> >>> > >>> Message: 2
>> >> >>> > >>> Date: Thu, 3 Jan 2013 15:17:01 -0500
>> >> >>> > >>> From: David Reed <david.reed.c@...>
>> >> >>> > >>> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 80,
>> >> Issue
>> >> >>> 3
>> >> >>> > >>> To: pytables-users@...
>> >> >>> > >>> Message-ID:
>> >> >>> > >>> <
>> >> >>> > >>>
>> >> CAM6XA7m4d9TrKuc79ifp+nft0QkQDsaVfH5AhCdLJhTvjr9UVA@...
>> >> >>> >
>> >> >>> > >>> Content-Type: text/plain; charset="iso-8859-1"
>> >> >>> > >>>
>> >> >>> > >>> Thanks a lot for the help so far guys!
>> >> >>> > >>>
>> >> >>> > >>> Looking at itertools, I found what I believe to be the
>> perfect
>> >> >>> function
>> >> >>> > >>> for
>> >> >>> > >>> what I need, itertools.combinations. This appears to be a
>> valid
>> >> >>> > >>> replacement
>> >> >>> > >>> to the method proposed.
>> >> >>> > >>>
>> >> >>> > >>> There is a small problem that I didn't mention is that my
>> >> compare
>> >> >>> > >>> function
>> >> >>> > >>> actually takes as inputs 2 columns from the table. Like so:
>> >> >>> > >>>
>> >> >>> > >>> D = np.empty((N_irises, N_irises))
>> >> >>> > >>> for ii in xrange(N_elements):
>> >> >>> > >>> for jj in xrange(ii+1, N_elements):
>> >> >>> > >>> D[ii, jj] = compare(data['element1'][ii],
>> >> >>> > >>> data['element1'][jj],data['element2'][ii],
>> >> >>> > >>> data['element2'][jj])
>> >> >>> > >>>
>> >> >>> > >>> Is there an efficient way of using itertools with this
>> >> structure?
>> >> >>> > >>>
>> >> >>> > >>>
>> >> >>> > >>> On Thu, Jan 3, 2013 at 1:29 PM, <
>> >> >>> > >>> pytables-users-request@...> wrote:
>> >> >>> > >>>
>> >> >>> > >>> > Send Pytables-users mailing list submissions to
>> >> >>> > >>> > pytables-users@...
>> >> >>> > >>> >
>> >> >>> > >>> > To subscribe or unsubscribe via the World Wide Web, visit
>> >> >>> > >>> >
>> >> >>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >>> > or, via email, send a message with subject or body 'help'
>> to
>> >> >>> > >>> > pytables-users-request@...
>> >> >>> > >>> >
>> >> >>> > >>> > You can reach the person managing the list at
>> >> >>> > >>> > pytables-users-owner@...
>> >> >>> > >>> >
>> >> >>> > >>> > When replying, please edit your Subject line so it is more
>> >> >>> specific
>> >> >>> > >>> > than "Re: Contents of Pytables-users digest..."
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>> > Today's Topics:
>> >> >>> > >>> >
>> >> >>> > >>> > 1. Re: Nested Iteration of HDF5 using PyTables (Josh
>> Ayers)
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> >
>> >> ----------------------------------------------------------------------
>> >> >>> > >>> >
>> >> >>> > >>> > Message: 1
>> >> >>> > >>> > Date: Thu, 3 Jan 2013 10:29:33 -0800
>> >> >>> > >>> > From: Josh Ayers <josh.ayers@...>
>> >> >>> > >>> > Subject: Re: [Pytables-users] Nested Iteration of HDF5
>> using
>> >> >>> PyTables
>> >> >>> > >>> > To: Discussion list for PyTables
>> >> >>> > >>> > <pytables-users@...>
>> >> >>> > >>> > Message-ID:
>> >> >>> > >>> > <
>> >> >>> > >>> >
>> >> >>> CACOB4aNozYD7dafoS7SxS07MCHZb8ZbripbBRVbaZRV4weqtXA@...
>> >
>> >> >>> > >>> > Content-Type: text/plain; charset="iso-8859-1"
>> >> >>> > >>> >
>> >> >>> > >>> > David,
>> >> >>> > >>> >
>> >> >>> > >>> > The change in issue 27 was only for iteration over a
>> >> >>> tables.Column
>> >> >>> > >>> > instance. To use it, tweak Anthony's code as follows.
>> This
>> >> will
>> >> >>> > >>> iterate
>> >> >>> > >>> > over the "element" column, as in your original example.
>> >> >>> > >>> >
>> >> >>> > >>> > Note also that this will only work with the development
>> >> version
>> >> >>> of
>> >> >>> > >>> PyTables
>> >> >>> > >>> > available on github. It will be very slow using the
>> released
>> >> >>> v2.4.0.
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>> > from itertools import izip
>> >> >>> > >>> >
>> >> >>> > >>> > with tb.openFile(...) as f:
>> >> >>> > >>> > data = f.root.data.cols.element
>> >> >>> > >>> > data_i = iter(data)
>> >> >>> > >>> > data_j = iter(data)
>> >> >>> > >>> > data_i.next() # throw the first value away
>> >> >>> > >>> > for i, j in izip(data_i, data_j):
>> >> >>> > >>> > compare(i, j)
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>> > Hope that helps,
>> >> >>> > >>> > Josh
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>> > On Thu, Jan 3, 2013 at 9:11 AM, Anthony Scopatz <
>> >> >>> scopatz@...>
>> >> >>> > >>> wrote:
>> >> >>> > >>> >
>> >> >>> > >>> > > HI David,
>> >> >>> > >>> > >
>> >> >>> > >>> > > Tables and table column iteration have been overhauled
>> >> fairly
>> >> >>> > >>> recently
>> >> >>> > >>> > > [1]. So you might try creating two iterators, offset by
>> >> one,
>> >> >>> and
>> >> >>> > >>> then
>> >> >>> > >>> > > doing the comparison. I am hacking this out super quick
>> so
>> >> >>> please
>> >> >>> > >>> > forgive
>> >> >>> > >>> > > me:
>> >> >>> > >>> > >
>> >> >>> > >>> > > from itertools import izip
>> >> >>> > >>> > >
>> >> >>> > >>> > > with tb.openFile(...) as f:
>> >> >>> > >>> > > data = f.root.data
>> >> >>> > >>> > > data_i = iter(data)
>> >> >>> > >>> > > data_j = iter(data)
>> >> >>> > >>> > > data_i.next() # throw the first value away
>> >> >>> > >>> > > for i, j in izip(data_i, data_j):
>> >> >>> > >>> > > compare(i, j)
>> >> >>> > >>> > >
>> >> >>> > >>> > > You get the idea ;)
>> >> >>> > >>> > >
>> >> >>> > >>> > > Be Well
>> >> >>> > >>> > > Anthony
>> >> >>> > >>> > >
>> >> >>> > >>> > > 1. https://github.com/PyTables/PyTables/issues/27
>> >> >>> > >>> > >
>> >> >>> > >>> > >
>> >> >>> > >>> > > On Thu, Jan 3, 2013 at 9:25 AM, David Reed <
>> >> >>> david.reed.c@...
>> >> >>> > >
>> >> >>> > >>> > wrote:
>> >> >>> > >>> > >
>> >> >>> > >>> > >> I was hoping someone could help me out here.
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> This is from a post I put up on StackOverflow,
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> I am have a fairly large dataset that I store in HDF5
>> and
>> >> >>> access
>> >> >>> > >>> using
>> >> >>> > >>> > >> PyTables. One operation I need to do on this dataset are
>> >> >>> pairwise
>> >> >>> > >>> > >> comparisons between each of the elements. This requires
>> 2
>> >> >>> loops,
>> >> >>> > >>> one to
>> >> >>> > >>> > >> iterate over each element, and an inner loop to iterate
>> >> over
>> >> >>> every
>> >> >>> > >>> other
>> >> >>> > >>> > >> element. This operation thus looks at N(N-1)/2
>> comparisons.
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> For fairly small sets I found it to be faster to dump
>> the
>> >> >>> contents
>> >> >>> > >>> into
>> >> >>> > >>> > a
>> >> >>> > >>> > >> multdimensional numpy array and then do my iteration. I
>> run
>> >> >>> into
>> >> >>> > >>> > problems
>> >> >>> > >>> > >> with large sets because of memory issues and need to
>> access
>> >> >>> each
>> >> >>> > >>> > element of
>> >> >>> > >>> > >> the dataset at run time.
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> Putting the elements into an array gives me about 600
>> >> >>> comparisons
>> >> >>> > >>> per
>> >> >>> > >>> > >> second, while operating on hdf5 data itself gives me
>> about
>> >> 300
>> >> >>> > >>> > comparisons
>> >> >>> > >>> > >> per second.
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> Is there a way to speed this process up?
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> Example follows (this is not my real code, just an
>> >> example):
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> *Small Set*:
>> >> >>> > >>> > >>
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> with tb.openFile(h5_file, 'r') as f:
>> >> >>> > >>> > >> data = f.root.data
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> N_elements = len(data)
>> >> >>> > >>> > >> elements = np.empty((N_irises, 1e5))
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> for ii, d in enumerate(data):
>> >> >>> > >>> > >> elements[ii] = data['element']
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> D = np.empty((N_irises, N_irises)) for ii in
>> >> >>> xrange(N_elements):
>> >> >>> > >>> > >> for jj in xrange(ii+1, N_elements):
>> >> >>> > >>> > >> D[ii, jj] = compare(elements[ii], elements[jj])
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> *Large Set*:
>> >> >>> > >>> > >>
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> with tb.openFile(h5_file, 'r') as f:
>> >> >>> > >>> > >> data = f.root.data
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> N_elements = len(data)
>> >> >>> > >>> > >>
>> >> >>> > >>> > >> D = np.empty((N_irises, N_irises))
>> >> >>> > >>> > >> for ii in xrange(N_elements):
>> >> >>> > >>> > >> for jj in xrange(ii+1, N_elements):
>> >> >>> > >>> > >> D[ii, jj] = compare(data['element'][ii],
>> >> >>> > >>> > data['element'][jj])
>> >> >>> > >>> > >>
>> >> >>> > >>> > >>
>> >> >>> > >>> > >>
>> >> >>> > >>> > >>
>> >> >>> > >>> >
>> >> >>> > >>>
>> >> >>> >
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> > >>> > >> Master Visual Studio, SharePoint, SQL, ASP.NET, C#
>> 2012,
>> >> >>> HTML5,
>> >> >>> > >>> CSS,
>> >> >>> > >>> > >> MVC, Windows 8 Apps, JavaScript and much more. Keep your
>> >> >>> skills
>> >> >>> > >>> current
>> >> >>> > >>> > >> with LearnDevNow - 3,200 step-by-step video tutorials by
>> >> >>> Microsoft
>> >> >>> > >>> > >> MVPs and experts. ON SALE this month only -- learn more
>> at:
>> >> >>> > >>> > >> http://p.sf.net/sfu/learnmore_122712
>> >> >>> > >>> > >> _______________________________________________
>> >> >>> > >>> > >> Pytables-users mailing list
>> >> >>> > >>> > >> Pytables-users@...
>> >> >>> > >>> > >>
>> >> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >>> > >>
>> >> >>> > >>> > >>
>> >> >>> > >>> > >
>> >> >>> > >>> > >
>> >> >>> > >>> > >
>> >> >>> > >>> >
>> >> >>> > >>>
>> >> >>> >
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> > >>> > > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012,
>> >> >>> HTML5,
>> >> >>> > CSS,
>> >> >>> > >>> > > MVC, Windows 8 Apps, JavaScript and much more. Keep your
>> >> skills
>> >> >>> > >>> current
>> >> >>> > >>> > > with LearnDevNow - 3,200 step-by-step video tutorials by
>> >> >>> Microsoft
>> >> >>> > >>> > > MVPs and experts. ON SALE this month only -- learn more
>> at:
>> >> >>> > >>> > > http://p.sf.net/sfu/learnmore_122712
>> >> >>> > >>> > > _______________________________________________
>> >> >>> > >>> > > Pytables-users mailing list
>> >> >>> > >>> > > Pytables-users@...
>> >> >>> > >>> > >
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >>> > >
>> >> >>> > >>> > >
>> >> >>> > >>> > -------------- next part --------------
>> >> >>> > >>> > An HTML attachment was scrubbed...
>> >> >>> > >>> >
>> >> >>> > >>> > ------------------------------
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>>
>> >> >>> >
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> > >>> > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012,
>> >> HTML5,
>> >> >>> CSS,
>> >> >>> > >>> > MVC, Windows 8 Apps, JavaScript and much more. Keep your
>> >> skills
>> >> >>> > current
>> >> >>> > >>> > with LearnDevNow - 3,200 step-by-step video tutorials by
>> >> >>> Microsoft
>> >> >>> > >>> > MVPs and experts. ON SALE this month only -- learn more at:
>> >> >>> > >>> > http://p.sf.net/sfu/learnmore_122712
>> >> >>> > >>> >
>> >> >>> > >>> > ------------------------------
>> >> >>> > >>> >
>> >> >>> > >>> > _______________________________________________
>> >> >>> > >>> > Pytables-users mailing list
>> >> >>> > >>> > Pytables-users@...
>> >> >>> > >>> >
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >>> >
>> >> >>> > >>> >
>> >> >>> > >>> > End of Pytables-users Digest, Vol 80, Issue 3
>> >> >>> > >>> > *********************************************
>> >> >>> > >>> >
>> >> >>> > >>> -------------- next part --------------
>> >> >>> > >>> An HTML attachment was scrubbed...
>> >> >>> > >>>
>> >> >>> > >>> ------------------------------
>> >> >>> > >>>
>> >> >>> > >>>
>> >> >>> > >>>
>> >> >>> >
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> > >>> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012,
>> HTML5,
>> >> >>> CSS,
>> >> >>> > >>> MVC, Windows 8 Apps, JavaScript and much more. Keep your
>> skills
>> >> >>> current
>> >> >>> > >>> with LearnDevNow - 3,200 step-by-step video tutorials by
>> >> Microsoft
>> >> >>> > >>> MVPs and experts. ON SALE this month only -- learn more at:
>> >> >>> > >>> http://p.sf.net/sfu/learnmore_122712
>> >> >>> > >>>
>> >> >>> > >>> ------------------------------
>> >> >>> > >>>
>> >> >>> > >>> _______________________________________________
>> >> >>> > >>> Pytables-users mailing list
>> >> >>> > >>> Pytables-users@...
>> >> >>> > >>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >>>
>> >> >>> > >>>
>> >> >>> > >>> End of Pytables-users Digest, Vol 80, Issue 4
>> >> >>> > >>> *********************************************
>> >> >>> > >>>
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> >
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> > >> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012,
>> HTML5,
>> >> >>> CSS,
>> >> >>> > >> MVC, Windows 8 Apps, JavaScript and much more. Keep your
>> skills
>> >> >>> current
>> >> >>> > >> with LearnDevNow - 3,200 step-by-step video tutorials by
>> >> Microsoft
>> >> >>> > >> MVPs and experts. ON SALE this month only -- learn more at:
>> >> >>> > >> http://p.sf.net/sfu/learnmore_122712
>> >> >>> > >> _______________________________________________
>> >> >>> > >> Pytables-users mailing list
>> >> >>> > >> Pytables-users@...
>> >> >>> > >> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >>
>> >> >>> > >>
>> >> >>> > >
>> >> >>> > >
>> >> >>> > >
>> >> >>> >
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> > > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012,
>> HTML5,
>> >> CSS,
>> >> >>> > > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>> >> >>> current
>> >> >>> > > with LearnDevNow - 3,200 step-by-step video tutorials by
>> Microsoft
>> >> >>> > > MVPs and experts. ON SALE this month only -- learn more at:
>> >> >>> > > http://p.sf.net/sfu/learnmore_122712
>> >> >>> > > _______________________________________________
>> >> >>> > > Pytables-users mailing list
>> >> >>> > > Pytables-users@...
>> >> >>> > > https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> > >
>> >> >>> > >
>> >> >>> > -------------- next part --------------
>> >> >>> > An HTML attachment was scrubbed...
>> >> >>> >
>> >> >>> > ------------------------------
>> >> >>> >
>> >> >>> >
>> >> >>> >
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5,
>> >> CSS,
>> >> >>> > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
>> >> current
>> >> >>> > with LearnDevNow - 3,200 step-by-step video tutorials by
>> Microsoft
>> >> >>> > MVPs and experts. ON SALE this month only -- learn more at:
>> >> >>> > http://p.sf.net/sfu/learnmore_122712
>> >> >>> >
>> >> >>> > ------------------------------
>> >> >>> >
>> >> >>> > _______________________________________________
>> >> >>> > Pytables-users mailing list
>> >> >>> > Pytables-users@...
>> >> >>> > https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>> >
>> >> >>> >
>> >> >>> > End of Pytables-users Digest, Vol 80, Issue 8
>> >> >>> > *********************************************
>> >> >>> >
>> >> >>> -------------- next part --------------
>> >> >>> An HTML attachment was scrubbed...
>> >> >>>
>> >> >>> ------------------------------
>> >> >>>
>> >> >>>
>> >> >>>
>> >>
>> ------------------------------------------------------------------------------
>> >> >>> Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and
>> >> >>> much more. Get web development skills now with LearnDevNow -
>> >> >>> 350+ hours of step-by-step video tutorials by Microsoft MVPs and
>> >> experts.
>> >> >>> SALE $99.99 this month only -- learn more at:
>> >> >>> http://p.sf.net/sfu/learnmore_122812
>> >> >>>
>> >> >>> ------------------------------
>> >> >>>
>> >> >>> _______________________________________________
>> >> >>> Pytables-users mailing list
>> >> >>> Pytables-users@...
>> >> >>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >>>
>> >> >>>
>> >> >>> End of Pytables-users Digest, Vol 80, Issue 9
>> >> >>> *********************************************
>> >> >>>
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >> >
>> >>
>> ------------------------------------------------------------------------------
>> >> > Everyone hates slow websites. So do we.
>> >> > Make your web apps faster with AppDynamics
>> >> > Download AppDynamics Lite for free today:
>> >> > http://p.sf.net/sfu/appdyn_d2d_jan
>> >> > _______________________________________________
>> >> > Pytables-users mailing list
>> >> > Pytables-users@...
>> >> > https://lists.sourceforge.net/lists/listinfo/pytables-users
>> >> >
>> >> >
>> >> -------------- next part --------------
>> >> An HTML attachment was scrubbed...
>> >>
>> >> ------------------------------
>> >>
>> >>
>> >>
>> ------------------------------------------------------------------------------
>> >> Everyone hates slow websites. So do we.
>> >> Make your web apps faster with AppDynamics
>> >> Download AppDynamics Lite for free today:
>> >> http://p.sf.net/sfu/appdyn_d2d_jan
>> >>
>> >> ------------------------------
>> >>
>> >> _______________________________________________
>> >> Pytables-users
>
> ...
>
> [Message clipped]
>
> ------------------------------------------------------------------------------
> Everyone hates slow websites. So do we.
> Make your web apps faster with AppDynamics
> Download AppDynamics Lite for free today:
> http://p.sf.net/sfu/appdyn_d2d_jan
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@...
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
|