|
From: David R. <dav...@gm...> - 2013-02-04 19:41:54
|
I didn't have any luck. I replaced that __iter__ function which led to me
replacing the read function which lead to me replaceing the _read function
and I eventually got another error.
Below are 2 functions and my HDF5 Table class declaration. They should be
self explanatory. I wasn't sure if attachments would go through and this
is pretty small, so I figured it would be ok just to post. I apologize if
this is a bit cluttered. I would also appreciate any comments on how I
assign the results to the matrix D, this does not seem very pythonic at all
and could use some advice there if its easy. (the ii*jj is just a place
holder for a more sophisticated measure). Thanks again!
import numpy as np
import tables as tb
class Iris(tb.IsDescription):
subject_id = tb.IntCol()
iris_id = tb.IntCol()
database = tb.StringCol(5)
is_left = tb.BoolCol()
is_flipped = tb.BoolCol()
templates = tb.BoolCol(shape=(17, 20*480))
masks1 = tb.BoolCol(shape=(17, 20*480))
phasors = tb.ComplexCol(itemsize=8, shape=(17, 20*240))
masks2 = tb.BoolCol(shape=(17, 20*240))
def create_hdf5():
"""
"""
with tb.openFile('test.h5', 'w') as f:
# Create and fill the table of irises",
irises = f.createTable(f.root, 'irises', Iris, 'Irises',
filters=tb.Filters(1))
for ii in range(4620):
r = irises.row
r['subject_id'] = ii
r['iris_id'] = 0
r['database'] = 'test'
r['is_left'] = True
r['is_flipped'] = False
r['templates'] = np.empty((17, 20*480), np.bool8)
r['masks1'] = np.empty((17, 20*480), np.bool8)
r['phasors'] = np.empty((17, 20*240)) + 1j*np.empty((17, 20*240))
r['masks2'] = np.empty((17, 20*240), np.bool8)
r.append()
irises.flush()
def get_hd():
"""
"""
from itertools import combinations, izip
with tb.openFile('test.h5') as f:
irises = f.root.irises
templates = f.root.irises.cols.templates
masks = f.root.irises.cols.masks1
N_irises = len(irises)
print '%i Comparisons' % (N_irises*(N_irises - 1)/2)
D = np.empty((N_irises, N_irises))
for (t1, m1, ii), (t2, m2, jj) in combinations(izip(templates, masks,
range(N_irises)), 2):
D[ii, jj] = ii*jj
np.save('test', D)
On Mon, Feb 4, 2013 at 11:16 AM, <
pyt...@li...> wrote:
> Send Pytables-users mailing list submissions to
> pyt...@li...
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.sourceforge.net/lists/listinfo/pytables-users
> or, via email, send a message with subject or body 'help' to
> pyt...@li...
>
> You can reach the person managing the list at
> pyt...@li...
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Pytables-users digest..."
>
>
> Today's Topics:
>
> 1. Re: Pytables-users Digest, Vol 81, Issue 7 (Anthony Scopatz)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 4 Feb 2013 10:16:24 -0600
> From: Anthony Scopatz <sc...@gm...>
> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 81, Issue 7
> To: Discussion list for PyTables
> <pyt...@li...>
> Message-ID:
> <
> CAP...@ma...>
> Content-Type: text/plain; charset="iso-8859-1"
>
> On Mon, Feb 4, 2013 at 9:53 AM, David Reed <dav...@gm...> wrote:
>
> > Hi Josh,
> >
> > Here is my __iter__ code:
> >
> > def __iter__(self):
> > table = self.table
> > itemsize = self.dtype.itemsize
> > nrowsinbuf = table._v_file.params['IO_BUFFER_SIZE'] // itemsize
> > max_row = len(self)
> > for start_row in xrange(0, len(self), nrowsinbuf):
> > end_row = min([start_row + nrowsinbuf, max_row])
> > buf = table.read(start_row, end_row, 1, field=self.pathname)
> > for row in buf:
> > yield row
> >
> > It does look different, I will try swapping in the code from github and
> > see what happens.
> >
>
> Yes, please let us know how that goes! Otherwise send the list both the
> test data generator script and the script that fails.
>
> Be Well
> Anthony
>
>
> >
> >
> > On Mon, Feb 4, 2013 at 9:59 AM, <
> > pyt...@li...> wrote:
> >
> >> Send Pytables-users mailing list submissions to
> >> pyt...@li...
> >>
> >> To subscribe or unsubscribe via the World Wide Web, visit
> >> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> or, via email, send a message with subject or body 'help' to
> >> pyt...@li...
> >>
> >> You can reach the person managing the list at
> >> pyt...@li...
> >>
> >> When replying, please edit your Subject line so it is more specific
> >> than "Re: Contents of Pytables-users digest..."
> >>
> >>
> >> Today's Topics:
> >>
> >> 1. Re: Pytables-users Digest, Vol 81, Issue 4 (Josh Ayers)
> >> 2. Re: Pytables-users Digest, Vol 81, Issue 6 (David Reed)
> >>
> >>
> >> ----------------------------------------------------------------------
> >>
> >> Message: 1
> >> Date: Fri, 1 Feb 2013 14:08:47 -0800
> >> From: Josh Ayers <jos...@gm...>
> >> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 81, Issue 4
> >> To: Discussion list for PyTables
> >> <pyt...@li...>
> >> Message-ID:
> >> <CACOB4aPG4NZ6b2a3v=
> >> 1Ue...@ma...>
> >> Content-Type: text/plain; charset="iso-8859-1"
> >>
> >> David,
> >>
> >> You added a custom version of table.Column.__iter__, correct? Could you
> >> also include that along with the script to reproduce the error?
> >>
> >> It seems like the problem may be in the 'nrowsinbuf' calculation - see
> >> [1]. Each of your rows is 17 x 9600 = 163200 bytes. If you're using
> the
> >> default 1MB value for IO_BUFFER_SIZE, it should be reading in rows of 6
> >> chunks. Instead, it's reading the entire table.
> >>
> >> [1]:
> >> https://github.com/PyTables/PyTables/blob/develop/tables/table.py#L3296
> >>
> >>
> >>
> >> On Fri, Feb 1, 2013 at 1:50 PM, Anthony Scopatz <sc...@gm...>
> >> wrote:
> >>
> >> >
> >> >
> >> > On Fri, Feb 1, 2013 at 3:27 PM, David Reed <dav...@gm...>
> >> wrote:
> >> >
> >> >> at the error:
> >> >>
> >> >> result = numpy.empty(shape=nrows, dtype=dtypeField)
> >> >>
> >> >> nrows = 4620 and dtypeField is ('bool', (17, 9600))
> >> >>
> >> >> I'm not sure what that means as a dtype, but thats what it is.
> >> >>
> >> >> Forgive me if I'm being totally naive, but I thought the whole point
> of
> >> >> __iter__ with pyttables was to do iteration on the fly, so there is
> no
> >> >> preallocation.
> >> >>
> >> >
> >> > Nope you are not being naive at all. That is the point.
> >> >
> >> >
> >> >> If you have any ideas on this I'm all ears.
> >> >>
> >> >
> >> > If you could send a minimal script which reproduces this error, that
> >> would
> >> > help a lot.
> >> >
> >> > Be Well
> >> > Anthony
> >> >
> >> >
> >> >>
> >> >>
> >> >> Thanks again.
> >> >>
> >> >> Dave
> >> >>
> >> >>
> >> >> On Fri, Feb 1, 2013 at 3:45 PM, <
> >> >> pyt...@li...> wrote:
> >> >>
> >> >>> Send Pytables-users mailing list submissions to
> >> >>> pyt...@li...
> >> >>>
> >> >>> To subscribe or unsubscribe via the World Wide Web, visit
> >> >>> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> >>> or, via email, send a message with subject or body 'help' to
> >> >>> pyt...@li...
> >> >>>
> >> >>> You can reach the person managing the list at
> >> >>> pyt...@li...
> >> >>>
> >> >>> When replying, please edit your Subject line so it is more specific
> >> >>> than "Re: Contents of Pytables-users digest..."
> >> >>>
> >> >>>
> >> >>> Today's Topics:
> >> >>>
> >> >>> 1. Re: Pytables-users Digest, Vol 81, Issue 2 (Anthony Scopatz)
> >> >>>
> >> >>>
> >> >>>
> ----------------------------------------------------------------------
> >> >>>
> >> >>> Message: 1
> >> >>> Date: Fri, 1 Feb 2013 14:44:40 -0600
> >> >>> From: Anthony Scopatz <sc...@gm...>
> >> >>> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 81, Issue 2
> >> >>> To: Discussion list for PyTables
> >> >>> <pyt...@li...>
> >> >>> Message-ID:
> >> >>> <
> >> >>> CAP...@ma...>
> >> >>> Content-Type: text/plain; charset="iso-8859-1"
> >> >>>
> >> >>> On Fri, Feb 1, 2013 at 12:43 PM, David Reed <dav...@gm...
> >
> >> >>> wrote:
> >> >>>
> >> >>> > Hi Anthony,
> >> >>> >
> >> >>> > Thanks for the reply.
> >> >>> >
> >> >>> > I honestly don't know how to monitor my Python memory usage, but
> I'm
> >> >>> sure
> >> >>> > that its caused by out of memory.
> >> >>> >
> >> >>>
> >> >>> Well, I would just run top or process monitor or something while
> >> running
> >> >>> the python script to see what happens to memory usage as the script
> >> chugs
> >> >>> along...
> >> >>>
> >> >>>
> >> >>> > I'm just trying to find out how to fix it. My HDF5 table has
> 4620
> >> >>> rows
> >> >>> > and the column I'm iterating over is a 17x9600 boolean matrix.
> The
> >> >>> > __iter__ method is preallocating an array that is this size which
> >> >>> appears
> >> >>> > to be root of the error. I was hoping there is a fix somewhere in
> >> >>> here to
> >> >>> > not have to do this preallocation.
> >> >>> >
> >> >>>
> >> >>> So a 17x9600 boolean matrix should only be 0.155 MB in space. 4620
> of
> >> >>> these is ~760 MB. If you have 2 GB of memory and you are iterating
> >> over
> >> >>> 2
> >> >>> of these (templates & masks) it is conceivable that you are just
> >> running
> >> >>> out of memory. Maybe there is a way that __iter__ could not
> >> preallocate
> >> >>> something that is basically a temporary. What is the dtype of the
> >> >>> templates array?
> >> >>>
> >> >>> Be Well
> >> >>> Anthony
> >> >>>
> >> >>>
> >> >>> >
> >> >>> > Thanks again.
> >> >>>
> >> >>>
> >> -------------- next part --------------
> >> An HTML attachment was scrubbed...
> >>
> >> ------------------------------
> >>
> >> Message: 2
> >> Date: Mon, 4 Feb 2013 09:58:53 -0500
> >> From: David Reed <dav...@gm...>
> >> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 81, Issue 6
> >> To: pyt...@li...
> >> Message-ID:
> >> <CAM6XA7=
> >> h50...@ma...>
> >> Content-Type: text/plain; charset="iso-8859-1"
> >>
> >> Hi Anthony,
> >>
> >> Sorry to just get back to you. I can send a script, should I send a
> script
> >> that creates some fake data as well?
> >>
> >> -Dave
> >>
> >>
> >> On Fri, Feb 1, 2013 at 4:50 PM, <
> >> pyt...@li...> wrote:
> >>
> >> > Send Pytables-users mailing list submissions to
> >> > pyt...@li...
> >> >
> >> > To subscribe or unsubscribe via the World Wide Web, visit
> >> > https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > or, via email, send a message with subject or body 'help' to
> >> > pyt...@li...
> >> >
> >> > You can reach the person managing the list at
> >> > pyt...@li...
> >> >
> >> > When replying, please edit your Subject line so it is more specific
> >> > than "Re: Contents of Pytables-users digest..."
> >> >
> >> >
> >> > Today's Topics:
> >> >
> >> > 1. Re: Pytables-users Digest, Vol 81, Issue 4 (Anthony Scopatz)
> >> >
> >> >
> >> > ----------------------------------------------------------------------
> >> >
> >> > Message: 1
> >> > Date: Fri, 1 Feb 2013 15:50:11 -0600
> >> > From: Anthony Scopatz <sc...@gm...>
> >> > Subject: Re: [Pytables-users] Pytables-users Digest, Vol 81, Issue 4
> >> > To: Discussion list for PyTables
> >> > <pyt...@li...>
> >> > Message-ID:
> >> > <
> >> > CAP...@ma...>
> >> > Content-Type: text/plain; charset="iso-8859-1"
> >> >
> >> > On Fri, Feb 1, 2013 at 3:27 PM, David Reed <dav...@gm...>
> >> wrote:
> >> >
> >> > > at the error:
> >> > >
> >> > > result = numpy.empty(shape=nrows, dtype=dtypeField)
> >> > >
> >> > > nrows = 4620 and dtypeField is ('bool', (17, 9600))
> >> > >
> >> > > I'm not sure what that means as a dtype, but thats what it is.
> >> > >
> >> > > Forgive me if I'm being totally naive, but I thought the whole point
> >> of
> >> > > __iter__ with pyttables was to do iteration on the fly, so there is
> no
> >> > > preallocation.
> >> > >
> >> >
> >> > Nope you are not being naive at all. That is the point.
> >> >
> >> >
> >> > > If you have any ideas on this I'm all ears.
> >> > >
> >> >
> >> > If you could send a minimal script which reproduces this error, that
> >> would
> >> > help a lot.
> >> >
> >> > Be Well
> >> > Anthony
> >> >
> >> >
> >> > >
> >> > >
> >> > > Thanks again.
> >> > >
> >> > > Dave
> >> > >
> >> > >
> >> > > On Fri, Feb 1, 2013 at 3:45 PM, <
> >> > > pyt...@li...> wrote:
> >> > >
> >> > >> Send Pytables-users mailing list submissions to
> >> > >> pyt...@li...
> >> > >>
> >> > >> To subscribe or unsubscribe via the World Wide Web, visit
> >> > >>
> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > >> or, via email, send a message with subject or body 'help' to
> >> > >> pyt...@li...
> >> > >>
> >> > >> You can reach the person managing the list at
> >> > >> pyt...@li...
> >> > >>
> >> > >> When replying, please edit your Subject line so it is more specific
> >> > >> than "Re: Contents of Pytables-users digest..."
> >> > >>
> >> > >>
> >> > >> Today's Topics:
> >> > >>
> >> > >> 1. Re: Pytables-users Digest, Vol 81, Issue 2 (Anthony Scopatz)
> >> > >>
> >> > >>
> >> > >>
> >> ----------------------------------------------------------------------
> >> > >>
> >> > >> Message: 1
> >> > >> Date: Fri, 1 Feb 2013 14:44:40 -0600
> >> > >> From: Anthony Scopatz <sc...@gm...>
> >> > >> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 81, Issue
> 2
> >> > >> To: Discussion list for PyTables
> >> > >> <pyt...@li...>
> >> > >> Message-ID:
> >> > >> <
> >> > >> CAP...@ma...
> >
> >> > >> Content-Type: text/plain; charset="iso-8859-1"
> >> > >>
> >> > >> On Fri, Feb 1, 2013 at 12:43 PM, David Reed <
> dav...@gm...>
> >> > >> wrote:
> >> > >>
> >> > >> > Hi Anthony,
> >> > >> >
> >> > >> > Thanks for the reply.
> >> > >> >
> >> > >> > I honestly don't know how to monitor my Python memory usage, but
> >> I'm
> >> > >> sure
> >> > >> > that its caused by out of memory.
> >> > >> >
> >> > >>
> >> > >> Well, I would just run top or process monitor or something while
> >> running
> >> > >> the python script to see what happens to memory usage as the script
> >> > chugs
> >> > >> along...
> >> > >>
> >> > >>
> >> > >> > I'm just trying to find out how to fix it. My HDF5 table has
> 4620
> >> > rows
> >> > >> > and the column I'm iterating over is a 17x9600 boolean matrix.
> The
> >> > >> > __iter__ method is preallocating an array that is this size which
> >> > >> appears
> >> > >> > to be root of the error. I was hoping there is a fix somewhere
> in
> >> > here
> >> > >> to
> >> > >> > not have to do this preallocation.
> >> > >> >
> >> > >>
> >> > >> So a 17x9600 boolean matrix should only be 0.155 MB in space. 4620
> >> of
> >> > >> these is ~760 MB. If you have 2 GB of memory and you are iterating
> >> > over 2
> >> > >> of these (templates & masks) it is conceivable that you are just
> >> running
> >> > >> out of memory. Maybe there is a way that __iter__ could not
> >> preallocate
> >> > >> something that is basically a temporary. What is the dtype of the
> >> > >> templates array?
> >> > >>
> >> > >> Be Well
> >> > >> Anthony
> >> > >>
> >> > >>
> >> > >> >
> >> > >> > Thanks again.
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> >
> >> > >> > On Fri, Feb 1, 2013 at 11:12 AM, <
> >> > >> > pyt...@li...> wrote:
> >> > >> >
> >> > >> >> Send Pytables-users mailing list submissions to
> >> > >> >> pyt...@li...
> >> > >> >>
> >> > >> >> To subscribe or unsubscribe via the World Wide Web, visit
> >> > >> >>
> >> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > >> >> or, via email, send a message with subject or body 'help' to
> >> > >> >> pyt...@li...
> >> > >> >>
> >> > >> >> You can reach the person managing the list at
> >> > >> >> pyt...@li...
> >> > >> >>
> >> > >> >> When replying, please edit your Subject line so it is more
> >> specific
> >> > >> >> than "Re: Contents of Pytables-users digest..."
> >> > >> >>
> >> > >> >>
> >> > >> >> Today's Topics:
> >> > >> >>
> >> > >> >> 1. Re: Pytables-users Digest, Vol 80, Issue 9 (Anthony
> Scopatz)
> >> > >> >>
> >> > >> >>
> >> > >> >>
> >> > ----------------------------------------------------------------------
> >> > >> >>
> >> > >> >> Message: 1
> >> > >> >> Date: Fri, 1 Feb 2013 10:11:47 -0600
> >> > >> >> From: Anthony Scopatz <sc...@gm...>
> >> > >> >> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 80,
> >> Issue 9
> >> > >> >> To: Discussion list for PyTables
> >> > >> >> <pyt...@li...>
> >> > >> >> Message-ID:
> >> > >> >> <
> >> > >> >>
> >> CAP...@ma...>
> >> > >> >> Content-Type: text/plain; charset="iso-8859-1"
> >> > >> >>
> >> > >> >> Hi David,
> >> > >> >>
> >> > >> >> Sorry, I haven't had a ton of time recently. You seem to be
> >> getting
> >> > a
> >> > >> >> memory error on creating a numpy array. This kind of thing
> >> typically
> >> > >> >> happens when you are out of memory. Does this seem to be the
> case
> >> > with
> >> > >> >> you? When this dies, is your memory usage at 100%? If so, this
> >> > >> algorithm
> >> > >> >> might require a little tweaking...
> >> > >> >>
> >> > >> >> Be Well
> >> > >> >> Anthony
> >> > >> >>
> >> > >> >>
> >> > >> >> On Fri, Feb 1, 2013 at 6:15 AM, David Reed <
> >> dav...@gm...>
> >> > >> >> wrote:
> >> > >> >>
> >> > >> >> > I'm still having problems with this one. I can't tell if this
> >> > >> something
> >> > >> >> > dumb Im doing with itertools, or if its something in pytables.
> >> > >> >> >
> >> > >> >> > Would appreciate any help.
> >> > >> >> >
> >> > >> >> > Thanks
> >> > >> >> >
> >> > >> >> >
> >> > >> >> > On Wed, Jan 30, 2013 at 5:00 PM, David Reed <
> >> > dav...@gm...
> >> > >> >> >wrote:
> >> > >> >> >
> >> > >> >> >> I think I have to reopen this issue. I have been running
> fine
> >> for
> >> > >> >> awhile
> >> > >> >> >> using the combinations method from itertools, but have
> recently
> >> > run
> >> > >> >> into a
> >> > >> >> >> memory since I have recently quadrupled the size of the hdf
> >> file.
> >> > >> >> >>
> >> > >> >> >> Here is my code again:
> >> > >> >> >>
> >> > >> >> >> from itertools import combinations, izip
> >> > >> >> >> with tb.openFile(h5_all, 'r') as f:
> >> > >> >> >> irises = f.root.irises
> >> > >> >> >>
> >> > >> >> >> templates = f.root.irises.cols.templates
> >> > >> >> >> masks = f.root.irises.cols.masks1
> >> > >> >> >>
> >> > >> >> >> N_irises = len(irises)
> >> > >> >> >> index = np.ones((20 * 480), np.bool)
> >> > >> >> >>
> >> > >> >> >> print '%i Comparisons' % (N_irises*(N_irises - 1)/2)
> >> > >> >> >> D = np.empty((N_irises, N_irises))
> >> > >> >> >> for (t1, m1, ii), (t2, m2, jj) in
> combinations(izip(templates,
> >> > >> masks,
> >> > >> >> >> range(N_irises)), 2):
> >> > >> >> >> # print ii
> >> > >> >> >> D[ii, jj] = ham_dist(
> >> > >> >> >> t1[8, index],
> >> > >> >> >> t2[:, index],
> >> > >> >> >> m1[8, index],
> >> > >> >> >> m2[:, index],
> >> > >> >> >> )
> >> > >> >> >>
> >> > >> >> >> And here is the error:
> >> > >> >> >>
> >> > >> >> >> In [10]: get_hd3()
> >> > >> >> >> 10669890 Comparisons
> >> > >> >> >>
> >> > >> >> >>
> >> > >> >>
> >> > >>
> >> >
> >>
> ---------------------------------------------------------------------------
> >> > >> >> >> MemoryError Traceback (most
> >> recent
> >> > >> call
> >> > >> >> >> last)
> >> > >> >> >> <ipython-input-10-cfb255ce7bd1> in <module>()
> >> > >> >> >> ----> 1 get_hd3()
> >> > >> >> >>
> >> > >> >> >>
> >> > >> >> >> 118 print '%i Comparisons' %
> >> > >> (N_irises*(N_irises -
> >> > >> >> >> 1)/2)
> >> > >> >> >> 119 D = np.empty((N_irises, N_irises))
> >> > >> >> >> --> 120 for (t1, m1, ii), (t2, m2, jj) in
> >> > >> >> >> combinations(izip(temp
> >> > >> >> >> lates, masks, range(N_irises)), 2):
> >> > >> >> >> 121 # print ii
> >> > >> >> >> 122 D[ii, jj] = ham_dist(
> >> > >> >> >>
> >> > >> >> >> c:\python27\lib\site-packages\tables\table.pyc in
> >> __iter__(self)
> >> > >> >> >> 3274 for start_row in xrange(0, len(self),
> >> nrowsinbuf):
> >> > >> >> >> 3275 end_row = min([start_row + nrowsinbuf,
> >> > max_row])
> >> > >> >> >> -> 3276 buf = table.read(start_row, end_row, 1,
> >> > >> >> >> field=self.pathname)
> >> > >> >> >>
> >> > >> >> >> 3277 for row in buf:
> >> > >> >> >> 3278 yield row
> >> > >> >> >>
> >> > >> >> >> c:\python27\lib\site-packages\tables\table.pyc in read(self,
> >> > start,
> >> > >> >> stop,
> >> > >> >> >> step,
> >> > >> >> >> field)
> >> > >> >> >> 1772 (start, stop, step) =
> >> > self._processRangeRead(start,
> >> > >> >> stop,
> >> > >> >> >> step)
> >> > >> >> >> 1773
> >> > >> >> >> -> 1774 arr = self._read(start, stop, step, field)
> >> > >> >> >> 1775 return internal_to_flavor(arr, self.flavor)
> >> > >> >> >> 1776
> >> > >> >> >>
> >> > >> >> >> c:\python27\lib\site-packages\tables\table.pyc in _read(self,
> >> > start,
> >> > >> >> >> stop, step,
> >> > >> >> >> field)
> >> > >> >> >> 1719 if field:
> >> > >> >> >> 1720 # Create a container for the results
> >> > >> >> >> -> 1721 result = numpy.empty(shape=nrows,
> >> > >> dtype=dtypeField)
> >> > >> >> >> 1722 else:
> >> > >> >> >> 1723 # Recarray case
> >> > >> >> >>
> >> > >> >> >> MemoryError:
> >> > >> >> >> > c:\python27\lib\site-packages\tables\table.py(1721)_read()
> >> > >> >> >> 1720 # Create a container for the results
> >> > >> >> >> -> 1721 result = numpy.empty(shape=nrows,
> >> > >> dtype=dtypeField)
> >> > >> >> >> 1722 else:
> >> > >> >> >>
> >> > >> >> >> Also, if you guys see any performance problems in my code,
> >> please
> >> > >> let
> >> > >> >> me
> >> > >> >> >> know.
> >> > >> >> >>
> >> > >> >> >> Thank you so much for the help.
> >> > >> >> >>
> >> > >> >> >> -Dave
> >> > >> >> >>
> >> > >> >> >>
> >> > >> >> >> On Fri, Jan 4, 2013 at 8:57 AM, <
> >> > >> >> >> pyt...@li...> wrote:
> >> > >> >> >>
> >> > >> >> >>> Send Pytables-users mailing list submissions to
> >> > >> >> >>> pyt...@li...
> >> > >> >> >>>
> >> > >> >> >>> To subscribe or unsubscribe via the World Wide Web, visit
> >> > >> >> >>>
> >> > >> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > >> >> >>> or, via email, send a message with subject or body 'help' to
> >> > >> >> >>> pyt...@li...
> >> > >> >> >>>
> >> > >> >> >>> You can reach the person managing the list at
> >> > >> >> >>> pyt...@li...
> >> > >> >> >>>
> >> > >> >> >>> When replying, please edit your Subject line so it is more
> >> > specific
> >> > >> >> >>> than "Re: Contents of Pytables-users digest..."
> >> > >> >> >>>
> >> > >> >> >>>
> >> > >> >> >>> Today's Topics:
> >> > >> >> >>>
> >> > >> >> >>> 1. Re: Pytables-users Digest, Vol 80, Issue 8 (David
> Reed)
> >> > >> >> >>>
> >> > >> >> >>>
> >> > >> >> >>>
> >> > >>
> >> ----------------------------------------------------------------------
> >> > >> >> >>>
> >> > >> >> >>> Message: 1
> >> > >> >> >>> Date: Fri, 4 Jan 2013 08:56:28 -0500
> >> > >> >> >>> From: David Reed <dav...@gm...>
> >> > >> >> >>> Subject: Re: [Pytables-users] Pytables-users Digest, Vol 80,
> >> > Issue
> >> > >> 8
> >> > >> >> >>> To: pyt...@li...
> >> > >> >> >>> Message-ID:
> >> > >> >> >>> <
> >> > >> >> >>>
> >> > CAM...@ma...
> >> > >> >
> >> > >> >> >>> Content-Type: text/plain; charset="iso-8859-1"
> >> > >> >> >>>
> >> > >> >> >>> I can't thank you guys enough for the help. I was able to
> add
> >> > the
> >> > >> >> >>> __iter__
> >> > >> >> >>> function to the table.py file and everything seems to be
> >> working
> >> > >> >> great!
> >> > >> >> >>> I'm not quite as fast as I was with iterating right of a
> >> matrix
> >> > >> but
> >> > >> >> >>> pretty
> >> > >> >> >>> close. I was at 555 comparisons per second, and now im at
> >> 420.
> >> > >> >> >>>
> >> > >> >> >>> I handled the problem I mentioned earlier by doing this, and
> >> it
> >> > >> seems
> >> > >> >> to
> >> > >> >> >>> work great:
> >> > >> >> >>>
> >> > >> >> >>> A = f.root.data.cols.A
> >> > >> >> >>> B = f.root.data.cols.B
> >> > >> >> >>>
> >> > >> >> >>> D = np.empty((len(A), len(A))
> >> > >> >> >>> for (a1, b1, ii), (a2, b2, jj) in combinations(izip(A, B,
> >> > >> >> range(len(A))),
> >> > >> >> >>> 2):
> >> > >> >> >>> D[ii, jj] = compare(a1, a2, b1, b2)
> >> > >> >> >>>
> >> > >> >> >>> Again, thanks a lot.
> >> > >> >> >>>
> >> > >> >> >>> -Dave
> >> > >> >> >>>
> >> > >> >> >>>
> >> > >> >> >>> On Thu, Jan 3, 2013 at 6:31 PM, <
> >> > >> >> >>> pyt...@li...> wrote:
> >> > >> >> >>>
> >> > >> >> >>> > Send Pytables-users mailing list submissions to
> >> > >> >> >>> > pyt...@li...
> >> > >> >> >>> >
> >> > >> >> >>> > To subscribe or unsubscribe via the World Wide Web, visit
> >> > >> >> >>> >
> >> > >> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > >> >> >>> > or, via email, send a message with subject or body 'help'
> to
> >> > >> >> >>> > pyt...@li...
> >> > >> >> >>> >
> >> > >> >> >>> > You can reach the person managing the list at
> >> > >> >> >>> > pyt...@li...
> >> > >> >> >>> >
> >> > >> >> >>> > When replying, please edit your Subject line so it is more
> >> > >> specific
> >> > >> >> >>> > than "Re: Contents of Pytables-users digest..."
> >> > >> >> >>> >
> >> > >> >> >>> >
> >> > >> >> >>> > Today's Topics:
> >> > >> >> >>> >
> >> > >> >> >>> > 1. Re: Pytables-users Digest, Vol 80, Issue 3 (Anthony
> >> > >> Scopatz)
> >> > >> >> >>> > 2. Re: Pytables-users Digest, Vol 80, Issue 4 (Anthony
> >> > >> Scopatz)
> >> > >> >> >>> >
> >> > >> >> >>> >
> >> > >> >> >>> >
> >> > >> >>
> >> > ----------------------------------------------------------------------
> >> > >> >> >>> >
> >> > >> >> >>> > Message: 1
> >> > >> >> >>> > Date: Thu, 3 Jan 2013 17:26:55 -0600
> >> > >> >> >>> > From: Anthony Scopatz <sc...@gm...>
> >> > >> >> >>> > Subject: Re: [Pytables-users] Pytables-users Digest, Vol
> 80,
> >> > >> Issue 3
> >> > >> >> >>> > To: Discussion list for PyTables
> >> > >> >> >>> > <pyt...@li...>
> >> > >> >> >>> > Message-ID:
> >> > >> >> >>> > <CAPk-6T6sz=J5ay_a9YGLPe_yBLGa9c+XgxG0CRNs6fJ=
> >> > >> >> >>> > Gz...@ma...>
> >> > >> >> >>> > Content-Type: text/plain; charset="iso-8859-1"
> >> > >> >> >>> >
> >> > >> >> >>> > On Thu, Jan 3, 2013 at 2:17 PM, David Reed <
> >> > >> dav...@gm...>
> >> > >> >> >>> wrote:
> >> > >> >> >>> >
> >> > >> >> >>> > > Thanks a lot for the help so far guys!
> >> > >> >> >>> > >
> >> > >> >> >>> > > Looking at itertools, I found what I believe to be the
> >> > perfect
> >> > >> >> >>> function
> >> > >> >> >>> > > for what I need, itertools.combinations. This appears to
> >> be a
> >> > >> >> valid
> >> > >> >> >>> > > replacement to the method proposed.
> >> > >> >> >>> > >
> >> > >> >> >>> >
> >> > >> >> >>> > Yes, combinations is awesome!
> >> > >> >> >>> >
> >> > >> >> >>> >
> >> > >> >> >>> > >
> >> > >> >> >>> > > There is a small problem that I didn't mention is that
> my
> >> > >> compare
> >> > >> >> >>> > function
> >> > >> >> >>> > > actually takes as inputs 2 columns from the table. Like
> >> so:
> >> > >> >> >>> > >
> >> > >> >> >>> > > D = np.empty((N_irises, N_irises))
> >> > >> >> >>> > > for ii in xrange(N_elements):
> >> > >> >> >>> > > for jj in xrange(ii+1, N_elements):
> >> > >> >> >>> > > D[ii, jj] = compare(data['element1'][ii],
> >> > >> >> >>> > data['element1'][jj],data['element2'][ii],
> >> > >> >> >>> > > data['element2'][jj])
> >> > >> >> >>> > >
> >> > >> >> >>> > > Is there an efficient way of using itertools with this
> >> > >> structure?
> >> > >> >> >>> > >
> >> > >> >> >>> >
> >> > >> >> >>> > You can always make two other iterators for each column.
> >> Since
> >> > >> you
> >> > >> >> >>> have
> >> > >> >> >>> > two columns you would have 4 iterators. I am not sure how
> >> fast
> >> > >> >> this is
> >> > >> >> >>> > going to be but I am confident that there is definitely a
> >> way
> >> > to
> >> > >> do
> >> > >> >> >>> this in
> >> > >> >> >>> > one for-loop, which is going to be way faster than nested
> >> > loops.
> >> > >> >> >>> >
> >> > >> >> >>> > Be Well
> >> > >> >> >>> > Anthony
> >> > >> >> >>> >
> >> > >> >> >>> >
> >> > >> >> >>> > >
> >> > >> >> >>> > >
> >> > >> >> >>> > > On Thu, Jan 3, 2013 at 1:29 PM, <
> >> > >> >> >>> > > pyt...@li...> wrote:
> >> > >> >> >>> > >
> >> > >> >> >>> > >> Send Pytables-users mailing list submissions to
> >> > >> >> >>> > >> pyt...@li...
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> To subscribe or unsubscribe via the World Wide Web,
> visit
> >> > >> >> >>> > >>
> >> > >> >> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > >> >> >>> > >> or, via email, send a message with subject or body
> >> 'help' to
> >> > >> >> >>> > >> pyt...@li...
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> You can reach the person managing the list at
> >> > >> >> >>> > >> pyt...@li...
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> When replying, please edit your Subject line so it is
> >> more
> >> > >> >> specific
> >> > >> >> >>> > >> than "Re: Contents of Pytables-users digest..."
> >> > >> >> >>> > >>
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> Today's Topics:
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> 1. Re: Nested Iteration of HDF5 using PyTables (Josh
> >> > Ayers)
> >> > >> >> >>> > >>
> >> > >> >> >>> > >>
> >> > >> >> >>> > >>
> >> > >> >> >>>
> >> > >>
> >> ----------------------------------------------------------------------
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> Message: 1
> >> > >> >> >>> > >> Date: Thu, 3 Jan 2013 10:29:33 -0800
> >> > >> >> >>> > >> From: Josh Ayers <jos...@gm...>
> >> > >> >> >>> > >> Subject: Re: [Pytables-users] Nested Iteration of HDF5
> >> using
> >> > >> >> >>> PyTables
> >> > >> >> >>> > >> To: Discussion list for PyTables
> >> > >> >> >>> > >> <pyt...@li...>
> >> > >> >> >>> > >> Message-ID:
> >> > >> >> >>> > >> <
> >> > >> >> >>> > >>
> >> > >> >>
> >> CAC...@ma...>
> >> > >> >> >>> > >> Content-Type: text/plain; charset="iso-8859-1"
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> David,
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> The change in issue 27 was only for iteration over a
> >> > >> >> tables.Column
> >> > >> >> >>> > >> instance. To use it, tweak Anthony's code as follows.
> >> This
> >> > >> will
> >> > >> >> >>> > iterate
> >> > >> >> >>> > >> over the "element" column, as in your original example.
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> Note also that this will only work with the development
> >> > >> version
> >> > >> >> of
> >> > >> >> >>> > >> PyTables
> >> > >> >> >>> > >> available on github. It will be very slow using the
> >> > released
> >> > >> >> >>> v2.4.0.
> >> > >> >> >>> > >>
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> from itertools import izip
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> with tb.openFile(...) as f:
> >> > >> >> >>> > >> data = f.root.data.cols.element
> >> > >> >> >>> > >> data_i = iter(data)
> >> > >> >> >>> > >> data_j = iter(data)
> >> > >> >> >>> > >> data_i.next() # throw the first value away
> >> > >> >> >>> > >> for i, j in izip(data_i, data_j):
> >> > >> >> >>> > >> compare(i, j)
> >> > >> >> >>> > >>
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> Hope that helps,
> >> > >> >> >>> > >> Josh
> >> > >> >> >>> > >>
> >> > >> >> >>> > >>
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> On Thu, Jan 3, 2013 at 9:11 AM, Anthony Scopatz <
> >> > >> >> sc...@gm...>
> >> > >> >> >>> > >> wrote:
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> > HI David,
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >> > Tables and table column iteration have been
> overhauled
> >> > >> fairly
> >> > >> >> >>> recently
> >> > >> >> >>> > >> > [1]. So you might try creating two iterators, offset
> >> by
> >> > >> one,
> >> > >> >> and
> >> > >> >> >>> then
> >> > >> >> >>> > >> > doing the comparison. I am hacking this out super
> >> quick
> >> > so
> >> > >> >> please
> >> > >> >> >>> > >> forgive
> >> > >> >> >>> > >> > me:
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >> > from itertools import izip
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >> > with tb.openFile(...) as f:
> >> > >> >> >>> > >> > data = f.root.data
> >> > >> >> >>> > >> > data_i = iter(data)
> >> > >> >> >>> > >> > data_j = iter(data)
> >> > >> >> >>> > >> > data_i.next() # throw the first value away
> >> > >> >> >>> > >> > for i, j in izip(data_i, data_j):
> >> > >> >> >>> > >> > compare(i, j)
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >> > You get the idea ;)
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >> > Be Well
> >> > >> >> >>> > >> > Anthony
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >> > 1. https://github.com/PyTables/PyTables/issues/27
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >> > On Thu, Jan 3, 2013 at 9:25 AM, David Reed <
> >> > >> >> >>> dav...@gm...>
> >> > >> >> >>> > >> wrote:
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >> >> I was hoping someone could help me out here.
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> This is from a post I put up on StackOverflow,
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> I am have a fairly large dataset that I store in
> HDF5
> >> and
> >> > >> >> access
> >> > >> >> >>> > using
> >> > >> >> >>> > >> >> PyTables. One operation I need to do on this dataset
> >> are
> >> > >> >> pairwise
> >> > >> >> >>> > >> >> comparisons between each of the elements. This
> >> requires 2
> >> > >> >> loops,
> >> > >> >> >>> one
> >> > >> >> >>> > to
> >> > >> >> >>> > >> >> iterate over each element, and an inner loop to
> >> iterate
> >> > >> over
> >> > >> >> >>> every
> >> > >> >> >>> > >> other
> >> > >> >> >>> > >> >> element. This operation thus looks at N(N-1)/2
> >> > comparisons.
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> For fairly small sets I found it to be faster to
> dump
> >> the
> >> > >> >> >>> contents
> >> > >> >> >>> > >> into a
> >> > >> >> >>> > >> >> multdimensional numpy array and then do my
> iteration.
> >> I
> >> > run
> >> > >> >> into
> >> > >> >> >>> > >> problems
> >> > >> >> >>> > >> >> with large sets because of memory issues and need to
> >> > access
> >> > >> >> each
> >> > >> >> >>> > >> element of
> >> > >> >> >>> > >> >> the dataset at run time.
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> Putting the elements into an array gives me about
> 600
> >> > >> >> >>> comparisons per
> >> > >> >> >>> > >> >> second, while operating on hdf5 data itself gives me
> >> > about
> >> > >> 300
> >> > >> >> >>> > >> comparisons
> >> > >> >> >>> > >> >> per second.
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> Is there a way to speed this process up?
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> Example follows (this is not my real code, just an
> >> > >> example):
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> *Small Set*:
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> with tb.openFile(h5_file, 'r') as f:
> >> > >> >> >>> > >> >> data = f.root.data
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> N_elements = len(data)
> >> > >> >> >>> > >> >> elements = np.empty((N_irises, 1e5))
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> for ii, d in enumerate(data):
> >> > >> >> >>> > >> >> elements[ii] = data['element']
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> D = np.empty((N_irises, N_irises)) for ii in
> >> > >> >> xrange(N_elements):
> >> > >> >> >>> > >> >> for jj in xrange(ii+1, N_elements):
> >> > >> >> >>> > >> >> D[ii, jj] = compare(elements[ii],
> >> elements[jj])
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> *Large Set*:
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> with tb.openFile(h5_file, 'r') as f:
> >> > >> >> >>> > >> >> data = f.root.data
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> N_elements = len(data)
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >> D = np.empty((N_irises, N_irises))
> >> > >> >> >>> > >> >> for ii in xrange(N_elements):
> >> > >> >> >>> > >> >> for jj in xrange(ii+1, N_elements):
> >> > >> >> >>> > >> >> D[ii, jj] =
> compare(data['element'][ii],
> >> > >> >> >>> > >> data['element'][jj])
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >>
> >> > >> >> >>> >
> >> > >> >> >>>
> >> > >> >>
> >> > >>
> >> >
> >>
> ------------------------------------------------------------------------------
> >> > >> >> >>> > >> >> Master Visual Studio, SharePoint, SQL, ASP.NET, C#
> >> 2012,
> >> > >> >> HTML5,
> >> > >> >> >>> CSS,
> >> > >> >> >>> > >> >> MVC, Windows 8 Apps, JavaScript and much more. Keep
> >> your
> >> > >> >> skills
> >> > >> >> >>> > current
> >> > >> >> >>> > >> >> with LearnDevNow - 3,200 step-by-step video
> tutorials
> >> by
> >> > >> >> >>> Microsoft
> >> > >> >> >>> > >> >> MVPs and experts. ON SALE this month only -- learn
> >> more
> >> > at:
> >> > >> >> >>> > >> >> http://p.sf.net/sfu/learnmore_122712
> >> > >> >> >>> > >> >> _______________________________________________
> >> > >> >> >>> > >> >> Pytables-users mailing list
> >> > >> >> >>> > >> >> Pyt...@li...
> >> > >> >> >>> > >> >>
> >> > >> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >>
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >>
> >> > >> >> >>> >
> >> > >> >> >>>
> >> > >> >>
> >> > >>
> >> >
> >>
> ------------------------------------------------------------------------------
> >> > >> >> >>> > >> > Master Visual Studio, SharePoint, SQL, ASP.NET, C#
> >> 2012,
> >> > >> >> HTML5,
> >> > >> >> >>> CSS,
> >> > >> >> >>> > >> > MVC, Windows 8 Apps, JavaScript and much more. Keep
> >> your
> >> > >> skills
> >> > >> >> >>> > current
> >> > >> >> >>> > >> > with LearnDevNow - 3,200 step-by-step video tutorials
> >> by
> >> > >> >> Microsoft
> >> > >> >> >>> > >> > MVPs and experts. ON SALE this month only -- learn
> more
> >> > at:
> >> > >> >> >>> > >> > http://p.sf.net/sfu/learnmore_122712
> >> > >> >> >>> > >> > _______________________________________________
> >> > >> >> >>> > >> > Pytables-users mailing list
> >> > >> >> >>> > >> > Pyt...@li...
> >> > >> >> >>> > >> >
> >> > https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >> >
> >> > >> >> >>> > >> -------------- next part --------------
> >> > >> >> >>> > >> An HTML attachment was scrubbed...
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> ------------------------------
> >> > >> >> >>> > >>
> >> > >> >> >>> > >>
> >> > >> >> >>> > >>
> >> > >> >> >>> >
> >> > >> >> >>>
> >> > >> >>
> >> > >>
> >> >
> >>
> ------------------------------------------------------------------------------
> >> > >> >> >>> > >> Master Visual Studio, SharePoint, SQL, ASP.NET, C#
> 2012,
> >> > >> HTML5,
> >> > >> >> >>> CSS,
> >> > >> >> >>> > >> MVC, Windows 8 Apps, JavaScript and much more. Keep
> your
> >> > >> skills
> >> > >> >> >>> current
> >> > >> >> >>> > >> with LearnDevNow - 3,200 step-by-step video tutorials
> by
> >> > >> >> Microsoft
> >> > >> >> >>> > >> MVPs and experts. ON SALE this month only -- learn more
> >> at:
> >> > >> >> >>> > >> http://p.sf.net/sfu/learnmore_122712
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> ------------------------------
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> _______________________________________________
> >> > >> >> >>> > >> Pytables-users mailing list
> >> > >> >> >>> > >> Pyt...@li...
> >> > >> >> >>> > >>
> >> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > >> >> >>> > >>
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> End of Pytables-users Digest, Vol 80, Issue 3
> >> > >> >> >>> > >> *********************************************
> >> > >> >> >>> > >>
> >> > >> >> >>> > >
> >> > >> >> >>> > >
> >> > >> >> >>> > >
> >> > >> >> >>> > >
> >> > >> >> >>> >
> >> > >> >> >>>
> >> > >> >>
> >> > >>
> >> >
> >>
> ------------------------------------------------------------------------------
> >> > >> >> >>> > > Master Visual Studio, SharePoint, SQL, ASP.NET, C#
> 2012,
> >> > >> HTML5,
> >> > >> >> CSS,
> >> > >> >> >>> > > MVC, Windows 8 Apps, JavaScript and much more. Keep your
> >> > skills
> >> > >> >> >>> current
> >> > >> >> >>> > > with LearnDevNow - 3,200 step-by-step video tutorials by
> >> > >> Microsoft
> >> > >> >> >>> > > MVPs and experts. ON SALE this month only -- learn more
> >> at:
> >> > >> >> >>> > > http://p.sf.net/sfu/learnmore_122712
> >> > >> >> >>> > > _______________________________________________
> >> > >> >> >>> > > Pytables-users mailing list
> >> > >> >> >>> > > Pyt...@li...
> >> > >> >> >>> > >
> >> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > >> >> >>> > >
> >> > >> >> >>> > >
> >> > >> >> >>> > -------------- next part --------------
> >> > >> >> >>> > An HTML attachment was scrubbed...
> >> > >> >> >>> >
> >> > >> >> >>> > ------------------------------
> >> > >> >> >>> >
> >> > >> >> >>> > Message: 2
> >> > >> >> >>> > Date: Thu, 3 Jan 2013 17:30:59 -0600
> >> > >> >> >>> > From: Anthony Scopatz <sc...@gm...>
> >> > >> >> >>> > Subject: Re: [Pytables-users] Pytables-users Digest, Vol
> 80,
> >> > >> Issue 4
> >> > >> >> >>> > To: Discussion list for PyTables
> >> > >> >> >>> > <pyt...@li...>
> >> > >> >> >>> > Message-ID:
> >> > >> >> >>> > <
> >> > >> >> >>> >
> >> > >> CAP...@ma...
> >
> >> > >> >> >>> > Content-Type: text/plain; charset="iso-8859-1"
> >> > >> >> >>> >
> >> > >> >> >>> > Josh is right that you can just edit the code by hand
> (which
> >> > >> works
> >> > >> >> but
> >> > >> >> >>> > sucks).
> >> > >> >> >>> >
> >> > >> >> >>> > However, on Windows -- on the rare occasion when I also
> >> have to
> >> > >> >> >>> develop on
> >> > >> >> >>> > it -- I typically use a distribution that includes a
> >> compiler,
> >> > >> >> cython,
> >> > >> >> >>> > hdf5, and pytables already and then I install my
> development
> >> > >> version
> >> > >> >> >>> from
> >> > >> >> >>> > github OVER this. I recommend either EPD or Anaconda,
> >> though
> >> > >> other
> >> > >> >> >>> > distributions listed here [1] might also work.
> >> > >> >> >>> >
> >> > >> >> >>> > Be well
> >> > >> >> >>> > Anthony
> >> > >> >> >>> >
> >> > >> >> >>> > 1. http://numfocus.org/projects-2/software-distributions/
> >> > >> >> >>> >
> >> > >> >> >>> >
> >> > >> >> >>> > On Thu, Jan 3, 2013 at 3:46 PM, Josh Ayers <
> >> > jos...@gm...
> >> > >> >
> >> > >> >> >>> wrote:
> >> > >> >> >>> >
> >> > >> >> >>> > > The change was in pure Python code, so you should be
> able
> >> to
> >> > >> just
> >> > >> >> >>> paste
> >> > >> >> >>> > in
> >> > >> >> >>> > > the changes to your local copy. Start with the
> >> > >> >> table.Column.__iter__
> >> > >> >> >>> > > method (lines 3296-3310) here.
> >> > >> >> >>> > >
> >> > >> >> >>> > >
> >> > >> >> >>> > >
> >> > >> >> >>> >
> >> > >> >> >>>
> >> > >> >>
> >> > >>
> >> >
> >>
> https://github.com/PyTables/PyTables/blob/b479ed025f4636f7f4744ac83a89bc947808907c/tables/table.py
> >> > >> >> >>> > >
> >> > >> >> >>> > > It needs to be modified slightly because it uses some
> >> > >> additional
> >> > >> >> >>> features
> >> > >> >> >>> > > that aren't available in the released version (the
> >> > >> out=buf_slice
> >> > >> >> >>> argument
> >> > >> >> >>> > > to table.read). The following should work.
> >> > >> >> >>> > >
> >> > >> >> >>> > > def __iter__(self):
> >> > >> >> >>> > > table = self.table
> >> > >> >> >>> > > itemsize = self.dtype.itemsize
> >> > >> >> >>> > > nrowsinbuf =
> >> table._v_file.params['IO_BUFFER_SIZE']
> >> > //
> >> > >> >> >>> itemsize
> >> > >> >> >>> > > max_row = len(self)
> >> > >> >> >>> > > for start_row in xrange(0, len(self),
> nrowsinbuf):
> >> > >> >> >>> > > end_row = min([start_row + nrowsinbuf,
> >> max_row])
> >> > >> >> >>> > > buf = table.read(start_row, end_row, 1,
> >> > >> >> >>> field=self.pathname)
> >> > >> >> >>> > > for row in buf:
> >> > >> >> >>> > > yield row
> >> > >> >> >>> > >
> >> > >> >> >>> > >
> >> > >> >> >>> > > I haven't tested this, but I think it will work.
> >> > >> >> >>> > >
> >> > >> >> >>> > > Josh
> >> > >> >> >>> > >
> >> > >> >> >>> > >
> >> > >> >> >>> > >
> >> > >> >> >>> > > On Thu, Jan 3, 2013 at 1:25 PM, David Reed <
> >> > >> >> dav...@gm...>
> >> > >> >> >>> > wrote:
> >> > >> >> >>> > >
> >> > >> >> >>> > >> I apologize if I'm starting to sound helpless, but I'm
> >> > forced
> >> > >> to
> >> > >> >> >>> work on
> >> > >> >> >>> > >> Windows 7 at work and have never had luck compiling
> >> python
> >> > >> source
> >> > >> >> >>> > >> successfully. I have had to rely on precompiled
> binaries
> >> > and
> >> > >> now
> >> > >> >> >>> its
> >> > >> >> >>> > >> biting me in the butt.
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> Is there any quick fix I can do to improve this
> iteration
> >> > >> using
> >> > >> >> >>> v2.4.0?
> >> > >> >> >>> > >>
> >> > >> >> >>> > >>
> >> > >> >> >>> > >> On Thu, Jan 3, 2013 at 3:17 PM, <
> >> > >> >> >>> > >> pyt...@li...> wrote:
> >> > >> >> >>> > >>
> >> > >> >> >>> > >>> Send Pytables-users mailing list submissions to
> >> > >> >> >>> > >>> pyt...@li...
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>> To subscribe or unsubscribe via the World Wide Web,
> >> visit
> >> > >> >> >>> > >>>
> >> > >> >> >>> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > >> >> >>> > >>> or, via email, send a message with subject or body
> >> 'help'
> >> > to
> >> > >> >> >>> > >>> pyt...@li...
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>> You can reach the person managing the list at
> >> > >> >> >>> > >>> pyt...@li...
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>> When replying, please edit your Subject line so it is
> >> more
> >> > >> >> specific
> >> > >> >> >>> > >>> than "Re: Contents of Pytables-users digest..."
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>> Today's Topics:
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>> 1. Re: Pytables-users Digest, Vol 80, Issue 2
> (David
> >> > Reed)
> >> > >> >> >>> > >>> 2. Re: Pytables-users Digest, Vol 80, Issue 3
> (David
> >> > Reed)
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>>
> >> > >> >> >>>
> >> > >>
> >> ----------------------------------------------------------------------
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>> Message: 1
> >> > >> >> >>> > >>> Date: Thu, 3 Jan 2013 13:44:29 -0500
> >> > >> >> >>> > >>> From: David Reed <dav...@gm...>
> >> > >> >> >>> > >>> Subject: Re: [Pytables-users] Pytables-users Digest,
> Vol
> >> > 80,
> >> > >> >> Issue
> >> > >> >> >>> 2
> >> > >> >> >>> > >>> To: pyt...@li...
> >> > >> >> >>> > >>> Message-ID:
> >> > >> >> >>> > >>>
> <CAM6XA7=8ocg5WPD4KLSvLhSw-3BCvq5u7MRxq3Ajd6ha=
> >> > >> >> >>> > >>> ev...@ma...>
> >> > >> >> >>> > >>> Content-Type: text/plain; charset="iso-8859-1"
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>> Thanks Anthony, but unless Im missing something I
> don't
> >> > think
> >> > >> >> that
> >> > >> >> >>> > method
> >> > >> >> >>> > >>> will work since this will only be comparing the ith
> >> element
> >> > >> with
> >> > >> >> >>> ith+1
> >> > >> >> >>> > >>> element. I still need 2 for loops right?
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>> Using itertools might speed things up though, I've
> never
> >> > used
> >> > >> >> them
> >> > >> >> >>> so I
> >> > >> >> >>> > >>> will give it a shot and let you know how it goes.
> Looks
> >> > >> like I
> >> > >> >> >>> need to
> >> > >> >> >>> > >>> download the latest release before I do that too.
> >> Thanks
> >> > for
> >> > >> >> the
> >> > >> >> >>> help.
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>> -Dave
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>> On Thu, Jan 3, 2013 at 12:12 PM, <
> >> > >> >> >>> > >>> pyt...@li...> wrote:
> >> > >> >> >>> > >>>
> >> > >> >> >>> > >>> > Send Pytables-users mailing list submissions to
> >> > >> >> >>> > >>> > pyt...@li...
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > To subscribe or unsubscribe via the World Wide Web,
> >> visit
> >> > >> >> >>> > >>> >
> >> > >> >> >>> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > >> >> >>> > >>> > or, via email, send a message with subject or body
> >> 'help'
> >> > >> to
> >> > >> >> >>> > >>> >
> pyt...@li...
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > You can reach the person managing the list at
> >> > >> >> >>> > >>> > pyt...@li...
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > When replying, please edit your Subject line so it
> is
> >> > more
> >> > >> >> >>> specific
> >> > >> >> >>> > >>> > than "Re: Contents of Pytables-users digest..."
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > Today's Topics:
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > 1. Re: Nested Iteration of HDF5 using PyTables
> >> > (Anthony
> >> > >> >> >>> Scopatz)
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> >
> >> > >> >> >>> >
> >> > >> >>
> >> > ----------------------------------------------------------------------
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > Message: 1
> >> > >> >> >>> > >>> > Date: Thu, 3 Jan 2013 11:11:47 -0600
> >> > >> >> >>> > >>> > From: Anthony Scopatz <sc...@gm...>
> >> > >> >> >>> > >>> > Subject: Re: [Pytables-users] Nested Iteration of
> HDF5
> >> > >> using
> >> > >> >> >>> PyTables
> >> > >> >> >>> > >>> > To: Discussion list for PyTables
> >> > >> >> >>> > >>> > <pyt...@li...>
> >> > >> >> >>> > >>> > Message-ID:
> >> > >> >> >>> > >>> > <CAPk-6T5b=
> >> > >> >> >>> > >>> >
> >> 1EG...@ma...
> >> > >
> >> > >> >> >>> > >>> > Content-Type: text/plain; charset="iso-8859-1"
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > HI David,
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > Tables and table column iteration have been
> overhauled
> >> > >> fairly
> >> > >> >> >>> > recently
> >> > >> >> >>> > >>> [1].
> >> > >> >> >>> > >>> > So you might try creating two iterators, offset by
> >> one,
> >> > >> and
> >> > >> >> then
> >> > >> >> >>> > >>> doing the
> >> > >> >> >>> > >>> > comparison. I am hacking this out super quick so
> >> please
> >> > >> >> forgive
> >> > >> >> >>> me:
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > from itertools import izip
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > with tb.openFile(...) as f:
> >> > >> >> >>> > >>> > data = f.root.data
> >> > >> >> >>> > >>> > data_i = iter(data)
> >> > >> >> >>> > >>> > data_j = iter(data)
> >> > >> >> >>> > >>> > data_i.next() # throw the first value away
> >> > >> >> >>> > >>> > for i, j in izip(data_i, data_j):
> >> > >> >> >>> > >>> > compare(i, j)
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > You get the idea ;)
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > Be Well
> >> > >> >> >>> > >>> > Anthony
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > 1. https://github.com/PyTables/PyTables/issues/27
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > On Thu, Jan 3, 2013 at 9:25 AM, David Reed <
> >> > >> >> >>> dav...@gm...>
> >> > >> >> >>> > >>> wrote:
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > > I was hoping someone could help me out here.
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > This is from a post I put up on StackOverflow,
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > I am have a fairly large dataset that I store in
> >> HDF5
> >> > and
> >> > >> >> >>> access
> >> > >> >> >>> > >>> using
> >> > >> >> >>> > >>> > > PyTables. One operation I need to do on this
> dataset
> >> > are
> >> > >> >> >>> pairwise
> >> > >> >> >>> > >>> > > comparisons between each of the elements. This
> >> > requires 2
> >> > >> >> >>> loops,
> >> > >> >> >>> > one
> >> > >> >> >>> > >>> to
> >> > >> >> >>> > >>> > > iterate over each element, and an inner loop to
> >> iterate
> >> > >> over
> >> > >> >> >>> every
> >> > >> >> >>> > >>> other
> >> > >> >> >>> > >>> > > element. This operation thus looks at N(N-1)/2
> >> > >> comparisons.
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > For fairly small sets I found it to be faster to
> >> dump
> >> > the
> >> > >> >> >>> contents
> >> > >> >> >>> > >>> into a
> >> > >> >> >>> > >>> > > multdimensional numpy array and then do my
> >> iteration. I
> >> > >> run
> >> > >> >> >>> into
> >> > >> >> >>> > >>> problems
> >> > >> >> >>> > >>> > > with large sets because of memory issues and need
> to
> >> > >> access
> >> > >> >> >>> each
> >> > >> >> >>> > >>> element
> >> > >> >> >>> > >>> > of
> >> > >> >> >>> > >>> > > the dataset at run time.
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > Putting the elements into an array gives me about
> >> 600
> >> > >> >> >>> comparisons
> >> > >> >> >>> > per
> >> > >> >> >>> > >>> > > second, while operating on hdf5 data itself gives
> me
> >> > >> about
> >> > >> >> 300
> >> > >> >> >>> > >>> > comparisons
> >> > >> >> >>> > >>> > > per second.
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > Is there a way to speed this process up?
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > Example follows (this is not my real code, just an
> >> > >> example):
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > *Small Set*:
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > with tb.openFile(h5_file, 'r') as f:
> >> > >> >> >>> > >>> > > data = f.root.data
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > N_elements = len(data)
> >> > >> >> >>> > >>> > > elements = np.empty((N_irises, 1e5))
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > for ii, d in enumerate(data):
> >> > >> >> >>> > >>> > > elements[ii] = data['element']
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > D = np.empty((N_irises, N_irises)) for ii in
> >> > >> >> >>> xrange(N_elements):
> >> > >> >> >>> > >>> > > for jj in xrange(ii+1, N_elements):
> >> > >> >> >>> > >>> > > D[ii, jj] = compare(elements[ii],
> >> elements[jj])
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > *Large Set*:
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > with tb.openFile(h5_file, 'r') as f:
> >> > >> >> >>> > >>> > > data = f.root.data
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > N_elements = len(data)
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > > D = np.empty((N_irises, N_irises))
> >> > >> >> >>> > >>> > > for ii in xrange(N_elements):
> >> > >> >> >>> > >>> > > for jj in xrange(ii+1, N_elements):
> >> > >> >> >>> > >>> > > D[ii, jj] =
> >> compare(data['element'][ii],
> >> > >> >> >>> > >>> > data['element'][jj])
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>>
> >> > >> >> >>> >
> >> > >> >> >>>
> >> > >> >>
> >> > >>
> >> >
> >>
> ------------------------------------------------------------------------------
> >> > >> >> >>> > >>> > > Master Visual Studio, SharePoint, SQL, ASP.NET,
> C#
> >> > 2012,
> >> > >> >> >>> HTML5,
> >> > >> >> >>> > CSS,
> >> > >> >> >>> > >>> > > MVC, Windows 8 Apps, JavaScript and much more.
> Keep
> >> > your
> >> > >> >> skills
> >> > >> >> >>> > >>> current
> >> > >> >> >>> > >>> > > with LearnDevNow - 3,200 step-by-step video
> >> tutorials
> >> > by
> >> > >> >> >>> Microsoft
> >> > >> >> >>> > >>> > > MVPs and experts. ON SALE this month only -- learn
> >> more
> >> > >> at:
> >> > >> >> >>> > >>> > > http://p.sf.net/sfu/learnmore_122712
> >> > >> >> >>> > >>> > > _______________________________________________
> >> > >> >> >>> > >>> > > Pytables-users mailing list
> >> > >> >> >>> > >>> > > Pyt...@li...
> >> > >> >> >>> > >>> > >
> >> > >> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > >
> >> > >> >> >>> > >>> > -------------- next part --------------
> >> > >> >> >>> > >>> > An HTML attachment was scrubbed...
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > ------------------------------
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>>
> >> > >> >> >>> >
> >> > >> >> >>>
> >> > >> >>
> >> > >>
> >> >
> >>
> ------------------------------------------------------------------------------
> >> > >> >> >>> > >>> > Master Visual Studio, SharePoint, SQL, ASP.NET, C#
> >> 2012,
> >> > >> >> HTML5,
> >> > >> >> >>> CSS,
> >> > >> >> >>> > >>> > MVC, Windows 8 Apps, JavaScript and much more. Keep
> >> your
> >> > >> >> skills
> >> > >> >> >>> > current
> >> > >> >> >>> > >>> > with LearnDevNow - 3,200 step-by-step video
> tutorials
> >> by
> >> > >> >> >>> Microsoft
> >> > >> >> >>> > >>> > MVPs and experts. ON SALE this month only -- learn
> >> more
> >> > at:
> >> > >> >> >>> > >>> > http://p.sf.net/sfu/learnmore_122712
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > ------------------------------
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > _______________________________________________
> >> > >> >> >>> > >>> > Pytables-users mailing list
> >> > >> >> >>> > >>> > Pyt...@li...
> >> > >> >> >>> > >>> >
> >> > >> https://lists.sourceforge.net/lists/listinfo/pytables-users
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> >
> >> > >> >> >>> > >>> > End of Pytables-users Digest, Vol 80, Issue 2
> >> > >> >> >>> > >>> > ***************...
[truncated message content] |