From: Antonio V. <ant...@ti...> - 2013-06-25 18:02:57
|
Hi Sebastian, Il 25/06/2013 09:36, Wagner Sebastian ha scritto: > Hi Anthony and Antonio, > > Thanks for your fast responses. It's great to hear all features are now free to use, though I needed one and a half week to get this. > > The first reference I read to learn the usage of PyTables was Hints for SQL Users [1], where is stated several times, for example in the section ' Creating an index': >> Indexing is supported in the commercial version of PyTables (PyTablesPro). > I would suggest that these texts should be updated. > Being convinced it's only available in Pro-Version after I read it so often, I also overread the warning in the PyTables Pro page[2] (As I were only interested in the features not available in the free version I just scrolled down immediately, diagonal reading...). So the next suggestion is to give a color to the warning text there :) > > [1] > http://www.pytables.org/moin/HintsForSQLUsers#Creatinganindex > http://www.pytables.org/moin/HintsForSQLUsers#Selectingdata > [2] > http://www.pytables.org/moin/PyTablesPro > > regards, > Sebastian > thank you for reporting the issue, I will fix it ASAP. The same problem also affect the corresponding cookbook page [1]. Anyway, please, feel free to update the wiki if you find outdated material. [1] http://pytables.github.io/cookbook/hints_for_sql_users.html > On Mon, Jun 24, 2013 at 4:25 AM, Wagner Sebastian < Seb...@ai...> wrote: > >> Dear PyTables-Users,**** >> >> ** ** >> >> For testing purposes I use a PyTables DB with 4 columns (1x Uint8 and >> 3xFloat) with 750k rows, the total file size about 90MB. As the free >> version does no support indexing I thought that a search (full-table) >> on this database would last a least one or two seconds, because the >> file has to be loaded first (throttleneck I/O), and then the search >> over ~20k rows can begin. But PyTables took only 0.05 seconds for a >> full table search (in-kernel, so near C-speed, but nevertheless full >> table), while my bisecting algorithm with a precomputed sorted list >> wrapped around PyTables (but saved in there), took about 0.5 >> seconds.**** >> >> ** ** >> >> So the thing I don?t understand: How can PyTables be so fast without >> any Indexing? >> > > Hi Sebastian, > > First, there is no longer a non-free version of PyTables and v3.0 *does* have indexing capabilities. However, you have to enable them so you probably weren't using them. > > PyTables is fast because HDF5 is a binary format, it using pthreads under the covers to parallelize some tasks, and it uses numexpr (which is also > parallel) to evaluate many expressions. All of these things help make PyTables great! > > Be Well > Anthony > > > Il 24/06/2013 11:25, Wagner Sebastian ha scritto: >> Dear PyTables-Users, >> >> For testing purposes I use a PyTables DB with 4 columns (1x Uint8 and 3xFloat) with 750k rows, the total file size about 90MB. As the free version does no support indexing I thought that a search (full-table) on this database would last a least one or two seconds, because the file has to be loaded first (throttleneck I/O), and then the search over ~20k rows can begin. But PyTables took only 0.05 seconds for a full table search (in-kernel, so near C-speed, but nevertheless full table), while my bisecting algorithm with a precomputed sorted list wrapped around PyTables (but saved in there), took about 0.5 seconds. >> >> So the thing I don't understand: How can PyTables be so fast without any Indexing? >> >> I'm using 3.0.0rc2 coming with WinPython >> >> Regards, >> Sebastian > > The indexing features of PyTables Pro are now available in the open source version of PyTables since version 2.3 (please see [1]). > > > > [1] > http://pytables.github.io/release-notes/RELEASE_NOTES_v2.3.x.html#changes-from-2-2-1-to-2-3 > > ciao > > -- > Antonio Valentino > -- Antonio Valentino |