From: Francesc A. <fa...@py...> - 2012-05-10 15:36:39
|
Hey List, Just a few words to inform you that yesterday I gave a quite extensive talk about PyTables at the Austin Python Meetup. I explained not only the basics of PyTables, but also its most advanced features (compression, out-core computation and querying). People were quite responsive and asked a lot of questions, specially on the compression (Blosc) and query features. You can find the slides here: http://www.pytables.org/docs/PUG-Austin-2012-v3.pdf Cheers, -- Francesc Alted |
From: Anthony S. <sc...@gm...> - 2012-05-10 16:11:04
|
Thanks for sharing Francesc! On Thu, May 10, 2012 at 10:36 AM, Francesc Alted <fa...@py...>wrote: > Hey List, > > Just a few words to inform you that yesterday I gave a quite extensive > talk about PyTables at the Austin Python Meetup. I explained not only > the basics of PyTables, but also its most advanced features > (compression, out-core computation and querying). People were quite > responsive and asked a lot of questions, specially on the compression > (Blosc) and query features. > > You can find the slides here: > > http://www.pytables.org/docs/PUG-Austin-2012-v3.pdf > > Cheers, > > -- > Francesc Alted > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: Alvaro T. C. <al...@mi...> - 2012-05-10 17:14:43
|
The graphical explanation of the different containers is masterly, and I believe, supersedes the table that we had talked about for the documentation. I think it the schematics deserve a prominent place in the web page. They are a very good symbolic explanation of the basics of PyTables. As for the tables.Expr example of an in-kernel query, [ r[‘c1’] for r in table.where(‘(c2>2.1)&(c3==True)’) ] now that there exists thanks to Josh a facility to obtain dataset sizes, perhaps some interesting things become possible a) I have always wondered why tables.Expr 'must' be used in an iterative context, i.e. pay the prize of building the Python list, which is not the best container to iterate on afterwards. My explanation for it is that you don't know how big the result set will be, and thus want to avoid returning a big object in memory. But now it would be possible that if the size of the columns that are involved fits in memory (or, let's say a fraction of the total RAM that is configurable), PyTables returns a numpy mask, or an index array, which are certainly very useful for further numpy work. A new function name could be provided for this functionality. b) more generally, expanding on this, knowing the size of datasets and the available memory, PyTables could eventually decide whether to perform operations in memory or in kernel. What do you think? -á. On Thu, May 10, 2012 at 5:10 PM, Anthony Scopatz <sc...@gm...> wrote: > Thanks for sharing Francesc! > > > On Thu, May 10, 2012 at 10:36 AM, Francesc Alted <fa...@py...> > wrote: >> >> Hey List, >> >> Just a few words to inform you that yesterday I gave a quite extensive >> talk about PyTables at the Austin Python Meetup. I explained not only >> the basics of PyTables, but also its most advanced features >> (compression, out-core computation and querying). People were quite >> responsive and asked a lot of questions, specially on the compression >> (Blosc) and query features. >> >> You can find the slides here: >> >> http://www.pytables.org/docs/PUG-Austin-2012-v3.pdf >> >> Cheers, >> >> -- >> Francesc Alted >> >> >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Pytables-users mailing list >> Pyt...@li... >> https://lists.sourceforge.net/lists/listinfo/pytables-users > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Pytables-users mailing list > Pyt...@li... > https://lists.sourceforge.net/lists/listinfo/pytables-users > |
From: Francesc A. <fa...@py...> - 2012-05-10 19:47:25
|
On 5/10/12 12:14 PM, Alvaro Tejero Cantero wrote: > The graphical explanation of the different containers is masterly, and > I believe, supersedes the table that we had talked about for the > documentation. > > I think it the schematics deserve a prominent place in the web page. > They are a very good symbolic explanation of the basics of PyTables. Glad that you like it. In fact I think you are right: this is perhaps the first time that some schematics have been used for describing the basic objects in PyTables. And my impression from the talk yesterday is that people really get the gist of PyTables very quickly. > > As for the tables.Expr example of an in-kernel query, > > [ r[‘c1’] for r in table.where(‘(c2>2.1)&(c3==True)’) ] > > now that there exists thanks to Josh a facility to obtain dataset > sizes, perhaps some interesting things become possible I think you are mixing concepts here. tables.Expr is for out-of-core operations. I suppose you mean Numexpr here. > > a) I have always wondered why tables.Expr 'must' be used in an > iterative context, i.e. pay the prize of building the Python list, > which is not the best container to iterate on afterwards. My > explanation for it is that you don't know how big the result set will > be, and thus want to avoid returning a big object in memory. But now > it would be possible that if the size of the columns that are involved > fits in memory (or, let's say a fraction of the total RAM that is > configurable), PyTables returns a numpy mask, or an index array, which > are certainly very useful for further numpy work. A new function name > could be provided for this functionality. Hmm, the Table.where() iterator is very fast already (I can assure you that a lot of optimizations and caching stuff is there), but I agree that, for the indexed case, there would be situations where returning a mask or an index array would be better (read faster). > > b) more generally, expanding on this, knowing the size of datasets and > the available memory, PyTables could eventually decide whether to > perform operations in memory or in kernel. In-memory or in-kernel? You probably mean indexed or in-kernel, right? Yes, that's certainly another nice place for further optimizations. -- Francesc Alted |