Re: [Pytables-users] Pytables bulk loading data

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Pushkar,

Il 17/07/2013 19:28, Pushkar Raj Pande ha scritto:
> Hi all,
> 
> I am trying to figure out the best way to bulk load data into pytables.
> This question may have been already answered but I couldn't find what I was
> looking for.
> 
> The source data is in form of csv which may require parsing, type checking
> and setting default values if it doesn't conform to the type of the column.
> There are over 100 columns in a record. Doing this in a loop in python for
> each row of the record is very slow compared to just fetching the rows from
> one pytable file and writing it to another. Difference is almost a factor
> of ~50.
> 
> I believe if I load the data using a C procedure that does the parsing and
> builds the records to write in pytables I can get close to the speed of
> just copying and writing the rows from 1 pytable to another. But may be
> there is something simple and better that already exists. Can someone
> please advise? But if it is a C procedure that I should write can someone
> point me to some examples or snippets that I can refer to put this together.
> 
> Thanks,
> Pushkar
> 

numpy has some tools for loading data from csv files like loadtxt [1],
genfromtxt [2] and other variants.

Non of them is OK for you?

[1]
http://docs.scipy.org/doc/numpy/reference/generated/numpy.loadtxt.html#numpy.loadtxt
[2]
http://docs.scipy.org/doc/numpy/reference/generated/numpy.genfromtxt.html#numpy.genfromtxt

cheers

-- 
Antonio Valentino