From: Bernard K. <cha...@be...> - 2004-03-08 09:08:46
|
Dear community, I have to develop a program that performs numerical analysis on data that come from a fab production line. Every month I can count on approximately 100 000 new entries. Each entry is composed on the one hand of general information (such as date, machine, ...) and on the other hand of raw data that we measure (a matrix of size 2000x1000 or more). So far I gather the general information in a relational database (firebird - kinterbasdb) and the data are just kept in individual files. I appreciate the database because I can sort my data on the different columns of my table and I can perform fast search to organize my huge number of entries. But I also realize that the numerical treatment that will follow will become quite cumbersome. This is why I am interested in PyTables (to be honest I am also interested in PyTables because I trully hate SQL and love Python) Here are my questions: - can I replace my database with PyTables ? - is it possible to sort efficiently (meaning fast) a table in PyTables along a specific column ? How ? - does the concept of primary key in a database exist in PyTables ? I use primary key to avoid inserting two times the same row in my table. Is there an equivalent way to do it in PyTables? - how does PyTables compare with relational databases such as Firebird, SQLite,... in terms of performance ? - Are my questions relevant or do you instead advise me to keep to relational database ? Thanks a lot for your answers. Bernard Kaplan |