I'm having a really big problem. When I send a query to my database using MySQLdb the result is about 15 times bigger in RAM than the data it holds and I have no idea what I could be doing wrong. In the DB the data I query is about 8MB big and the RAM use rises up to 140MB just for this script. In this case, 140MB is not a so big problem, but I should use all the data of the table and it means it should need 150MB Data * 15 = 2250MB in fact it seems to rise exponentially so 8GB RAM is not enough for all the data.
Here is what I'm doing:
self.conn = MySQLdb.connect(...)
self.c = self.conn.cursor(MySQLdb.cursors.DictCursor)
query = 'SELECT * from table LIMIT 50000'
self.c.execute(query) # when I execute this line, the RAM use rises to 140MB
self.data = self.c.fetchall()
self.c.close()
self.conn.close()
I also tried to do a del self.c (self is the db object) after see if it helps, but it didn't change anything. Does someone have any idea, what is wrong with it and what I can do to solve my problem? I have about 100 queries after each other instead of one big query and it seems to need much more RAM. I would appreciate any try of help.
Thx in advance. Regards. Nico
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'd recommend that you start by not using DictCursor. Python objects come with some overhead (a python string is larger than its associated char*) so you can expect some difference in size. However, creating a dict per row for a very large data set is going to be expensive.
In high performance MySQL, most people consider "select *" harmful. You should name the columns you want and access them positionally (using a cursor that returns tuples). You may also want to check out SSCursor (which doesn't pull the full result over to the client) and go a row or group at a time (using fetchone or fetchmany).
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
I'm having a really big problem. When I send a query to my database using MySQLdb the result is about 15 times bigger in RAM than the data it holds and I have no idea what I could be doing wrong. In the DB the data I query is about 8MB big and the RAM use rises up to 140MB just for this script. In this case, 140MB is not a so big problem, but I should use all the data of the table and it means it should need 150MB Data * 15 = 2250MB in fact it seems to rise exponentially so 8GB RAM is not enough for all the data.
Here is what I'm doing:
self.conn = MySQLdb.connect(...)
self.c = self.conn.cursor(MySQLdb.cursors.DictCursor)
query = 'SELECT * from table LIMIT 50000'
self.c.execute(query) # when I execute this line, the RAM use rises to 140MB
self.data = self.c.fetchall()
self.c.close()
self.conn.close()
I also tried to do a del self.c (self is the db object) after see if it helps, but it didn't change anything. Does someone have any idea, what is wrong with it and what I can do to solve my problem? I have about 100 queries after each other instead of one big query and it seems to need much more RAM. I would appreciate any try of help.
Thx in advance. Regards. Nico
I'd recommend that you start by not using DictCursor. Python objects come with some overhead (a python string is larger than its associated char*) so you can expect some difference in size. However, creating a dict per row for a very large data set is going to be expensive.
In high performance MySQL, most people consider "select *" harmful. You should name the columns you want and access them positionally (using a cursor that returns tuples). You may also want to check out SSCursor (which doesn't pull the full result over to the client) and go a row or group at a time (using fetchone or fetchmany).