[Sqlalchemy-tickets] Issue #3176: faster KeyedTuple (zzzeek/sqlalchemy)
Brought to you by:
zzzeek
|
From: Mike B. <iss...@bi...> - 2014-08-28 15:53:21
|
New issue 3176: faster KeyedTuple https://bitbucket.org/zzzeek/sqlalchemy/issue/3176/faster-keyedtuple Mike Bayer: collections.namedtuple() is much much faster on creation of tuples, and KeyedTuple is actually slower than straight object. But namedtuple() is very slow to create new types as the code to do so uses eval(), stack frame inspection, and other overbuilt things that slow us down too much on a per-query basis. What to do? We can hedge between both approaches. Here is a speed breakdown: ``` #!python import collections import timeit from sqlalchemy.util import KeyedTuple, lightweight_named_tuple def go1(size): nt = collections.namedtuple('a', ['x', 'y', 'z']) result = [ nt(1, 2, 3) for i in range(size) ] def go2(size): labels = ['x', 'y', 'z'] result = [ KeyedTuple([1, 2, 3], labels) for i in range(size) ] def go3(size): nt = lightweight_named_tuple('a', ['x', 'y', 'z']) result = [ nt([1, 2, 3]) for i in range(size) ] for size, num in [ (10, 10000), (100, 1000), (10000, 100), (1000000, 10), ]: print "-----------------" print "size=%d num=%d" % (size, num) print "namedtuple:", timeit.timeit("go1(%s)" % size, "from __main__ import go1", number=num) print "keyedtuple:", timeit.timeit("go2(%s)" % size, "from __main__ import go2", number=num) print "lw keyed tuple:", timeit.timeit("go3(%s)" % size, "from __main__ import go3", number=num) ``` output: ``` #! ----------------- size=10 num=10000 namedtuple: 3.60116696358 keyedtuple: 0.257042884827 lw keyed tuple: 0.571335792542 ----------------- size=100 num=1000 namedtuple: 0.362484931946 keyedtuple: 0.24974322319 lw keyed tuple: 0.0887930393219 ----------------- size=10000 num=100 namedtuple: 0.562417030334 keyedtuple: 2.53507685661 lw keyed tuple: 0.607440948486 ----------------- size=1000000 num=10 namedtuple: 5.84964299202 keyedtuple: 28.8070271015 lw keyed tuple: 6.69921588898 ``` we can see that namedtuple is very slow for lots of distinct types. But then that keyedtuple is *really* slow for a lot of instances on a small number of types. the new lw_tuple is almost as fast as namedtuple on instance create and just a bit slower than keyedtuple on making new types. it is definitely the best option in the graph. |