From: <bm...@ca...> - 2007-08-30 08:47:26
|
Quoting St=E9phane Charette <ste...@gm...>: >> >> Filters are easy. In this case, you give a list of media handles to the >> filter and see if it is present in the backlinks, something like >> >> for backlink in database.find_backlink_handles(media_handle): >> # count number of backlinks > > Hmm... :) Slightly more complicated, but thanks for the hint. > > If anyone wants to try it out, this is now committed to svn on the > trunk/3.0 branch. For example: > > Events tab -> Edit -> Event Filter Editor -> Add a new filter -> Add > another rule -> General Filters -> Events with a reference count -> > ... > > St=E9phane Stephane, i think you can improve the performance of your filter if you break the for loop if not ok, instead of for all handles running over all backlinks. So eg : for item in db.find_backlink_handles(handle): count +=3D 1 if count > value : return False would greatly improve performance, especially if value=3D0. Also, if you put value =3D int(self.list[1]) in the prepare method, as def prepare(...) self.value =3D int(self.list[1]) you avoid doing the str to int conversion for every handle that is passed. In the same way, you can store the self.list[0] as an integer in the prepar= e, avoiding the need to check with a translated _('lesser than') This removes the gettext call on every handle, and improves the equality checking. For filters, all these performance issues are important, especially with things like FilterProxyDb, and people complaining it being slower ;-) As little unneccessary computations should be done in the apply method. Benny (Changed to devel list as users have no interest in this mail) ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. |
From: <ste...@gm...> - 2007-08-30 16:09:29
|
> Stephane, > > i think you can improve the performance of your filter if you break the > for loop > if not ok, instead of for all handles running over all backlinks. > > So eg : > > for item in db.find_backlink_handles(handle): > count += 1 > if count > value : > return False But what if the filter is to see if count > value -- then I'd have to return True. This means putting the return logic code inside the "for" loop. Are you certain putting the logic return code inside the for loop and executing it for every handle would really be faster? > would greatly improve performance, especially if value=0. > Also, if you put > > value = int(self.list[1]) > > in the prepare method, as > def prepare(...) > self.value = int(self.list[1]) > > you avoid doing the str to int conversion for every handle that is passed. > In the same way, you can store the self.list[0] as an integer in the prepar > e, > avoiding the need to check with a translated _('lesser than') > This removes the gettext call on every handle, and improves the equality > checking. Thanks! I didn't know about prepare(). Now I see other examples of it. I'll take care of it today. Stephane |
From: <bm...@ca...> - 2007-08-30 22:05:54
|
Quoting St=E9phane Charette <ste...@gm...>: >> Stephane, >> >> i think you can improve the performance of your filter if you break the >> for loop >> if not ok, instead of for all handles running over all backlinks. >> >> So eg : >> >> for item in db.find_backlink_handles(handle): >> count +=3D 1 >> if count > value : >> return False > > But what if the filter is to see if count > value -- then I'd have to > return True. This means putting the return logic code inside the > "for" loop. > > Are you certain putting the logic return code inside the for loop and > executing it for every handle would really be faster? Well, it depends on the database structure and how the pages are hit to look up backreferences. No idea there. If you have, db.find_backlink_handles(handle), probably everything is looked up before you even start, so count =3D len(db.find_backlink_handles(handle)) is just as good. if there is a db method of only obtaining the first backlink handle, then y= es, it will be faster, but it needs different way of accessing the data, namely with iterators. Don't know if that is implemented. Note that if you worry about the check in the for loop to see if equal, you could do 3 loops in three different functions, and in prepare set the funct= ion that must run. You have the best of both worlds, so def prepare(): if 'equal': self.func =3D self._equal_func else ... and then in apply : return self.func() However, as I said, the db.find_backlink_handles(handle) has probably hit t= he database already, so no use optimizing as that is the slow guy (hitting you= r hard disk). I just wrote my first idea down when I saw your code. I just want to make i= t very clear that with filters thinking hard about performance pays, as you k= now beforehand that all handles in the database will checked, so the faster the check, the better. Checks that themselves need a lot of db access or calculation will be slow. Benny ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. |