From: Yury K. <kat...@gm...> - 2012-11-22 11:23:00
|
The premises are clear: (1) the current implementation of the max parser function is slow and (2) there is a workaround for making max queries quicker. The conclusion is not clear: "let's drop the max ASAP". It's not that hard to replace the current implementation of MAX format with the faster one and save the backward compatibility there. ----- Yury Katkov, WikiVote On Thu, Nov 22, 2012 at 1:23 PM, Markus Krötzsch <ma...@se...> wrote: > Hi, > > I would like to ask about this: > > http://semantic-mediawiki.org/wiki/Help:Max_format > > I am afraid to say that this idea seems to be fundamentally broken. The > above page seriously suggests to find the largest population number in > the wiki by querying for a list of *all cities with and without > population* and invoke PHP code that scans through this list to find the > maximum (this is what format=max does, AFAIK). The query to do this is: > > {{#ask: [[Category:City]] > | ?Population > | format=max > }} > > This is an extremely slow method of producing wrong results (the results > will be wrong as soon as there are enough pages in the wiki so that the > one with the maximum value is after the default query limit when > ordering results alphabetically). > > What one would do instead is to ask for the one result that has the > largest value right away, like this: > > {{#ask: [[Category:City]] > | ?Population > | sort=population > | order=DESC > | limit=1 > | format=max > }} > > The max format in this case is obsolete, since one could also just do > > {{#ask: [[Category:City]] > | ?Population= > | mainlabel=- > | sort=population > | order=DESC > | limit=1 > }} > > This has the big advantage that one can also use further output > formatting on the resulting number, e.g., to get it in a plain format > without any beautification. > > > I just noted these problems since there seem to be cases where PHP runs > out of time/memory due to users following the above query anti-pattern > [1]. My conclusion would be: let's drop max/min as soon as possible and > change the documentation to give the efficient query pattern I gave above. > > Markus > > [1] https://bugzilla.wikimedia.org/show_bug.cgi?id=42347 > > ------------------------------------------------------------------------------ > Monitor your physical, virtual and cloud infrastructure from a single > web console. Get in-depth insight into apps, servers, databases, vmware, > SAP, cloud infrastructure, etc. Download 30-day Free Trial. > Pricing starts from $795 for 25 servers or applications! > http://p.sf.net/sfu/zoho_dev2dev_nov > _______________________________________________ > Semediawiki-devel mailing list > Sem...@li... > https://lists.sourceforge.net/lists/listinfo/semediawiki-devel |