From: Jeroen De D. <jer...@gm...> - 2011-11-09 05:45:44
|
Hey, > Can it also be used with regular formats like csv or table (resulting in the value for the distribution to be displayed?) Each format needs to add support for this functionality, so at the moment you can not use value distribution with these formats. However, it's relatively easy to add this support in. I'm not sure that adding it to all formats is makes much sense though. For a lot of them, the usefulness seems limited, and it's a bunch of work to do. In that case, it might be worth reworking the query result class a little so it's possible to modify query results (in a sane manner) before they get passed to the actual result printer, which would also allows for other kinds of post-query processing. > One way to do this would be to automatically give each property some special properties, such that the property itself could be queried for its set of unique values, and the number of times each value has been used. Interesting, I had not thought of this. Implementing this would be completely different then what I did though, and it'd be as you say more powerful. If any system to handle this is created, it could probably easily made more generic, and support all kinds of computations, not just the occurance count of values of a property. It might even go hand in hand with query management functionality (allows for automatic invalidation of query caches when their source data is modified). This will not be trivial to implement, and is out of scope of what I want to do here. If such functionality is created, it might make the value distribution feature a bit obsolete, but I don't see this happening soon (unless someone throws money or devs at it). I'm curious to your ideas about this though and have some questions: * Where/when would this property meta data be computed? On every change of any occurrence of the property might be quite expensive. * Where would you defined how to compute this meta data? If possible I'd be neat to have control over this in the wiki itself. > although I suppose the most general solution of all would be to implement aggregation queries. > .. > I guess GROUP BY and COUNT() functionality are the bits that would would jeopardize sanity? :) I actually discussed this at length with Yaron, and we concluded that generic group by functionality would not be terribly useful, since it's hard to imagine cases where you would not just want to count the occurrences. My current implementation is pretty much equivalent to doing a group by count I think (not sure, as I'm not that familiar with the SQL group by statement). > For the discussion: What about something like this: > {{#ask: [[Category:Locations]] [[Has location type::City]] > | ?Located in > |?count(*) > | group by=Located in > | format=jqplotpie > | mainlabel=- > | limit=500 > }} What would the advantage of this syntax be? I suspect It's less clear to most users, and it's definitely harder to implement, since you'll need to recognize ?count(*) as a special printout. Cheers -- Jeroen De Dauw http://www.bn2vs.com Don't panic. Don't be evil. -- |