Very slow super aggregation
jAgg - Java Aggregation Operations
Status: Beta
Brought to you by:
rgettman
Hello,
I tried to aggregate a CSV file of about 700k lines and I was very surprised it took more than 30min. Looking at the code, it seems that the FunctionCache is growing as fast as the lines are grouped, filled up with used functions. As a consequence, every time we need a function we iterate over many used function in the cache. Removing use of the cache reduced the aggregation time from 30min to 4 seconds.
Thanks
--
Francois
François
Anonymous
Francois,
Please supply your code that calls jAgg, along with the code for the object that stores the data from the CSV, and a sample of your data.
With that information, I would be able to analyze why your super-aggregation was so slow.
Thanks,
Randy Gettman