Menu

#90 Implement partition algorithm and rewrite median and pct to use it

feature_request
open
nobody
None
5
2016-10-07
2016-10-06
Diab Jerius
No

Currently percentile (including median) operators completely sort the
data. This is not necessary for these operations. Instead, the data
need only be partitioned:

Partitioning: reorder the array so that all elements with values less
than the pivot come before the pivot, while all elements with values
greater than the pivot come after it (equal values can go either
way).

(From the wikipedia quicksort article.)

Partitioning (the basis for quicksort, which performs it recursively) is much faster, and can be done incrementally. This can speed up generating percentiles, generating partitions optimized for the percentiles requested (e.g. using the same approach to dividing the data as used by the quicksort or binary search algorithms).

Discussion

  • Diab Jerius

    Diab Jerius - 2016-10-06

    I should note that this is a bit more complicated than simply using partitions, as the pivot value needs to be passed to the partition routine. However, it should be possible to come up with an iterative approach with a worst case performance equivalent to quicksort's.

     
  • Diab Jerius

    Diab Jerius - 2016-10-06

    I need improve my web search skills. Quickselect is what I was thinking of.

     
  • Chris Marshall

    Chris Marshall - 2016-10-07

    Ticket moved from /p/pdl/bugs/428/

     
  • Chris Marshall

    Chris Marshall - 2016-10-07

    Patches and code contributions are always welcome. Drop-in but more performant functions are particularly easy to verify correctness of...

     

Log in to post a comment.