Perl Data Language / Feature Requests / #90 Implement partition algorithm and rewrite median and pct to use it

#90 Implement partition algorithm and rewrite median and pct to use it

Milestone: feature_request

Status: open

Owner: nobody

Labels: None

Priority: 5

Updated: 2016-10-07

Created: 2016-10-06

Creator: Diab Jerius

Private: No

Currently percentile (including median) operators completely sort the
data. This is not necessary for these operations. Instead, the data
need only be partitioned:

Partitioning: reorder the array so that all elements with values less
than the pivot come before the pivot, while all elements with values
greater than the pivot come after it (equal values can go either
way).

(From the wikipedia quicksort article.)

Partitioning (the basis for quicksort, which performs it recursively) is much faster, and can be done incrementally. This can speed up generating percentiles, generating partitions optimized for the percentiles requested (e.g. using the same approach to dividing the data as used by the quicksort or binary search algorithms).

Discussion

Diab Jerius - 2016-10-06

I should note that this is a bit more complicated than simply using partitions, as the pivot value needs to be passed to the partition routine. However, it should be possible to come up with an iterative approach with a worst case performance equivalent to quicksort's.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Diab Jerius - 2016-10-06

I need improve my web search skills. Quickselect is what I was thinking of.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2016-10-07

Ticket moved from /p/pdl/bugs/428/

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Marshall - 2016-10-07

Patches and code contributions are always welcome. Drop-in but more performant functions are particularly easy to verify correctness of...

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Implement partition algorithm and rewrite median and pct to use it

Group

Searches

Help

#90 Implement partition algorithm and rewrite median and pct to use it

Discussion