The documentation for vsearch indicates it returns the index of the "least larger member". This doesn't seem to be standard nomenclature (cf. http://en.wikipedia.org/wiki/Least-upper-bound_property).
I can easily parse its meaning as "the smallest element of the set which is larger than the searched-for element". That's not what vsearch is returning, however. It returns the smallest element of the set which is larger or equal to the searched-for element.
pdl> $xs = sequence(10); say $xs->index( vsearch( pdl( 4.5, 8), $xs ) )
[5 8]
Note the "8", rather than "9".
So, while "least larger member" is very similar to "least upper bound", it isn't using the exact magic words, so I suggest clarifying to documentation to say
Returns for each value of $vals the index of the smallest member of $xs which is greater than or equal to it. $xs must be in increasing order.
While the term "least upper bound" may be well defined mathematically, it's not obvious from its words that the comparison includes equality. I think being explicit about what is happening would be much clearer, an approach backed by my random sample of one colleague.
Thanks!
Diab
Chris Marshall
2014-12-20
Chris Marshall
2014-12-20
Nice to fix for PDL-2.008, upping the priority as a reminder.
Chris Marshall
2015-02-22
The current pdldoc vsearch returns
Module PDL::Primitive
vsearch
Signature: ( vals(); xs(n); [o] indx(); [\%options] )Efficiently search for values in a sorted piddle. $idx = vsearch( $vals, $x, [\%options] ); vsearch( $vals, $x, $idx, [\%options ] ); vsearch performs a binary search for the values from $vals piddle in the ordered piddle $x, returning indices into $x. It is a front end to a set of routines which differ in how matches are determined and the meaning of the returned indices. The "mode" option indicates which method of searching to use, and may be one of: "sample" invoke vsearch_sample, returning indices appropriate for sampling within a distribution. "insert_leftmost" invoke vsearch_insert_leftmost, returning the left-most possible insertion point. "insert_rightmost" invoke vsearch_insert_rightmost, returning the right-most possible insertion point. "insert_match" invoke vsearch_match, returning the index of a matching element, else -(insertion point + 1) "insert_bin_inclusive" invoke vsearch_bin_inclusive, returning an index appropriate for binning on a grid where the left bin edges are *inclusive* of the bin. "insert_bin_exclusive" invoke vsearch_bin_exclusive, returning an index appropriate for binning on a grid where the left bin edges are *exclusive* of the bin. The default value of "mode" is "sample".
which I do not understand. Maybe this could be made more clear pre-2.008?
mohawk
2015-02-23
I have had a go at explaining a bit more, in https://sourceforge.net/p/pdl/code/merge-requests/30/
In order to better understand the workings of vsearch, one should read a little further on for the docs of vsearch_bin_(in,ex)clusive. It made sense to me. If that isn't considered enough, Diab will need to liaise with Chris to clarify what is still unclear.
Diab Jerius
2015-02-23
Sorry for the tardy reply. This bug report was submitted before I reworked the vsearch code and docs, so the current context is somewhat muddled. I'm happy with the current docs (as I wrote them).
Chris, could you be more explicit about what's not clear about the vsearch docs? I used language which seemed consistent with what I found in the documentation for other implementations of vsearch (primarily the Java one).
Chris Marshall
2015-02-23
If you do 'pdldoc vsearch' you get a description that is not
particularly useful since all of the real explanation is in the
vsearch_xxx routines and not in the vsearch POD.
What would help:
- put links in the POD for the specific vsearch_xxx
- add a =for example to show what happens
- this should make it clear what the 'sample' mode is
- put 'see also for the mode-specific routines as well
--Chris
On Mon, Feb 23, 2015 at 10:09 AM, Diab Jerius djerius@users.sf.net wrote:
Sorry for the tardy reply. This bug report was submitted before I reworked
the vsearch code and docs, so the current context is somewhat muddled. I'm
happy with the current docs (as I wrote them).Chris, could you be more explicit about what's not clear about the vsearch
docs? I used language which seemed consistent with what I found in the
documentation for other implementations of vsearch (primarily the Java one).
- [bugs:#359] http://sourceforge.net/p/pdl/bugs/359 Improved
documentation for vsearch*Status: open
Group: feature_request
Created: Mon Aug 18, 2014 09:37 PM UTC by Diab Jerius
Last Updated: Mon Feb 23, 2015 02:55 AM UTC
Owner: nobodyThe documentation for vsearch indicates it returns the index of the "least
larger member". This doesn't seem to be standard nomenclature (cf.
http://en.wikipedia.org/wiki/Least-upper-bound_property).I can easily parse its meaning as "the smallest element of the set which
is larger than the searched-for element". That's not what vsearch is
returning, however. It returns the smallest element of the set which is
larger or equal to the searched-for element.pdl> $xs = sequence(10); say $xs->index( vsearch( pdl( 4.5, 8), $xs ) )
[5 8]Note the "8", rather than "9".
So, while "least larger member" is very similar to "least upper bound",
it isn't using the exact magic words, so I suggest clarifying to
documentation to sayReturns for each value of $vals the index of the smallest member of $xs
which is greater than or equal to it. $xs must be in increasing order.While the term "least upper bound" may be well defined mathematically,
it's not obvious from its words that the comparison includes equality. I
think being explicit about what is happening would be much clearer, an
approach backed by my random sample of one colleague.Thanks!
Diab
Sent from sourceforge.net because you indicated interest in
https://sourceforge.net/p/pdl/bugs/359/To unsubscribe from further messages, please visit
https://sourceforge.net/auth/subscriptions/
Zakariyya Mughal
2015-02-25
Should probably be set to pending-fixed or fixed now that https://sourceforge.net/p/pdl/code/merge-requests/30/ is merged.
Chris Marshall
2015-02-25
MR30 does not include the additional clarifications I mentioned. 'pdldoc
vsearch' is not self-explanatory to me. The additions proposed would make
it so.
On Tue, Feb 24, 2015 at 9:57 PM, Zakariyya Mughal zsmughal@users.sf.net
wrote:
Should probably be set to pending-fixed or fixed now that
https://sourceforge.net/p/pdl/code/merge-requests/30/ is merged.
- [bugs:#359] http://sourceforge.net/p/pdl/bugs/359 Improved
documentation for vsearch*Status: open
Group: feature_request
Created: Mon Aug 18, 2014 09:37 PM UTC by Diab Jerius
Last Updated: Mon Feb 23, 2015 03:09 PM UTC
Owner: nobodyThe documentation for vsearch indicates it returns the index of the "least
larger member". This doesn't seem to be standard nomenclature (cf.
http://en.wikipedia.org/wiki/Least-upper-bound_property).I can easily parse its meaning as "the smallest element of the set which
is larger than the searched-for element". That's not what vsearch is
returning, however. It returns the smallest element of the set which is
larger or equal to the searched-for element.pdl> $xs = sequence(10); say $xs->index( vsearch( pdl( 4.5, 8), $xs ) )
[5 8]Note the "8", rather than "9".
So, while "least larger member" is very similar to "least upper bound",
it isn't using the exact magic words, so I suggest clarifying to
documentation to sayReturns for each value of $vals the index of the smallest member of $xs
which is greater than or equal to it. $xs must be in increasing order.While the term "least upper bound" may be well defined mathematically,
it's not obvious from its words that the comparison includes equality. I
think being explicit about what is happening would be much clearer, an
approach backed by my random sample of one colleague.Thanks!
Diab
Sent from sourceforge.net because you indicated interest in
https://sourceforge.net/p/pdl/bugs/359/To unsubscribe from further messages, please visit
https://sourceforge.net/auth/subscriptions/
Zakariyya Mughal
2015-03-03
How is this for a start to the example?
https://sourceforge.net/p/pdl/code/ci/doc/vsearch-example/~/tree/
Chris Marshall
2015-03-05
Zakariyya Mughal
2015-03-06
On 2015-03-05 at 22:39:59 +0000, Chris Marshall wrote:
- status: open --> closed
- assigned_to: Diab Jerius
- Comment:
Looking at the docs again, I think the clarity on vsearch() really needs a tutorial rather than just a few line example. Thanks for the improved docs and implementation. Fixed in git and should appear in PDL-2.007_12 and later.
I can write up a tutorial as well and I'll add it on to the
doc/vsearch-example branch https://github.com/PDLPorters/pdl/pull/58.
I'll update this issue when I do.
Chris Marshall
2015-03-05
Looking at the docs again, I think the clarity on vsearch() really needs a tutorial rather than just a few line example. Thanks for the improved docs and implementation. Fixed in git and should appear in PDL-2.007_12 and later.