Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.


#121 Allow substring() to take only two parameters

Tony Balinski

This patch allows the macro programmer to drop the last
parameter to the three parameter built-in function
substring(). Normally substring() is called as
substring(string, start, end)

This leads to heavy duty calls such as
substring(string, start, length(string))
just to retrieve the end of a string. It would be easier
just to say
substring(string, start)
and make the function itself provide the end position.
This is the first thing this patch does. Note that this
change in itself has no adverse effects on existing macro
code since, until now, all three parameters were required.

The patch does change some behaviour. The changes
seem sensible and useful to me, as well as better than
the current behaviour. This concerns the actions of
substring() when the given end marker is negative or
less than the start position.

Former behaviour: if end is negative, it is set to zero; if
it is less than start, the values of start and end are
swapped. This means that the following are true:
substring("hello world", 5, -1) == "hello"
substring("hello world", 5, 1) == "ello"
This behaviour seems very counterintuitive. The
result "should" be "".

New behaviour: if end is negative or smaller than start,
the empty string is returned. If it is not given, the end
position used is the end of the string. So the following is
substring("hello world", 5, -1) == ""
substring("hello world", 5, 1) == ""
substring("hello world", 5) == " world"

The behaviour with negative values for end appears
justified since there does not appear (to me at least) for
a program to desire the "flip the positions" behaviour,
and that existing, non-buggy macro code will make
certain this is the case.


  • Tony Balinski
    Tony Balinski

    patch/diff -ur file

  • Thorsten Haude
    Thorsten Haude

    Logged In: YES

    Overall I think the patch makes a lot of sense. One minor
    thing though: By invalidating all substrings with negative
    indices, we loose a possible functionality: Negative
    positions could be counted from the end:
    substring("hello world", 5, -1) -> " worl"
    substring("hello world", -4, -1) -> "orl"

  • Tony Balinski
    Tony Balinski

    Logged In: YES

    I actually considered using a negative end-pos as specifying
    a "length" to extract rather than a position, and in fact
    had code to do this. This was more difficult to describe for
    the help documentation and I found it a bit counterintuitive.

    Your proposal, keeping both numbers as offsets, but from the
    end, is a lot clearer. I think this could be done. I also
    think it could be just as useful - I have had cases where I
    wanted to strip off the first and last character from a
    string: this, using your scheme, would be
    substring(string, 1, -1)

  • Tony Balinski
    Tony Balinski

    Logged In: YES

    Added suggested handling of negative position values: if
    either of the position indicators is less than zero, it will
    be treated as relative to the end of the string. Also, if
    the end position is missing, the end of the string is used.
    Finally, if the start position is beyond the end position,
    an empty string is returned. So the following are true:
    substring("hello world", 5, -1) == " worl"
    substring("hello world", -5, -4) == "w"
    substring("hello world", -5) == "world"
    substring("hello world", 6, -6) == ""

  • Tony Balinski
    Tony Balinski

  • Tony Balinski
    Tony Balinski

    Logged In: YES

    Integrated into CVS, allowing negative offsets.

  • Tony Balinski
    Tony Balinski

    • status: open --> closed-accepted