Thread: [Libclc-developers] new function: clc_strrev
Status: Planning
Brought to you by:
augestad
|
From: <bo...@me...> - 2003-03-23 11:09:39
|
We can't have a string module without a strrev function, can we?
I could not find a proposal for one on google, so here is my version.
q1: Do we need it?
q2: any bugs?
q3: other comments?
Enjoy.
Bjørn
/* $id$
* Copyright(c) 2003, B. Augestad (bo...@me...)
*/
/* documentation start:
SYNOPSIS
#include <clc_string.h>
char* clc_strrev(char* s);
DESCRIPTION
clc_strrev() - reverses the contents of a string.
PARAMETERS
s - The string to reverse
RETURN VALUE
Returns the parameter s
ERROR HANDLING
The release version of libclc performs no error checking. The debug
version asserts that s isn't a NULL pointer.
EXAMPLE
char buf[100];
strcpy(buf, "Hello, World");
clc_strrev(buf);
printf("%s\n", buf);
AUTHOR
B. Augestad, bo...@me...
BUGS
None known
* documentation end:
*/
#include <string.h>
#include "clc_assert.h"
#include "clc_string.h"
char* clc_strrev(char* s)
{
size_t left, right, middle, cb;
clc_assert_not_null(clc_strrev, s);
cb = strlen(s);
middle = cb / 2;
for(left = 0; left < middle; left++) {
char tmp;
right = cb - left - 1;
tmp = s[left];
s[left] = s[right];
s[right] = tmp;
}
return s;
}
#ifdef CLC_TEST
int main(void)
{
char odd[100], even[100], empty[100];
strcpy(odd, "12345");
strcpy(even, "123456");
strcpy(empty, "");
printf("odd :%s\n", clc_strrev(odd));
printf("even :%s\n", clc_strrev(even));
printf("empty:%s\n", clc_strrev(empty));
return 0;
}
#endif
|
|
From: Hallvard B F. <h.b...@us...> - 2003-03-23 11:30:06
|
Bj=F8rn Augestad writes: > We can't have a string module without a strrev function, can we? > I could not find a proposal for one on google, so here is my version. > q1: Do we need it? I don't think so. I can only remember needing a rev function once or twice in my life. I have heroically refrained from posting a few nice libclc functions I've rarely if ever needed myself:-) --=20 Hallvard |
|
From: <bo...@me...> - 2003-03-23 12:25:02
|
Hallvard B Furuseth wrote:
> Bjørn Augestad writes:
>
>>We can't have a string module without a strrev function, can we?
>>I could not find a proposal for one on google, so here is my version.
>>q1: Do we need it?
>
>
> I don't think so. I can only remember needing a rev function once or
> twice in my life. I have heroically refrained from posting a few nice
> libclc functions I've rarely if ever needed myself:-)
>
Some sales pitches for clc_strrev. :-)
1. It seems to be a common problem for newbies to reverse a string, lots
of people have asked about this over the years. We can help them out by
providing it.
2. Microsoft has one (_strrev), ANSI C does not. clc_strrev can aid
portability.
3. Some functions can be implemented in a clearer way if we have a
clc_strrev. Consider the clc_ultostr() which in the latest version
writes from the end of buffer to the beginning and then calls memmove to
adjust the buffer. clc_ultostr() could be changed to writing from the
beginning of the buffer and then just reverse the output before
returning. Here's a quick&dirty implementation :
int clc_ultostr(char *ptr, size_t size, unsigned long num, int base)
{
const char *sym = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char* sp;
sp = ptr;
size--; /* reserve one for \0 */
do {
*sp++ = sym[num % base];
num /= base;
} while (num > 0 && size-- > 0);
if(num > 0)
return 0;
*sp = '\0';
clc_strrev(ptr);
return 1;
}
I ran my test program with both versions. The clc_strrev version is 5%
faster on one of my machines than the original which uses memmove. ;-)
I think you should post the nice functions as well. Never know if
someone needs them or suddenly finds a new way of doing things. :-)
--
boa
Please join the libclc-developers list
at http://lists.sourceforge.net/lists/listinfo/libclc-developers
|
|
From: Hallvard B F. <h.b...@us...> - 2003-03-23 13:40:41
|
Bj=F8rn Augestad writes:
> Some sales pitches for clc_strrev. :-)
OK.
In that case, here is a smaller version. I added clc_strnrev() too
since I imagine that can be useful sometimes, or at least faster when
we already know strlen.
BTW, I did as you said and replaced #include "clc_*" with <clc_*>:-)
Note: The 'if(len)' is a very slight optimization, it can be omitted
(and the end-- moved) at the price of an extra loop for uneven-length
strings.
/* $Id$ */
/*
* Copyright(c) 2003, Hallvard B Furuseth <h.b...@us...>
*/
#include <string.h>
#include <clc_assert.h>
#include <clc_string.h>
char *
clc_strrev(char *s)
{
clc_assert_not_null(clc_strrev, s);
return clc_strnrev(s, strlen(s));
}
char *
clc_strnrev(char *s, size_t len)
{
clc_assert_not_null(clc_strnrev, s);
if(len) {
char *beg =3D s;
char *end =3D s + len - 1;
while (beg < end) {
char tmp =3D *end;
*end-- =3D *beg;
*beg++ =3D tmp;
}
}
return s;
}
--=20
Hallvard
|
|
From: <bo...@me...> - 2003-03-23 14:50:59
|
Hallvard B Furuseth wrote:
> Bjørn Augestad writes:
>
>>Some sales pitches for clc_strrev. :-)
>
>
> OK.
>
> In that case, here is a smaller version. I added clc_strnrev() too
> since I imagine that can be useful sometimes, or at least faster when
> we already know strlen.
How about renaming clc_strnrev() to clc_memrev()? That's what it does
and suddenly we have a general purpose memory reverser. It can even be
used to swap bytes in integers.
>
> BTW, I did as you said and replaced #include "clc_*" with <clc_*>:-)
That was in the user documentation. We must use "" to avoid that we
include headers from a previous version already installed. Nice try,
though ;-)
>
> Note: The 'if(len)' is a very slight optimization, it can be omitted
> (and the end-- moved) at the price of an extra loop for uneven-length
> strings.
>
> /* $Id$ */
> /*
> * Copyright(c) 2003, Hallvard B Furuseth <h.b...@us...>
I compared the two versions and optimized mine a little. Turned out that
mine was slightly faster on a 600MHz Pentium running Linux. Then I tried
it on an Athlon 1800+ running Windows XP. Then your version was faster.
Finally Windows *crashed*. :-( Oh well, your version looks better so we
keep that one. Please update libclc/src/string/clc_strrev.c
Speaking of optimizations, any opinions on code like this?
int clc_ultostr(char *ptr, size_t size, unsigned long num, int base)
{
const char *sym = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char* sp;
#ifdef CLC_FAST
if(base == 2) {
sp = ptr;
size--;
do {
*sp++ = num & 0x01 ? '1' : '0';
num >>= 1;
} while(num > 0 && size-- > 0) ;
if(num == 0) {
*sp = '\0';
clc_strrev(ptr);
}
return num == 0;
}
#endif
.... /* regular version goes here */
}
The CLC_FAST version is 4 times faster than the regular one, but the
library will be bigger. Nice for those who likes to print bitpatterns?
Strangely enough division by bitshifting is still faster than / 2. (gcc
3.2.1 -O3 -NDEBUG)
--
boa
Please join the libclc-developers list
at http://lists.sourceforge.net/lists/listinfo/libclc-developers
|
|
From: Hallvard B F. <h.b...@us...> - 2003-03-23 17:19:32
|
Bj=F8rn Augestad writes:
> How about renaming clc_strnrev() to clc_memrev()?
Done.
> Oh well, your version looks better so we=20
> keep that one. Please update libclc/src/string/clc_strrev.c
Done.
> Speaking of optimizations, any opinions on code like this?
>=20
> int clc_ultostr(char *ptr, size_t size, unsigned long num, int base)
> {
> const char *sym =3D "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
> char* sp;
>=20
> #ifdef CLC_FAST
> if(base =3D=3D 2) { (...) return num =3D=3D 0; }
> #endif
> .... /* regular version goes here */
> }
Since the _whole_ base2 function is special-cased, I think it's better
to provide a separate base2 function. Then the library still gets
bigger like you say, but not the application using it.
--=20
Hallvard
|
|
From: <bo...@me...> - 2003-03-23 17:56:32
|
Hallvard B Furuseth wrote:
> Bjørn Augestad writes:
>>Speaking of optimizations, any opinions on code like this?
>>
>>int clc_ultostr(char *ptr, size_t size, unsigned long num, int base)
>>{
>> const char *sym = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
>> char* sp;
>>
>>#ifdef CLC_FAST
>> if(base == 2) { (...) return num == 0; }
>>#endif
>> .... /* regular version goes here */
>>}
>
>
> Since the _whole_ base2 function is special-cased, I think it's better
> to provide a separate base2 function. Then the library still gets
> bigger like you say, but not the application using it.
>
Hmm. A separate base2 function means that the user must call different
functions depending on the speed/size tradeoff he's willing to make. We
discussed the concept of CLC_FAST early in the project, where the
intention of CLC_FAST was to allow for speedier implementations if the
user/builder of libclc wanted that.
I used base2 as an easy-to-implement example, pretty sure that if the
idea catches on, others can provide optimizations for e.g. base 16. We
don't want to call base2, base16, base10 or clc_ultostr() depending on
base, do we?
What if we add clc_ltostr(), clc_lltostr(), clc_itostr() and
clc_ulltostr() for other integer types? Do we want separate base2
functions for them as well?
--
boa
libclc home: http://libclc.sourceforge.net
|
|
From: Hallvard B F. <h.b...@us...> - 2003-03-27 23:17:30
|
Bj=F8rn Augestad writes:
>Hallvard B Furuseth wrote:
>> Bj=F8rn Augestad writes:
>=20
>>>Speaking of optimizations, any opinions on code like this?
>>>
>>>int clc_ultostr(char *ptr, size_t size, unsigned long num, int base)
>>>{
>>> const char *sym =3D "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
>>> char* sp;
>>>
>>>#ifdef CLC_FAST
>>> if(base =3D=3D 2) { (...) return num =3D=3D 0; }
>>>#endif
>>> .... /* regular version goes here */
>>>}
>>=20
>>=20
>> Since the _whole_ base2 function is special-cased, I think it's better
>> to provide a separate base2 function. Then the library still gets
>> bigger like you say, but not the application using it.
>=20
> Hmm. A separate base2 function means that the user must call different
> functions depending on the speed/size tradeoff he's willing to make. We
> discussed the concept of CLC_FAST early in the project, where the
> intention of CLC_FAST was to allow for speedier implementations if the
> user/builder of libclc wanted that.
Sorry, I forgot about that. Yes, I think that may be a good idea if we
implement CLC_FAST. I'm not sure if CLC_FAST is a good idea though...
> I used base2 as an easy-to-implement example, pretty sure that if the
> idea catches on, others can provide optimizations for e.g. base 16.
Here is one which covers all power-of-2 bases.
In the case of base 2, it's 10-20% slower than hardcoded base 2 on
Solaris, I think that's acceptable.
BTW, your claim that CLC_FAST is 4 times as fast as without CLC_FAST
depends on how large NUM is. With NUM=3D0, this function is 15% faster,
with random NUM, 50% faster on Solaris.
Anyway, shall I check it in? We can always take CLC_FAST out later
if c.l.c doesn't like it.
int clc_ultostr(char *ptr, size_t size, unsigned long num, int base)
{
const char *sym =3D "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char *sp;
clc_assert_not_null(clc_ultostr, ptr);
clc_assert_arg(clc_ultostr, size > 1);
clc_assert_arg(clc_ultostr, base >=3D 2 && base <=3D 36);
--size;
sp =3D ptr;
#ifdef CLC_FAST
if((base & (base-1)) =3D=3D 0) {
static const unsigned int log[] =3D { 1,2,3,0, 4,0,0,0, 5 };
unsigned int shift =3D log[base / 4];
unsigned int mask =3D base - 1U;
do {
*sp++ =3D sym[num & mask];
num >>=3D shift;
} while(num && --size);
}
else
#endif /* CLC_FAST */
{
do {
*sp++ =3D sym[num % base];
num /=3D base;
} while (num && --size);
}
*sp =3D '\0';
clc_memrev(ptr, sp-ptr);
return !num;
}
> What if we add clc_ltostr(), clc_lltostr(), clc_itostr() and
> clc_ulltostr() for other integer types?
Not clc_itostr() I hope. clc_ltostr() can handle that. As for
clc_{u}lltostr(), I guess that depends on whether or not we want to add
a bunch of long long functions when compiled with C99 compilers.
--=20
Hallvard
|
|
From: <bo...@me...> - 2003-03-28 17:55:28
|
Hallvard B Furuseth wrote: > Bjørn Augestad writes: > Anyway, shall I check it in? Yes. We can always take CLC_FAST out later > if c.l.c doesn't like it. They'll love it. :-) -- boa libclc home: http://libclc.sourceforge.net |
|
From: Hallvard B F. <h.b...@us...> - 2003-03-28 18:32:16
|
Bj=F8rn Augestad writes:
>Hallvard B Furuseth wrote:
>> Anyway, shall I check it in? =20
> Yes.
Done.
I also made it handle size =3D=3D 1, since it took no extra object code.
Also, here are some fixes to the documentation.
NAME
clc_ultostr - convert an unsigned long integer to a string
SYNOPSIS
char *clc_ultostr(char *ptr, size_t size, unsigned long num
int base);
DESCRIPTION
The clc_ultostr() function converts the number NUM to a string, in
base BASE notation. BASE must be between 2 and 36 inclusive. SIZE
must be at least 1. To summarize it, this function does (nearly)
exactly the opposite of strtoul().
The output and a trailing '\0' is stored in a buffer pointed to by
PTR, whose total length is passed in SIZE. If SIZE is too small,
the result is truncated and only the least significant digits are
stored.
RETURN VALUE
clc_ultostr() returns 1 on success, or 0 if truncation happened.
EXAMPLE
unsigned char buf[12];
/* print 10101010 */
clc_ultostr_n(buf, sizeof(buf), 170, 2);
puts(buf);
/* print 1Y */
clc_ultostr_n(buf, sizeof(buf), 70, 36);
puts(buf);
--=20
Hallvard
|
|
From: Jan E. <je...@li...> - 2003-04-02 11:52:24
|
>Also, here are some fixes to the documentation. > >NAME > clc_ultostr - convert an unsigned long integer to a string > >SYNOPSIS > char *clc_ultostr(char *ptr, size_t size, unsigned long num > int base); > >DESCRIPTION > The clc_ultostr() function converts the number NUM to a string, in > base BASE notation. BASE must be between 2 and 36 inclusive. SIZE > must be at least 1. To summarize it, this function does (nearly) > exactly the opposite of strtoul(). Now we have to discuss what we gain by two spaces. - Jan Engelhardt |
|
From: Hallvard B F. <h.b...@us...> - 2003-04-02 14:47:04
|
Jan Engelhardt writes: >> DESCRIPTION >> The clc_ultostr() function converts the number NUM to a string, in >> base BASE notation. BASE must be between 2 and 36 inclusive. SIZE >> must be at least 1. To summarize it, this function does (nearly) >> exactly the opposite of strtoul(). > > Now we have to discuss what we gain by two spaces. Er, what? BTW, something else to discuss - on comp.lang.c - is whether or not to include this function at all. There is just one vote there now, and that is 'no', so the majority is against:-) The libclc discussion there now doesn't look promising for libclc anyway - it won't *be* libclc if c.l.c doesn't discuss it. -- Hallvard |
|
From: Jan E. <je...@li...> - 2003-04-03 13:57:09
|
>>> DESCRIPTION >>> The clc_ultostr() function converts the number NUM to a string, in >>> base BASE notation. BASE must be between 2 and 36 inclusive. SIZE >>> must be at least 1. To summarize it, this function does (nearly) >>> exactly the opposite of strtoul(). >> >> Now we have to discuss what we gain by two spaces. > >Er, what? "Why is there two spaces between a sentence end and the next one like "notation. BASE"? Probably just the ugly man<->txt formatter. >BTW, something else to discuss - on comp.lang.c - is whether or not to >include this function at all. There is just one vote there now, and >that is 'no', so the majority is against:-) who is that? the other, who says 'yes' is me, so we 're equal Oh yeah indeed, few people really need ultostr, as;all they ever need will be %o, %d and %x, which are in printf. But, does glibc contain only functions the mass needs? - Jan Engelhardt |
|
From: regis <re...@in...> - 2003-04-03 16:18:34
|
Jan Engelhardt wrote: > > >>> DESCRIPTION > >>> The clc_ultostr() function converts the number NUM to a string, in > >>> base BASE notation. BASE must be between 2 and 36 inclusive. SIZE > >>> must be at least 1. To summarize it, this function does (nearly) > >>> exactly the opposite of strtoul(). > >> > >> Now we have to discuss what we gain by two spaces. > > > >Er, what? > > "Why is there two spaces between a sentence end and the next one > like "notation. BASE"? Probably just the ugly man<->txt formatter. Typographic rules change from languages to languages. In French: a) for simple marks => no space before and one after (comma, period) b) for double marks => one insecable space before and one space after, (colon, exclamation, question mark) In English: a) same as in French except for the period (two spaces after) b) no space before double marks but one after In both languages: c) suspension mark (...) follows the same rule as for the period. > Oh yeah indeed, few people really need ultostr, as;all they > ever need will be %o, %d and %x, which are in printf. > But, does glibc contain only functions the mass needs? base 2 is more useful than base 8 and is missing... -- Regis |
|
From: Jan E. <je...@li...> - 2003-04-03 16:40:20
|
>> "Why is there two spaces between a sentence end and the next one >> like "notation. BASE"? Probably just the ugly man<->txt formatter. > >Typographic rules change from languages to languages. > >In French: >a) for simple marks => no space before and one after >(comma, period) > >b) for double marks => one insecable space before and one space after, >(colon, exclamation, question mark) ... ah j'ai compris. >> Oh yeah indeed, few people really need ultostr, as;all they >> ever need will be %o, %d and %x, which are in printf. >> But, does glibc contain only functions the mass needs? > >base 2 is more useful than base 8 and is missing... so that is a vote for ultostr? - Jan Engelhardt |
|
From: regis <re...@in...> - 2003-04-03 17:33:06
|
Jan Engelhardt wrote: > > >> "Why is there two spaces between a sentence end and the next one > >> like "notation. BASE"? Probably just the ugly man<->txt formatter. > > > >Typographic rules change from languages to languages. > > > >In French: > >a) for simple marks => no space before and one after > >(comma, period) > > > >b) for double marks => one insecable space before and one space after, > >(colon, exclamation, question mark) > > ... ah j'ai compris. > > >> Oh yeah indeed, few people really need ultostr, as;all they > >> ever need will be %o, %d and %x, which are in printf. > >> But, does glibc contain only functions the mass needs? > > > >base 2 is more useful than base 8 and is missing... > > so that is a vote for ultostr? Mhh Yes, except that the mangled names of all the string functions suck. |
|
From: Jan E. <je...@li...> - 2003-04-03 17:44:54
|
>> so that is a vote for ultostr? > >Mhh Yes, >except that the mangled names of all the string functions suck. tell us a better name for uh ultostr()... strfromul? > > > >------------------------------------------------------- >This SF.net email is sponsored by: ValueWeb: >Dedicated Hosting for just $79/mo with 500 GB of bandwidth! >No other company gives more support or power for your dedicated server >http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/ >_______________________________________________ >Libclc-developers mailing list >Lib...@li... >https://lists.sourceforge.net/lists/listinfo/libclc-developers > - Jan Engelhardt |
|
From: Hallvard B F. <h.b...@us...> - 2003-04-03 17:53:05
|
regis writes: >> so that is a vote for ultostr? > > Mhh Yes, > except that the mangled names of all the string functions suck. Say it on c.l.c. With suggestions for other names. Then I'll probably speak up against you, I think most names are fine:-) -- Hallvard |
|
From: Hallvard B F. <h.b...@us...> - 2003-04-03 17:38:15
|
Jan Engelhardt writes: >>> Now we have to discuss what we gain by two spaces. >> >>Er, what? > > "Why is there two spaces between a sentence end and the next one > like "notation. BASE"? Probably just the ugly man<->txt formatter. Because that's the way I learned to write it. And it helps formatting tools to detect a sentence end which is not at the end of the line. >> BTW, something else to discuss - on comp.lang.c - is whether or not to >> include this function at all. There is just one vote there now, and >> that is 'no', so the majority is against:-) > > who is that? Dan Pop, message ID <b6bqgt$fq2$1...@su...>. > the other, who says 'yes' is me, so we 're equal But you didn't say it it c.l.c (unless my news server missed it), so it doesn't count yet. This is lib*clc* after all. -- Hallvard |
|
From: Hallvard B F. <h.b...@us...> - 2003-03-23 14:33:14
|
> I think you should post the nice functions as well. Never know if
> someone needs them or suddenly finds a new way of doing things. :-)
OK, here are the clc_str{n}{case}comm() functions. Does anyone need
them? I don't:-)
NAME
clc_strcomm, clc_strncomm, clc_strcasecomm, clc_strncasecomm
- find common prefix of strings
SYNOPSIS
#include <clc_string.h>
long clc_strcomm(const char *s1, const char *s2);
long clc_strncomm(const char *s1, const char *s2, long len);
long clc_strcasecomm(const char *s1, const char *s2);
long clc_strncasecomm(const char *s1, const char *s2, long len);
DESCRIPTION
These functions find the length of the common prefix of two strings.
clc_strcomm() and clc_strncomm() do case-sensitive compare,
clc_strcasecomm() and clc_strncasecomm() case-insensitive.
clc_strcomm() and clc_strcasecomm() compare until the terminating
null characters. clc_strncomm() and clc_strncasecomm() also stop
after len characters compared equal.
If the strings are longer than LONG_MAX, the behaviors of
clc_strncomm() and clc_strncasecomm() are undefined.
RETURN VALUES
The functions return the common length of the strings, or -1 if the
strings compare equal.
CAVEATS
Note that the functions treat the size as long, not size_t.
SEE ALSO
clc(3), strcmp(3), strncmp(3), clc_strcasecmp(3), clc_strncasecmp(3)
Implementation choices:
- I used long instead of size_t as a length parameter to emphasize that
a long length is returned.
- I could have returned an error condition (-2?) from clc_str{case}cmp
if the strings were equal for more than LONG_MAX characters, but
didn't bother. I can't imagine these functions being used on other
than fairly short strings.
Source code for clc_strncasecomm() follows. The others are for the time
being left as an exercise for the reader.
/* $Id$ */
/*
* Copyright(c) 2003 Hallvard B Furuseth <h.b...@us...>
*/
#include <ctype.h>
#include <clc_assert.h>
#include <clc_string.h>
long
clc_strncasecomm(const char *s1, const char *s2, long len)
{
const char *start = s1;
clc_assert_arg(clc_strncasecomm, s1 != NULL && s2 != NULL);
for (; --len >= 0; ++s1, ++s2) {
if (tolower((unsigned char)*s1) != tolower((unsigned char)*s2))
return s1 - start;
if (*s1 == '\0')
break;
}
return -1;
}
#ifdef CLC_TEST
#include <stdio.h>
int
main()
{
if(clc_strncasecomm("abc", "ABD", 3) != 2 ||
clc_strncasecomm("abc", "ABD", 2) != -1 ||
clc_strncasecomm("abc", "ABD", 1) != -1 ||
clc_strncasecomm("ab", "ABD", 3) != 2 ||
clc_strncasecomm("abc", "DEF", 3) != 0 ) {
puts("Sorry.");
return 1;
}
return 0;
}
#endif /* CLC_TEST */
--
Hallvard
|
|
From: <bo...@me...> - 2003-03-23 15:35:33
|
Hallvard B Furuseth wrote:
>>I think you should post the nice functions as well. Never know if
>>someone needs them or suddenly finds a new way of doing things. :-)
>
>
> OK, here are the clc_str{n}{case}comm() functions. Does anyone need
> them? I don't:-)
I have had use for functions like this. If you want to implement
strtod() you must support the text "infinity", which can be abbreviated
to "inf" and should be treated case insensitive as well.
n = clc_strcasecomm(s, "infinity");
would have been nice to have in such cases.
I have some questions about the design. Allow for stupid questions here,
hard to grasp everything at first try.
1. Why do you return -1 if s1 is shorter than s2? IMO that means that
the length of the common part == strlen(s1).
2. The implementation crashes if s2 is shorter than s1 and shorter than
len, doesn't it?
3. Why have a len at all? If you just want to see if the first n
characters are common, can't memcmp, strncasecmp or others be used instead?
4. Given 1) and 3), why not return size_t? What can go wrong? If either
of the strings are "", then return 0. When a char differs, return count
so far.
>
> NAME
> clc_strcomm, clc_strncomm, clc_strcasecomm, clc_strncasecomm
> - find common prefix of strings
>
> SYNOPSIS
> #include <clc_string.h>
> long clc_strcomm(const char *s1, const char *s2);
> long clc_strncomm(const char *s1, const char *s2, long len);
> long clc_strcasecomm(const char *s1, const char *s2);
> long clc_strncasecomm(const char *s1, const char *s2, long len);
>
> DESCRIPTION
> These functions find the length of the common prefix of two strings.
>
> clc_strcomm() and clc_strncomm() do case-sensitive compare,
> clc_strcasecomm() and clc_strncasecomm() case-insensitive.
>
> clc_strcomm() and clc_strcasecomm() compare until the terminating
> null characters. clc_strncomm() and clc_strncasecomm() also stop
> after len characters compared equal.
>
> If the strings are longer than LONG_MAX, the behaviors of
> clc_strncomm() and clc_strncasecomm() are undefined.
>
> RETURN VALUES
> The functions return the common length of the strings, or -1 if the
> strings compare equal.
>
> CAVEATS
> Note that the functions treat the size as long, not size_t.
>
> SEE ALSO
> clc(3), strcmp(3), strncmp(3), clc_strcasecmp(3), clc_strncasecmp(3)
>
>
> Implementation choices:
> - I used long instead of size_t as a length parameter to emphasize that
> a long length is returned.
> - I could have returned an error condition (-2?) from clc_str{case}cmp
> if the strings were equal for more than LONG_MAX characters, but
> didn't bother. I can't imagine these functions being used on other
> than fairly short strings.
>
> Source code for clc_strncasecomm() follows. The others are for the time
> being left as an exercise for the reader.
>
>
> /* $Id$ */
> /*
> * Copyright(c) 2003 Hallvard B Furuseth <h.b...@us...>
> */
>
> #include <ctype.h>
> #include <clc_assert.h>
> #include <clc_string.h>
>
> long
> clc_strncasecomm(const char *s1, const char *s2, long len)
> {
> const char *start = s1;
>
> clc_assert_arg(clc_strncasecomm, s1 != NULL && s2 != NULL);
>
> for (; --len >= 0; ++s1, ++s2) {
> if (tolower((unsigned char)*s1) != tolower((unsigned char)*s2))
> return s1 - start;
> if (*s1 == '\0')
> break;
> }
> return -1;
> }
>
[snip]
--
boa
Please join the libclc-developers list
at http://lists.sourceforge.net/lists/listinfo/libclc-developers
|
|
From: Hallvard B F. <h.b...@us...> - 2003-03-23 15:47:54
|
Bj=F8rn Augestad writes: > 1. Why do you return -1 if s1 is shorter than s2? I don't. the *s1 =3D=3D '\0' test only hits if *s1 =3D=3D *s2. > 2. The implementation crashes if s2 is shorter than s1 and shorter than=20 > len, doesn't it? Then the tolower test will stop the loop. Well, unless *s2 =3D=3D '\0' and tolower(*s2) !=3D '\0', but if anyone has broken tolower that badly I don't care if they lose. > 3. Why have a len at all? If you just want to see if the first n=20 > characters are common, can't memcmp, strncasecmp or others be used instea= d? One could test with strncasecmp first and then call strcasecomm() if the strings are not equal, but that's more work and more code. > 4. Given 1) and 3), why not return size_t? What can go wrong? The problem is what to return if the strings are equal and shorter than the length parameter. strlen(s1)? Then the caller must test if either of s1[return value] or s2[return value] is non-'\0' if he wants to know if the strings are equal. strlen(s1)+1? Then we can't inspect s1[return value], which might be out of bounds, and we must test if the return value is nonzero before inspecting s1[return value-1]. So I gave up and returned -1 instead. --=20 Hallvard |
|
From: regis <re...@in...> - 2003-04-03 17:58:36
|
Jan Engelhardt wrote: > > >> so that is a vote for ultostr? > > > >Mhh Yes, > >except that the mangled names of all the string functions suck. > > tell us a better name for uh ultostr()... > strfromul? clc_string_from_unsigned_long() -- Regis |
|
From: regis <re...@in...> - 2003-04-03 18:35:50
|
Hallvard B Furuseth wrote: > > regis writes: > >> so that is a vote for ultostr? > > > > Mhh Yes, > > except that the mangled names of all the string functions suck. > > Say it on c.l.c. With suggestions for other names. Then I'll probably > speak up against you, I think most names are fine:-) - Given a function name, It's hard to guess what it is supposed to do. - Given a wanted feature by the user, it is hard for him to guess what is the corresponding function. - Given a function already used by the user, it is hard to remember it for the next use. What clc_stpcpy is supposed to do? What does the 'p' stand? What clc_strtok_r is supposed to do? What does the '_r' mean? What clc_strlcpy is supposed to do? What does the 'l' mean? These are as many questions the new user asks at the first sight of these functions... Even for the standard strspn, I still don't know what spn stands for... Clearer (and unforgetable) names (with prefix clc_str_. Thinking of it, it would be better to reserve the long prefix clc_string_ for a possible future ADT for strings that expand/shrink automatically...) from_unsigned_long() to_uppercase() trim_left() trim_right() trim() or trim_both() split() normalize() reverse() compare_case_insensitive() etc. ( I would merge clc_stpcpy() and clc_strdup() to the single function clc_str_copy() with a possibly NULL extra output arg. ) |
|
From: Hallvard B F. <h.b...@us...> - 2003-04-03 19:06:08
|
I still think this should go to comp.lang.c, but anyway: regis writes: >Hallvard B Furuseth wrote: >>regis writes: >>>> so that is a vote for ultostr? >>> >>> Mhh Yes, >>> except that the mangled names of all the string functions suck. >> >> Say it on c.l.c. With suggestions for other names. Then I'll probably >> speak up against you, I think most names are fine:-) > > - Given a function name, It's hard to guess what it is supposed to do. That's true, but unfortunately typical for C. > - Given a wanted feature by the user, it is hard for him to guess what > is the corresponding function. The documentation should point the user in the right direction. See cvs/doc/string/string.txt. Maybe that should be extended to include brief (hopefully one-line) summaries for each function. > - Given a function already used by the user, it is hard to remember it > for the next use. Not sure I really agree with that, but it's all the same of your first point anyway. > What clc_stpcpy is supposed to do? What does the 'p' stand? > What clc_strtok_r is supposed to do? What does the '_r' mean? > What clc_strlcpy is supposed to do? What does the 'l' mean? I think these function names should stay, because they are copied from already existing functions. <http://www.linuxcentral.com/linux/man-pages/stpcpy.3.html> <http://www.courtesan.com/todd/papers/strlcpy.html> <http://www.mkssoftware.com/docs/man3/strtok_r.3.asp> (I have no idea what the 'p' is. '_r' is standard for reentrant versions of non-reentrant functions. Maybe 'l' is 'length'.) > Even for the standard strspn, I still don't know what spn stands for... "span"? > Clearer (and unforgetable) names > (with prefix clc_str_. Thinking of it, it would be better > to reserve the long prefix clc_string_ for a possible future > ADT for strings that expand/shrink automatically...) > > from_unsigned_long() > to_uppercase() > trim_left() > trim_right() > trim() or trim_both() > split() > normalize() > reverse() > compare_case_insensitive() > etc. It might have been nice if Standard C used such names, but it doesn't. So I think we should at least partly stick to standard C's way of naming things. Though I do think clc_str<lwr/upr> should be <lower/upper>, like tolower/toupper(). #define long_name(...) short_name(...) might be an idea, though. > ( I would merge clc_stpcpy() and clc_strdup() to the single function > clc_str_copy() with a possibly NULL extra output arg. ) No way. Then it would have to return a pointer to the end of the string if the argument is NULL and to the beginning if it is non-NULL. That's _really_ confusing. Even more than you probably were when you misremembered what clc_stpcpy does:-) -- Hallvard |