Thread: [Libclc-developers] contribution(?): string function ultostr()
Status: Planning
Brought to you by:
augestad
|
From: Jan E. <je...@li...> - 2003-03-17 16:16:35
|
Hi,
copied this from my personal lib as I thought
this might be worth sharing, no?
/*=============================================================================
ultostr
by Jan "Hirogen2" Engelhardt <hirogen2 at gmx de>, 2003
-- distributed under the Frontier Artistic License and GNU General Public
-- License. See doc/FAL1.txt and doc/GPL2.txt for details.
-------------------------------------------------------------------------------
NAME
ultostr - convert an unsigned long integer to a string
SYNOPSIS
unsigned char *ultostr(unsigned long num, unsigned long base,
unsigned char *ptr, size_t size);
DESCRIPTION
The ultostr() function converts the number NUM to a string, in
base BASE notation. BASE must be between 2 and 36 inclusive. To
summarize it, this function does (nearly) exactly the opposite
of strtoul().
The output is written to the location pointed to by PTR, whose
length is passed in SIZE. SIZE is the number of bytes, including
space for a trailing '\0'. No more bytes than (SIZE - 1) are
written. The output is null-padded in front.
RETURN VALUE
ultostr() returns PTR on success, or NULL on error. ERANGE is
returned if BASE is < 2 or > 36. EFAULT is returned if PTR is
NULL.
EXAMPLE
unsigned char buf[12];
memset(buf, 0, 12);
printf("%s\n", ultostr(170, 2, 12));
Prints 00010101010 (11 chars).
=============================================================================*/
#include <errno.h>
#include <stdio.h>
unsigned char *ultostr(unsigned long num, unsigned long base,
unsigned char *ptr, size_t size) {
unsigned char *sym = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ",
*startp = ptr + size - 2;
if(base < 2 || base > 36) { errno = ERANGE; return NULL; }
if(ptr == NULL) { errno = EFAULT; return NULL; }
while(--size > 0) {
*ptr = sym[num % base];
num /= base;
--ptr;
}
return ptr;
}
//==[ End of file ]============================================================
- Jan Engelhardt
|
|
From: <bo...@me...> - 2003-03-17 16:46:51
|
Jan Engelhardt wrote:
> Hi,
>
> copied this from my personal lib as I thought
> this might be worth sharing, no?
Thanks for the code, Jan. Here are my comments.
>
> /*=============================================================================
> ultostr
> by Jan "Hirogen2" Engelhardt <hirogen2 at gmx de>, 2003
> -- distributed under the Frontier Artistic License and GNU General Public
> -- License. See doc/FAL1.txt and doc/GPL2.txt for details.
The license must be changed to BSD.
> -------------------------------------------------------------------------------
> NAME
> ultostr - convert an unsigned long integer to a string
The name must be prefixed with clc_ .
>
> SYNOPSIS
> unsigned char *ultostr(unsigned long num, unsigned long base,
> unsigned char *ptr, size_t size);
unsigned long base is pretty long for the range 2..36 :-) strtoul uses
int for base.
Unsigned char* are very uncommon in C libraries, and its buddy function
strtoul uses char*.
>
> DESCRIPTION
> The ultostr() function converts the number NUM to a string, in
> base BASE notation. BASE must be between 2 and 36 inclusive. To
> summarize it, this function does (nearly) exactly the opposite
> of strtoul().
>
> The output is written to the location pointed to by PTR, whose
> length is passed in SIZE. SIZE is the number of bytes, including
> space for a trailing '\0'. No more bytes than (SIZE - 1) are
> written. The output is null-padded in front.
>
> RETURN VALUE
> ultostr() returns PTR on success, or NULL on error. ERANGE is
> returned if BASE is < 2 or > 36. EFAULT is returned if PTR is
> NULL.
I assume that you mean that errno is set to ERANGE? I guess we won't
have to test for legal base. strtoul sets errno to EINVAL if base is out
of range according to my man page. That page also says that C99 does
*not* set errno if base is out of range, so we shouldn't do it either?
EFAULT does not exist in ANSI C and the libclc doesn't require that we
test for NULL args in release versions. The debug version should have an
clc_assert_not_null(function_name, ptr);
It should also have a
clc_assert_arg(function_name, base >= 2 && base <= 36);
>
> EXAMPLE
> unsigned char buf[12];
> memset(buf, 0, 12);
> printf("%s\n", ultostr(170, 2, 12));
>
> Prints 00010101010 (11 chars).
> =============================================================================*/
> #include <errno.h>
> #include <stdio.h>
>
> unsigned char *ultostr(unsigned long num, unsigned long base,
> unsigned char *ptr, size_t size) {
> unsigned char *sym = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ",
> *startp = ptr + size - 2;
> if(base < 2 || base > 36) { errno = ERANGE; return NULL; }
> if(ptr == NULL) { errno = EFAULT; return NULL; }
> while(--size > 0) {
> *ptr = sym[num % base];
> num /= base;
> --ptr;
> }
> return ptr;
> }
>
/* Proper libclc format guidelines (unpublished :-)) applied */
unsigned char *ultostr(
unsigned long num,
unsigned long base,
unsigned char *ptr,
size_t size)
{
unsigned char *sym = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ",
clc_assert_not_null(ultostr, ptr);
clc_assert_arg(ultostr, base >= 2 && base <= 36);
*startp = ptr + size - 2;
if(base < 2 || base > 36) {
errno = ERANGE;
return NULL;
}
while(--size > 0) {
*ptr = sym[num % base];
num /= base;
--ptr;
}
return ptr;
}
I fail to see where startp is defined.
Very good documentation!
I think we need a function like this, especially if it converts numbers
to bit patterns. Lots of people wants to convert integers to bit patterns.
If we add this function, should we add ltostr, lltostr and ulltostr as well?
--
boa
Please join the libclc-developers list
at http://lists.sourceforge.net/lists/listinfo/libclc-developers
|
|
From: Jan E. <je...@li...> - 2003-03-17 16:56:13
|
>> /*=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
>> ultostr
>> by Jan "Hirogen2" Engelhardt <hirogen2 at gmx de>, 2003
>> -- distributed under the Frontier Artistic License and GNU General P=
ublic
>> -- License. See doc/FAL1.txt and doc/GPL2.txt for details.
>
>The license must be changed to BSD.
What's the thing about BSD?
>> NAME
>> ultostr - convert an unsigned long integer to a string
>The name must be prefixed with clc_ .
Was just straight forward copied from my lib, so there is not anything cl=
c
related in it ;)
>> SYNOPSIS
>> unsigned char *ultostr(unsigned long num, unsigned long base,
>> unsigned char *ptr, size_t size);
>unsigned long base is pretty long for the range 2..36 :-) strtoul uses
>int for base.
woho... heh right, that probably should have been unsigned char.
>Unsigned char* are very uncommon in C libraries, and its buddy function
>strtoul uses char*.
And? I like it. It's probably because I cannot feel comfortable when they
assign the '=F6' a value of -10.
>I assume that you mean that errno is set to ERANGE? I guess we won't
Whatever. Take EDOM.
>have to test for legal base. strtoul sets errno to EINVAL if base is out
Oh yeah I know that, but it looks like I swapped EINVAL and EFAULT. Every=
where
where I looked.
>/* Proper libclc format guidelines (unpublished :-)) applied */
this is the thing nobody can really agree to, the discussion for such is =
way
too long.
The major aspects of my style are
>unsigned char *ultostr(
> unsigned long num,
> unsigned long base,
> unsigned char *ptr,
> size_t size)
- one row, or indent-by-1 if longer than 79 chars
>{
- as well as keeping any { on the line
> unsigned char *sym =3D "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ",
>
> clc_assert_not_null(ultostr, ptr);
> clc_assert_arg(ultostr, base >=3D 2 && base <=3D 36);
>
> *startp =3D ptr + size - 2;
> if(base < 2 || base > 36) {
> errno =3D ERANGE;
> return NULL;
> }
>
> while(--size > 0) {
> *ptr =3D sym[num % base];
> num /=3D base;
> --ptr;
> }
>
> return ptr;
>}
>
>I fail to see where startp is defined.
imagine ptr is startp
>Very good documentation!
Indeed stolen from strtoul
- Jan Engelhardt
|
|
From: Hallvard B F. <h.b...@us...> - 2003-03-18 18:48:10
|
Michael B.Allen writes:
>Jan Engelhardt <je...@li...> wrote:
>
>>> *startp = ptr + size - 2;
>
> What would happen if size were 2, 1, or even 0? Those are legitimate
> values.
2 is OK with that code. 0 and 1 don't need to be, we can document
that size must be at least 2 and add clc_assert_arg(ultostr, size>1).
>>> clc_assert_arg(ultostr, base >= 2 && base <= 36);
>>> (...)
>>> if(base < 2 || base > 36) {
>>> errno = ERANGE;
>>> return NULL;
>>> }
I hope this doesn't spawn a too big error handling thread again, but I
think if the function fails with ERANGE when base is out of range, that
should be documented, well-defined behaviour - so there should not be an
assert which makes such an argument crash the program.
> What about negative values?
Not in an _unsigned_ long.
Some other points:
I think the output argument should be first, followed by the size
argument, similar to fread (and strcpy for the output argument).
You forgot to \0-terminate the output.
I'd prefer to get as few digits as possible returned instead of
'0'-padding the output. Even though this means doing memmove()
at the end. (Or we could return a pointer into the middle of the
string, but that invites bugs in the user's program.)
It shuold fail if the output buffer is too small.
... fail with ERANGE? Or is this our chance to invent our first CLC_E*
code - CLC_ENOSPC "not enough space"?
#include <string.h>
#include <errno.h>
#include "clc_assert.h"
char *
clc_ultostr(char *ptr, size_t size, unsigned long num, int base)
{
char *end, *end2;
const char *sym = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
clc_assert_not_null(clc_ultostr, ptr);
clc_assert_arg(clc_ultostr, size > 1);
if(base < 2 || base > 36) { /* or assert 2 <= base && base <= 36 */
errno = ERANGE;
return NULL;
}
end = end2 = ptr + size;
*--end = '\0';
do {
if (end == ptr) {
errno = ERANGE;
return NULL;
}
*--end = sym[num % base];
num /= base;
} while (num);
if (end != ptr)
memmove(ptr, end, end2 - end);
return ptr;
}
|
|
From: Hallvard B F. <h.b...@us...> - 2003-03-19 14:26:45
|
Jan Engelhardt writes: >>> What would happen if size were 2, 1, or even 0? Those are legitimate >>> values. >> >> 2 is OK with that code. 0 and 1 don't need to be, we can document >> that size must be at least 2 and add clc_assert_arg(ultostr, size>1). > > Since SIZE is the length of the /whole/ (incoming) string including \0 (hehe) > we need to substract 2. Which still makes 2 OK. >>I think the output argument should be first, followed by the size >>argument, similar to fread (and strcpy for the output argument). > > Play. Eh? Was that agreement or disagreement? >> You forgot to \0-terminate the output. > Did I? Possibly. I thought the user takes care of that: Oh. I think that's a very bad idea. All other string functions I know return a \0-terminated string, except strncpy which I think everyone agrees is an abomination. >> I'd prefer to get as few digits as possible returned instead of >> '0'-padding the output. Even though this means doing memmove() >> at the end. (Or we could return a pointer into the middle of the >> string, but that invites bugs in the user's program.) > > Strip preceding '0's, that is make them ' '. Otherwise, we would > need to allocate a buffer inside ultostr(), depending on how big > the output will be. No, we just need to do memmove like my posted code did. It writes at the end of the buffer at first, but moves it to the front before returning. This means that part of the buffer 'behind' the returned string in the output buffer is modified, but I don't care. If you don't like that, we can instead generate a reversed string at first (starting from the beginning of the buffer) and reverse it again before returning. >> It shuold fail if the output buffer is too small. >> ... fail with ERANGE? Or is this our chance to invent our first CLC_E* >> code - CLC_ENOSPC "not enough space"? > > ENOSPC is for writing files. CLC_ENOROOM? > Where is the point in choosing between ERANGE and CLC_ERANGE (if such > exists)... I totally wonder about the clc errno code though. I see no point in introducing CLC_ERANGE, since the standard specifies that ERANGE must exist. We only need CLC_ codes for meanings that the standard does not define. > BTW: Most mailing-list-readers (at least, mine) does, when I hit > reply, want to compose a message to the author who wrote it, plus a CC > to li...@sf.... I would like to send the msgs to li...@sf... only, If you mean the list should generate 'reply-to: <list>', I disagree. Then I can't easily sent private replies when that's what I want. > as otherwise I get 2x the same message for no reason. TIA Log in with [Edit Options] at the bottom of http://lists.sourceforge.net/lists/listinfo/libclc-developers with the address you are subscribed as, and turn off 'Receive posts you send to the list'. -- Hallvard |
|
From: Michael B.A. <mb...@io...> - 2003-03-18 07:55:35
|
On Mon, 17 Mar 2003 17:56:05 +0100 (MET) Jan Engelhardt <je...@li...> wrote: > >> /*=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D > >> ultostr > >> by Jan "Hirogen2" Engelhardt <hirogen2 at gmx de>, 2003 > >> -- distributed under the Frontier Artistic License and GNU General P= ublic > >> -- License. See doc/FAL1.txt and doc/GPL2.txt for details. > > > >The license must be changed to BSD. > What's the thing about BSD? http://www.opensource.org/licenses/bsd-license.php > >> SYNOPSIS > >> unsigned char *ultostr(unsigned long num, unsigned long base, > >> unsigned char *ptr, size_t size); > >unsigned long base is pretty long for the range 2..36 :-) strtoul uses > >int for base. > woho... heh right, that probably should have been unsigned char. If you were very space conscientious that would indeed be the correct datatype. However I think I would use strtoul as a guide here and just use int. > >Unsigned char* are very uncommon in C libraries, and its buddy function > >strtoul uses char*. > And? I like it. It's probably because I cannot feel comfortable when they > assign the '=F6' a value of -10. Functions that take char * for historical reasons will take care to properly convert each character as necessary taking into consideration what the locale encoding is. Using unsigned char * is appropriate for encoding and decoding types of function however in this context considering we're only expecting ASCII character coupled with the fact that again strtoul should probably be used as a model I suggest char *. So just: char * clc_ultostr(unsigned long num, int base, char *ptr, size_t size) { ... > The major aspects of my style are >=20 > >unsigned char *ultostr( > > unsigned long num, > > unsigned long base, > > unsigned char *ptr, > > size_t size) > - one row, or indent-by-1 if longer than 79 chars Sounds good here too. >=20 > >{ > - as well as keeping any { on the line The prevailing technique is to place the bracket on a new line I believe. Some editors actually key on these (vi for one). > > unsigned char *sym =3D "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ", > > > > clc_assert_not_null(ultostr, ptr); > > clc_assert_arg(ultostr, base >=3D 2 && base <=3D 36); > > > > *startp =3D ptr + size - 2; What would happen if size were 2, 1, or even 0? Those are legitimate values. > > if(base < 2 || base > 36) { > > errno =3D ERANGE; > > return NULL; > > } > > > > while(--size > 0) { If size were originally 0 size would rollover to a vary high number. > > *ptr =3D sym[num % base]; > > num /=3D base; > > --ptr; > > } What about negative values? Mike --=20 A program should be written to model the concepts of the task it performs rather than the physical world or a process because this maximizes the potential for it to be applied to tasks that are conceptually similar and, more important, to tasks that have not yet been conceived.=20 |
|
From: Jan E. <je...@li...> - 2003-03-19 14:00:25
|
>>>> *startp = ptr + size - 2; >> >> What would happen if size were 2, 1, or even 0? Those are legitimate >> values. > >2 is OK with that code. 0 and 1 don't need to be, we can document >that size must be at least 2 and add clc_assert_arg(ultostr, size>1). Since SIZE is the length of the /whole/ (incoming) string including \0 (hehe) we need to substract 2. >> What about negative values? >Not in an _unsigned_ long. Tell me (you, the one with >>) what output shall be expected if a negative base is given (and there would be no errors)? Like, tell me, what is 6561 (decimal) in Base -13 notation? >Some other points: > >I think the output argument should be first, followed by the size >argument, similar to fread (and strcpy for the output argument). Play. >You forgot to \0-terminate the output. Did I? Possibly. I thought the user takes care of that: char bla[12]; memset(bla,0,12) ultostr(base, bla, 12) >I'd prefer to get as few digits as possible returned instead of >'0'-padding the output. Even though this means doing memmove() >at the end. (Or we could return a pointer into the middle of the >string, but that invites bugs in the user's program.) Strip preceding '0's, that is make them ' '. Otherwise, we would need to allocate a buffer inside ultostr(), depending on how big the output will be. >It shuold fail if the output buffer is too small. >... fail with ERANGE? Or is this our chance to invent our first CLC_E* >code - CLC_ENOSPC "not enough space"? ENOSPC is for writing files. Where is the point in choosing between ERANGE and CLC_ERANGE (if such exists)... I totally wonder about the clc errno code though. BTW: Most mailing-list-readers (at least, mine) does, when I hit reply, want to compose a message to the author who wrote it, plus a CC to li...@sf.... I would like to send the msgs to li...@sf... only, as otherwise I get 2x the same message for no reason. TIA |
|
From: Jan E. <je...@li...> - 2003-03-19 14:45:47
|
>>>I think the output argument should be first, followed by the size
>>>argument, similar to fread (and strcpy for the output argument).
>> Play.
>
>Eh? Was that agreement or disagreement?
Play with the arguments :) Eh well, any argument why to do that?
In the worst case, we can do so, if really everyone of clc
wants so :)
Hm currently has it
char *clc_ultostr(unsigned long num, char base,
char *ptr, size_t size);
You mean
size_t fread( void *ptr, size_t size, size_t nmemb, FILE
*stream);
char *clc_ultostr(char *ptr, size_t size, unsigned long num
char base); ??
>>> You forgot to \0-terminate the output.
>> Did I? Possibly. I thought the user takes care of that:
>
>Oh. I think that's a very bad idea. All other string functions I
>know return a \0-terminated string, except strncpy which I think
>everyone agrees is an abomination.
Yeah right... small addition: memcpy
>>> I'd prefer to get as few digits as possible returned instead of
>>> '0'-padding the output. Even though this means doing memmove()
>>> at the end. (Or we could return a pointer into the middle of the
>>> string, but that invites bugs in the user's program.)
>>
>> Strip preceding '0's, that is make them ' '. Otherwise, we would
>> need to allocate a buffer inside ultostr(), depending on how big
>> the output will be.
>
>No, we just need to do memmove like my posted code did. It writes at
>the end of the buffer at first, but moves it to the front before
>returning. This means that part of the buffer 'behind' the returned
>string in the output buffer is modified, but I don't care. If you don't
>like that, we can instead generate a reversed string at first (starting
>from the beginning of the buffer) and reverse it again before returning.
>
>>> It shuold fail if the output buffer is too small.
>>> ... fail with ERANGE? Or is this our chance to invent our first CLC_E*
>>> code - CLC_ENOSPC "not enough space"?
>> ENOSPC is for writing files.
>CLC_ENOROOM?
I would stick to, push as much into the output buffer as can be.
Even snprintf does so:
#include <stdio.h>
int main(void) {
char buf[3];
snprintf(buf, 3, "%d", 2345);
printf("%s\n", buf); // prints your 23
}
But it is considerable to think about this... maybe we return
the usual thing, plus set errno?
>> Where is the point in choosing between ERANGE and CLC_ERANGE (if such
>> exists)... I totally wonder about the clc errno code though.
>
>I see no point in introducing CLC_ERANGE, since the standard specifies
>that ERANGE must exist. We only need CLC_ codes for meanings that
>the standard does not define.
thought so...
- Jan Engelhardt
|
|
From: Hallvard B F. <h.b...@us...> - 2003-03-19 15:10:13
|
Jan Engelhardt writes: > Play with the arguments :) Eh well, any argument why to do that? Like I said, because that's the usualy way - and therefore the way I'd expect, and the way I'd expect users to expect (I expect that the right number of expects:-) > You mean > size_t fread( void *ptr, size_t size, size_t nmemb, FILE > *stream); > > char *clc_ultostr(char *ptr, size_t size, unsigned long num > char base); ?? Yes, like I did in the code I sent a few messages back. Except 'int base'. >>>> You forgot to \0-terminate the output. >>> Did I? Possibly. I thought the user takes care of that: >> >> Oh. I think that's a very bad idea. All other string functions I >> know return a \0-terminated string, except strncpy which I think >> everyone agrees is an abomination. > > Yeah right... small addition: memcpy Well, I don't think of that as a string function, even though it is in string.h. It operates on raw memory, not strings. >>>> It shuold fail if the output buffer is too small. >>>> ... fail with ERANGE? Or is this our chance to invent our first CLC_E* >>>> code - CLC_ENOSPC "not enough space"? >>> ENOSPC is for writing files. >>CLC_ENOROOM? > > I would stick to, push as much into the output buffer as can be. > Even snprintf does so: Note that snprintf returns failure (-1) on truncation. > But it is considerable to think about this... maybe we return > the usual thing, plus set errno? If so it must also return NULL to indicate that errno is meaningful. If we do this, it might be prettier to return an integer length of the result string on success if the function doesn't pad, or just 1 on success if it does. At least I think that looks better than returning a pointer vs NULL. I can't quite explain why. -- Hallvard |
|
From: Michael B.A. <mb...@io...> - 2003-03-17 21:28:39
|
On Mon, 17 Mar 2003 17:16:23 +0100 (MET) Jan Engelhardt <je...@li...> wrote: > Hi, > > copied this from my personal lib as I thought > this might be worth sharing, no? > > /*============================================================================= > ultostr > by Jan "Hirogen2" Engelhardt <hirogen2 at gmx de>, 2003 > -- distributed under the Frontier Artistic License and GNU General Public > -- License. See doc/FAL1.txt and doc/GPL2.txt for details. > ------------------------------------------------------------------------------- > NAME > ultostr - convert an unsigned long integer to a string > > SYNOPSIS > unsigned char *ultostr(unsigned long num, unsigned long base, > unsigned char *ptr, size_t size); > > DESCRIPTION > The ultostr() function converts the number NUM to a string, in > base BASE notation. BASE must be between 2 and 36 inclusive. To > summarize it, this function does (nearly) exactly the opposite > of strtoul(). I like it! No point in invoking sprintf just to convert a number like this. Mike -- A program should be written to model the concepts of the task it performs rather than the physical world or a process because this maximizes the potential for it to be applied to tasks that are conceptually similar and, more important, to tasks that have not yet been conceived. |