libclc-developers Mailing List for libclc (Page 8)
Status: Planning
Brought to you by:
augestad
You can subscribe to this list here.
| 2003 |
Jan
|
Feb
(1) |
Mar
(170) |
Apr
(73) |
May
(3) |
Jun
(1) |
Jul
(1) |
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2008 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
|
From: <bo...@me...> - 2003-03-20 19:45:08
|
Hallvard B Furuseth wrote: > Bjørn Augestad writes: > >>Hallvard B Furuseth wrote: >> >>>2) what to do if the buffer is not large enough: >>> Silently truncate, or generate a truncated string but return failure? >>> If the latter, should it return NULL or should we change the return >>> type so it returns the length of the string or -1 at truncation? (I >>> think it's ugly to return NULL at failure but still have a meaningful >>> output, but maybe that's just me?) >> >>We can add an assert for the debug version, but let the release version >>invoke UB. That's the error handling we have agreed upon. > > > We have? I didn't see anyone say that. And I don't like that variant. > What I have seen so far is silent truncation, which I also don't like. IMO we did in the "libclc: error handling policy" thread. That was a long thread and lasted for almost 4 weeks, IIRC. The pros and cons of all alternatives were discussed to death. In the wrapup I wrote: > - The standard non-debug build of libclc acts just like the standard C > library, regarding invalid arguments and UB. We will not test the > validity of arguments. > > - We will create a set of macros which replaces the standard assert > macro. The main purpose of these macros is to output better messages > than the standard assert. These messages will explain the problem to the > user of the library in a good way. This will reduce both support > requests and bug reports. > > - We will provide an optional debug version of libclc. Functions in that > version will use our assert macros to test for null pointer arguments > and will perform other assertions if the group finds it necessary. Please stick to these rules. [snip] > >>>5) BTW, I'd like (2 <= base && base < 64): characters '0-9A-Za-z_=' >>> or something. That's because I expect I'd usually use *printf to >>> print normal numbers, but could use clc_ultostr with a large base >>> when I want to generate textual IDs. Base 64 is nice for that, which >>> is why I added "_" and "=" (two characters that hopefully have no >>> special meaning in filesystems and shell commands). >> >>How will you (or other users of libclc) convert it back to unsigned long? > > > I won't. Just like strtol can parse things the library can generate > today. Just a little extra functionality which doesn't cost anything. I have no problems with that, but slightly dislike that it no longer does the complete opposite of strtoul. base64 is also used a lot and I even need it myself, but both ways. If we end up having a base64 module, maybe we can put conversion functions there? -- boa Please join the libclc-developers list at http://lists.sourceforge.net/lists/listinfo/libclc-developers |
|
From: Hallvard B F. <h.b...@us...> - 2003-03-20 19:25:40
|
Jan Engelhardt writes: >> 2) what to do if the buffer is not large enough: > > already said that... like snprintf(buf, 3, "%d", 234567); And as I said, snprintf doesn't _silently_ truncate like your latest clc_ultostr does. I want an error indication. Or a descision not to return an error, but I hope not. See below: >> Silently truncate, or generate a truncated string but return failure? >> If the latter, should it return NULL or should we change the return >> type so it returns the length of the string or -1 at truncation? (I >> think it's ugly to return NULL at failure but still have a meaningful >> output, but maybe that's just me?) >> 5) BTW, I'd like (2 <= base && base < 64): characters '0-9A-Za-z_=' >> or something. That's because I expect I'd usually use *printf to >> print normal numbers, but could use clc_ultostr with a large base >> when I want to generate textual IDs. Base 64 is nice for that, which >> is why I added "_" and "=" (two characters that hopefully have no >> special meaning in filesystems and shell commands). > > No, as we _may_ want to express 254 (decimal) in uh... (0x)FE as > well as (0x)fe. You mean to give a flag for uppercase/lowercase in the output as an argument to the function? Otherwise that is not relevant: The function will always produce uppercase when the base argument <= 36, the lowercase variant will only be produced for base > 36. -- Hallvard |
|
From: Michael B.A. <mb...@io...> - 2003-03-20 19:21:27
|
On Thu, 20 Mar 2003 17:07:06 +0100 Bj=F8rn Augestad <bo...@me...> wrote: > I wrote: > > I'm trying to implement a map, but need some help/advice.=20 >=20 > Everyone seem to focus on a map implemented as a hash, but it aint=20 > necessarily so. >=20 > The sgi link, http://www.sgi.com/tech/stl/table_of_contents.html,=20 > describes 4 different maps: > map > multimap > hash_map > hash_multimap >=20 > These 4 types share some properties, but not all. A map is not=20 > implemented as a hash, which means that the interface problem remains... >=20 > So let me rephrase my question to see if you can forget about the hash=20 > for a while. :-) >=20 > What kind of interface is suitable for (a set of) functions which > - must accept more than one datatype void * is the obvious choice really. I don't see any practical alternative. Requiring the user to setup and use some abstract data container isn't very attractive. > - void* cannot be used > - we don't want to allocate memory That's pretty restrictive. As the map grows where will you store it's data pointers? > - we can't take the address of the argument, e.g. foo(&123); I'm not sure I understand what you mean here. > - variadic functions cannot be used as it requires a format > string No it doesn't. It requires a parameter for which the type is know. It does not have to be a string. Mike --=20 A program should be written to model the concepts of the task it performs rather than the physical world or a process because this maximizes the potential for it to be applied to tasks that are conceptually similar and, more important, to tasks that have not yet been conceived.=20 |
|
From: Hallvard B F. <h.b...@us...> - 2003-03-20 19:20:29
|
Bj=F8rn Augestad writes: >Hallvard B Furuseth wrote: >> >> 2) what to do if the buffer is not large enough: >> Silently truncate, or generate a truncated string but return fail= ure? >> If the latter, should it return NULL or should we change the retu= rn >> type so it returns the length of the string or -1 at truncation? = (I >> think it's ugly to return NULL at failure but still have a meanin= gful >> output, but maybe that's just me?) >=20 > We can add an assert for the debug version, but let the release versi= on=20 > invoke UB. That's the error handling we have agreed upon. We have? I didn't see anyone say that. And I don't like that variant.= What I have seen so far is silent truncation, which I also don't like. >> 3) malloc a buffer to be returned if the input ptr =3D=3D NULL? >> If so, must return ptr and not length in point (2) above. >> Could let size=3D=3D0 mean allocate whatever space is necessary. >=20 > Do we need more Swiss Army Knife functions? No opinion. >> 5) BTW, I'd like (2 <=3D base && base < 64): characters '0-9A-Za-z_=3D= ' >> or something. That's because I expect I'd usually use *printf to= >> print normal numbers, but could use clc_ultostr with a large base= >> when I want to generate textual IDs. Base 64 is nice for that, w= hich >> is why I added "_" and "=3D" (two characters that hopefully have = no >> special meaning in filesystems and shell commands). >=20 > How will you (or other users of libclc) convert it back to unsigned l= ong? I won't. Just like strtol can parse things the library can generate today. Just a little extra functionality which doesn't cost anything. --=20 Hallvard |
|
From: <bo...@me...> - 2003-03-20 19:10:17
|
Hallvard B Furuseth wrote: > Bjørn Augestad writes: > >> char *sp = ptr + size - 1; >> >> clc_assert_not_null(clc_ultostr, ptr); >> clc_assert_arg(clc_ultostr, size > 1); > > > The assignment must be after the asserts: If the arguments are incorrect, > the assignment can crash the program. Fixed. > > Except for that, maybe we (including me, I know) should decide just what > the function should do before posting more code. Here are some points > that have been mentioned, half of them mine:-) > > 1) no padding, space-pad, '0'-pad, or make it an option: > a 'char padding' parameter where '\0' means no padding? No padding. A function should do one thing, and one thing well. A char padding parameter with e.g. the value '1' will also create 'interesting' output. > > 2) what to do if the buffer is not large enough: > Silently truncate, or generate a truncated string but return failure? > If the latter, should it return NULL or should we change the return > type so it returns the length of the string or -1 at truncation? (I > think it's ugly to return NULL at failure but still have a meaningful > output, but maybe that's just me?) We can add an assert for the debug version, but let the release version invoke UB. That's the error handling we have agreed upon. > > 3) malloc a buffer to be returned if the input ptr == NULL? > If so, must return ptr and not length in point (2) above. > Could let size==0 mean allocate whatever space is necessary. Do we need more Swiss Army Knife functions? [snip] > > 5) BTW, I'd like (2 <= base && base < 64): characters '0-9A-Za-z_=' > or something. That's because I expect I'd usually use *printf to > print normal numbers, but could use clc_ultostr with a large base > when I want to generate textual IDs. Base 64 is nice for that, which > is why I added "_" and "=" (two characters that hopefully have no > special meaning in filesystems and shell commands). How will you (or other users of libclc) convert it back to unsigned long? Just my 0.02 NOK... -- boa Please join the libclc-developers list at http://lists.sourceforge.net/lists/listinfo/libclc-developers |
|
From: Jan E. <je...@li...> - 2003-03-20 19:06:15
|
>> char *sp = ptr + size - 1; >> >> clc_assert_not_null(clc_ultostr, ptr); >> clc_assert_arg(clc_ultostr, size > 1); > >The assignment must be after the asserts: If the arguments are incorrect, >the assignment can crash the program. > >Except for that, maybe we (including me, I know) should decide just what >the function should do before posting more code. Here are some points >that have been mentioned, half of them mine:-) > >2) what to do if the buffer is not large enough: already said that... like snprintf(buf, 3, "%d", 234567); > Silently truncate, or generate a truncated string but return failure? > If the latter, should it return NULL or should we change the return > type so it returns the length of the string or -1 at truncation? (I > think it's ugly to return NULL at failure but still have a meaningful > output, but maybe that's just me?) > >3) malloc a buffer to be returned if the input ptr == NULL? > If so, must return ptr and not length in point (2) above. > Could let size==0 mean allocate whatever space is necessary. > >4) errno: > Base is invalid: ERANGE. > Truncation: EDOM? CLC_ENOSPC? CLC_ENOROOM? CLC_ETOOLARGE? > Malloc failed, if we use malloc: CLC_ENOMEM? > > (CLC_ENOSPC doesn't need to mean no _file_ space even though ENOSPC > does.) > >5) BTW, I'd like (2 <= base && base < 64): characters '0-9A-Za-z_=' > or something. That's because I expect I'd usually use *printf to > print normal numbers, but could use clc_ultostr with a large base > when I want to generate textual IDs. Base 64 is nice for that, which > is why I added "_" and "=" (two characters that hopefully have no > special meaning in filesystems and shell commands). No, as we _may_ want to express 254 (decimal) in uh... (0x)FE as well as (0x)fe. To get around this, there is something in my mind: Opening up my doom2.wad ;) I see that they counted frames from A-Z and then if that wasnot enough, continue which 'Z' + 1 (I guess [ then \ and then ]) but that's ugly. Stick to base 36 max. - Jan Engelhardt |
|
From: Hallvard B F. <h.b...@us...> - 2003-03-20 18:46:06
|
I wrote: > 4) errno: > Base is invalid: ERANGE. Or assertion failure, I missed that latest version had reverted to that. -- Hallvard |
|
From: Hallvard B F. <h.b...@us...> - 2003-03-20 18:41:17
|
Bj=F8rn Augestad writes: > char *sp =3D ptr + size - 1; > > clc_assert_not_null(clc_ultostr, ptr); > clc_assert_arg(clc_ultostr, size > 1); The assignment must be after the asserts: If the arguments are incorrec= t, the assignment can crash the program. Except for that, maybe we (including me, I know) should decide just wha= t the function should do before posting more code. Here are some points that have been mentioned, half of them mine:-) 1) no padding, space-pad, '0'-pad, or make it an option: a 'char padding' parameter where '\0' means no padding? 2) what to do if the buffer is not large enough: Silently truncate, or generate a truncated string but return failure= ? If the latter, should it return NULL or should we change the return type so it returns the length of the string or -1 at truncation? (I= think it's ugly to return NULL at failure but still have a meaningfu= l output, but maybe that's just me?) 3) malloc a buffer to be returned if the input ptr =3D=3D NULL? If so, must return ptr and not length in point (2) above. Could let size=3D=3D0 mean allocate whatever space is necessary. 4) errno: Base is invalid: ERANGE. Truncation: EDOM? CLC_ENOSPC? CLC_ENOROOM? CLC_ETOOLARGE? Malloc failed, if we use malloc: CLC_ENOMEM? (CLC_ENOSPC doesn't need to mean no _file_ space even though ENOSPC does.) 5) BTW, I'd like (2 <=3D base && base < 64): characters '0-9A-Za-z_=3D'= or something. That's because I expect I'd usually use *printf to print normal numbers, but could use clc_ultostr with a large base when I want to generate textual IDs. Base 64 is nice for that, whic= h is why I added "_" and "=3D" (two characters that hopefully have no special meaning in filesystems and shell commands). --=20 Hallvard |
|
From: <bo...@me...> - 2003-03-20 17:18:33
|
Hallvard B Furuseth wrote:
>
>>How about space padding (see above) it? Oh well, memmove if you like
>>or apply any performance changes, to move the output string to the left.
>>(If it's that what was intended.)
>
>
> I still prefer no padding. Waiting to see what others say.
Padding is not a good idea. Consider this example:
unsigned long ul = 1;
char buf[10000000];
clc_ultostr(buf, sizeof buf, ul, 10);
printf("%s", buf);
Use sprintf() for padding, but clc_ultostr() should IMO be used to
convert an integer to a string, not convert *and* format.
Other opinions?
--
boa
Please join the libclc-developers list
at http://lists.sourceforge.net/lists/listinfo/libclc-developers
|
|
From: <bo...@me...> - 2003-03-20 17:14:04
|
Sorry for the odd reply, email problems.
Jan Engelhardt wrote:
> #include <errno.h>
> #include <stdio.h>
>
> char *clc_ultostr(char *ptr, size_t size, unsigned long num, int base) {
> char *sym = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ",
> *sp = ptr + size - 1;
> clc_assert_not_null(clc_ultostr, ptr);
> clc_assert_arg(clc_ultostr, size > 1);
> clc_assert_arg(clc_ultostr, base < 2 || base > 36);
> *sp = '\0';
> while(size-- > 0 && num > 0) {
> *--sp = sym[num % base];
> num /= base;
> }
> while(--size > 0) { *--sp = ' '; }
> return ptr;
> }
>
> //==[ End of file ]====================================================
>
>
> Bj/orn: YOU are _now_ responsible for the indent style! :P
Thanks. :-)
>
> I bet there is still errors in there...any?
Hmm, let's see...
1. The assert(base < 2 || base > 36) is incorrect.
Should have been >= 2 && base <= 36
2. stddef.h and clc_assert.h is missing and stdio.h and errno.h
is not needed
3. nit pick. char* sym can be const char* sym.
Always a few lints :-) Here's my version.
#include <stddef.h>
#include "clc_assert.h"
#include "clc_string.h"
char *clc_ultostr(char *ptr, size_t size, unsigned long num, int base)
{
const char *sym = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char *sp = ptr + size - 1;
clc_assert_not_null(clc_ultostr, ptr);
clc_assert_arg(clc_ultostr, size > 1);
clc_assert_arg(clc_ultostr, base >= 2 && base <= 36);
*sp = '\0';
while (size-- > 0 && num > 0) {
*--sp = sym[num % base];
num /= base;
}
while (--size > 0) {
*--sp = ' ';
}
return ptr;
}
--
boa
Please join the libclc-developers list
at http://lists.sourceforge.net/lists/listinfo/libclc-developers
|
|
From: <bo...@me...> - 2003-03-20 16:06:40
|
I wrote: > I'm trying to implement a map, but need some help/advice. Everyone seem to focus on a map implemented as a hash, but it aint necessarily so. The sgi link, http://www.sgi.com/tech/stl/table_of_contents.html, describes 4 different maps: map multimap hash_map hash_multimap These 4 types share some properties, but not all. A map is not implemented as a hash, which means that the interface problem remains... So let me rephrase my question to see if you can forget about the hash for a while. :-) What kind of interface is suitable for (a set of) functions which - must accept more than one datatype - void* cannot be used - we don't want to allocate memory - we can't take the address of the argument, e.g. foo(&123); - variadic functions cannot be used as it requires a format string - It has to be a reusable concept ? BTW, I haven't received mail from this list the last couple of days. No idea why, but don't expect rapid responses right now :-( -- boa Please join the libclc-developers list at http://lists.sourceforge.net/lists/listinfo/libclc-developers |
|
From: Bertrand M. T. <ber...@en...> - 2003-03-19 22:14:47
|
This is to let you guys know that I have published the first draft of the documentation policy on the projects main page (http://libclc.sourceforge.net) under the development section. Please review and let me know your opinion. Bertrand |
|
From: Michael B.A. <mb...@io...> - 2003-03-19 21:46:00
|
On Wed, 19 Mar 2003 21:28:07 +0100 Thomas Stegen <ts...@ci...> wrote: > Michael B.Allen wrote: > > > > It's not entirely obvious to me why the comparator is really necessay. If > > the user supplies a hash function you can just compare hashes yes? Yes, > > there could be a collision but with a suitible hash function the chances > > of that are roughly 1 in ULONG_MAX which on my machine appears to be 1 > > in 4,294,967,295. > > That is a small chance. But can you make a hash function which produces > hashes over the whole range? I don't think you can. And if a collision That's why you need a suitable hash function for the data in question. > does happen it is going to be a rather subtle bug I would imagine. And True. > if the table has less than in 4,294,967,295 buckets then the chance of > collision increases. And equality can mean different things to different Well I don't mean bucket collisions. You can have more than one datum in a bucket. The only way to loose a datum is if the hashes match exactly and not just hash to the same bucket. So provided the hashing function gave you a remotely even distribution the chances of loosing a datum is really 1 in ULONG_MAX. Also I think as the hashtable gets *smaller* the chances of such a collision are less because the divide by size crontibutes to the hashing. > people. Same value, same object, matching only part of a string. Again a suitable hash function is required. You might provide a generic string hashing function though. Incedentally here's a nice page with a bit on hashing: http://burtleburtle.net/bob/ > In short, I think a compare function will improve the map/hashtable > and make it useful in more situations. I don't see how a comparator would "make it more useful in more situations" but it would eliminate the miniscule uncertaintly I described. I suppose you're right though. As slim as it is that there would be a collision resulting in a datum being removed I think it would be technically incorrect not to have it. Mike -- A program should be written to model the concepts of the task it performs rather than the physical world or a process because this maximizes the potential for it to be applied to tasks that are conceptually similar and, more important, to tasks that have not yet been conceived. |
|
From: Thomas S. <ts...@ci...> - 2003-03-19 21:30:42
|
Michael B.Allen wrote: > > It's not entirely obvious to me why the comparator is really necessay. If > the user supplies a hash function you can just compare hashes yes? Yes, > there could be a collision but with a suitible hash function the chances > of that are roughly 1 in ULONG_MAX which on my machine appears to be 1 > in 4,294,967,295. That is a small chance. But can you make a hash function which produces hashes over the whole range? I don't think you can. And if a collision does happen it is going to be a rather subtle bug I would imagine. And if the table has less than in 4,294,967,295 buckets then the chance of collision increases. And equality can mean different things to different people. Same value, same object, matching only part of a string. In short, I think a compare function will improve the map/hashtable and make it useful in more situations. -- Thomas. |
|
From: Michael B.A. <mb...@io...> - 2003-03-19 21:18:16
|
On Wed, 19 Mar 2003 20:43:23 +0100 Thomas Stegen <ts...@ci...> wrote: > > > > void* won't work with structs. Or anything else. The compiler can add padding > > bits - we can't know what's in them without using a proper comparator > > function. > > If it is possible to customize datastructures properly (without > recompilation) then the datastructure can easily be totally oblivious to > what the void* pointers really point to. I can see using void* for both > values and keys (and it does work). That means you can map anything to > anything. Proper customization means supplying a compare function among > other things. It's not entirely obvious to me why the comparator is really necessay. If the user supplies a hash function you can just compare hashes yes? Yes, there could be a collision but with a suitible hash function the chances of that are roughly 1 in ULONG_MAX which on my machine appears to be 1 in 4,294,967,295. Mike -- A program should be written to model the concepts of the task it performs rather than the physical world or a process because this maximizes the potential for it to be applied to tasks that are conceptually similar and, more important, to tasks that have not yet been conceived. |
|
From: Thomas S. <ts...@ci...> - 2003-03-19 21:13:36
|
Bryan Donlan wrote:
>>If it is possible to customize datastructures properly (without
>>recompilation) then the datastructure can easily be totally oblivious to
>>what the void* pointers really point to. I can see using void* for both
>>values and keys (and it does work). That means you can map anything to
>>anything. Proper customization means supplying a compare function among
>>other things.
>
> No it dosen't. You can use void *s but you need a hashing routine and a
> comparator. Example:
> struct key {
> char x;
> int y;
> };
>
> If int requires alignment, and sizeof(char) != sizeof(int), there'll be
> some padding bytes. memcmp() will not correctly compare the structure in
> this case. This applies to any other type;
Except for unsigned char. But this sounds to much like c.l.c ;)
> implementations are free to insert an
> arbitrary number of padding bits which can be set to anything whatsoever.
I am aware, but it is not an issue if things are done properly.
I have in fact done this myself. And I know the code will work anywhere
(unless there is a bug I haven't found). The hashtable itself does not
need do any copying whatsoever if it is possible to properly customize
it. The hashtable can be completely oblivious as to what the void* is
pointing to.
--
Thomas.
Re: Fwd: Re: [Libclc-developers] Introducing the double linked list
interface
|
|
From: Michael B.A. <mb...@io...> - 2003-03-19 21:09:31
|
On Wed, 19 Mar 2003 20:36:57 +0100 Thomas Stegen <ts...@ci...> wrote: > > If someone really needs to put null pointers into a hashtable > they can always wrap the pointers up in structs. Agreed. -- A program should be written to model the concepts of the task it performs rather than the physical world or a process because this maximizes the potential for it to be applied to tasks that are conceptually similar and, more important, to tasks that have not yet been conceived. |
|
From: Bryan D. <bd...@bd...> - 2003-03-19 20:50:18
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Wednesday 19 March 2003 02:43 pm, Thomas Stegen wrote:
> Bryan Donlan (by way of Bryan Donlan ) wrote:
> >>>>void* is OK for data, I guess. The hard part will be to find a *good*
> >>>>way to represent keys. Does anyone have a good concept for keys?
> >>>
> >>>I'm not sure what you mean. What is a key? How about void *.
> >>
> >>void* is OK, but we need length for some data types. We may even want to
> >>know the data type itself, for optimization purposes[1]. It would also
> >>be nice to avoid lots of casts...
> >
> > void* won't work with structs. Or anything else. The compiler can add
> > padding bits - we can't know what's in them without using a proper
> > comparator function.
>
> If it is possible to customize datastructures properly (without
> recompilation) then the datastructure can easily be totally oblivious to
> what the void* pointers really point to. I can see using void* for both
> values and keys (and it does work). That means you can map anything to
> anything. Proper customization means supplying a compare function among
> other things.
No it dosen't. You can use void *s but you need a hashing routine and a
comparator. Example:
struct key {
char x;
int y;
};
If int requires alignment, and sizeof(char) != sizeof(int), there'll be some
padding bytes. memcmp() will not correctly compare the structure in this
case. This applies to any other type; implementations are free to insert an
arbitrary number of padding bits which can be set to anything whatsoever.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)
iD8DBQE+eNf3x533NjVSos4RAhURAJ4tkGfVmRZJs/ZckrpV4spv6mmUdgCeLEQ/
LUmPJifSsH1pedMNj6AR0Zc=
=8Xp5
-----END PGP SIGNATURE-----
|
|
From: Thomas S. <ts...@ci...> - 2003-03-19 20:45:59
|
Bryan Donlan (by way of Bryan Donlan ) wrote: >> >>>>void* is OK for data, I guess. The hard part will be to find a *good* >>>>way to represent keys. Does anyone have a good concept for keys? >>> >>>I'm not sure what you mean. What is a key? How about void *. >> >>void* is OK, but we need length for some data types. We may even want to >>know the data type itself, for optimization purposes[1]. It would also >>be nice to avoid lots of casts... > > void* won't work with structs. Or anything else. The compiler can add padding > bits - we can't know what's in them without using a proper comparator > function. If it is possible to customize datastructures properly (without recompilation) then the datastructure can easily be totally oblivious to what the void* pointers really point to. I can see using void* for both values and keys (and it does work). That means you can map anything to anything. Proper customization means supplying a compare function among other things. -- Thomas. |
|
From: Thomas S. <ts...@ci...> - 2003-03-19 20:39:40
|
Michael B.Allen wrote: > > That's all you need really. At the very least you don't really need > signed keys where the unsigned functions work fine. > > Is NULL a permitted data value? Can anything go wrong in the get function > that would cause an error? If so how do you indicate to the caller that > an error occured? What I do in my hashtable is that the user have to provide functions which either copies or uses the value being inserted (giving the user the choice). The same goes for keys. I haven't tested my hashtable to that effect, but I think that having keys or values which is NULL shouldn't be a problem as long the functions provided by the user knows how to handle them. At least 4 functions are needed to properly customize a hashtable I think, copy the key, copy the value, compare keys and hashing the keys. The copy functions does not need to actually copy, but can return the original elements. Also functions for freeing keys and values are convenient. If you allow NULL as a value then you do have a problem in the get function. Easiest solution would be to set some flag in the hashtable if there was no element mapped to the key. But I don't think that is nice. If someone really needs to put null pointers into a hashtable they can always wrap the pointers up in structs. -- Thomas. |
|
From: Bryan D. <bd...@bd...> - 2003-03-19 20:24:29
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday 19 March 2003 10:35 am, Bj=F8rn Augestad wrote: > Michael B.Allen wrote: > > On Wed, 19 Mar 2003 06:59:32 +0100 > > > > Bj=F8rn Augestad <bo...@me...> wrote: > >># Sequences > >> 1. vector > >> 2. deque > >> 3. list > >> 4. slist > >> 5. bit_vector > >> > >># Associative Containers > >> 1. set > >> 2. map > >> 3. multiset > >> 4. multimap > >> 5. hash_set > >> 6. hash_map > >> 7. hash_multiset > >> 8. hash_multimap > >> 9. hash > >> > >>Need more containers? :-) > > > > I think we should start with something simple, perfect it, debate it, > > and let it stew a little. Then we can move on and think about > > transcendent ideas that facilitate exotic functionality. The linked list > > is a good start. > > Agree, but we need to agree on a few things and the STL works well as a > specification. STL gives us concepts, naming conventions and semantics > for e.g. iterators and e.g. push_back() and push_front(). So instead of > discussing these issues to death, why not adapt what's usable from STL? > (and then discuss it to death :-)) > > >>void* is OK for data, I guess. The hard part will be to find a *good* > >>way to represent keys. Does anyone have a good concept for keys? > > > > I'm not sure what you mean. What is a key? How about void *. > > void* is OK, but we need length for some data types. We may even want to > know the data type itself, for optimization purposes[1]. It would also > be nice to avoid lots of casts... void* won't work with structs. Or anything else. The compiler can add paddi= ng bits - we can't know what's in them without using a proper comparator function. =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE+eNH5x533NjVSos4RAhJGAJ9dB9+kgJBzcICniDcAmQ6WbkiO1ACeIOdN 8S3tVZVGHhXe4Z0ZyxXutO0=3D =3D+I0F =2D----END PGP SIGNATURE----- |
|
From: Michael B.A. <mb...@io...> - 2003-03-19 20:17:31
|
On Wed, 19 Mar 2003 17:32:51 +0100
Bj=F8rn Augestad <bo...@me...> wrote:
> I'm trying to implement a map, but need some help/advice. Here's the=20
> problem.
>=20
> 1) The map should handle keys of different datatypes.
> 2) The C99 standard, in =A76.3.2.3 Pointers, paragraph 5 and 6, describe=
=20
> integer->void*->integer conversion rules. IMHO it rules out the use of=20
> void* as a general key parameter, which means that we cannot have just=20
> one clc_map_set(clc_map m, void* key, void* data). Is my interpretation=20
> correct?
It's a rare thing to need to use an integer (or real) as a hash key
directly. Usually the map implementation provides a mechanism for the user
to supply their own hash function in which case if they want to hash by
integer there is usually an opportunity to pass the struct to which it
is a member. So the user supplied hash function might just look like:
unsigned long
hash_foo(void *ptr)
{
struct foo *foo =3D ptr;
return foo->theint * LARGE_PRIME_NUMBER;
}
> 3) I ended up with this interface for set/get functions and need=20
> opinions. Please let me know how you all feel about it. Thanks in advance.
>=20
> int clc_map_set(clc_map m, const char* key, void* data);
<snip>
> void* clc_map_get(clc_map m, const char* key);
<snip>
That's all you need really. At the very least you don't really need
signed keys where the unsigned functions work fine.
Is NULL a permitted data value? Can anything go wrong in the get function
that would cause an error? If so how do you indicate to the caller that
an error occured?
Mike
--=20
A program should be written to model the concepts of the task it
performs rather than the physical world or a process because this
maximizes the potential for it to be applied to tasks that are
conceptually similar and, more important, to tasks that have not
yet been conceived.=20
|
|
From: Hallvard B F. <h.b...@us...> - 2003-03-19 20:07:58
|
Bj=F8rn Augestad writes: > Me too. I'm currently thinking about something like: > clc_anytype clc_int2any(int); > int clc_any2int(clc_anytype); I'm not sure I understand what these functions do, but anyway: > Do you know of an alterative way? The normal way is for the user to store these in the hash table: - key size, - hash function (of a key argument), - compare function to compare two keys. That way we won't need a general type definition description to walk through. > Fine, but that doesn't solve the interface problem dealing with keys = of=20 > different types. No, not if we want an interface for common types. It might be enough of an interface to supply the functions mentioned above for a few common types, though. --=20 Hallvard |
|
From: Bryan D. <bd...@bd...> - 2003-03-19 20:03:05
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 =2D -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday 19 March 2003 12:59 am, Bj=F8rn Augestad wrote: > Bryan Donlan wrote: > > What about maps/dictionaries/whatever they're called? It should have > > a hashing > > > function on the items... > > Here's a list of the container classes in STL. That should be sufficient > for most needs and is also a good specification to work from. [snip] > Need more containers? :-) > > void* is OK for data, I guess. The hard part will be to find a *good* > way to represent keys. Does anyone have a good concept for keys? > [snip] A key type struct, with size and a comparator function. Pass data as a stru= ct of void* to the data and a pointer to a key type struct. =2D -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE+eMwyx533NjVSos4RAioqAKDBAaAvuR3elWuD1X0ORKLDezt38wCglFnh W22qLf0QC2O8VVplWBoWO1g=3D =3DhY8Y =2D -----END PGP SIGNATURE----- =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE+eMzvx533NjVSos4RAos0AJ9B0YvHPEJqTZM19U4F8/qY+aDJIQCgxjwF =46bI8C9HyQ/wA+88cJFxTJds=3D =3DNqQT =2D----END PGP SIGNATURE----- |
|
From: <bo...@me...> - 2003-03-19 19:53:12
|
Hallvard B Furuseth wrote:
> Bjørn Augestad writes:
>
>>I'm trying to implement a map (...)
>>
>>int clc_map_set(clc_map m, const char* key, void* data);
>>int clc_map_iset(clc_map m, int key, void* data);
>>(...)
>>void* clc_map_get(clc_map m, const char* key);
>>(...)
>
>
> I think this is pretty backwards.
Me too. I'm currently thinking about something like:
clc_anytype clc_int2any(int);
int clc_any2int(clc_anytype);
with functions for each datatype, of course.
Then the set/get protos for a map could be
int clc_map_set(clc_map m, clc_anytype key, clc_anytype value);
clc_anytype clc_map_get(clc_map m, clc_anytype key);
and a call: clc_map_set(clc_int2any(123), clc_void2any(foo));
Implemented as a struct containing typeinfo and a union to store stuff.
I really don't want such a solution, as it isn't very idiomatic. May be
nice for private stuff, but IMHO not for libclc.
Do you know of an alterative way?
Let's pick a good hash table
> implementation first (which implementation is best is seems like a good
> thing to ask about on comp.lang.c), and you can implement maps on top of
> that if you want to.
Fine, but that doesn't solve the interface problem dealing with keys of
different types. BTW, the protos I posted was just for setting and
getting values using keys, other protos exist. :-)
[snip]
--
boa
Please join the libclc-developers list
at http://lists.sourceforge.net/lists/listinfo/libclc-developers
|