|
From: Sam S. <sd...@gn...> - 2005-04-21 16:23:11
|
Hi Bruno,
I think we are having a problem with I_to_LEbytes/LEbytes_to_I.
(berkeley-db tests fail).
The problems boil down to the following issues:
1. I_to_LEbytes() interface is counterintuitive: the size argument is
the bit size of the object (which can be computed by I_to_LEbytes() -
actually, it must be 1+I_integer_length(obj) for positive obj!),
not the byte size of the buffer (as for LEbytes_to_I()) which means
that the caller has to call memset() or equivalent on the buffer.
This is a simple interface bug and I hope you will fix this soon.
2. It appears that berkeley-db _in some cases_ requires that integers
are encoded like this:
uint32_t value = I_to_uint32(obj);
key->data = my_malloc(sizeof(uint32_t));
key->ulen = key->size = sizeof(uint32_t);
*(uint32_t*)key->data = value;
IIUC, this is CPU byte order dependant and
thus I_to_LEbytes() cannot guarantee this.
Correct?
--
Sam Steingold (http://www.podval.org/~sds) running w2k
<http://www.jihadwatch.org/> <http://www.dhimmi.com/> <http://www.iris.org.il>
<http://www.camera.org> <http://www.honestreporting.com>
Never underestimate the power of stupid people in large groups.
|
|
From: Bruno H. <br...@cl...> - 2005-05-12 16:28:00
|
Sam wrote:
> I think we are having a problem with I_to_LEbytes/LEbytes_to_I.
> (berkeley-db tests fail).
> The problems boil down to the following issues:
>
> 1. I_to_LEbytes() interface is counterintuitive: the size argument is
> the bit size of the object (which can be computed by I_to_LEbytes() -
> actually, it must be 1+I_integer_length(obj) for positive obj!),
The I_to_LEbytes() interface is made for the cases where you know in
advance the maximum size of the integers. This is, I assume, the frequent
case when you deal with C or database world. If you want to use this function
with arbitrary size integers, yes you need 1+I_integer_length(obj).
Or, in the case of UI_to_LEbytes(), I_integer_length(obj).
> not the byte size of the buffer (as for LEbytes_to_I()) which means
> that the caller has to call memset() or equivalent on the buffer.
Show me a test case where you pass 8*N as 2nd argument to I_to_LEbytes()
and it does not fill exactly N bytes.
> This is a simple interface bug and I hope you will fix this soon.
I cannot see an interface bug, and you have yet to show that the
implementation has a bug.
> 2. It appears that berkeley-db _in some cases_ requires that integers
> are encoded like this:
>
> uint32_t value = I_to_uint32(obj);
> key->data = my_malloc(sizeof(uint32_t));
> key->ulen = key->size = sizeof(uint32_t);
> *(uint32_t*)key->data = value;
>
> IIUC, this is CPU byte order dependant and
> thus I_to_LEbytes() cannot guarantee this.
> Correct?
Correct. If you need the bytes in some endianness-dependent order, you need
to byte-swap them yourself. For uint64 values, for example, it's interesting
to know whether it's stored as 8 bytes in big-endian order:
7 6 5 4 3 2 1 0
or as 2 words in little-endian order, each being in big-endian order:
3 2 1 0 7 6 5 4
This is knowledge from BDB that you must bring in. Your "in some cases"
citation does not indicate that you have found reliable reference material
about it yet...
Bruno
|
|
From: Sam S. <sd...@gn...> - 2005-05-12 17:29:00
|
> * Bruno Haible <oe...@py...> [2005-05-12 18:17:09 +0200]:
>
> Sam wrote:
>> I think we are having a problem with I_to_LEbytes/LEbytes_to_I.
>> (berkeley-db tests fail).
>> The problems boil down to the following issues:
>>
>> 1. I_to_LEbytes() interface is counterintuitive: the size argument is
>> the bit size of the object (which can be computed by I_to_LEbytes() -
>> actually, it must be 1+I_integer_length(obj) for positive obj!),
>
> The I_to_LEbytes() interface is made for the cases where you know in
> advance the maximum size of the integers
no, I have no idea about the maximum size of the integers.
>> not the byte size of the buffer (as for LEbytes_to_I()) which means
>> that the caller has to call memset() or equivalent on the buffer.
>
> Show me a test case where you pass 8*N as 2nd argument to I_to_LEbytes()
> and it does not fill exactly N bytes.
suppose I have fixed size records in the database, e.g., 64 bytes.
suppose I want to write number 7 (3 bits) there.
right now I have to do this:
byte buffer[64];
memset(buffer,0,64);
I_to_LEbytes(fixnum(7),1+I_integer_length(fixnum(7)),buffer,64);
I want to be able to simply do
byte buffer[64];
I_to_LEbytes(fixnum(7),buffer,64);
which is simple and symmetric wrt LEbytes_to_I.
>> This is a simple interface bug and I hope you will fix this soon.
>
> I cannot see an interface bug, and you have yet to show that the
> implementation has a bug.
the normal way to interface with a serializing facility is:
int serialize_foo (foo_t object, int buffer_size, byte* buffer);
int deserialize_foo (foo_t *object, int buffer_size, byte* buffer);
int foo_size (foo_t object);
and serialize_foo() should take care of padding.
so the normal use case is:
foo_t object;
byte buffer[64];
serialize_foo(object,buffer,64);
....
deserialize_foo(&object,buffer,64);
....
or
foo_t object;
int size = foo_size(object);
byte *buffer = alloca(size);
serialize_foo(object,buffer,size);
....
deserialize_foo(&object,buffer,size);
....
the current interface is highly unusual:
1. an extra argument to serialize_foo()
2. user has to do the padding, not serialize_foo()
--
Sam Steingold (http://www.podval.org/~sds) running w2k
<http://www.memri.org/> <http://www.iris.org.il>
<http://ffii.org/> <http://www.honestreporting.com> <http://pmw.org.il/>
The plural of "anecdote" is not "data".
|
|
From: Bruno H. <br...@cl...> - 2005-05-30 12:48:25
|
Sam wrote:
> >> 1. I_to_LEbytes() interface is counterintuitive: the size argument is
> >> the bit size of the object (which can be computed by I_to_LEbytes() -
> >> actually, it must be 1+I_integer_length(obj) for positive obj!),
> >
> > The I_to_LEbytes() interface is made for the cases where you know in
> > advance the maximum size of the integers
>
> no, I have no idea about the maximum size of the integers.
> ...
> I want to be able to simply do
>
> byte buffer[64];
> I_to_LEbytes(fixnum(7),buffer,64);
Where do you know that a buffer of size 64 will be sufficient, if
you don't know about the maximum size of the integers?
> suppose I have fixed size records in the database, e.g., 64 bytes.
> suppose I want to write number 7 (3 bits) there.
> right now I have to do this:
>
> byte buffer[64];
> memset(buffer,0,64);
> I_to_LEbytes(fixnum(7),1+I_integer_length(fixnum(7)),buffer,64);
>
> I want to be able to simply do
>
> byte buffer[64];
> I_to_LEbytes(fixnum(7),buffer,64);
The current interface allows you to do this through
byte buffer[64];
I_to_LEbytes(fixnum(7),8*64,buffer);
Which is - except for the multiplication by 8 - exactly what you are asking
for.
> the normal way to interface with a serializing facility is:
>
> int serialize_foo (foo_t object, int buffer_size, byte* buffer);
> int deserialize_foo (foo_t *object, int buffer_size, byte* buffer);
> int foo_size (foo_t object);
>
> and serialize_foo() should take care of padding.
Yes. This is why I_to_LEbytes()'s implementation ends with a memset.
> the current interface is highly unusual:
>
> 1. an extra argument to serialize_foo()
No. It takes the usual 3 arguments: the object, the buffer, and the buffer's
size.
> 2. user has to do the padding, not serialize_foo()
No.
Bruno
|
|
From: Sam S. <sd...@gn...> - 2005-06-06 18:13:51
|
> * Bruno Haible <oe...@py...> [2005-05-30 14:47:28 +0200]:
>
> Sam wrote:
>> >> 1. I_to_LEbytes() interface is counterintuitive: the size argument is
>> >> the bit size of the object (which can be computed by I_to_LEbytes() -
>> >> actually, it must be 1+I_integer_length(obj) for positive obj!),
>> >
>> > The I_to_LEbytes() interface is made for the cases where you know in
>> > advance the maximum size of the integers
>>
>> no, I have no idea about the maximum size of the integers.
>> ...
>> I want to be able to simply do
>>
>> byte buffer[64];
>> I_to_LEbytes(fixnum(7),buffer,64);
>
> Where do you know that a buffer of size 64 will be sufficient, if
> you don't know about the maximum size of the integers?
I don't - I want I_to_LEbytes() to find that out and return an error
code if the buffer space is insufficient.
>> suppose I have fixed size records in the database, e.g., 64 bytes.
>> suppose I want to write number 7 (3 bits) there.
>> right now I have to do this:
>>
>> byte buffer[64];
>> memset(buffer,0,64);
>> I_to_LEbytes(fixnum(7),1+I_integer_length(fixnum(7)),buffer,64);
>>
>> I want to be able to simply do
>>
>> byte buffer[64];
>> I_to_LEbytes(fixnum(7),buffer,64);
>
> The current interface allows you to do this through
>
> byte buffer[64];
> I_to_LEbytes(fixnum(7),8*64,buffer);
if you remove memset() from modules/berkeley-db/bdb.c:fill_dbt(), the
tests will fail.
> Which is - except for the multiplication by 8 - exactly what you are
> asking for.
what is "8"?
how do I know it's not "3" or "17"?
>> the normal way to interface with a serializing facility is:
>>
>> int serialize_foo (foo_t object, int buffer_size, byte* buffer);
>> int deserialize_foo (foo_t *object, int buffer_size, byte* buffer);
>> int foo_size (foo_t object);
>>
>> and serialize_foo() should take care of padding.
>
> Yes. This is why I_to_LEbytes()'s implementation ends with a memset.
no I_to_LEbytes does not pad, see the bdb above.
>> the current interface is highly unusual:
>>
>> 1. an extra argument to serialize_foo()
>
> No. It takes the usual 3 arguments: the object, the buffer, and the
> buffer's size.
the size is in the wrong units.
there is no function that returns the size, like foo_size() above.
how do I write
int size = foo_size(x);
byte buf = malloc(size);
serialize_foo(x,size,buf);
>> 2. user has to do the padding, not serialize_foo()
> No.
see above.
if the I_to_LEbytes does padding, it is broken.
--
Sam Steingold (http://www.podval.org/~sds) running w2k
<http://www.mideasttruth.com/> <http://www.openvotingconsortium.org/>
<http://www.jihadwatch.org/> <http://pmw.org.il/>
cogito cogito ergo cogito sum
|
|
From: Bruno H. <br...@cl...> - 2005-06-06 19:44:24
|
Sam wrote:
> > Where do you know that a buffer of size 64 will be sufficient, if
> > you don't know about the maximum size of the integers?
>
> I don't - I want I_to_LEbytes() to find that out and return an error
> code if the buffer space is insufficient.
Actually, in bdb.c you are using both cases at the same time: limited-size
integers (re_len > 0) and unlimited size (re_len == 0).
> if you remove memset() from modules/berkeley-db/bdb.c:fill_dbt(), the
> tests will fail.
Sure, because when, say, bitsize = 53,
I_to_LEbytes(obj,bitsize,...)
will fill 53 bits, i.e. 6 bytes and 5 bits. Whereas you are expecting it
to fill 7 bytes. That's why you think you need to clear 7 bytes in advance.
Does this patch work?
*** bdb.c 14 May 2005 13:45:38 -0000 1.86
--- bdb.c 6 Jun 2005 19:41:32 -0000
***************
*** 1040,1047 ****
}
key->ulen = key->size = bytesize;
key->data = my_malloc(bytesize);
! begin_system_call(); memset(key->data,0,bytesize); end_system_call();
! if (I_to_LEbytes(obj,bitsize,(uintB*)key->data))
NOTREACHED; /* there must not be an overflow! */
# if defined(DEBUG)
ASSERT(eql(LEbytes_to_I(bytesize,(uintB*)key->data),obj));
--- 1040,1046 ----
}
key->ulen = key->size = bytesize;
key->data = my_malloc(bytesize);
! if (I_to_LEbytes(obj,8*bytesize,(uintB*)key->data))
NOTREACHED; /* there must not be an overflow! */
# if defined(DEBUG)
ASSERT(eql(LEbytes_to_I(bytesize,(uintB*)key->data),obj));
> what is "8"?
"8" is an arbitrary number occurring in the specification of the
I_to_LEbytes function.
Bruno
|
|
From: Sam S. <sd...@gn...> - 2005-06-06 21:31:16
|
> * Bruno Haible <oe...@py...> [2005-06-06 21:43:11 +0200]: > >> if you remove memset() from modules/berkeley-db/bdb.c:fill_dbt(), the >> tests will fail. > > Sure, because when, say, bitsize = 53, > I_to_LEbytes(obj,bitsize,...) > will fill 53 bits, i.e. 6 bytes and 5 bits. Whereas you are expecting it > to fill 7 bytes. That's why you think you need to clear 7 bytes in advance. > > Does this patch work? yes, please put it in. -- Sam Steingold (http://www.podval.org/~sds) running w2k <http://www.jihadwatch.org/> <http://www.iris.org.il> <http://pmw.org.il/> <http://www.camera.org> <http://www.memri.org/> As a computer, I find your faith in technology amusing. |