Hi,
On Wed, 2004-02-25 at 15:55, Andras Erdei wrote:
> [sorry for the long post]
No problem.
> i'm thinking about contributing some effort to your project;
> a good first step can be getting familiar with already existing
> code, thus i picked mkntfs.c and started reading it
Great!
> there seem to be a few bugs in the code, so while reading it
> i might as well try to fix them, but i don't know what kind
Good idea. (-:
> of fixes would you prefer, so i selected a trivial function
> with a trivial bug, and written three fixes -- please let me
> know which one would you prefer, and flame me a bit, so that
> my future fixes can get through more easily (if you want them
> at all :O)
Ok, see below...
> the function in question is stoucs(); it gets the size of
> the output buffer in bytes (as maxlen), but in fact it can
> write (maxlen-2)*2+2 bytes into it (e.g. into an output
> buffer of 10 bytes it writes up to 18 bytes):
Eeek! You are quite right. It is somewhat broken... Luckily maxlen is
always made big enough by the callers of stoucs() which allocate dest =
malloc((strlen(src) + 1) * 2); So we have never seen a problem...
> /**
> * stoucs - convert ASCII string to unicode-character string
> * @dest: points to buffer to receive the converted string
> * @src: points to string to convert
> * @maxlen: size of @dest buffer in bytes
> *
> * Return the number of characters written to @dest, not including the
> * terminating null unicode character.
> */
> int stoucs(uchar_t *dest, const char *src, int maxlen)
> {
> char c;
> int i;
>
> /* Need two bytes for null terminator. */
> maxlen -= 2;
> for (i = 0; i < maxlen; i++) {
> c = src[i];
> if (!c)
> break;
> dest[i] = cpu_to_le16(c);
> }
> dest[i] = cpu_to_le16('\0');
> return i;
> }
>
> the first fix is the minimalistic one:
>
> --- mkntfs.c.orig 2004-02-25 15:29:28.000000000 +0100
> +++ mkntfs.c 2004-02-25 16:23:38.000000000 +0100
> @@ -569,7 +569,7 @@
>
> /* Need two bytes for null terminator. */
> maxlen -= 2;
> - for (i = 0; i < maxlen; i++) {
> + for (i = 0; i < maxlen/2; i++) {
> c = src[i];
> if (!c)
> break;
Yes that would be my preferred fix, except that I wouldn't trust gcc to
optimize away the maxlen/2. I assume it would just re-evaluate it on
every i < maxlen/2 test which is silly so I committed this fix instead:
diff -Nru a/ntfsprogs/mkntfs.c b/ntfsprogs/mkntfs.c
--- a/ntfsprogs/mkntfs.c Thu Feb 26 09:19:33 2004
+++ b/ntfsprogs/mkntfs.c Thu Feb 26 09:19:33 2004
@@ -567,8 +567,10 @@
char c;
int i;
- /* Need two bytes for null terminator. */
- maxlen -= 2;
+ /* Convert maxlen from bytes to unicode characters. */
+ maxlen /= sizeof(uchar_t);
+ /* Need space for null terminator. */
+ maxlen--;
for (i = 0; i < maxlen; i++) {
c = src[i];
if (!c)
Note I didn't write maxlen = maxlen / 2 - 1; as both forms generate the
same compiler output (gcc 3.3.1 from SuSE 9.0 Pro) but merging the
comments makes the comment go multi line which I thought was uglier.
(-:
> the second fix preserves the original method, with a
> little cleanup:
>
> int stoucs( uchar_t * dst, char const * src, int size )
> {
> int const uni_size = size / sizeof( *dst ) ;
why not uni_size = size / sizeof(uchar_t) - 1; ?
> int cp ;
>
> for ( cp = 0 ; src[ cp ] && cp < uni_size - 1 ; ++cp )
then you don't need the -1 in the test although I guess gcc ought to be
able to optimize this away as you declared uni_size as const so it
shouldn't make much difference.
> dst[ cp ] = cpu_to_le16( src[ cp ] ) ;
>
> dst[ cp ] = cpu_to_le16( '0' ) ;
>
> return cp ;
> }
>
> the third fix is an imho more c-ish complete rewrite
> (+ assert + comments):
>
> #include <assert.h>
>
> unsigned stoucs( uchar_t * dst, char const * src, unsigned const size )
> {
> char const * const begin = src ;
> unsigned uni_size = size / sizeof( *dst ) ;
>
> assert( uni_size ) ;
I don't like assert() very much because it is too easy to create
Heisenbugs by using things in inside the assert() that have side
effects...
> // while there's both input and output left
> while ( *src && --uni_size )
> *dst++ = cpu_to_le16( *src++ ) ; // ascii -> unicode
Again, I don't trust gcc to do the right thing. (-: I prefer to cache
*src in a temporary variable for reuse. I guess gcc would do the right
thing as you made src const...
You might notice I (perhaps unfairly) don't trust gcc to do anything
well so I tend to write what I consider optimized code instead of hoping
that gcc would optimize it... I know the performance of stoucs() in
mkntfs isn't important at all but hey. (-8
> /**
> * stoucs - convert ASCIIZ string to unicode string
> *
> * @param dst [out] points the buffer to receive the unicode string
> * @param src [in] points the char string to be converted
> * @param size [in] size of the @dst buffer in bytes
Please stick with DocBook style for parameters, i.e.
@paramname: description
@paramname2: description2
etc...
If nothing else I find the style we currently use a hell of a lot more
readable than what you used... My eyes are automatically drawn to the @
character and I want to then find the parameter name there rather than
having to scan to the next work. Also having the descriptions aligned
looks neater and again helps readability IMHO.
Another comment about your style is can we please have C style comments:
/* comment */
Instead of C++ style:
//
I only use C++ style for TODO or FIXME comments with the intention of
them going away when the TODO or FIXME has been done...
In case you didn't know we keep the latest sources in BitKeeper on
linux-ntfs.bkbits.net. Here is what I wrote (somewhat modified) in a
previous email on how to use BK with ntfsprogs. You might find it
useful...
A small howto. Set parent to: bk://linux-ntfs.bkbits.net/ntfsprogs
(or ntfsprogs-devel obviously), i.e. just "bk clone -q
bk://linux-ntfs.bbkits.net/ntfsprogs" which creates ntfsprogs directory
in your current working directory. Preferably work on ntfsprogs not
ntfsprogs-devel, then "bk citool" (need X running!) then, generate patch
with bk export -tpatch -r1.865 (1.865 is the changeset number you can
get that by looking typing "bk changes". If there is only one changeset
commit you can use -r+ instead of the number). And email the patch to
lin...@li.... To get updated code you just go
into your ntfsprogs/ntfsprogs-devel repositories and type "bk pull". To
look at changes use "bk revtool" and "bk csettool" (csettool is invoked
automatically bu revtool when needed) and of course "bk helptool" is the
generic BK help system...
And after all that, welcome to the Linux-NTFS project! I hope I haven't
flamed you too much. (-:
Best regards,
Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ &
http://www-stu.christs.cam.ac.uk/~aia21/
|