Re: [Linux-NTFS-Dev] how to fix bugs

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Thu, 2004-02-26 at 12:31, Andras Erdei wrote:
> On Thu, 26 Feb 2004 09:57:00 +0000, Anton Altaparmakov <ai...@ca...> wrote:
> >I committed this fix instead:
> >-       /* Need two bytes for null terminator. */
> >-       maxlen -= 2;
> >+       /* Convert maxlen from bytes to unicode characters. */
> >+       maxlen /= sizeof(uchar_t);
> 
> just nitpicking, but maybe getting rid of the magic number 2
> and using sizeof(uchar_t), or even more sizeof(*dest) at both 
> places is a bit better

Of course. You will notice that the above is a unified diff and the
lines prefaced by '-' characters are the lines removed!  So yes, the
magic 2 has now disappeared.  We both agree.  (-;  I prefer
sizeof(uchar_t) in this case because we are talking about a file system
and hence the types cannot change as they are fixed by the NTFS
specifications and on-disk structures.  And sizeof(uchar_t) is a lot
more obvious than sizeof(*dest) where one has to check in the function
declaration for what type dest is.

A few comments I have been meaning to make and forgetting:  Note that
the functions in mkntfs were written for mkntfs only.  For example
stoucs() is mkntfs only.  In fact mkntfs ought to use the functions
provided by libntfs (unistr.[hc]::ntfs_mbstoucs() and friends.  But
alas, I wrote mkntfs before unistr.[hc] existed and never revisited
mkntfs to fix it...

The general goal is that any useful functionality from all the utilities
should one day be moved in a generalized form to the library itself and
the utilities modified to use the library instead of having their own
versions.  That will make the utilities smaller and even more
importantly, hopefully one day mkntfs() itself will just be a function
provided by libntfs and mkntfs.c would be just a wrapper calling the
library mkntfs() function (plus option parsing and error handling).

> >I don't like assert() very much because it is too easy to create
> >Heisenbugs by using things in inside the assert() that have side
> >effects...
> 
> bugs are easy to create with almost every language construct :O)

True.  But Szaka's point not to use assert() in a library is still quite
a valid one...  A library should _never_ abort, it should always return
an error code instead.

> ok, no asserts will be used

Thanks.  (-:

> do you feel that an if() is needed?
> or the comment added in my version of the fn description is enough?

For library functions() and kernel functions and any functions that
parse external data (from user input, another program, from the disk
partition, etc) you cannot rely on comments.  You need heavy sanity
checking otherwise the software will either crash or worse corrupt data
or even create security holes in the kernel or SUID applications so I
tend to try and catch all input parameter errors and return error with
errno set to EINVAL (or in kernel return -EINVAL;).

> >Please stick with DocBook style for parameters
> 
> can you give me a pointer to some DocBook intro ?

Um no, but since I use the same style as the kernel so here is the
except from the kernel docs:

The format of the block comment is like this:

/**
 * function_name(:)? (- short description)?
(* @parameterx: (description of parameter x)?)*
(* a blank line)?
 * (Description:)? (Description of function)?
 * (section header: (section description)? )*
(*)?*/

The short function description cannot be multiline, but the other
descriptions can be (and they can contain blank lines). Avoid putting a
spurious blank line after the function name, or else the description
will be repeated!

All descriptive text is further processed, scanning for the following
special patterns, which are highlighted appropriately.

'funcname()' - function
'$ENVVAR' - environment variable
'&struct_name' - name of a structure (up to two words including
'struct')
'@parameter' - name of a parameter
'%CONST' - name of a constant.

And I prefer:

/**
 * function_name - short description
(* @parameterx:	description of parameter x)*
 * a blank line
 * Description of function.
 * (section header: (section description)? )*
 */

> >Another comment about your style is can we please have C style comments:
> >	/* comment */
> >Instead of C++ style:
> >	//
> >
> >I only use C++ style for TODO or FIXME comments with the intention of
> >them going away when the TODO or FIXME has been done...
> 
> i've only seen that // was used extensively, but you are right,
> only for this specific purpose; if it's important i'll try to
> get back into the habit of using /* */ instead

That would be good as kernel hackers scream on sight of C++ comments and
as Szaka said code can be copied from the library to the kernel driver
so it would be nice if one didn't have to change all the comments in the
process...

> this is the only point where i feel i have to make a stand
> -- but if you'll insist, i'll do my best to optimize :O)
> 
> - i'd prefer maintainability over efficiency; if can be
>   understood, then it can be changed (fixed!), so it can 
>   be optimized; if it's uncomprehensible, it cannnot be 
>   fixed, updated or optimized, in the rare cases where 
>   that's needed, and imho this (or any other i/o-centric code) 
>   isn't that case

I would agree with this.  I would prefer readable code over optimized
code everywhere, except for performance critical functions.  An example
where performance is important is libntfs/attrib.c::ntfs_attr_lookup()
and friends which are called all the time before anything at all can
happen so if they are slow the whole of ntfs would be slowed down. 
Whether it actually makes any noticeable speed difference when compared
to the slowdowns imposed by needing to read something from disk I don't
know but in an ideal world intelligent read-ahead in the kernel will
have read all the data already so that we only deal with memory
accesses... (-;

> - today it's extremely hard to tell which code is faster,
>   even with asm code (consider pipelines, cache etc); with C 
>   basically the only way to tell is trial and error (or a lot 
>   of experience); for example your original (fixed) stoucs() 
>   and the last version generate the same code:

That is very true.  Nowadays noone knows what compilers get up to and
what different cpus get up to so the same code could run at different
speeds on different cpus, etc...

> should i still try to optimize?

You don't have to.  I just like to do it.  To my personal way of
thinking having to re-evaluate things repeatedly is bad epsecially in a
multithreaded / kernel environment as you are potentially using shared
variables which can be changed under you.  But in libntfs that is hardly
important but nevertheless IMHO it is cleaner to not re-evaluate.  But
the bottom line is that as long as code not performance critical and it
is readable I don't mind if you optimize or not.

Best regards,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ &
http://www-stu.christs.cam.ac.uk/~aia21/

Re: [Linux-NTFS-Dev] how to fix bugs

Development moved to https://sourceforge.net/projects/ntfs-3g/

Re: [Linux-NTFS-Dev] how to fix bugs