Menu

#22 Check "collector_t" and "frame_t"

open
nobody
5
2006-08-16
2006-08-16
No

http://felix.cvs.sourceforge.net/felix/lpsrc/flx_gc.pak?revision=1.8&view=markup
Would you like to replace any type specifications like
"unsigned long" by "size_t"?

Discussion

  • John Skaller

    John Skaller - 2006-08-17

    Logged In: YES
    user_id=5394

    Hard to say. size_t has a few problems:

    * we don't know which function to call when passing an argument

    * we don't know if it is signed, which makes address
    calculations indeterminate

    The Felix configuration script does know of course.

    In C, only operators are overloaded, so the main problem is
    things like printf("%????",(size_t)n): no way to tell what
    format code to use.

    In C++ it is more serious, one might even end up with an
    ambiguous overload:

    void f(int,size_t)
    void f(long,int)

    sort of thing (not sure if that one is ambiguous or not, but
    you get the idea).

    In both languages, the fact you don't know the sign of
    size_t has devastating consequences for low level
    calculations, since signed arithmetic isn't deterministic.
    In practice this isn't so bad: we know 2's complement, and,
    usually size_t is unsigned anyhow. But then, we also know
    it's usually the same type as unsigned long -- since C++
    technically doesn't actually have long long. Using 'unsigned
    long' as a size is therefore reasonably safe and portable.

    In Felix the problem is even worse, because in Felix there
    are no implicit conversions. To use an alias like size_t you
    would have to forcibly convert it to another type or provide
    a combinatorial number of overloads. The latter is
    untenable. The former defeats the purpose of size_t. In fact
    the latest Felix supports mixed mode arithmetic using
    constrained polymorphism, but that only works for binary
    operators. The technique can be applied to user functions,
    including models of C functions .. but it doesn't work for
    pointers.

    Ok. Specifics: for frame_t, the whole struct is under
    review. It uses up to 6 machine words, which and is 48 bytes
    on an AMD64. This is the overhead paid for every heap
    allocation .. it is much too high. In fact it is likely I
    will reduce the size with low level hackery like packing
    flags into unused bits of pointers .. rather than make it
    more conforming.

    The collector generally uses unsigned long for counting,
    because it is hopefully big enough and a definite type, and
    it may need to be exposed to Felix so the programmer can
    control the GC.

    This is all very messy. Felix tries to fix some of the
    problems here, but the results are somewhat dubious. The
    prefered model is actually exact sizes.. unfortunately that
    makes binding to C/C++ hard.

    Anyhow, the bottom line is that the choice of unsigned long
    as a counter at the GC level is deliberate. It may the wrong
    choice but it isn't an oversight.

    Following discussions on comp.std.c++ most of the GC cannot
    be fully conforming anyhow. C/C++ doesn't make enough
    guarantees to write working code without additional
    implementation specific details.

    This impacts in many places. For example Felix array access
    is given by:

    fun subscript[T]: ptr[T] * int -> T = "$1[$2]";

    Note the index is an 'int'. If I used instead

    fun subscript[T]: ptr[T] * size -> T = "$1[$2]";

    then subscript(a,0) may not work, because 0 is type int
    and there are no automatic conversions. You'd have to write

    subscript(a,cast[size](0))

    or change the definition to use constrained generics:

    fun subscript[I:ints, T]: ptr[T] * I -> T = "$1[$2]";

    where 'ints' is the set of all integer types.
    This is a new feature .. it allows C to do the implicit
    conversions. But note of course it can't work with
    pointers because it doesn't in C either.

    The bottom line here is that Felix does NOT model the C/C++
    type system directly .. because it sucks: the point is to
    fix it. The lack of implicit conversions is one of those
    things that are different to C++: overloads require an exact
    match. Implicit conversions are widely regarded as a very
    bad idea.

    Unfortunately the use of type aliases (typedefs) in C is
    predicated on availability of implicit conversions and
    generic operators. The effect on C++ is bad. Some Standard
    Library functions are misdesigned because of this: in particular

    stream << T

    doesn't work for polymorphic streams (because characters and
    integers get mixed up unpredictably).

     
  • John Skaller

    John Skaller - 2006-08-17

    Logged In: YES
    user_id=5394

    "By the way: the signedness is clear. Please look at the
    section "The sizeof operator".
    - 3.3.3.4
    http://www.lysator.liu.se/c/rat/c3.html#size-95t-3-3-3-4"

    Accepted, thanks. Heard other rumours .. typically the
    Standard is sloppy in specifying things: this one is
    probably historical (size_t was added after sizeof).

    "I am curious how your type development will evolve."

    Try it out then :) The type *system* is unrelated to low
    level issues of integral types. None of the integral types
    are part of the type system: they're ALL defined by the end
    user in terms of C/C++. "Felix has no integers".

    Felix does have a type *system*: a method for combining
    types. It is based on ML, and Ocaml in particular, with
    extensions .. so it should work out just fine.

    The main hassle is interfacing to C/C++ type systems ..
    since the latter are rather broken, and because Felix has
    garbage collection. It looks a little like Microsoft's
    "Managed C++".

     

Log in to post a comment.