From: Hoehle, Joerg-C. <Joe...@t-...> - 2002-03-12 16:07:15
|
Hi, I summarized some known deficiencies of the current FFI and possible exits = (sketch of proposals). Please contribute! o Arrays of variable size are not supported. C-ARRAY-MAX helps but doesn't cover all needs. IDL/COM's size_is() annotation would cover many situations. In Haskell (and COM), they say something along: gethostname([out,size_is(len)] cstring name, [] int len); TODO in/out?? Examples of use: COM of course, regerror's errbuf_size, regexec's nmatch, T= ODO others TODO: exact specification when used with :out or :in-out or possibly differ= ent mode for the length and the array parameters TODO: interaction with *foreign-encoding* PROPOSAL: Add size_is ability to function declaration. Needs working out (b= ut exists in Haskell, COM etc.). Quite some work (typical danger of "better= is worse" 100% solution). Short-time work-around: my redefine-foreign-function hack o Kevin Rosenberg is currently releasing the universal FFI (UFFI): a single= API that supports AllegroCL, Lispworks and CMUCL. http://www.med-info.com/uffi.shtml One may consider it a threat to CLISP, as it operates at a low level of abs= traction that the CLISP FFI does not provide. Quite on the contrary, both t= he CLISP and CMUCL FFI try to abstract away from the low level that e.g. Al= legroCL and UFFI use. CMUCL provides access to both levels. CLISP and CMUCL= 's provide :out etc. and are rather declarative in nature, whereas other in= terfaces let you feel like a C programmer in Lisp syntax, working on pointe= rs. This is IMHO a "worse is better" dilemma: low-level interfaces are easy to = write, and programmers used to C know how to use them. OTOH, high-level int= erfaces must be *designed* first, and the designer is constantly at risk th= at his/her design is unusable and hides away required flexibility, by captu= ring too few usage scenarios (cf. variable sized arrays). From my feelings, I cannot recommend to provide a low-level FFI for CLISP t= hat UFFI could directly map to. My feeling is that the more declarative, th= e better. It should be the other way round: an UFFI should provide a very h= igh level, declarative interface. This one could be expanded into a set of = macros and functions for all possible implementations. Mapping UFFI or Alle= gro's FFI to CLISP's would require involved program analysis and even not s= tatically visible information about the behaviour of the foreign functions.= Yet maybe nobody will come up with such a high level FFI API for Lisp. A true case of worse is better (cf. in http://www.paulgraham.com/thist.html= the success of T and the failure of NIL). o VALIDP could be made SETF'able. Maybe this should be restricted to allow = invalidation only. Currently, some code uses (when (validp compiled-pattern) (mregfree compiled-pattern))) PROPOSAL: code could benefit from the addition of (setf (validp compiled-pattern) nil) or (mark-invalid compiled-pattern) I feel (setf (validp #) T) quite unsafe. But again, in what name am I limiting people's freedom? The current FFI does too much of this already. Observation: + The following 3 limitations come from trying to provide access to C idiom= s. C++, CORBA, COM, Java or other language interfaces don't have such weird= requirements and use very regular interfaces. COM requires variable sized = arrays though (via the size_is annotation). o No choice for initialization of c-union types Example: sigaction slot TODO can either be an int or a pointer to a handler= . PROPOSAL: use FOREIGN-VARIABLE objects and CAST as needed. FOREIGN-VARIABLE= is IMHO nearly unusable in the current FFI. Invent functions or macros to = make them usable. o Variable argument types e.g. ioctl or sigaction (SIG_IGN vs handler) o No malloc or free directly available Consider the following function int switch_port(char **name); On entry, name points to a malloc'ed buffer. The buffer contents must stay = valid until another name is selected. On (successfu)l return, name points to another buffer containing the previo= us port name. This old buffer may then be free'ed. PROPOSAL: I may be able to provide hacks (mostly Lisp-level) that provide m= alloc and free. So people could experiment. Yet the current FFI that they s= hould rarely be needed. OTOH, they may be needed more with delayed derefere= ncing (see below) which requires more work under programmer's control. o FFI dereferences everything function returns. Only C-POINTER is available to prevent this, e.g. module regexp, but then f= oreign structure is really opaque. PROPOSAL: would IMHO involve FOREIGN-VARIABLE types, as they feature exactl= y what is needed: a pair of FOREIGN-ADDRESS and type information. Access ca= n be made as needed. TODO: how to declare this "mode"? o Danger of dereferencing out parameters when function returns failure (cf various mails from me) Same problem as above, same solution: delay o FFI is too picky about some types. E.g. type C-POINTER only accepts passi= ng a FOREIGN-ADDRESS, but not FOREIGN-VARIABLE or FOREIGN-FUNCTION although= these are built upon FOREIGN-ADDRESS. TODO: investigate need o FFI copies everything on stack, or via malloc TODO how would allocation :none work with strings? E.g. each call to regexp-exec duplicates the string to be searched for. It = can be huge (MB range)! A more performant interface would attempt to work in place, on Lisp objects= , which is not possible with the FFI. TODO Is it actually possible to work = in place with Unicode strings? (For module regex, working in place would need to use re_search or re_searc= h_2() instead of regexec(), since the latter doesn't allow to specify a sto= p position and calls strlen.) Thanks for sharing comments, Regards, =09J=F6rg H=F6hle. |