From: Christophe R. <cs...@ca...> - 2003-06-30 08:44:43
|
The challenge was (from #lisp denizens) to write enough of an ls clone to see how hard it was to do in sbcl; my conclusion, thanks to some unexpected behaviour by sb-grovel, was "too hard". Firstly, let me post my solution; then, I'll discuss the problems I had getting there: --- begin constants.lisp (sb-grovel:grovel-constants-file) --- ("sys/types.h" "dirent.h") ((:structure dirent ("struct dirent" ((* t) name "char *" "d_name"))) (:function opendir ("opendir" (* t) (name c-string))) (:function readdir ("readdir" (* t) (dir (* t)))) (:function closedir ("closedir" integer (dir (* t))))) --- end constants.lisp --- --- begin list-files.lisp --- (defun list-files (directory) (loop with dir = (opendir directory) for dirent = (readdir dir) then (readdir dir) until (sb-grovel::foreign-nullp dirent) do (princ (sb-alien::%naturalize-c-string (sb-sys:sap+ (alien-sap dirent) 11))) finally (closedir dir))) --- end list-files.lisp --- So SB-GROVEL certainly helped in writing the alien functions, at least partly. OPENDIR would seem to be a complete success; it's completely in accordance with the man page; likewise CLOSEDIR (well, if we accept that the C type "DIR *" is best treated as a void * by lisp, on the basis that it's a competely opaque type to C in any case). Where things start going wrong is in the definition of READDIR. Because we don't have a full alien definition of "struct dirent", we can't declare READDIR as returning a pointer to one of these, so we have to resort to declaring the return as a void pointer. I don't see any way around this without fully parsing C header files, though, because there are non-standard extra fields on Linux, so to get a full alien definition for struct dirent we'd have to read header files ourselves. Fair enough. Where it really falls down, though, is in the definition of struct dirent. On SunOS, it's probably not too bad: struct dirent { ino_t d_ino; off_t d_off; unsigned short d_reclen; char d_name[1]; }; So there's a simple bug in sb-grovel which involves taking CARs of things that aren't necessarily lists; that's why the above definition has (* T), not C-STRING, for the DIRENT name. But it gets worse, because on Linux the definition of the d_name field isn't char d_name[1]; but char d_name[256]; at which point the logic in define-c-struct merrily deduces that it wants to use the mystical operator SB-SYS:SAP-REF-2048 to access this datum (because sizeof(t.d_name) returns 256 :-). Needless to say, this operator does not exist. Any ideas? Cheers, Christophe -- http://www-jcsu.jesus.cam.ac.uk/~csr21/ +44 1223 510 299/+44 7729 383 757 (set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b))) (defvar b "~&Just another Lisp hacker~%") (pprint #36rJesusCollegeCambridge) |
From: <sa...@bl...> - 2003-06-30 15:13:31
|
Christophe Rhodes writes: > So SB-GROVEL certainly helped in writing the alien functions, at least > partly. OPENDIR would seem to be a complete success; it's completely > in accordance with the man page; likewise CLOSEDIR (well, if we accept > that the C type "DIR *" is best treated as a void * by lisp, on the > basis that it's a competely opaque type to C in any case). When I hacked SB-GROVEL to support POSIX for UFFI, I added a new OPAQUE-TYPE declaration: (:opaque-type dir-ptr-t "DIR *" "Opaque type that represents an open directory stream") The "opendir" declaration then becomes: (:function opendir ("opendir" dir-ptr-t (name :cstring))) > Where things start going wrong is in the definition of READDIR. > Because we don't have a full alien definition of "struct dirent", we > can't declare READDIR as returning a pointer to one of these, so we > have to resort to declaring the return as a void pointer. I don't see > any way around this without fully parsing C header files, though, > because there are non-standard extra fields on Linux, so to get a full > alien definition for struct dirent we'd have to read header files > ourselves. Fair enough. Don't try parsing the C header files. Change SB-GROVEL to determine the exact start and length of each of the explicitly listed fields. For a DIRENT, the declaration looks like: (:structure dirent-t ("struct dirent" (d_name :cstring "d_name" "char *"))) The auxiliary C program will generate an declaration like: (define-sparse-struct dirent-t 268 ((d_name, :cstring 11 256))) This indicates that any DIRENT-T is 268 bytes long, with a "d_name" field 256 bytes long starting at byte 11. It uses the standard technique of "sizeof"s and subtracting field addresses. Note that there may be other fields, but we don't care about them. We generated the declaration from the POSIX spec. DEFINE-SPARSE-STRUCT is a macro that turns this into an actual UFFI declaration; under the hood it creates "_DUMMY1", "_DUMMY2", ... fields (unsigned character arrays of the appropriate length) to force field alignments. You end up with a structure declaration with exactly the right length, and all the fields you declared in the right spot; non-standard fields don't exist as far as the program is concerned. And because you're defining a real structure, you don't have to use the DEFINE-C-ACCESSOR memcpy hack to get data in and out of the fields. My declaration of "readdir" ends up being: (:function readdir ("readdir" (* dirent-t) (dir dir-ptr-t))) Derek -- Derek Upham sa...@bl... "Ha! Your Leaping Tiger Kung Fu is no match for my Frightened Piglet style!" |
From: Daniel B. <da...@te...> - 2003-06-30 16:39:51
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 sa...@bl... writes: > When I hacked SB-GROVEL to support POSIX for UFFI, I added a new > OPAQUE-TYPE declaration: > > (:opaque-type dir-ptr-t "DIR *" > "Opaque type that represents an open directory stream") > > The "opendir" declaration then becomes: > > (:function opendir ("opendir" dir-ptr-t (name :cstring))) I think this is basically the same as=20 (sb-alien:define-alien-type dirent-t (* t)) though obviously if you're using uffi you probably don't want to use sb-alienb directly, so I can see why you did this. > For a DIRENT, the declaration looks like: > > (:structure dirent-t ("struct dirent" > (d_name :cstring "d_name" "char *"))) > > The auxiliary C program will generate an declaration like: > > (define-sparse-struct dirent-t 268 ((d_name, :cstring 11 256))) > > This indicates that any DIRENT-T is 268 bytes long, with a "d_name" > field 256 bytes long starting at byte 11. It uses the standard > technique of "sizeof"s and subtracting field addresses. Note that > there may be other fields, but we don't care about them. We generated > the declaration from the POSIX spec. For what it's worth, the usual sb-grovel processor will cope with partial struct definitions too. The new feature in Derek's version is that it recognises :cstring as a type - and obviously, that it generates uffi definitions, not sb-alien stuff directly. > concerned. And because you're defining a real structure, you don't > have to use the DEFINE-C-ACCESSOR memcpy hack to get data in and out > of the fields. My declaration of "readdir" ends up being: Yeah, this is a problem for sb-grovel in its current form. There are, uh,three excuses 1) There's no exported functional interface (as opposed to macro- based interface) to ALIEN, and I didn't want to have user code grovelling around extensively in the alien type system innards. (This was back in the days when I first wrote db-sockets, for CMUCL, and I was in even less of a position to unilaterally define new CMUCL interfaces then than I am for SBCL now) 2) I wanted structures that would get GCed without having to worry about creating finalisers for everything 3) As the eulogist said at the funeral of the dealer in stolen property, it was a long time ago, and besides, the fence is dead. On the subject of UFFI and POSIX more generally: I would like to borrow/steal some of your POSIX work, but I'd really rather not have SBCL contrib all depending on UFFI. Partly this is for size reasons, partly because UFFI is a cross-implementation (if not portable, then at least widely ported) project that's developed independently of SBCL. (If I'm honest it's partly also a taste/prejudice issue: IMO UFFI suffers from OAOOM issues - probably inescapable given the job it does, but I don't want to get any on me) =2D -dan =2D --=20 http://www.cliki.net/ - Link farm for free CL-on-Unix resources=20 =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE/AGeDHDK5ZnWQiRMRAg5+AJ9EcMPFwVRtN+YEci6ZL2iEAV3NLwCgs/DP wEqHO2H5UhZmcpiFnQEdGZI=3D =3DF5c3 =2D----END PGP SIGNATURE----- |
From: Daniel B. <da...@te...> - 2003-06-30 16:10:12
|
=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Christophe Rhodes <cs...@ca...> writes: > ((:structure dirent ("struct dirent" > ((* t) name "char *" "d_name"))) That actually wouldn't work even if you declared it in C: a character pointer and a character array are not the same thing. There's a similar issue with sockaddr_un (sockaddr for local-domain sockets). In sb-bsd-sockets, we do something like (:structure sockaddr-un ("struct sockaddr_un" (integer family "sa_family_t" "sun_family") ((array (unsigned 8) 108) path "char" "sun_path")= )) =2D - yes, ugly constant number there that needs fixing, I know. The code to deal with this doesn't get any more beautiful either (loop for c across filename ;; XXX magic constant ew ew ew. should grovel this from ;; system headers for i from 0 to (min 107 (1- (length filename))) do (setf (sockint::sockaddr-un-path sockaddr i) (char-code c)) finally (setf (sockint::sockaddr-un-path sockaddr (1+ i)) 0))) I'm not sure why I used (unsigned 8) instead of base-char; that would at least save doing the code-char thing =2D -dan =2D --=20 http://www.cliki.net/ - Link farm for free CL-on-Unix resources=20 =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE/AGCcHDK5ZnWQiRMRAvJSAJ42w11CSlJ90oIoEWiJFccj2eTEkQCgpGeJ 23ckdx+S9OHoI0YWNESncOw=3D =3DEmN8 =2D----END PGP SIGNATURE----- |