Daniel Barlow <dan@...> writes:
> I was thinking about the whole foreign-memory issue on Monday, though
> From the perspective of a "standard" interface rather than SBCL
> specifically. This is marginally related.
> Some implementations have efficient means for letting foreign code
> write into Lisp space - e.g. a character or (unsigned-byte 8)
> array. Some (e.g. OpenMCL) can't.
> We _could_ say that posix:read always takes a Lisp array as argument,
> but this would suck if all we wanted to do with the array was write it
> out again, because we'd have to do all that copying for no reason. So,
> there should be some way to say "leave this in foreign space" for
> implementations that would make this faster.
> Sometimes we have to do format conversions and stuff as well. For
> example, whatever we read from a file is going to be in some
> native-to-the-system encoding (probably utf8) and will need converting
> into unicode or iso-8859-15 or whatever else the Lisp uses.
This is what simple-streams does, essentially (among other things).
The device layer of simple-streams works with buffer objects. It
would be cool to unify this with sb-posix.
The relevant parts of buffers, as seen by simple-streams (code
courtesy of Paul Foley):
(deftype simple-stream-buffer ()
'(or sb-sys:system-area-pointer (sb-kernel:simple-unboxed-array (*))))
(defun buffer-sap (thing &optional offset)
(declare (type simple-stream-buffer thing) (type (or fixnum null) offset)
(optimize (speed 3) (space 2) (debug 0) (safety 0)
;; Suppress the note about having to box up the return:
(let ((sap (if (vectorp thing) (sb-sys:vector-sap thing) thing)))
(if offset (sb-sys:sap+ sap offset) sap)))
(defun bref (buffer index)
(declare (type simple-stream-buffer buffer)
(type (integer 0 #.most-positive-fixnum) index))
(sb-sys:sap-ref-8 (buffer-sap buffer) index))
(defun (setf bref) (octet buffer index)
(declare (type (unsigned-byte 8) octet)
(type simple-stream-buffer buffer)
(type (integer 0 #.most-positive-fixnum) index))
(setf (sb-sys:sap-ref-8 (buffer-sap buffer) index) octet))
See also: buffer-copy, allocate-buffer, free-buffer, all in
> So: a buffer is an opaque object which has a start address and
>optionally also a size.
> A buffer-designator is
> - NIL, meaning either #<buffer :start 0 :length 0> or "allocate your
> own buffer", depending on context. (yuck, context-dependence. I
> hope it should be obvious which behaviour is appropriate for any
> given call but haven't thought about it too hard yet)
> - a buffer, as returned by
> (allocate-buffer :length l) ; allocates new buffer somewhere appropriate
> (get-buffer :start s :length l)
> ; creates a buffer pointing to existing VM. length is optional
> [ Open to better names for that latter ]
fabricate-buffer perhaps? The name must indicate to the dumb
programmer (i.e., me) that nothing is allocated.
buffer-from-memory-location? (awkward to type, which might be a win
in this case :)
I think we need a protocol for buffer-length in this case, and specify
that it can return NIL (unknown) for buffers created from memory
locations. Also a function buffer-length-or-lose, we don't need no
buffer overruns in this town!
> - a vector of (unsigned-byte 8), whose contents are made available
>as a buffer of appropriate start and length, using an identity
>mapping. This may or may not involve creating a copy of the data.
> - a vector of character, which is turned into a buffer using some
> implementation-defined transformation that obviously depends on the
> implementation's representation of characters. This may or may not
> involve creating a copy of the data.
> (perhaps this wants to be a string designator, not exclusively a string)
> - implementations may optionally extend this if they want [e.g. SBCL
> could also take a SAP]
> 1) optional size? yes, because the 'size' arguments to system calls
> are optional too, when the size can be inferred from the object passed.
See above; a sap or buffer-from-memory-location has no size, so the
stupid programmer should see an error at run-time instead of memory
corruption, if he is stupid enough to pass a SAP or
buffer-from-memory-location without a length argument.
> If you're mmaping at fixed addresses you probably want to write
> (posix:mmap (get-buffer 1000) 4096 posix:prot-read 0 -1 0) so that
> your argument list matches the C binding more closely
The alternative being keyword arguments to posix:mmap. I don't think
this is appropriate for such a low-level function, and would violate
the sb-posix design guidelines as I understand them (lispy arglist
follows manpage of syscall).
> 2) Remember that this is a one-way transformation: you can't expect to
> portably do
> (let ((v (make-string 10))) (posix:read 0 v))
> because the mutated buffer isn't automatically transformed back again.
> You have to pass posix:read a buffer. This is slightly ungainly but
> not so philosophically different from what the rest of the interface
> does: calling (posix:dup *standard-input*) returns an integer, not a
> We do need some functions to convert buffers back into these lispy
> objects. In fact, this goes for all the designators: we also need
> functions that convert other all kinds of C objects that we have
> designators for (currently file descriptors, pathnames) into usefully
> Lisp-like objects.
How system-independent can this part of the posix interface be kept?
pathnames can be converted via parse-namestring, fds need some
make-stream-from-fd function. I would hesitate to convert a returned
buffer into a string, other than with the trivial
(map 'string #'code-char buffer) variant. As mentioned,
simple-streams deal with the byte-array-to-lisp-object question, but
at a higher level (streams). The issue will perhaps come up when I
begin testing string-simple-streams.
> "Buffer" is not a name I'm particularly attached to, though in the
>sense that it's a holding area for results passed back from C
>functions that may need further processing before Lisp sees them, I'm
>not sure it's actually such a bad one. It avoids the whole "should
>we call it _foreign_ memory or _native_ memory?" issue, at least.
It is a workable proposal, I think. I would convert sb-simple-streams to it,
to see how well it stands up to actual use. Should not be too much
work, and makes the simple-stream codebase a bit smaller (and less
Random thought: it would be nice to be able to express the notion of
unmovable-by-gc buffers somehow, for buffers that need to stay valid
for the duration of multiple syscalls.
Another random thought: the proposal needs to differentiate between
buffers and buffer-designators. The former could be acceptable for a
call to, say, mlock, while the latter (e.g. Lisp strings) clearly are
not. Or do we need a portable way of saying (void *)0xdeadbeef too?
whois DRS1020334-NICAT http://constantly.at/pubkey.gpg.asc
Key fingerprint = C182 F738 6B9A 83AF 9C25 62D9 EFAE 45A6 9A69 0867