From: Fred C. <fc...@al...> - 2009-11-24 17:51:16
|
Of course I am far less of an expert at it than you are, but to me, it seems problematic every time I deal with anything other than the standard lisp character set. While the encoding trick applies to input and output streams (sort of), it does not permit all lisp operations to work on all encodings. Also, it requires different compilation (e.g., for unicode), which is to say, I cannot adapt it to the circumstance I encounter when dealing with different input sources. For example, if I want to operate in (UNSIGNED-BYTE 8) throughout my lisp program, there is no way to do it. I cannot do a "(setf " using an (unsigned-byte 8 name) - or a sixbit name - or whatever other coding I may choose, I cannot have a list in which the elements are (unsigned-byte 8) things and have them properly handled by all of the lisp operations, etc. I guess what I am trying to say is that for much of my work (which is largely focussed on dealing with raw datagrams and forensic images), I want to live in a world where all of the byte values can be present in all operations performed. If I input a sequence of bytes, I want to be able to treat them as if they are strings from a standpoint of doing searches, applying regular expressions, etc., and if I want to use then in arrays, all of the operations should operate on them in the same way as if they were regular things (e.g., strings) places in arrays (except of course that they are byte values not restricted to those of strings). Sort, cons, eval, hashes, sequence operations, and everything else should operate exactly as it does with the normal character set, and I should not have to do special escape sequences in order to represent things like a " character or an "ESC" (byte value 127?). While it is relatively easy to input a sequence of bytes into a lisp structure (e.g., an array), once it's in there, it has to be treated specially in every way for every operation, and I always seem to risk a translation that will turn my bytes into some other bytes, which I cannot tolerate for my applications. So, I am sort of looking for a generalization of the original lisp character set - to allow any desired character set to be used for all purposes. Now I realize that this is an unrealistic thing to ask for, and that it won't happen any time soon, but then I also want a built-in GUI that works across all platforms (like java only better), to be able to use unlimited precision arithmetic directly on byte sequences without any special translations, the ability to directly place objects into files (i.e., open a file, write a series of objects into it, and be able to read them all back - random access), the ability to use a file or set of files as cache for memory structures (so that as my programs grow to enormous sizes, lisp automatically does paging to allow me to get to huge calculations and storage sizes without having to do special file IO), a direct database interface for lisp-operated database functions (rather than having to go to MySQL or some such thing), a built-in command interpreter (bash equivalent but operating in lisp), emacs running lisp rather than lisp running in emacs, and of course, world peace. So far, I just keep on writing C interfaces and making everything locally customized, but I would far prefer to live in a lisp environment all the time. I hope this answers your question with regard to the limits on the use of bytes in lisp, and that my response isn't taken in an unfriendly spirit. I recognize the fantastic amount of effort already undertaken and the high quality of clisp as it exists. IT cannot be all things to all people, and I am likely the rare exception rather than the rule with respect to all of these things. But since you asked what I want, I figured I would tell you. Clearly I don't expect to get it. FC On Nov 24, 2009, at 8:51 AM, Sam Steingold wrote: > Fred Cohen wrote: >> Since binary I/O is coming up, I have long wanted to be able to do >> all of the operations I do on ASCII characters and sequences on >> bytes... Somehow it would be nice if we could identify a character >> set (e.g., ASCII / EBCDIC / BYTE / UNICODE / etc.) and have >> everything work in that set instead of in what is something like >> ASCII-7 today. > > I am confused. We already have encodings. > http://clisp.cons.org/impnotes/encoding.html > Is that not what you want? - This communication is confidential to the parties it is intended to serve - Fred Cohen & Associates tel/fax: 925-454-0171 http://all.net/ 572 Leona Drive Livermore, CA 94550 Join http://tech.groups.yahoo.com/group/FCA-announce/join for our mailing list |