From: Robert M. <bob...@bo...> - 2004-04-19 10:49:44
|
Hi all. I've got a multi-threaded app running under SBCL and somewhere in it it calls (read queue nil 0). It runs quite nicely except eventually it spits this: "unhandled condition (of type TYPE-ERROR): The value NIL is not of type NUMBER.". The value it should be reading from the file is a number... and I can see no reason it would be "NIL" when it's trying to read it here. The thread immediately after the crashing thread picks up the ball and successfully reads the number without a complaint and all read/write accesses to the file are mutex'd. Any suggestions on what I can try or fixes I can perform would be appreciated. The SBCL version I'm running is 0.8.8 with multithreaded support running on a single x86 CPU on GNU/Linux. Here's the debugging output SBCL gives (seems the error's happening in good old foreign function call land which I have no idea how to debug, and I can't find any obvious FF calls in the SBCL source to figure out what's going on): unhandled condition (of type TYPE-ERROR): The value NIL is not of type NUMBER. 0: ("hairy arg processor for top level local call SB!DEBUG:BACKTRACE" 128 #<SYNONYM-STREAM :SYMBOL SB-SYS:*STDERR* {504F771}>) 1: (SB-DEBUG::DEBUGGER-DISABLED-HOOK 2 #<TYPE-ERROR {92F9DE9}> #<unavailable argument>)[:EXTERNAL] 2: (INVOKE-DEBUGGER 1 #<TYPE-ERROR {92F9DE9}>)[:EXTERNAL] 3: (ERROR 5 TYPE-ERROR)[:EXTERNAL] 4: (SB-KERNEL::OBJECT-NOT-TYPE-ERROR-HANDLER 4 #<unavailable argument> #.(SB-SYS:INT-SAP #X41085D40) #<SB-ALIEN-INTERNALS:ALIEN-VALUE :SAP #X41085A10 :TYPE (* (STRUCT SB-VM::OS-CONTEXT-T-STRUCT))> (398 14))[:EXTERNAL] 5: (SB-KERNEL:INTERNAL-ERROR 2 #.(SB-SYS:INT-SAP #X41085A10) #<unavailable argument>)[:EXTERNAL] 6: ("foreign function call land: ra=#x80579F1") 7: ("foreign function call land: ra=#x805787D") 8: ("foreign function call land: ra=#x805236B") 9: ("foreign function call land: ra=#x80574EB") 10: (SB-IMPL::MAKE-INTEGER 0)[:EXTERNAL] 11: ("hairy arg processor for top level local call READ-PRESERVING-WHITESPACE" #<FILE-STREAM for "file \"/home/pisces/psmsd/spool/.queue\"" {92F8929}> NIL 0 T) 12: ("hairy arg processor for top level local call READ-PRESERVING-WHITESPACE" #<FILE-STREAM for "file \"/home/pisces/psmsd/spool/.queue\"" {92F8929}> NIL 0 NIL) 13: ("hairy arg processor for top level local call READ" #<FILE-STREAM for "file \"/home/pisces/psmsd/spool/.queue\"" {92F8929}> NIL 0 NIL) 14: (PSMSD::DEQUEUE-TICKET) 15: ("hairy arg processor for PSMSD::DEQUEUE-MESSAGE" #<unavailable argument>) 16: (PSMSD::SENDER) 17: ("XEP for SENDER" 0)[:EXTERNAL] 18: ("#'(LAMBDA NIL (LET # # ...))") 19: ("foreign function call land: ra=#x80579F1") 20: ("foreign function call land: ra=#x805780D") -- Regards, Robert Marlow |
From: Christophe R. <cs...@ca...> - 2004-04-19 15:14:17
|
Robert Marlow <bob...@bo...> writes: > 10: (SB-IMPL::MAKE-INTEGER 0)[:EXTERNAL] This is where the error is occurring... and I suspect I know what's going on. The tokeniser uses global buffers to do its tokenization (have a look at the use of *read-buffer* in src/code/reader.lisp). This means that if two threads attempt to tokenise at once, something much like what you've observed will happen, for instance when DIGIT-CHAR-P (in MAKE-INTEGER) gets called on a non-numeric character. I don't have a ready fix right now. The quick fix, but with possible performance implications, is to bind *read-buffer* to a fresh array somewhere in src/code/reader.lisp (and *read-buffer-length*, too). If that turns out to be too consy, then maybe a pool of buffers needs to be implemented. Cheers, Christophe -- http://www-jcsu.jesus.cam.ac.uk/~csr21/ +44 1223 510 299/+44 7729 383 757 (set-pprint-dispatch 'number (lambda (s o) (declare (special b)) (format s b))) (defvar b "~&Just another Lisp hacker~%") (pprint #36rJesusCollegeCambridge) |
From: Robert M. <bob...@bo...> - 2004-05-11 12:12:03
Attachments:
reader.diff
|
Thanks for the pointer I've set about trying to rewrite reader.lisp to be thread friendly but my attempts seem to not be playing nicely. I've possibly not chosen the best way to do it, but how I've written it is rewritten *read-buffer* to be a hash table and have buffers extracted from the hash table by using #'current-thread-id as the key. Perhaps I could have done this more efficiently? Anyway, that's required quite a lot of rewriting (most of it just moving the global read-buffer stuff into functional arguments for much of the functions in the file) but now I find that during my attempts to compile sbcl I get buffer overflows. grow-read-buffer seems like it should work ok, but the compile dies on the first attempt to parse a >512 character documentation string. I've attached a diff for what I've done. Actually, a sort of unrelated question - what's an easy way to speed up code testing for SBCL development? There's a lot of cross compilation stuff in reader.lisp so it doesn't make it easy for me to just load up the file and start messing with it interactively as lisp hackers are wont to enjoy. Is there a trick you guys have to make SBCL source hacking quicker than a hack->full-recompile->debug cycle? On Mon, 2004-04-19 at 23:11, Christophe Rhodes wrote: > Robert Marlow <bob...@bo...> writes: > > > 10: (SB-IMPL::MAKE-INTEGER 0)[:EXTERNAL] > > This is where the error is occurring... and I suspect I know what's > going on. > > The tokeniser uses global buffers to do its tokenization (have a look > at the use of *read-buffer* in src/code/reader.lisp). This means that > if two threads attempt to tokenise at once, something much like what > you've observed will happen, for instance when DIGIT-CHAR-P (in > MAKE-INTEGER) gets called on a non-numeric character. > > I don't have a ready fix right now. The quick fix, but with possible > performance implications, is to bind *read-buffer* to a fresh array > somewhere in src/code/reader.lisp (and *read-buffer-length*, too). If > that turns out to be too consy, then maybe a pool of buffers needs to > be implemented. > > Cheers, > > Christophe -- Regards, Robert Marlow |
From: Robert M. <bob...@bo...> - 2004-05-16 06:49:30
Attachments:
target-thread.diff
|
I think I've fixed this bug. I took dan`b's advice and locally bound *read-buffer*, *ouch-ptr* and *inch-ptr* during thread creation in make-thread. I've attached the diff. I've run my test case and got no read errors where I used to get 2-3 during the run so I think it's fixed the problem. I have still been getting one other error though: mmap: Cannot allocate memory unhandled condition (of type SB-INT:SIMPLE-CONTROL-ERROR): attempt to RETURN-FROM a block or GO to a tag that no longer exists Is this just SBCL running out of memory or is it possibly to do with thread contention for memory addresses? On Mon, 2004-04-19 at 23:11, Christophe Rhodes wrote: > Robert Marlow <bob...@bo...> writes: > > > 10: (SB-IMPL::MAKE-INTEGER 0)[:EXTERNAL] > > This is where the error is occurring... and I suspect I know what's > going on. > > The tokeniser uses global buffers to do its tokenization (have a look > at the use of *read-buffer* in src/code/reader.lisp). This means that > if two threads attempt to tokenise at once, something much like what > you've observed will happen, for instance when DIGIT-CHAR-P (in > MAKE-INTEGER) gets called on a non-numeric character. > > I don't have a ready fix right now. The quick fix, but with possible > performance implications, is to bind *read-buffer* to a fresh array > somewhere in src/code/reader.lisp (and *read-buffer-length*, too). If > that turns out to be too consy, then maybe a pool of buffers needs to > be implemented. > > Cheers, > > Christophe -- Regards, Robert Marlow |
From: Nikodemus S. <tsi...@cc...> - 2004-05-11 12:32:26
|
On Tue, 11 May 2004, Robert Marlow wrote: > Actually, a sort of unrelated question - what's an easy way to speed up > code testing for SBCL development? There's a lot of cross compilation > stuff in reader.lisp so it doesn't make it easy for me to just load up > the file and start messing with it interactively as lisp hackers are > wont to enjoy. Is there a trick you guys have to make SBCL source > hacking quicker than a hack->full-recompile->debug cycle? The ways I'm aware of: * Just load the code into a running system. May work as long as the code contains no def!foo, sb!foo or #! +stuff. * Load src/cold/chill.lisp -- this allows loading _some_ parts of the SBCL sources into a running image, including sending stuff from Slime with C-c C-c and friends. Though it seems that hacking on a chilled system is a good way to get Slime confused eventually, at least when tinkering with PCL. * Modify sources, cross your fingers and run "sh slam.sh <original build host>". Works for only some changes, though. After you get your stuff working you better do a full build still, since it's possible to slam some stuff that will break a full build -- mostly things that affect the cross-compilation. Requires having :sb-after-xc-core in features for the original build. This is the reason that I usually use CMUCL as a build host, since then I can safely install the freshly built SBCL without invalidating old after-xc-cores. * If you really want to live dangerously, see my slam-host hack in the archives (around last december, iirc): just like slam, but occasionally able to incorporate changes that normal slam can't. Can result in spectacular *CRASH* *BOOM* *BANG* sound effects, though... Cheers, -- Nikodemus |
From: Robert M. <bob...@bo...> - 2004-05-11 12:52:56
|
Excellent, thanks very much Nikodemus. Those pointers should save me a lot of time with this kind of stuff :) On Tue, 2004-05-11 at 20:32, Nikodemus Siivola wrote: > On Tue, 11 May 2004, Robert Marlow wrote: > > > Actually, a sort of unrelated question - what's an easy way to speed up > > code testing for SBCL development? There's a lot of cross compilation > > stuff in reader.lisp so it doesn't make it easy for me to just load up > > the file and start messing with it interactively as lisp hackers are > > wont to enjoy. Is there a trick you guys have to make SBCL source > > hacking quicker than a hack->full-recompile->debug cycle? > > The ways I'm aware of: > > * Just load the code into a running system. May work as long as the code > contains no def!foo, sb!foo or #! +stuff. > > * Load src/cold/chill.lisp -- this allows loading _some_ parts > of the SBCL sources into a running image, including sending > stuff from Slime with C-c C-c and friends. Though it seems that > hacking on a chilled system is a good way to get Slime confused > eventually, at least when tinkering with PCL. > > * Modify sources, cross your fingers and run "sh slam.sh <original build > host>". Works for only some changes, though. After you get your > stuff working you better do a full build still, since it's possible to > slam some stuff that will break a full build -- mostly things that > affect the cross-compilation. > > Requires having :sb-after-xc-core in features for the original > build. This is the reason that I usually use CMUCL as a build host, > since then I can safely install the freshly built SBCL without > invalidating old after-xc-cores. > > * If you really want to live dangerously, see my slam-host hack in > the archives (around last december, iirc): just like slam, but > occasionally able to incorporate changes that normal slam can't. > Can result in spectacular *CRASH* *BOOM* *BANG* sound effects, > though... > > Cheers, > > -- Nikodemus -- Regards, Robert Marlow |