From: Vladimir T. <vtz...@gm...> - 2009-09-15 13:22:25
|
On 9/15/09, Sam Steingold <sd...@gn...> wrote: > PS. Vladimir, did you decide not to replace my package.d patch? yes, I decided not to replace it. It has more "consistent" behavior now - package can not change while being iterated. |
From: Sam S. <sd...@gn...> - 2009-09-16 20:16:33
|
Vladimir Tzankov wrote: > On 9/15/09, Sam Steingold <sd...@gn...> wrote: >> PS. Vladimir, did you decide not to replace my package.d patch? > > yes, I decided not to replace it. > It has more "consistent" behavior now - package can not change while > being iterated. good. so, the only missing item now is clos/mop safety, right? (aside from bug fixing). Sam. |
From: Vladimir T. <vtz...@gm...> - 2009-09-16 21:28:51
|
On 9/16/09, Sam Steingold <sd...@gn...> wrote: > Vladimir Tzankov wrote: >> On 9/15/09, Sam Steingold <sd...@gn...> wrote: >>> PS. Vladimir, did you decide not to replace my package.d patch? >> >> yes, I decided not to replace it. >> It has more "consistent" behavior now - package can not change while >> being iterated. > > good. > > so, the only missing item now is clos/mop safety, right? yes > (aside from bug fixing). Which is not trivial at all :(. The problem I reported - about floating point test - appears to be quite general (and may cause various unrelated/surprising errors). I am still not sure what's going on exactly. |
From: Sam S. <sd...@gn...> - 2009-09-16 21:52:01
|
Vladimir Tzankov wrote: >> (aside from bug fixing). > Which is not trivial at all :(. The problem I reported - about > floating point test - appears to be quite general (and may cause > various unrelated/surprising errors). I am still not sure what's going > on exactly. I did not mean it was trivial. in fact, just now I encountered this "unkillable clisp" problem again: (make-thread (lambda () (do-symbols (s "CL-USER") (sleep 5) (print s)))) does not return and does not print anything, loadavg is at 100%, and the only way out is kill -9. |
From: Vladimir T. <vtz...@gm...> - 2009-09-16 22:01:44
|
On 9/17/09, Sam Steingold <sd...@gn...> wrote: > Vladimir Tzankov wrote: >>> (aside from bug fixing). >> Which is not trivial at all :(. The problem I reported - about >> floating point test - appears to be quite general (and may cause >> various unrelated/surprising errors). I am still not sure what's going >> on exactly. > > I did not mean it was trivial. > in fact, just now I encountered this "unkillable clisp" problem again: > (make-thread (lambda () (do-symbols (s "CL-USER") (sleep 5) (print s)))) > does not return and does not print anything, loadavg is at 100%, and the > only > way out is kill -9. Can you check where it spins (ctrl-z in gdb)? I just tested it on i386 linux and it's fine. 100% cpu usage is strange since no spinlocks are involved (besides the heap one). |
From: Sam S. <sd...@gn...> - 2009-09-16 22:09:56
|
Vladimir Tzankov wrote: >> does not return and does not print anything, loadavg is at 100%, and the >> only way out is kill -9. > > Can you check where it spins (ctrl-z in gdb)? > I just tested it on i386 linux and it's fine. 100% cpu usage is > strange since no spinlocks are involved (besides the heap one). sure - as soon as I get it again under gdb :-) alas, this is not reliably reproducible. |
From: Vladimir T. <vtz...@gm...> - 2009-09-17 15:10:49
|
On 9/17/09, Sam Steingold <sd...@gn...> wrote: > Vladimir Tzankov wrote: >>> (aside from bug fixing). >> Which is not trivial at all :(. The problem I reported - about >> floating point test - appears to be quite general (and may cause >> various unrelated/surprising errors). I am still not sure what's going >> on exactly. > > I did not mean it was trivial. > in fact, just now I encountered this "unkillable clisp" problem again: > (make-thread (lambda () (do-symbols (s "CL-USER") (sleep 5) (print s)))) > does not return and does not print anything, loadavg is at 100%, and the > only way out is kill -9. I fixed very serious bug in spinlock_release() on i386 and AMD64. Due to it gcc builds with -O2 were very likely to misbehave (in strange and different ways) with many threads. Problems happened when thread was waiting for GC to finish (triggered from another thread). Due to gcc optimization and instruction reordering - mem.varobject.heap_end was not reloaded after the end of gc - new object was allocated after the end of the heap (but usually before heap_limit - so no sigsegv at that time). All this was causing quite different and unrelated problems later. I've totally missed this in the past! (as an excuse I may say that most of the allocations/gc development/testing was done on old osx ppc :( ). Vladimir |
From: Sam S. <sd...@gn...> - 2009-09-16 21:07:54
|
Vladimir Tzankov wrote: > On 9/15/09, Sam Steingold <sd...@gn...> wrote: >> PS. Vladimir, did you decide not to replace my package.d patch? > > yes, I decided not to replace it. > It has more "consistent" behavior now - package can not change while > being iterated. good. in that case I will also lock the package in MAP-EXTERNAL-SYMBOLS & MAP-SYMBOLS. also, I will lock individual packages in MAP-ALL-SYMBOLS instead of locking all_packages_lock. objections? |
From: Vladimir T. <vtz...@gm...> - 2009-09-16 22:41:49
|
On 9/17/09, Sam Steingold <sd...@gn...> wrote: > in that case I will also lock the package in > MAP-EXTERNAL-SYMBOLS & MAP-SYMBOLS. Actually I've not missed them. As they are implemented/left now - concurrent modifications of symtabs will not interfere with traversal but modified data may not be processed (or be processed in case deletion). With lock - this will be consistent - for the time of traversal there will be no modifications (it's the same issue as with-package-iterator) > also, I will lock individual packages in MAP-ALL-SYMBOLS instead of locking > all_packages_lock. the same here - but lock all packages. I am wondering whether it's not better to go without any locking in these functions and leave it to the user? |
From: Sam S. <sd...@gn...> - 2009-09-17 15:32:02
|
Vladimir Tzankov wrote: > On 9/17/09, Sam Steingold <sd...@gn...> wrote: >> in that case I will also lock the package in >> MAP-EXTERNAL-SYMBOLS & MAP-SYMBOLS. > > Actually I've not missed them. As they are implemented/left now - > concurrent modifications of symtabs will not interfere with traversal > but modified data may not be processed (or be processed in case > deletion). what if adding new data triggers a rehash? >> also, I will lock individual packages in MAP-ALL-SYMBOLS instead of locking >> all_packages_lock. > the same here - but lock all packages. > > I am wondering whether it's not better to go without any locking in > these functions and leave it to the user? how safe is this? the clisp philosophy is that a segfault is clisp fault while everything else is a user error (more or less). i.e., if a simultaneous iteration and rehashing cannot trigger a crash, I am not worried about it. however, what if the package is dramatically shrinks during a rehash? can the iteration crash? also, if we roll back my 2009-08-28 patch, we should also remove the lock from map-symbols, for consistency. Sam. |
From: Vladimir T. <vtz...@gm...> - 2009-09-17 15:57:43
|
On 9/17/09, Sam Steingold <sd...@gn...> wrote: > Vladimir Tzankov wrote: >> On 9/17/09, Sam Steingold <sd...@gn...> wrote: >>> in that case I will also lock the package in >>> MAP-EXTERNAL-SYMBOLS & MAP-SYMBOLS. >> >> Actually I've not missed them. As they are implemented/left now - >> concurrent modifications of symtabs will not interfere with traversal >> but modified data may not be processed (or be processed in case >> deletion). > > what if adding new data triggers a rehash? In MT builds on rehash - new symtab is allocated and cons cells (from collision lists) are shared. This is done in order to have find-symbol without any locking and will help here as well. >>> also, I will lock individual packages in MAP-ALL-SYMBOLS instead of >>> locking >>> all_packages_lock. >> the same here - but lock all packages. >> >> I am wondering whether it's not better to go without any locking in >> these functions and leave it to the user? > > how safe is this? > > the clisp philosophy is that a segfault is clisp fault while everything else > is > a user error (more or less). > i.e., if a simultaneous iteration and rehashing cannot trigger a crash, I am > not worried about it. > however, what if the package is dramatically shrinks during a rehash? > can the iteration crash? I think there will not be segfauls due to the way symbols are added and removed from symtabs. > also, if we roll back my 2009-08-28 patch, we should also remove the lock > from > map-symbols, for consistency. I'll roll back it locally and test it thoroughly. |
From: Vladimir T. <vtz...@gm...> - 2009-09-29 16:55:36
|
Sam, On 9/17/09, Sam Steingold <sd...@gn...> wrote: > the clisp philosophy is that a segfault is clisp fault while everything else > is a user error (more or less). > i.e., if a simultaneous iteration and rehashing cannot trigger a crash, I am > not worried about it. > however, what if the package is dramatically shrinks during a rehash? > can the iteration crash? > > also, if we roll back my 2009-08-28 patch, we should also remove the lock > from > map-symbols, for consistency. After a lot of testing I am confident it is safe not to lock anything (by "safe" I mean there will be no segfault). Unless you have objections I'll roll back the patch and remove locking in WITH-PACKAGE-ITERATOR (should I restore the ABI as well?). Still open issue is package locks (not mutex). Currently single bit is used to mark package as locked and this easily breaks when two concurrent threads change it (WITHOUT-PACKAGE-LOCKS). I think this "flag" should be per thread - modeled in the same way as per thread special variable bindings. I think about adding special variable to package structure - WITHOUT-PACKAGE-LOCK will create per thread binding temporarily (this will work in single thread build as well). What do you think? Vladimir |
From: Sam S. <sd...@gn...> - 2009-09-29 17:55:06
|
Vladimir Vladimir Tzankov wrote: > > On 9/17/09, Sam Steingold <sd...@gn...> wrote: >> the clisp philosophy is that a segfault is clisp fault while everything else >> is a user error (more or less). >> i.e., if a simultaneous iteration and rehashing cannot trigger a crash, I am >> not worried about it. >> however, what if the package is dramatically shrinks during a rehash? >> can the iteration crash? >> >> also, if we roll back my 2009-08-28 patch, we should also remove the lock >> from >> map-symbols, for consistency. > > After a lot of testing I am confident it is safe not to lock anything > (by "safe" I mean there will be no segfault). Unless you have > objections I'll roll back the patch and remove locking in > WITH-PACKAGE-ITERATOR (should I restore the ABI as well?). yes. you need to restore constobj.d:version and remove the recompilation note in NEWS. you might want to keep the loop.lisp:wrap-unwind-protect scaffolding for future extensibility. > Still open issue is package locks (not mutex). Currently single bit is > used to mark package as locked and this easily breaks when two > concurrent threads change it (WITHOUT-PACKAGE-LOCKS). I think this > "flag" should be per thread - modeled in the same way as per thread > special variable bindings. I think about adding special variable to > package structure - WITHOUT-PACKAGE-LOCK will create per thread > binding temporarily (this will work in single thread build as well). > What do you think? I think package locking is not something modified often, it is pretty much a software delivery facility as opposed to a user-level functionality. I think it is best left to the user: if the user wants to lock/unlock(as in package-lock) a package, he is probably going to add/remove symbols and/or definitions, so he will probably be locking (as in mutex-lock) the package anyway, so I don't think we need to worry about this. Sam. |
From: Vladimir T. <vtz...@gm...> - 2009-09-30 07:14:46
|
On 9/29/09, Sam Steingold <sd...@gn...> wrote: >> Still open issue is package locks (not mutex). Currently single bit is >> used to mark package as locked and this easily breaks when two >> concurrent threads change it (WITHOUT-PACKAGE-LOCKS). I think this >> "flag" should be per thread - modeled in the same way as per thread >> special variable bindings. I think about adding special variable to >> package structure - WITHOUT-PACKAGE-LOCK will create per thread >> binding temporarily (this will work in single thread build as well). >> What do you think? > > I think package locking is not something modified often, it is pretty much a > software delivery facility as opposed to a user-level functionality. > I think it is best left to the user: if the user wants to lock/unlock(as in > package-lock) a package, he is probably going to add/remove symbols and/or > definitions, so he will probably be locking (as in mutex-lock) the package > anyway, so I don't think we need to worry about this. Sometimes while running make check-tests-parallel I get errors like this: Form: (LET ((FILE "clos-tst.lisp") C) (UNWIND-PROTECT (PROGN (MAKUNBOUND '*FOO*) (WITH-OPEN-FILE (F FILE :DIRECTION :OUTPUT) (FORMAT F "(eval-when (compile load eval) (defstruct foo slot))~@ (defparameter *foo* #.(make-foo))~%")) (LOAD (SETQ C (COMPILE-FILE FILE))) *FOO*) (POST-COMPILE-FILE-CLEANUP FILE))) CORRECT: #S(FOO :SLOT NIL) CLISP : ERROR INTERN("MISSING-LOAD-FORM-OBJECT-PRELIMINARY"): #1=#<PACKAGE CLOS> is locked OUT: ";; Compiling file /Users/vtz/clisp/dev/m/tests/clos-tst.lisp ... [SIMPLE-PACKAGE-ERROR]: INTERN(\"MISSING-LOAD-FORM-OBJECT-PRELIMINARY\"): #1=#<PACKAGE CLOS> is locked " ERR: "WARNING: (SETF FIND-CLASS): redefining class FOO in /Users/vtz/clisp/dev/m/tests/clos-tst.lisp, was defined in top-level 0 errors, 0 warnings " clos-custom.lisp uses WITHOUT-PACKAGE-LOCK and interferes with other threads - but I guess you mean this will/should be fixed by adding clos lock(s)? |
From: Sam S. <sd...@gn...> - 2009-09-30 15:40:22
|
Vladimir Tzankov wrote: > > Sometimes while running make check-tests-parallel I get errors like this: > > Form: (LET ((FILE "clos-tst.lisp") C) (UNWIND-PROTECT (PROGN > (MAKUNBOUND '*FOO*) (WITH-OPEN-FILE (F FILE :DIRECTION :OUTPUT) > (FORMAT F "(eval-when (compile load eval) (defstruct foo slot))~@ > (defparameter *foo* #.(make-foo))~%")) (LOAD > (SETQ C (COMPILE-FILE FILE))) *FOO*) (POST-COMPILE-FILE-CLEANUP > FILE))) > CORRECT: #S(FOO :SLOT NIL) > CLISP : ERROR > INTERN("MISSING-LOAD-FORM-OBJECT-PRELIMINARY"): #1=#<PACKAGE CLOS> is locked > OUT: > ";; Compiling file /Users/vtz/clisp/dev/m/tests/clos-tst.lisp ... > [SIMPLE-PACKAGE-ERROR]: > INTERN(\"MISSING-LOAD-FORM-OBJECT-PRELIMINARY\"): #1=#<PACKAGE CLOS> > is locked > " > ERR: > "WARNING: (SETF FIND-CLASS): redefining class FOO in > /Users/vtz/clisp/dev/m/tests/clos-tst.lisp, was defined in top-level > > 0 errors, 0 warnings > " it is conceivable (although, IMO, unlikely) that running the tests in parallel indiscriminately, like we do now, is not OK under the MT-safety rules (i.e., if two test files try to define and use a function of the same name - this actually cannot happen at this time because all test files run in their own packages). if this is the case, we should see errors even after CLOS is made MT-safe, so we will need to investigate this further. > clos-custom.lisp uses WITHOUT-PACKAGE-LOCK and interferes with other > threads - but I guess you mean this will/should be fixed by adding > clos lock(s)? yes. |
From: Vladimir T. <vtz...@gm...> - 2009-09-16 21:39:40
|
On 9/17/09, Sam Steingold <sd...@gn...> wrote: > in that case I will also lock the package in > MAP-EXTERNAL-SYMBOLS & MAP-SYMBOLS. I missed these two. > also, I will lock individual packages in MAP-ALL-SYMBOLS instead of locking > all_packages_lock. > > objections? Yes. Both locks should be held: all_packages_lock and package lock in all three cases. Let's say we have two threads (T1 and T2) that call MAP-SYMBOLS simultaneously on two different packages P1 and P2 with functions F1 and F2. If F1 tries to modify P2 (intern a symbol) and F2 tries to modify P1 we will get deadlock (T1 will hold package lock of P1, T2 will hold package lock of P2 and each one will try to get the other lock). Since there is no way to establish some order for locking multiple packages at once - we should use another lock - all_packages_lock (we may create another one for exactly this kind of usage but in the past I decided to use all_packages_locks since anyway it is held during MAKE-PACKAGE, DELETE-PACKAGE and introducing yet another one would complicate the code more). |
From: <don...@is...> - 2009-09-16 22:19:05
|
Vladimir Tzankov writes: > Yes. Both locks should be held: all_packages_lock and package lock in > all three cases. > Let's say we have two threads (T1 and T2) that call MAP-SYMBOLS > simultaneously on two different packages P1 and P2 with functions F1 > and F2. If F1 tries to modify P2 (intern a symbol) and F2 tries to > modify P1 we will get deadlock (T1 will hold package lock of P1, T2 > will hold package lock of P2 and each one will try to get the other > lock). Since there is no way to establish some order for locking > multiple packages at once - we should use another lock - > all_packages_lock (we may create another one for exactly this kind of > usage but in the past I decided to use all_packages_locks since anyway > it is held during MAKE-PACKAGE, DELETE-PACKAGE and introducing yet > another one would complicate the code more). It looks like your objective is that a user not be able to get into a deadlock situation without introducing his own locks. Any time you have two different (system) functions that each get their own locks and allow user code to execute while holding those locks (such as map-symbols) the user can get into a deadlock: suppose system function (s1 u) calls user function u while holding system lock l1 while system function (s2 u) calls user function u while holding system lock l2 thread1: (s1 (lambda() (s2 (lambda ())))) thread2: (s2 (lambda() (s1 (lambda ())))) So in order to satisfy the requirement above all such functions must share the same lock. As Sam pointed out, the trade off is - don't require the user to know anything about internal locking but restrict what can be done concurrently more than the alternative vs - require the user to understand the locks used by system functions (which requires that you document the locks used internally!), allow him to deadlock if he makes mistakes, but allow him to do things concurrently that he wouldn't be able to in the alternative The first alternative will surely lead to people who want to further optimize. The second will clearly be better for beginners. Perhaps there could be a run time switch. I'd recommend starting in safe mode where all of the internal locks are the same. |
From: Sam S. <sd...@gn...> - 2009-09-16 21:56:34
|
Vladimir Tzankov wrote: > On 9/17/09, Sam Steingold <sd...@gn...> wrote: >> in that case I will also lock the package in >> MAP-EXTERNAL-SYMBOLS & MAP-SYMBOLS. > > I missed these two. > >> also, I will lock individual packages in MAP-ALL-SYMBOLS instead of locking >> all_packages_lock. >> >> objections? > > Yes. Both locks should be held: all_packages_lock and package lock in > all three cases. > Let's say we have two threads (T1 and T2) that call MAP-SYMBOLS > simultaneously on two different packages P1 and P2 with functions F1 > and F2. If F1 tries to modify P2 (intern a symbol) and F2 tries to > modify P1 we will get deadlock (T1 will hold package lock of P1, T2 > will hold package lock of P2 and each one will try to get the other > lock). Since there is no way to establish some order for locking > multiple packages at once - we should use another lock - > all_packages_lock (we may create another one for exactly this kind of > usage but in the past I decided to use all_packages_locks since anyway > it is held during MAKE-PACKAGE, DELETE-PACKAGE and introducing yet > another one would complicate the code more). but this means that two threads cannot iterate over two different packages at the same time! is this a good idea? or, rather, is this the correct trade off? the situation you are trying to prevent by locking all_packages_lock before the individual package lock is a user error and it is not clear to me that it is clisp's responsibility to prevent the user from doing that. also, locking all_packages_lock before the individual package lock makes the latter useless. |
From: Vladimir T. <vtz...@gm...> - 2009-09-16 22:19:38
|
On 9/17/09, Sam Steingold <sd...@gn...> wrote: > Vladimir Tzankov wrote: >> On 9/17/09, Sam Steingold <sd...@gn...> wrote: >>> in that case I will also lock the package in >>> MAP-EXTERNAL-SYMBOLS & MAP-SYMBOLS. >> >> I missed these two. >> >>> also, I will lock individual packages in MAP-ALL-SYMBOLS instead of >>> locking >>> all_packages_lock. >>> >>> objections? >> >> Yes. Both locks should be held: all_packages_lock and package lock in >> all three cases. >> Let's say we have two threads (T1 and T2) that call MAP-SYMBOLS >> simultaneously on two different packages P1 and P2 with functions F1 >> and F2. If F1 tries to modify P2 (intern a symbol) and F2 tries to >> modify P1 we will get deadlock (T1 will hold package lock of P1, T2 >> will hold package lock of P2 and each one will try to get the other >> lock). Since there is no way to establish some order for locking >> multiple packages at once - we should use another lock - >> all_packages_lock (we may create another one for exactly this kind of >> usage but in the past I decided to use all_packages_locks since anyway >> it is held during MAKE-PACKAGE, DELETE-PACKAGE and introducing yet >> another one would complicate the code more). > > but this means that two threads cannot iterate over two different packages > at > the same time! > is this a good idea? > or, rather, is this the correct trade off? > the situation you are trying to prevent by locking all_packages_lock before > the > individual package lock is a user error and it is not clear to me that it is > clisp's responsibility to prevent the user from doing that. For this particular cases more or less I agree. However there are cases like USE/UNUSE-PACKAGE, DELETE-PACKAGE, etc which internally should acquire locks of several packages at the same time - user cannot control this. Providing this in MAP-SYMBOLS (and similar) will prevent user errors/confusion. Of course it may hurt performance in some use cases. > also, locking all_packages_lock before the individual package lock makes the > latter useless. Individual package locks are useful for all operations that involve just single package - mostly INTERN. If we use single global lock for it - interning in different packages will be serialized. |
From: Sam S. <sd...@gn...> - 2009-09-17 15:22:18
|
Vladimir Tzankov wrote: > > I fixed very serious bug in spinlock_release() on i386 and AMD64. great! > Due > to it gcc builds with -O2 were very likely to misbehave (in strange > and different ways) with many threads. alas, I get crashes with a debug (-O0) build, so this fix should not affect them. |
From: Vladimir T. <vtz...@gm...> - 2009-09-17 15:52:38
|
On 9/17/09, Sam Steingold <sd...@gn...> wrote: > Vladimir Tzankov wrote: >> >> I fixed very serious bug in spinlock_release() on i386 and AMD64. > > great! > >> Due >> to it gcc builds with -O2 were very likely to misbehave (in strange >> and different ways) with many threads. > > alas, I get crashes with a debug (-O0) build, so this fix should not affect > them. Will investigate them. Last two weeks I am mostly testing. |