You can subscribe to this list here.
2000 |
Jan
(81) |
Feb
(55) |
Mar
(459) |
Apr
(159) |
May
(126) |
Jun
(69) |
Jul
(48) |
Aug
(29) |
Sep
(106) |
Oct
(76) |
Nov
(155) |
Dec
(161) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(122) |
Feb
(150) |
Mar
(294) |
Apr
(124) |
May
(197) |
Jun
(266) |
Jul
(111) |
Aug
(259) |
Sep
(163) |
Oct
(142) |
Nov
(101) |
Dec
(86) |
2002 |
Jan
(187) |
Feb
(108) |
Mar
(274) |
Apr
(157) |
May
(346) |
Jun
(242) |
Jul
(345) |
Aug
(187) |
Sep
(263) |
Oct
(69) |
Nov
(30) |
Dec
(76) |
2003 |
Jan
(125) |
Feb
(191) |
Mar
(87) |
Apr
(69) |
May
(107) |
Jun
(66) |
Jul
(112) |
Aug
(161) |
Sep
(184) |
Oct
(137) |
Nov
(28) |
Dec
(61) |
2004 |
Jan
(148) |
Feb
(99) |
Mar
(365) |
Apr
(225) |
May
(311) |
Jun
(204) |
Jul
(95) |
Aug
(214) |
Sep
(256) |
Oct
(290) |
Nov
(239) |
Dec
(152) |
2005 |
Jan
(253) |
Feb
(183) |
Mar
(178) |
Apr
(88) |
May
(175) |
Jun
(195) |
Jul
(122) |
Aug
(81) |
Sep
(119) |
Oct
(200) |
Nov
(110) |
Dec
(179) |
2006 |
Jan
(154) |
Feb
(64) |
Mar
(55) |
Apr
(69) |
May
(66) |
Jun
(64) |
Jul
(80) |
Aug
(59) |
Sep
(62) |
Oct
(90) |
Nov
(132) |
Dec
(106) |
2007 |
Jan
(58) |
Feb
(51) |
Mar
(59) |
Apr
(19) |
May
(33) |
Jun
(52) |
Jul
(15) |
Aug
(50) |
Sep
(41) |
Oct
(259) |
Nov
(323) |
Dec
(136) |
2008 |
Jan
(205) |
Feb
(128) |
Mar
(203) |
Apr
(126) |
May
(307) |
Jun
(166) |
Jul
(259) |
Aug
(181) |
Sep
(217) |
Oct
(265) |
Nov
(256) |
Dec
(132) |
2009 |
Jan
(104) |
Feb
(81) |
Mar
(27) |
Apr
(21) |
May
(85) |
Jun
(237) |
Jul
(243) |
Aug
(199) |
Sep
(178) |
Oct
(151) |
Nov
(64) |
Dec
(39) |
2010 |
Jan
(33) |
Feb
(146) |
Mar
(125) |
Apr
(109) |
May
(52) |
Jun
(135) |
Jul
(103) |
Aug
(68) |
Sep
(99) |
Oct
(88) |
Nov
(45) |
Dec
(56) |
2011 |
Jan
(19) |
Feb
(32) |
Mar
(50) |
Apr
(105) |
May
(46) |
Jun
(22) |
Jul
(101) |
Aug
(80) |
Sep
(52) |
Oct
(16) |
Nov
(10) |
Dec
(29) |
2012 |
Jan
(8) |
Feb
(22) |
Mar
(17) |
Apr
(68) |
May
(19) |
Jun
(19) |
Jul
(12) |
Aug
(6) |
Sep
(13) |
Oct
(5) |
Nov
(5) |
Dec
(5) |
2013 |
Jan
(6) |
Feb
(4) |
Mar
(3) |
Apr
(5) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(6) |
Dec
|
2014 |
Jan
|
Feb
|
Mar
(16) |
Apr
(1) |
May
(8) |
Jun
|
Jul
(1) |
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
(8) |
Mar
(23) |
Apr
(5) |
May
|
Jun
|
Jul
|
Aug
(7) |
Sep
(1) |
Oct
|
Nov
|
Dec
(5) |
2016 |
Jan
|
Feb
|
Mar
(16) |
Apr
(6) |
May
(53) |
Jun
(19) |
Jul
(3) |
Aug
(39) |
Sep
(24) |
Oct
(2) |
Nov
(19) |
Dec
|
2017 |
Jan
(13) |
Feb
(44) |
Mar
(208) |
Apr
(12) |
May
(94) |
Jun
(54) |
Jul
(18) |
Aug
(52) |
Sep
(12) |
Oct
(22) |
Nov
(27) |
Dec
(93) |
2018 |
Jan
(85) |
Feb
(28) |
Mar
(16) |
Apr
(47) |
May
(16) |
Jun
(15) |
Jul
(10) |
Aug
(3) |
Sep
(5) |
Oct
|
Nov
(6) |
Dec
|
2019 |
Jan
(4) |
Feb
(6) |
Mar
(12) |
Apr
(1) |
May
|
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
|
2020 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
(6) |
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
|
2022 |
Jan
(2) |
Feb
|
Mar
(5) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(10) |
Oct
(5) |
Nov
|
Dec
|
2023 |
Jan
|
Feb
(4) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
|
Mar
|
Apr
(9) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(8) |
Nov
(28) |
Dec
(3) |
2025 |
Jan
(8) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: <don...@is...> - 2017-12-23 19:10:42
|
Bruno Haible writes: > > Consider (macroexpand-1' > > (LOOP for (i j k) of-type (fixnum) across #()) > > Who says that this should expand to something reasonable? > > Citing 6.1.1.7 "Destructuring" > "When aligning the trees, an atom in the tree of type specifiers > that matches a cons in the variable tree declares the same type > for each variable in the subtree rooted at the cons." > > So this form declares I as being of type FIXNUM and J and K as > being of type NIL. My interpretation of an atom in the type specifier matches a cons in the variable tree would be fixnum matching (i j k) in (LOOP for (i j k) of-type fixnum ...) so in that case i,j,k would all be declared fixnum, but in (LOOP for (i j k) of-type (fixnum) ...) I would expect the atom fixnum to match only i, and I would NOT expect of-type (fixnum) to mean the same as of-type (fixnum nil nil) so I would expect j and k to be of unspecified type. |
From: Bruno H. <br...@cl...> - 2017-12-23 18:20:36
|
Jörg wrote on 2017-12-05: > NIL is a valid type, cf. http://clhs.lisp.se/Body/t_nil.htm but, as useful as > it is for reasoning about type hierarchies or (VECTOR NIL #), you don't want > it as type declaration for a variable, because then you can't bind anything > to that variable! Correct. > Consider (macroexpand-1' > (LOOP for (i j k) of-type (fixnum) across #()) Who says that this should expand to something reasonable? Citing 6.1.1.7 "Destructuring" "When aligning the trees, an atom in the tree of type specifiers that matches a cons in the variable tree declares the same type for each variable in the subtree rooted at the cons." So this form declares I as being of type FIXNUM and J and K as being of type NIL. > Both CLISP and the CMU loop code from the CMU AI repository generate forms like: > -> (declare (type NIL j)) ; actually (type (or null NIL) j) in clisp > That's useless. Correct, that's useless, because the input was useless. The programmer should have written (LOOP for (i j k) of-type (fixnum t t) across #()) or (LOOP for (i j k) of-type (fixnum . t) across #()) > NIL has a special meaning in destructuring: > - as a variable name, that position is ignored (explicitly in CLtL2 and ANSI-CL); > - as a type, both CLtL2 and ANSI-CL are silent. Yes. > So here's a patch for clisp to do the same (even though clisp mostly ignores types). > With it, clisp and MIT LOOP destructuring look strikingly similar. What's the point of changing an clisp's LOOP implementation, in an aspect where it is ANSI CL compliant, to be more like an older implementation that predates ANSI CL? Bruno |
From: <don...@is...> - 2017-12-22 17:16:03
|
> That's OK, and I believe exactly this feature has been on the clisp > wish-list for as long as threads exist. > > My idea A was that, until the time when somebody will fulfill this > wish, we can live with minor updates to the current hash table code > just good enough to terminate iteration and to not crash. Ok, but that requires a lot of understanding of the current implementation. I was not even aware that the current implementation could crash in this case. > >Then an iteration would only have to keep track of the current key, > >right? It could lock the table in order to find the next key and > >then unlock the table. > Might work. Even if that key has been deleted meanwhile, the CDR of > the list can still point to the former remaining elements > (some of those may well have been deleted since ...). You're imagining holding on to the tail of the list. I don't think that would work, but it might depend on other details of how things are inserted and removed from the list. What I had in mind was much simpler - just remember which key you got last. In order to get the next one, compute its bucket, search that bucket in order until you find something ordered after the previous key. > Alternatively, the hash-table may weakly link to all iterators, so > their current index can be checked. Meaning that if you have a lot of iterations in progress any update can take a long time (while holding a lock) to adjust their states. > Maybe that's more in line with what Bruno wants: > Check during update that nobody is iterating at another position - > even when single-threaded. That implies that iterating through > *PACKAGE*, calling READ/INTERN is likely to fail, ouch! You mean if you just signal an error when there's another thread holding a lock you want. In addition to the huge cost of locking (I think of CAR as one instruction so I don't want to spend even one more instruction locking the cons cell and another one unlocking it), this seems to place an unreasonable burden on the user, unless he wants to get these errors all the time. I suppose you could run everything inside a handler that waits for the lock to become available and thereby end up with something close to what we expect, other than in performance. > >My specification says there should be some ordering of the keys that > >were in the table at the start of iteration along with those added > >during the iteration. > Why not: Push new at start, remove anywhere? > That too implies that once you are mid-list, you won't see new elements. The time the key was last added to the table makes a reasonable order. We can actually maintain a list of keys in this order - each entry in the table contains not only a key and value but also the next entry in this ordering. So we could always visit keys in the order they were last added (most recently added are visited first). The fact that this order changes during an iteration seems ok. We can define the order the iteration visits keys as: for those keys that were in the table at the start of the iteration and either never deleted or visited before they were first deleted, the inverse of the order in which they were added before the iteration started, and for those keys that were either in the table initially and then deleted before being visited in that order or those keys that were added without being in the table initially, the ordering by the time they were last added or deleted as of the end of the iteration (so they never get visited). This seems like it would even avoid any GC related problems. |
From: Sam S. <sd...@gn...> - 2017-12-22 16:56:06
|
Hi, > * <Wbret-Plevy.Ubruyr@g-flfgrzf.pbz> [2017-12-22 10:52:46 +0000]: > > Sam Steingold wrote: >> (and the committee did not reach a consensus because implementing this >> behavior is relatively hard > > Uh? Implementations are mentioned (old Genera, MIT LOOP) and they did no > particular magic. Yes, but those that did not do that, had a hard time implementing it. But this is not relevant. > Here we have some fundamental disagreement. I claim that LOOP should > not be hard because no used expects it to expand into some too clever > magic that would require special operators not yet invented that, Users don't care about how LOOP is implemented. They care that it works as naively expected based on its text. > would prevent code walking projects out there from processing > clisp's LOOP -- not a good idea. No, the special operator TAGLET will have a macroexpansion (based on your code) which will keep code walkers working. > - from (loop for x on list finally (return x) ; some want nil, others > the last restlist last restlist?! okay, I can imagine people wanting this. > - from (loop for x below N finally (return x) ; some want N-1 (undefined > if N<=0), others N! hah! here is an argument for nil above and N here: undefined is _never_ a good value, so, for consistency, it should always be N and nil. ;-) > But that's left for a future 2018 "About values in the LOOP epilogue". > I'll be off email for the next 3 weeks. Enjoy your vacation! >>> (loop for a = (princ 1) then b for b below (princ 2) do (princ (list a b))) >>> would print 21 by that proposal. >>This behavior does not stand the WTF test. > [...] >>if VARS is bound to non-NIL, this does not stand the WTF test. > > Please explain WTF. "What the f**k?!" this is the reaction you have when told that `(loop for x in '(1 2 3) collect x)` should return `(3 2 1)` ;-) > Is this test case not relevant? Don't you want to > assign semantics to this form? No, it's just that the user sees (princ 1) before (princ 2) and he will never accept "21" as output. Cf. this with the pandas sum([]) issue. Mathematically, the associativity of addition requires sum([])==0. However, they changed it to NA for some absurd reasons (consistency with SQL) - and that started a firestorm. We do not want to go the same path. LOOP users should never have the WTF feeling long enough to file a bug report ;-) >>Let us start with code samples and what they should return, not with >> "how we should macroexpand LOOP". > > You are right, I have code samples in mind constantly, but nowhere in > the text. I am glad we agree. The right sequence is: * test cases which we agree upon * spec which describes the tests * implementation that conforms to the spec This is similar to the scientific process: * experiments where we observe the nature * theory that predicts the experimental outcome * machine that uses the theory to perform useful work >> (let ((x '(a b c d))) (loop repeat 2 for x in x collect x)) >>==> (a b) > Sure. Where is the problem? I see no problem at all. That's the semantic > I want to give it. Yes, that's what I meant - this is the correct behavior. I am glad we agree! I would even add --8<---------------cut here---------------start------------->8--- (let ((x '(a b c d)) y) (list (loop repeat 2 for x in x collect x finally (setq y x)) y)) ==> ((a b) c) --8<---------------cut here---------------end--------------->8--- (although SBCL returns ((A B) B)). Thanks! -- Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.1504 http://steingoldpsychology.com http://www.childpsy.net http://memri.org http://thereligionofpeace.com http://iris.org.il http://no2bds.org Your mouse has moved - WinNT has to be restarted for this to take effect. |
From: <Joe...@t-...> - 2017-12-22 15:39:36
|
Hi, Don wrote: >I'm expecting the hash table implementation to do whatever is necessary >in order to make programs written by users without adding their own >locks work as I think they are justified in expecting in the presence >of other threads. You expect some isolation level better than my A) "don't crash". >I've been assuming that a package IS a hash table and that when the >implementation of hash tables is fixed then packages will work. Not sure, packages may need their own lock to update the package object. The hash table therein is just one slot. But I've not looked at actual code, maybe that's all that is needed. >I don't understand the distinction between A and B. A) User code calls MAKE-HASH-TABLE, then lets N threads stamp on it. B) clisp uses hash-table internally, e.g. in CLOS or packages. Package access ought to be thread-safe, hence clisp will use locks internally. >[...] Or the user writing a program to >iterate in one thread while altering the table in another has to add >locks? I want this to be unnecessary in order to obtain the guarantee >I described. I understood that wish of yours, and that's where A and B collapse: You don't want a distinction, and even more than that: you don't want the need for locks just to walk and update hash-tables. That's OK, and I believe exactly this feature has been on the clisp wish-list for as long as threads exist. My idea A was that, until the time when somebody will fulfill this wish, we can live with minor updates to the current hash table code just good enough to terminate iteration and to not crash. >Then an iteration would only have to keep track of the current key, >right? It could lock the table in order to find the next key and >then unlock the table. Might work. Even if that key has been deleted meanwhile, the CDR of the list can still point to the former remaining elements (some of those may well have been deleted since ...). Alternatively, the hash-table may weakly link to all iterators, so their current index can be checked. Maybe that's more in line with what Bruno wants: Check during update that nobody is iterating at another position - even when single-threaded. That implies that iterating through *PACKAGE*, calling READ/INTERN is likely to fail, ouch! >My specification says there should be some ordering of the keys that >were in the table at the start of iteration along with those added >during the iteration. Why not: Push new at start, remove anywhere? That too implies that once you are mid-list, you won't see new elements. >I think you're saying that you DO have a solution for that >particular algorithm but you're not satisfied with its performance. Not at all, I'm sorry. I barely have time to think about LOOP. >As I said above, I don't know how they work in any detail. Neither me. I was just expressing an idea (that Bruno would dislike:) I apologize for creating such confusion, Jörg |
From: <Joe...@t-...> - 2017-12-22 13:45:36
|
Hi, > (let ((x '(a b c d))) (loop repeat 2 for x in x collect x)) >==> (a b) Sure. Where is the problem? I see no problem at all. That's the semantic I want to give it. I still like showing the macroexpansion, because it allows to express the semantics I want to give to loop initialization. (let ((x '(a b c d))) (block nil (let ((collector ())) (let ((_from 1) (_to 2)) ; left to right (let ((x _from)) ; second pass (let (y) ; initialization delayed until body (tagbody ... Whether x and _from shall collapse is to be debated separately, below. Here's some vocabulary I made up: Driver Variable(s) The main variable(s) that permit iteration, e.g. the array index FOR # ACROSS. One may or not consider the vector that is constant through the iteration as a driver variable too. Derived Variables Those are derived from driver variables, e.g. all variables in a true destructuring pattern. A "true" destructuring pattern is a list, i.e. either nil or a cons, not another symbol. Named variables Those explicitly named by the user of LOOP. Note that this also includes collecting INTO # etc. but that's irrelevant for now. Clearly in FOR x IN ..., x is nevertheless a derived variable of some internal driver variable necessary to hold the list whose CAR is set to x during iteration. FOR x ON ... is peculiar, because users expect x to *be* the driver variable. So we shall meet that expectation as a special rule. Regarding for-as-arithmetic, it can be debated whether we should make driver and named variable collapse. That is related to the FINALLY discussion. A) If they don't collapse, then we can assert (for x of-type (or (integer 2 10) (eql 0)) from 2 to 10) (for x of-type (or (integer 2 (10)) (eql 0)) from 2 below 10) because the internal driver variable will be tested against the limits and x will be stepped only when the iteration does not terminate. Hence that is the value that will *likely* be available within FINALLY. I said "likely" because some other clauses may cause termination even earlier. (EQL 0) is needed because of the default binding of x. B) Many people (clearly *not* everybody) expect them to collapse. Consequently, that variable is *likely* to be stepped beyond limits (for x of-type (integer 0 (10)) from 0 below 10) ; x is likely to end at 10 I said "likely", because one could implement the test as follows (when (>= (+ x _step) _limit) (loop-finish)) (setq x (+ x _step)); i.e. perform the addition twice (or use anaphoric if to avoid that duplication, or C) cache it directly (setq _helpvar (+ x _step)) (when (>= _helpvar _limit) (loop-finish)) (setq x _helpvar) (That is almost, but not completely undistinguishable from A) Then one can assert the type (integer 2 (10)), without (eql 0) -- well mostly. Consider the degenerate case (for x from 2 below 2). X would be bound to 2, but it's type would not satisfy (integer 2 (2)), but merely (integer 2 2). For completeness, collapsing can only occur when the variable is named. (loop for nil below 2 count t) ; is valid, ==>2 In summary the choice is ours. That's three "proposals" just to discuss for-as-arithmetic! My preference is 1. Dependability, i.e. provide understandable semantics that people can depend upon. 2. Performance - don't make it overly complicated for perceived over-correctness. Hence I'd directly step x over limits, then perform the termination test. Incidentally, that's what clisp has been doing for decades :-) Benefits: - (A/B/C) Execution model and effects are simple to understand - (B/C) x gets a defined value right from the start (FROM/DOWNFROM), not an arbitrary 0. Heck, perhaps I should favour C. The implementation is minimally slower (hopefully 1 bytecode), however it's Benefit: - (C) Type declaration given by bounds (except in degenerate case) (for x of-type (integer 2 10) from 2 to 10) (for x of-type (integer 2 (10)) from 2 below 10) But clisp doesn't care about the type, so why make it more complicated? Hence B is a perfect choice for clisp, while C might be better for CMUCL. C *might* have another benefit: - (C) Even easier to explain FINALLY value: "last value", independent on the ordering of termination tests. But that's too early, we need to discuss sequencing/interleaving of stepping and termination tests first. I'll be back mid-January, hopefully with code. Take care, Jörg |
From: <don...@is...> - 2017-12-22 13:22:09
|
Joe...@t-... writes: > A) The idea is to give a minimum set of guarantees when users are > programming in the wild, without locks. You mean without adding their own locks or without the hash table implementation adding internal locks? I'm expecting the hash table implementation to do whatever is necessary in order to make programs written by users without adding their own locks work as I think they are justified in expecting in the presence of other threads. > Basically: no crash, no endless iteration due to sudden cycles in internal data structures. > Perhaps attempt to detect that and raise an error as Bruno suggested? > > B) Other parts would use locks in order to give stronger guarantees. I think you mean that the clisp hash table implementation would add locks in order to make things work as expected. That's what I want. > For instance, clisp would internally use locks when updating or > iterating a package's hash table, because I've been assuming that a package IS a hash table and that when the implementation of hash tables is fixed then packages will work. > Therefore, it does not make sense to raise the minimum set of > guarantees for the A case. I don't understand the distinction between A and B. > Of course, you can try and make A and B collapse using algorithms > for concurrent hash-tables. I have some doubts that this can work > because e.g. the package example shows that often enough you need > locks (or rather, critical sections) above and surrounding the > level where you have atomic/concurrent updates to individual > parts. The hash tables may support concurrent updates, but they are > only part of the package object which (likely) has more invariants > and postconditions. I'm missing something. Show me an example of what you're talking about above. You mean the lisp implementation has to lock things internally? That's fine with me. Or the user writing a program to iterate in one thread while altering the table in another has to add locks? I want this to be unnecessary in order to obtain the guarantee I described. If you need something stronger then you might have to add locks. For instance, you might want to guarantee that an entire iteration is done in the same hash table state. And that would require user level locks. > I have not analyzed your algorithm in sufficient detail to > determine whether it could work as a concurrent hash table I have not described an algorithm, only a requirement. I expect the requirement can be satisfied by various implementations, and I hope some of them are even efficient and not too complicated. > implementation. But that's what it wants to be. My proposal was way > more limited: just use the present, non concurrent code, add a few > checkpoints / atomics / fences to ensure it will terminate. But I I don't know enough about the current code to know what can go wrong now or what has to be added in order to have what effects. > don't see how a simple non-concurrent bucket-based algorithm could > prevent keys from disappearing from an enumeration in the presence > of parallel deletion of other keys. That's precisely why CL forbids > such modifications. Again I'm failing to understand your problem. As a simple example, suppose we have a stable ordering of all keys, and that a hash table always has the same number of buckets, with a stable function mapping keys to buckets, and that all keys in the same bucket are stored in an ordered list (sorted by the ordering). Then an iteration would only have to keep track of the current key, right? It could lock the table in order to find the next key and then unlock the table. My specification says there should be some ordering of the keys that were in the table at the start of iteration along with those added during the iteration. If the iteration has an order on buckets, and within buckets an order on keys, then that gives us the required ordering: X < Y if either X belongs in an earlier bucket than Y according to the stable function mapping keys to buckets, or if that function maps both to the same bucket and X precedes Y in the stable ordering of all keys. I realize that GC is likely to have an important impact on all of this. I expect it already does in the current implementation. For the moment I leave that as a later issue. The above is only intended as an example to illustrate the concept. Maybe it can be extended to work acceptably in practice, maybe not. > One could RPLACA the deleted element with #<UNBOUND> to preserve > the integrity of the bucket list. But that would be a burden > imposed on all tables, not exactly nice. I can sort of understand that in the context of some particular algorithm, or type of algorithm, but I don't think it's worth while to try to respond without knowing exactly what algorithm you have in mind. I think you're saying that you DO have a solution for that particular algorithm but you're not satisfied with its performance. Maybe that particular performance problem can be fixed, or maybe a different algorithm would be better. If you're hoping to make a small change to the current hashing algorithms in clisp, I can understand that desire, but since they were probably not designed with this in mind, it might be easier to change them. As I said above, I don't know how they work in any detail. |
From: <Joe...@t-...> - 2017-12-22 10:52:54
|
Hi, Sam Steingold wrote: > (and the committee did not reach a consensus because implementing this behavior is relatively hard Uh? Implementations are mentioned (old Genera, MIT LOOP) and they did no particular magic. Here we have some fundamental disagreement. I claim that LOOP should not be hard because no used expects it to expand into some too clever magic that would require special operators not yet invented that, besides, would prevent code walking projects out there from processing clisp's LOOP -- not a good idea. If it mustn't be hard, we just need to find the good compromises, i.e. (re)ordering vs. dependability Meeting user expectations will be especially hard when discussing FINALLY. I've seen wishes either side - from (loop for x on list finally (return x) ; some want nil, others the last restlist - from (loop for x below N finally (return x) ; some want N-1 (undefined if N<=0), others N! But that's left for a future 2018 "About values in the LOOP epilogue". I'll be off email for the next 3 weeks. >> (loop for a = (princ 1) then b for b below (princ 2) do (princ (list a b))) >> would print 21 by that proposal. >This behavior does not stand the WTF test. [...] >if VARS is bound to non-NIL, this does not stand the WTF test. Please explain WTF. Is this test case not relevant? Don't you want to assign semantics to this form? To me there's a difference between portable (CL) semantics and implementation-defined semantics. E.g. within FINALLY, a variable's value may not be portably defined, yet we can document "In clisp's loop the named driver variable will have value ... in FINALLY". >Let us start with code samples and what they should return, not with "how we should macroexpand LOOP". You are right, I have code samples in mind constantly, but nowhere in the text. >> - Disallows LOOP-FINISH within INITIALLY, as the (GO epilogue) tag is not visible. >No, people expect this. You're right. LOOP-FINISH needs to work within INITIALLY (trivial if it's within the one TAGBODY). >I must admit that I did not study your proposals in detail yet, Sure. It's long and not particularly elaborate. I had sketched an even longer text, found it too academic and rewrote parts into that "proposal" form. >but it would be nice if you could augment it with actual code: That's what I wanted to avoid. I don't want to implement LOOP a dozen times, then vote on "the best". I wanted to discuss the semantics, then implement them. > (let ((x '(a b c d))) (loop repeat 2 for x in x collect x)) >==> (a b) Sure. Where is the problem? I see no problem at all. That's the semantic I want to give it. Regards, Jörg |
From: <Joe...@t-...> - 2017-12-22 10:27:33
|
Hi, Don Cohen wrote: >I don't think it's ok to miss keys that were present at the >beginning of the iteration and never removed. >Nor do I think it's ok to present the same key twice in any case. >I do agree that the iteration should not crash. There's what I wrote, and there's another half I didn't write: A) The idea is to give a minimum set of guarantees when users are programming in the wild, without locks. Basically: no crash, no endless iteration due to sudden cycles in internal data structures. Perhaps attempt to detect that and raise an error as Bruno suggested? B) Other parts would use locks in order to give stronger guarantees. For instance, clisp would internally use locks when updating or iterating a package's hash table, because 1) without them you can hardly guarantee uniqueness; 2) you want unwarranted (not thread-aware) apps to obtain stable (or shall I say sensible?) results when iterating across e.g. the keyword package. Therefore, it does not make sense to raise the minimum set of guarantees for the A case. Of course, you can try and make A and B collapse using algorithms for concurrent hash-tables. I have some doubts that this can work because e.g. the package example shows that often enough you need locks (or rather, critical sections) above and surrounding the level where you have atomic/concurrent updates to individual parts. The hash tables may support concurrent updates, but they are only part of the package object which (likely) has more invariants and postconditions. I have not analyzed your algorithm in sufficient detail to determine whether it could work as a concurrent hash table implementation. But that's what it wants to be. My proposal was way more limited: just use the present, non concurrent code, add a few checkpoints / atomics / fences to ensure it will terminate. But I don't see how a simple non-concurrent bucket-based algorithm could prevent keys from disappearing from an enumeration in the presence of parallel deletion of other keys. That's precisely why CL forbids such modifications. One could RPLACA the deleted element with #<UNBOUND> to preserve the integrity of the bucket list. But that would be a burden imposed on all tables, not exactly nice. Regards, Jörg |
From: Sam S. <sd...@gn...> - 2017-12-21 17:51:16
|
Hi, > * <Wbret-Plevy.Ubruyr@g-flfgrzf.pbz> [2017-12-21 16:47:12 +0000]: > > [I'v been much more involved with specifications than actual code in > recent years. And it shows...] ;-) My problem with your proposal is that you do not offer the code what works / breaks with each proposal. IOW, you think in terms "how do we macroexpand various LOOP forms and what behavior can be expected from such approaches". I think in terms "how the code is expected to behave from the user POV", ignoring mental macroexpand. Let us start with code samples and what they should return, not with "how we should macroexpand LOOP". Standards and specs codify user expectations. Implementing behavior that surprises a reasonable user is not a good idea from the usability POV. > As the discussion around issue 222 shows, the committee did not reach > a consensus, to the texts were not changed. The ancient MIT LOOP > implements this behaviour. For a good reason... (and the committee did not reach a consensus because implementing this behavior is relatively hard and the maintainers of existing implementations do not want to bear the costs). Note that it is unclear how do implement it in a way that things like these behave as the user expects: (let ((x '(a b c d))) (loop repeat 2 for x in x collect x)) ==> (a b) (let ((x '(a b c d))) (loop for x from 1 to 2 for y = x collect y)) ==> (1 2) > (loop for a = (princ 1) then b for b below (princ 2) do (princ (list a b))) > would print 21 by that proposal. This behavior does not stand the WTF test. > (loop for vars on vars for x = (error) then 2 return vars) would not error out. if VARS is bound to non-NIL, this does not stand the WTF test. > - Disallows LOOP-FINISH within INITIALLY, as the (GO epilogue) tag is not visible. > But if you need to exit that early, why not use (RETURN-FROM named x)? No, people expect this. > (LOOP FOR a = (princ 1) FOR vars ON (princ vars) RETURN (list a b)) > would expand to something containing > (let ((A nil)) > (flet ((#:on-init () (princ VARS))) > (let ((VARS nil)) > (tagbody ... (setq A (princ 1)) (setq VARS (#:on-init)) ...)))) This looks suspiciously like lasy-let ;-) > Live with bug #667 and bug #375 and a few others I've not yet written about. No, these bugs violate user expectations. I must admit that I did not study your proposals in detail yet, but it would be nice if you could augment it with actual code: > | Case \ Proposal | 1:INIT-FORM | 2:FIRST | 3:FLET | 4:QUO | > |-----------------------+-------------+--------------+---------+-------------| > | vars on vars | yes | yes | yes | no | > | form1 in lex. env. | no | yes | yes | yes | > | left to right | strict | except first | strict | strict | > | ready in initially | yes | yes | yes | interleaved | > | initially interleaved | compatible | compatible | ? | yes | > | overhead | lowest | low | highest | middle | The descriptions are not very clear. Could you please give code snippets and their behaviors for each proposal? -- Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.1504 http://steingoldpsychology.com http://www.childpsy.net https://jihadwatch.org http://islamexposedonline.com http://www.memritv.org http://think-israel.org Professionalism is being dispassionate about your work. |
From: Sam S. <sd...@gn...> - 2017-12-21 17:15:00
|
Jörg, > * <Wbret-Plevy.Ubruyr@g-flfgrzf.pbz> [2017-12-21 16:45:16 +0000]: > > Your lazy-let reminds me of the trouble Pythonistas have with the > local/global variable dichotomy and the `global` statement. That made > it into the Python programming FAQ. IIRC, setting a variable first > defines it in the local environment; only reading one sees the global > environment. > https://docs.python.org/3/faq/programming.html#what-are-the-rules-for-local-and-global-variables-in-python No, this is very different. The Python global/local problem is because the way a variable (first) _used_ affects how it works. My LAZY-LET merely separates initialization from binding. I.e., I allow code between binding and initialization to ignore the binding. The idea is to make the macroexpansion of loop behave the way people perceive loop to operate. (macroexpand '(loop ....)) looks like this: --8<---------------cut here---------------start------------->8--- (MACROLET ((LOOP-FINISH NIL (SYSTEM::LOOP-FINISH-ERROR))) (BLOCK NIL (LET ((...)) (PROGN (LET ((...)) (LET NIL (LET ((...)) (LET NIL (MACROLET ((LOOP-FINISH NIL '(GO SYSTEM::END-LOOP))) (TAGBODY (SETQ ...) SYSTEM::BEGIN-LOOP ....)))))))))) --8<---------------cut here---------------end--------------->8--- most of those LET outside of TAGBODY are _not_ supposed to initialize the variables they bind. > I believe the Python lesson clearly is to *not* re-invent that trouble > in Lisp. TAGLET is _not_ supposed to be a public interface, just an internal tool to implement LOOP. It closely corresponds to how people read LOOP, so it has the best chance to result in code doing what the user expects. -- Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.1504 http://steingoldpsychology.com http://www.childpsy.net http://www.dhimmitude.org http://thereligionofpeace.com http://honestreporting.com http://jij.org Whom computers would destroy, they must first drive mad. |
From: <Joe...@t-...> - 2017-12-21 16:48:22
|
Hi, [I'v been much more involved with specifications than actual code in recent years. And it shows...] clisp generates almost perfect code for WITH and AND WITH clauses. Witness: (macroexpand-1'(loop with (a b) = (foo) and (nil c d) = (bar))) (BLOCK NIL (LET ((#:PATTERN-4434 NIL)) (LET ((A (CAR (SETQ #:PATTERN-4434 (FOO)))) (B (CAR (CDR #:PATTERN-4434))) (C (CAR (SETQ #:PATTERN-4434 (CDR (BAR))))) (D (CAR (CDR #:PATTERN-4434)))) #))) Note how thanks to LET, all named variables a b c d are evaluated in a context that may refer to outer bindings of these variables. Hence (loop WITH a = (foo a) ...) works as one would expect in Common Lisp (or any LISP with LET). What I propose for LOOP in clisp is to apply the exact same scoping rules that govern WITH (possibly joined by AND) to named variables in FOR clauses. That will make (LOOP FOR vars ON vars ...) work. So let's look at FOR clauses. I call init forms all those that are to be executed *once* before iteration begins, in order to supply initial values or limits valid across the whole iteration. FOR # IN init-form BY init-form FOR # ON init-form BY init-form FOR # ACROSS init-form FOR # BEING EACH/THE HASH-KEYs IN/OF init-form FOR # BEING EACH/THE [PRESENT/EXTERNAL-]SYMBOLs IN/OF init-form FOR # BY init-form UPTO/BELOW init-form FROM init-form REPEAT init-form Note that the repeatedly evaluated form in: FOR # = form is *not* an init-form. Is this obvious to everybody? There's one final case that needs close attention: FOR # = arguably-init-form THEN later Here one can argue about the context in which to evaluate the form that provides the value in the first iteration. Half of the following text deals with that exception. Proposal: FOR-AS-EQUALS-THEN-HAS-INIT-FORM Consider for-as-equals-then as FOR # = init-form THEN form i.e. apply aforementioned handling about variable scoping and init forms. Benefits: - Named variables are immediately bound, hence values are readily available in every subsequent clause as well as INITIALLY clauses. - left-to-right evaluation across all init-forms is globally preserved. - Easy to grasp and to explain. - Implements ANSI issue 222 (which didn't reach consensus). http://clhs.lisp.se/Issues/iss222_w.htm However, this proposal appears incompatible with CLHS 6.1.1.4: "form1 and form2 in for-as-equals-then form include the lexical environment of all loop variables." Sad, very sad. They should have distinguished for-as-equals-then and for-as-equals. As the discussion around issue 222 shows, the committee did not reach a consensus, to the texts were not changed. The ancient MIT LOOP implements this behaviour. We can consider implementing issue 222 for the sake of clarity and dependability. With it, the lexical environment of init-forms becomes as clear as in DO/DO*/DOLIST. Proposal: FOR-AS-EQUALS-FIRST-THEN-LATER Consider for-as-equals-then as FOR # = form1 THEN form2 Because per CLHS 6.1.1.4 (quoted above), first-iteration need be evaluated in an environment including all loop variables, the implementor of clisp's current loop IMHO wrongly concluded that all following FOR clauses (here including those joined by AND) must be evaluated there as well, to respect left to right evaluation order requirements, causing known bugs and scoping rules very hard to explain or even understand, as scoping now depends on former clauses. Instead, this proposal follow CLHS by the letter to "initialize the variables [...] by setting it to the result of evaluating form1 on the first iteration, then ..." -- Initialize, not bind! I.e. bind to NIL, later initialize to form1. That definition can work, considering that LOOP users need be aware of code motion anyway -- Think how (LOOP INITIALLY (princ 2) WITH a = (princ 1) RETURN a) prints 12 (can anybody reveal an implementation that produces another output?). In effect, it will be very similar to for-as-in-list, for-as-across or for-as-hash: - The named variables need default bindings when declared; - It is only upon entering iteration that actual values will be supplied. (loop for a = (princ 1) then b for b below (princ 2) do (princ (list a b))) would print 21 by that proposal. Benefits are: - The lexical environment requirement in CLHS 6.1.1.4 is obeyed to the letter. One can argue that this rule does not make much sense without *additional* rules about values of variables within that environment, i.e. to what values have other named loop variables been initialized? Have INITIALLY clauses been evaluated already (see below FOR-AS-EQUALS-THEN[-VERY]-LATE)? >From an implementation POV, one may try hard to avoid an extra control variable "is-first-iteration" by initializing before entering the loop (as in INITIALLY) and stepping at the end of each iteration. However, that interferes with observable values in FINALLY clauses, I'll not talk about that now. Proposal: FOR-AS-EQUALS-THEN-EARLY In for-as-equals-then, initialize the variables prior to any INITIALLY. (yet still in full lexical scope of all loop variables in the loop prologue). Benefits: - Named variables are ready for use within INITIALLY. - which distinguishes it from for-as-equals without THEN, so the user has the choice. Proposal: FOR-AS-EQUALS-THEN-LATE In for-as-equals-then, initialize the variables just upon entering the loop body. INITIALLY clauses will only see default values. Benefits: - Very close to for-as-equal without then, where I don't plan at all to duplicate the step form in the loop prologue and the loop body. Disadvantages: - Precisely that it's late, so INITIALLY only ever sees default values. Proposal: FOR-AS-EQUALS-THEN-VERY-LATE In for-as-equals-then, initialize the variables within the loop body, after evaluating the termination tests of preceding FOR-AS clauses. clisp did/does this. (loop for vars on vars for x = (error) then 2 return vars) would not error out. Benefits: - Obeys the letter of CLHS: "setting it ... on the first iteration". Disadvantages: - INITIALLY only ever sees default values. - Even FINALLY may only see the default, not form1. (And you haven't even seen my proposals about the loop body :-) Proposal: FOR-AS-INITIALLY-INTERLEAVE-LET The LET forms resulting from FOR-AS initforms and WITH clauses could alternate easily. That would make it trivial to globally respect left to right evaluation across FOR-AS, WITH and even INITIALLY clauses. This is not strictly conforming to CLtL2 or CLHS. They require INITIALLY to lie in the loop prologue, which is inside the TAGBODY, hence after all LET forms, instead of amid LET forms. Benefits: - Global left to right evaluation order between INITIALLY, WITH and FOR-AS initforms - Outer variables are still accessible to early WITH clauses - Easy to explain (yet different from CLtL2 & ANSI-CL) Disadvantage - Not the letter of CLtL2 or CLHS. - Disallows LOOP-FINISH within INITIALLY, as the (GO epilogue) tag is not visible. But if you need to exit that early, why not use (RETURN-FROM named x)? Proposal: FOR-AS-INITIALLY-GROUP-PROLOGUE Group all INITIALLY clauses in the prologue, following all variable initializations. Benefits: - Values of driver variables readily accessible in INITIALLY - Easy to explain (after grasping code movement) Disadvantage: - Code movement, hence no global left to right evaluation order between INITIALLY and FOR-AS/WITH initforms Do you interpret ANSI-CL to expect a global order of evaluation across INITIALLY and FOR/AS-WITH clauses? Proposal: FOR-AS-INITIALLY-INTERLEAVE-PROLOGUE Use dummy LET bindings, assign variables later in the prologue, which enables an interleaving with INITIALLY clauses, hence benefitting left to right evaluation. This is somehow what clisp's LOOP currently does. Benefits: - Global left to right evaluation order between INITIALLY and FOR-AS/WITH initforms Disadvantages: - LOOP for VARS on VARS won't work (without further tricks, e.g. see FLET below). - INITIALLY evaluates in an environment including all variables, yet values are not yet available, depending on clause order, so it's somehow unreliable. Proposal: FOR-AS-EQUALS-THEN-INTERLEAVE Interleave the initialization of for-as-equals-then variables with INITIALLY forms. Benefits: - left to right evaluation is obeyed. Disadvantages: - It may appear inconsistent to obey left to right with for-as-equals-then only, when all other for-as forms init-forms are evaluated and bindings were established long before any INITIALLY. Proposal: FOR-AS-EQUALS-THEN-FLET-FOR-SCOPE Maintain the complexity in clisp's current loop, and fix some scoping bugs by adding even more complexity. FLET can be used to evaluate init-forms in an outer scope not comprising all loop variables. (LOOP FOR a = (princ 1) FOR vars ON (princ vars) RETURN (list a b)) would expand to something containing (let ((A nil)) (flet ((#:on-init () (princ VARS))) (let ((VARS nil)) (tagbody ... (setq A (princ 1)) (setq VARS (#:on-init)) ...)))) Benefits: - At least the scoping rules become explainable. - Visible order of evaluation is obeyed. Disadvantage: - Poor performance, likely needs closure. Proposal: FOR-AS-PROLOGUE-FIX clisp's current loop contains complicated logic in an attempt to satisfy requirements on lexical environments, global left to right evaluation, and optimization ideas (binding with actual values is faster than producing dummy bindings and initializing later). It uses LET bindings first, then degrades to (P)SETQ inside TAGBODY. Fix that logic not to degrade in unexpected (and unwarranted) cases. Benefits: - Visible order of evaluation is globally obeyed. Disadvantages: - (loop FOR x = # THEN # FOR vars ON vars ...) becomes unable to satisfy user expectations, e.g. bug #375; (loop FOR vars ON vars FOR x = # THEN # ...) would work as expected(!). - initforms sometimes evaluated in an environment including all variables, sometimes not. How to document and explain that to users? Proposal: STATUS-QUO Live with bug #667 and bug #375 and a few others I've not yet written about. What proposal to choose? In a 2005 posting to comp.lang.lisp, Kent M. Pitman wrote: "And there are places like LOOP where people disagree in what I'd bet is a useless dispute because I think LOOP, while complicated, is not so complicated that we couldn't force implementations to agree _enough_ that the ability to define loop paths could be portably achieved without infringing the part of LOOP that should _not_ be exposed, which is what is the right and most efficient expansion for any given implementation in order to benchmark well. The more you try to pin down that latter part, the more you just require Common Lisp to be slow. Some parts are meant to be kept abstract." So the intension seems to be to make LOOP fast by not overspecifying it. That pretty much rules out the overhead of the FLET trick IMHO. Does it imply that the few requirements that ANSI-CL imposes upon LOOP become even more important to obey strictly? | Case \ Proposal | 1:INIT-FORM | 2:FIRST | 3:FLET | 4:QUO | |-----------------------+-------------+--------------+---------+-------------| | vars on vars | yes | yes | yes | no | | form1 in lex. env. | no | yes | yes | yes | | left to right | strict | except first | strict | strict | | ready in initially | yes | yes | yes | interleaved | | initially interleaved | compatible | compatible | ? | yes | | overhead | lowest | low | highest | middle | I'm sorry this table is very incomplete. This text is so long already, and there's no TL;DR. In case you didn't guess ;-) my vote is FOR-AS-EQUALS-FIRST-THEN-LATER together with + FOR-AS-EQUALS-THEN-EARLY ; i.e. for-as-equals-then is almost an init-form like the others + FOR-AS-INITIALLY-GROUP-PROLOGUE Benefits: - FOR vars ON vars never sees NIL. - FOR a = form1 [then form2] evaluates both forms in an env. including all variables. - Well defined values within INITIALLY. - Easy enough to document - Shall be performant by favouring LET over additional SETQ; i.e. I'm biased towards performance thanks to reordering, left to right evaluation order across all LOOP clauses. Regarding a future Lisp, my vote would be FOR-AS-EQUALS-THEN-HAS-INIT-FORM ; i.e. implement issue 222 + FOR-AS-INITIALLY-INTERLEAVE-LET even though that means that LOOP-FINISH is not available in INITIALLY. If you favour global left to right evaluation order, I believe your vote will be FOR-AS-PROLOGUE-FIX + FOR-AS-INITIALLY-INTERLEAVE-PROLOGUE + FOR-AS-EQUALS-THEN-INTERLEAVE and maybe include FOR-AS-EQUALS-THEN-VERY-LATE Regards, Jörg |
From: <Joe...@t-...> - 2017-12-21 16:45:25
|
Sam, Your lazy-let reminds me of the trouble Pythonistas have with the local/global variable dichotomy and the `global` statement. That made it into the Python programming FAQ. IIRC, setting a variable first defines it in the local environment; only reading one sees the global environment. https://docs.python.org/3/faq/programming.html#what-are-the-rules-for-local-and-global-variables-in-python I believe the Python lesson clearly is to *not* re-invent that trouble in Lisp. Regards, Jörg |
From: <don...@is...> - 2017-12-20 22:00:05
|
Sam Steingold writes: > > * Don Cohen <qba...@vf...3-vap.pbz> [2017-12-20 20:06:46 +0000]: > > > > Suppose an iteration over a table starts in a state with the table > > empty, and another thread performs the following operations in the > > following order: > > operation resulting state > > add A => 1 (A=>1) > > add B => 2 (A=>1,B=>2) > > remove A (B=>2) > > add A => 3 (B=>2,A=>3) > > remove B (A=>3) > > add B => 4 (A=>3,B=>4) > > and now the iteration returns. > > Suppose the iteration simply collects and returns a list of all the > > keys and values in the order visited. > > The following results from the iteration are allowable: > > NIL - the iteration could have finished before the first add > > all of the resulting states listed above are allowed when > > interpreted as iterations results, since the entire iteration > > could have been performed in that state > > (A=>1,B=>4), even though the table was never in such a state, > > because it could visit A right after A was added and B just > > before the end) > > (B=>4) is allowed because A might have been visited during the > > time it was not present, and then B visited at the end. > > > > However (B=>4,A=>1) is NOT allowed, because after B became > > related to 4, A was never related to 1. > > I cannot imagine how you can prevent this. > Suppose the iteration collected A=>1 (after 1st or 2nd step) but, before > it could terminate, the 1st thread did all its work. > Now iteration sees that the key vector has length 2, it saw the 1st > element and now it processes the second element which is B which points > to 4. > Thus iteration will collect B=>4,A=>1. You're missing the point that the result is being returned in visited order. If it visits A first and gets A=>1, then later visits B and gets B=>4 it returns the list (A=>1,B=>4), NOT the list (B=>4,A=>1). In order to return (B=>4,A=>1) it would have to first visit B and find 4 as the value, and then later visit A and find 1 as the value. And this is incompatible with the history of states. Notice that the "state" descriptions are unordered sets of key/value pairs, but the iteration results are ordered. > > all of the resulting states listed above are allowed when > > interpreted as iterations results, since the entire iteration The "interpreted as" above means that the unordered state (A=>3,B=>4) is interpreted as the result of first visiting A and then B. In fact, any of the orderings of any of the actual states would be allowed as iteration results since the entire iteration could be done in that state in any order. However (A=>1,B=>4) is outside of that class. |
From: Sam S. <sd...@gn...> - 2017-12-20 21:31:58
|
> * Don Cohen <qba...@vf...3-vap.pbz> [2017-12-20 20:06:46 +0000]: > > Suppose an iteration over a table starts in a state with the table > empty, and another thread performs the following operations in the > following order: > operation resulting state > add A => 1 (A=>1) > add B => 2 (A=>1,B=>2) > remove A (B=>2) > add A => 3 (B=>2,A=>3) > remove B (A=>3) > add B => 4 (A=>3,B=>4) > and now the iteration returns. > Suppose the iteration simply collects and returns a list of all the > keys and values in the order visited. > The following results from the iteration are allowable: > NIL - the iteration could have finished before the first add > all of the resulting states listed above are allowed when > interpreted as iterations results, since the entire iteration > could have been performed in that state > (A=>1,B=>4), even though the table was never in such a state, > because it could visit A right after A was added and B just > before the end) > (B=>4) is allowed because A might have been visited during the > time it was not present, and then B visited at the end. > > However (B=>4,A=>1) is NOT allowed, because after B became > related to 4, A was never related to 1. I cannot imagine how you can prevent this. Suppose the iteration collected A=>1 (after 1st or 2nd step) but, before it could terminate, the 1st thread did all its work. Now iteration sees that the key vector has length 2, it saw the 1st element and now it processes the second element which is B which points to 4. Thus iteration will collect B=>4,A=>1. Thanks. -- Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.1504 http://steingoldpsychology.com http://www.childpsy.net http://www.dhimmitude.org http://americancensorship.org http://iris.org.il https://ffii.org People hear what they want to hear and discard the rest. |
From: <don...@is...> - 2017-12-20 20:06:53
|
Joe...@t-... writes: > Regarding concurrent hash-tables, I'd relax the requirements about iteration: > - Iteration must terminate. > - The set of keys need not match the serializable database isolation level. > - It's OK for some keys to be missing from the enumeration! That's the price > an application pays for going through a table without adequate outer locking. > - No key shall be presented twice(?) (except if it's removed + re-added meanwhile) > - Iteration must not crash. > > IOW, it's OK during iteration to present keys that have been removed > concurrently. There's no need to implement snapshots. But the data > structures should never be seen in such an inconsistent state that the > iteration loops infinitely or crashes. I think that if you have a thread that keeps adding new keys, it's ok for the iteration to never terminate. I don't think it's ok to miss keys that were present at the beginning of the iteration and never removed. Nor do I think it's ok to present the same key twice in any case. I do agree that the iteration should not crash. Here's what I wrote before: For instance, I hope hash table iteration could be described in terms of all of the keys in the table at the beginning plus all those added during the iteration being ordered somehow, and the iteration visiting them in that order (along with the associated values at the time the key is visited), but skipping those that were not in the table (either cause they were deleted earlier or not yet added) at the time the iteration reaches their position. I hope the informal specification above is clear. (BTW, I think that the statement: the iteration visits once every key that was in the table at the beginning of the iteration but never deleted during the iteration, never visits any key that was not in the table at the beginning and not added during the iteration, and visits either once or zero times those that were added or deleted during the iteration is implied by that but fails to guarantee some desired ordering properties guaranteed by the more operational description.) To be a bit more explicit, here's an example. I think I'm asking for something that every programmer will expect. Suppose an iteration over a table starts in a state with the table empty, and another thread performs the following operations in the following order: operation resulting state add A => 1 (A=>1) add B => 2 (A=>1,B=>2) remove A (B=>2) add A => 3 (B=>2,A=>3) remove B (A=>3) add B => 4 (A=>3,B=>4) and now the iteration returns. Suppose the iteration simply collects and returns a list of all the keys and values in the order visited. The following results from the iteration are allowable: NIL - the iteration could have finished before the first add all of the resulting states listed above are allowed when interpreted as iterations results, since the entire iteration could have been performed in that state (A=>1,B=>4), even though the table was never in such a state, because it could visit A right after A was added and B just before the end) (B=>4) is allowed because A might have been visited during the time it was not present, and then B visited at the end. However (B=>4,A=>1) is NOT allowed, because after B became related to 4, A was never related to 1. Note that my model does require that the table can be viewed as having gone through a sequence of states corresponding to the operations having been performed in some order. So if the first two adds had been done by two different threads, we might have a choice of the order in which they were done. But this choice could be limited by other evidence, such as reads that determine what's in the table at different ordered points in time. The specification above also rules out clearly ridiculous outcomes such as (C=>4) or (A=>9). |
From: Sam S. <sd...@gn...> - 2017-12-20 19:04:35
|
> * <Wbret-Plevy.Ubruyr@g-flfgrzf.pbz> [2017-12-20 18:16:08 +0000]: > > Having fun, here's the version with the need for most gensyms removed: > > (defmacro lazy-let1 ((var) &body body) > (let ((shadow (gensym (symbol-name var)))) > `(let ((,shadow 'unbound)) > (flet ((,shadow () (if (eq ,shadow 'unbound) ,var ,shadow)) > ((setf ,shadow) (,var) (setf ,shadow ,var))) > (symbol-macrolet ((,var (,shadow))) .,body))))) > > (setf ,shadow) is indeed defined as (setf ,shadow) - scoping rules! > > 'unbound should be replaced by the usual unique CONS trick, and perhaps #1#... > > This is just for the fun, I don't like this macro because of the run-time check. This is a proof of concept. If we can fix all the LOOP bugs (search for #-CLISP in loop.tst) using lazy-let, it would lend support to my TAGLET proposal: (taglet (vars) &body) == (lazy-let (vars) (tagbody ,@body)) TAGLET fits the way LOOP is generally _perceived_ very well. > Actually, it has huge costs in clisp - try out DISASSEMBLE - as the > closures are not inlined. Inlining the flet closures has been in clisp man page TODO section as long as I can remember. ;-) Thanks. -- Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.1504 http://steingoldpsychology.com http://www.childpsy.net https://jihadwatch.org http://mideasttruth.com http://americancensorship.org http://memri.org When told to go to hell, ask for directions. |
From: Bruno H. <br...@cl...> - 2017-12-20 18:45:39
|
Hi Sam, > one is marked ABI and the other is not: > > --8<---------------cut here---------------start------------->8--- > note-c-fun ; not ABI, compile-time only > note-c-var ; ABI > --8<---------------cut here---------------end--------------->8--- "ABI" means that a symbol that is not exported may occur in .fas files or .lib files. In this case, when I compile a file such as -------------------------------------------------------- (ffi:def-c-var foo (:type ffi:int)) (ffi:def-call-out bar (:arguments) (:return-type nil)) -------------------------------------------------------- the .fas and .lib files don't reference NOTE-C-VAR nor NOTE-C-FUN. So, "not ABI, compile-time only" is correct for both. Jörg fixed the description for NOTE-C-FUN on 2006-01-30. We can do the same for NOTE-C-VAR. Bruno |
From: <Joe...@t-...> - 2017-12-20 18:27:41
|
Hi, Sam wrote: >I may be missing something, but it seems to me that the requirements are >contradictory: >If one thread keeps adding and removing a key, then the other thread may >iterate over the table forever, presenting the same key over and over again. Indeed. However you could add to the front of the list and expect the iterator to already have passed that point. IIRC one can seek inspiration from the RCU lists in the Linux kernel, even though we have nothing to do with RCU. But I was just considering updates to the single linked list within one bucket. Updating the array of buckets concurrently is hairy (what if someone removes the last element in one bucket, wants to resize, etc.) Plain fun. Regards, Jörg |
From: <Joe...@t-...> - 2017-12-20 18:16:18
|
Hi again, Having fun, here's the version with the need for most gensyms removed: (defmacro lazy-let1 ((var) &body body) (let ((shadow (gensym (symbol-name var)))) `(let ((,shadow 'unbound)) (flet ((,shadow () (if (eq ,shadow 'unbound) ,var ,shadow)) ((setf ,shadow) (,var) (setf ,shadow ,var))) (symbol-macrolet ((,var (,shadow))) .,body))))) (setf ,shadow) is indeed defined as (setf ,shadow) - scoping rules! 'unbound should be replaced by the usual unique CONS trick, and perhaps #1#... This is just for the fun, I don't like this macro because of the run-time check. Actually, it has huge costs in clisp - try out DISASSEMBLE - as the closures are not inlined. Regards, Jörg |
From: Sam S. <sd...@gn...> - 2017-12-20 18:12:30
|
Hi, > * <Wbret-Plevy.Ubruyr@g-flfgrzf.pbz> [2017-12-20 17:35:44 +0000]: > >> This would require something like `lazy-let` with the following test case: > > (let ((a 1)) > (lazy-let (a b) > (print a) ; prints 1 > (print b) ; error - unbound variable > (setq a 10) ; inner binding of a is activated > (print a) ; prints 10 > (setq b 22)) ; binding of b is activated > (print b) ; error - unbound variable > (print a)) ; prints 1 > >>Alternatively, `lazy-let` can be implemented as something like >> (let ((a (if (boundp 'a) a (sys::%unbound)))) ...) > Certainly not, BOUNDP would only work with variables declared special. of course - that's the point > Hmm, there's perhaps nothing that another level of delayed evaluation > in order to get the desired scoping couldn't solve, so here it is: > > (defmacro lazy-let1 ((var) &body body) > (let ((shadow (gensym (symbol-name var)))) > `(let ((,shadow 'unbound));(sys::%unbound) > (flet ((access-a () (if (eq ,shadow 'unbound) ,var ,shadow)) > ((setf access-a) (x) (setf ,shadow x))) > (symbol-macrolet ((,var (access-a))) .,body))))) > > ; bonus point 0: Add more required gensyms > ; bonus point 1: Support a list of variables as per Sam's specification > ; bonus point 2: See if LET can work with clisp's %unbound trick > ; bonus point 3.5: See if you can make it work with special variables... > > (let ((a 1)) > (lazy-let1 (a) > (print a) ; prints 1 > (setq a 10) ; inner binding of a is activated > (print a)) ; prints 10 > (print a)) ; prints 1 Wow, I am impressed. However, I am not sure CLISP will compile that to good code. I think the first use of (setf access-a) should modify both (setf access-a) and access-a to be trivial operations. This is why I think a new special form TAGLET is necessary. -- Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.1504 http://steingoldpsychology.com http://www.childpsy.net http://thereligionofpeace.com https://jihadwatch.org http://memri.org My plan for 'after work' is to retire. |
From: <Joe...@t-...> - 2017-12-20 17:35:54
|
Hi, Sam wrote: >"lazy-binding": bind the variable but not initialize its value. until it is explicitly set, >accessing it reaches up in the environment. Interesting challenge :-) > This would require something like `lazy-let` with the following test case: --8<---------------cut here---------------start------------->8--- (let ((a 1)) (lazy-let (a b) (print a) ; prints 1 (print b) ; error - unbound variable (setq a 10) ; inner binding of a is activated (print a) ; prints 10 (setq b 22)) ; binding of b is activated (print b) ; error - unbound variable (print a)) ; prints 1 --8<---------------cut here---------------end--------------->8--- >Alternatively, `lazy-let` can be implemented as something like > (let ((a (if (boundp 'a) a (sys::%unbound)))) ...) Certainly not, BOUNDP would only work with variables declared special. Perhaps SYMBOL-MACROLET? But that won't work either, because the problem remains that once `a becomes a symbol-macro, one cannot express anymore "let me access `a from the outer environment". Didn't somebody once post a bug report about similar infinite macro expansion about regular macros? This makes me remember that there have been several past attempts at trying to make the environment-access functions of CLtL2 fly. But Google turned up few references, I must have been dreaming. :-( http://clast.sourceforge.net/ Hmm, there's perhaps nothing that another level of delayed evaluation in order to get the desired scoping couldn't solve, so here it is: (defmacro lazy-let1 ((var) &body body) (let ((shadow (gensym (symbol-name var)))) `(let ((,shadow 'unbound));(sys::%unbound) (flet ((access-a () (if (eq ,shadow 'unbound) ,var ,shadow)) ((setf access-a) (x) (setf ,shadow x))) (symbol-macrolet ((,var (access-a))) .,body))))) ; bonus point 0: Add more required gensyms ; bonus point 1: Support a list of variables as per Sam's specification ; bonus point 2: See if LET can work with clisp's %unbound trick ; bonus point 3.5: See if you can make it work with special variables... (let ((a 1)) (lazy-let1 (a) (print a) ; prints 1 (setq a 10) ; inner binding of a is activated (print a)) ; prints 10 (print a)) ; prints 1 Regards, Jörg |
From: Sam S. <sd...@gn...> - 2017-12-20 16:48:31
|
Bruno, note-c-var and note-c-fun in foreign1.lisp are used identically (def-call-out resp def-c-var expand to --8<---------------cut here---------------start------------->8--- (eval-when (compile) (unless library (note-c-??? ...))) --8<---------------cut here---------------end--------------->8--- ), but one is marked ABI and the other is not: --8<---------------cut here---------------start------------->8--- note-c-fun ; not ABI, compile-time only note-c-var ; ABI --8<---------------cut here---------------end--------------->8--- It seems that the note-c-fun description is more correct. Am I missing something? Thanks! -- Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.1504 http://steingoldpsychology.com http://www.childpsy.net https://ffii.org http://islamexposedonline.com http://honestreporting.com PI seconds is a nanocentury. 10! seconds is 6 weeks. |
From: Sam S. <sd...@gn...> - 2017-12-20 16:27:48
|
Hi, > * <Wbret-Plevy.Ubruyr@g-flfgrzf.pbz> [2017-12-20 14:40:46 +0000]: > > > Regarding concurrent hash-tables, I'd relax the requirements about iteration: > > - Iteration must terminate. > - The set of keys need not match the serializable database isolation level. > - It's OK for some keys to be missing from the enumeration! That's the price > an application pays for going through a table without adequate outer locking. > - No key shall be presented twice(?) (except if it's removed + re-added meanwhile) > - Iteration must not crash. I may be missing something, but it seems to me that the requirements are contradictory: If one thread keeps adding and removing a key, then the other thread may iterate over the table forever, presenting the same key over and over again. However, this seems like a corner case. -- Sam Steingold (http://sds.podval.org/) on darwin Ns 10.3.1504 http://steingoldpsychology.com http://www.childpsy.net http://memri.org http://camera.org http://think-israel.org http://islamexposedonline.com If you do not move, you will not feel the shackles. |
From: <Joe...@t-...> - 2017-12-20 14:41:03
|
Hi, Bruno proposed: LISPFUNNR(car,1) { var object argument = popSTACK(); RDLOCK(object); var object result = car(argument); RDUNLOCK(object); VALUES1(result); } That is completely unsatisfying. As Don noticed, all you have achieved here, at the cost of *lots* of changes to the source, is atomic access on machines where it is not already guaranteed by the HW. For instance, the WIDE_SOFT configuration uses 64 bit objects on 32 bit architectures. There locking will prevent random crashes caused by observing non-atomic half-object updates. I'd simply state that WIDE_SOFT is incompatible with ENABLE_THREADS. Your proposal is unsatisfying because merely enabling atomic updates does not help at all towards application-level consistency. E.g. suppose my app uses two arrays in sync (instead of putting a cons of two values into each element). Or suppose my app builds some tree of things. Lack of synchronization during updates may cause thread B to observe a violation of invariants, e.g. both arrays not of the same size, or a cyclic graph instead of a tree. So the above locks gain you too little. >Here RDLOCK will never wait: it will either signal an error or >proceed. This sounds crazy. Consider the following sequence: (My example is stupid because you certainly proposed to use a shared-exclusive lock. But just replace CAR with (SETF CAR).) A B var object argument = popSTACK(); var object argument = popSTACK(); RDLOCK(object); [delayed] var object result = car(argument); RDUNLOCK(object); RDLOCK(object); var object result = car(argument); RDUNLOCK(object); VALUES1(result); VALUES1(result); No error would occur, the inherent data race would go unnoticed. But an error would be signaled when A resumes just a little bit earlier? That randomness makes no sense. What I'm missing is a vision of something that would make concurrent processing in clisp interesting. So far, all I see is an attempt to play catch up where clisp is the last in class, as all other implementations already feature threads AFAIK. I would enjoy other means of (real or pseudo-)concurrency in clisp. Clojure is known for having transactional memory (I never tried it out), Python and several others have coroutines or some limited form of delimited continuations. I could imagine a single threaded clisp yet featuring asynchronous I/O based on libuv. Unlike other fashionable things like comprehensions which can be conveniently expressed in pure Common Lisp, async stuff would need support from the C core of clisp. Back since Amiga days I dream of a clisp with something similar to CMUCL's serve-event. But Bruno has always favoured real threads over green threads. https://common-lisp.net/project/cmucl/doc/cmu-user/serve-event.html Regarding concurrent hash-tables, I'd relax the requirements about iteration: - Iteration must terminate. - The set of keys need not match the serializable database isolation level. - It's OK for some keys to be missing from the enumeration! That's the price an application pays for going through a table without adequate outer locking. - No key shall be presented twice(?) (except if it's removed + re-added meanwhile) - Iteration must not crash. IOW, it's OK during iteration to present keys that have been removed concurrently. There's no need to implement snapshots. But the data structures should never be seen in such an inconsistent state that the iteration loops infinitely or crashes. Likewise, when you intern a symbol-name into a package, concurrent enumeration (and lookup) should remain stable. That seems achievable using simply linked bucket lists and a few memory barriers & atomic instructions. As you see, I favour a model easily understood of serializable atomic reads from and writes to memory. Alas, that's not how modern processors evolve. They are moving in a direction of isolated memory, where two threads must explicitly communicate (rendez-vous) with each other (atomic writes are *not* enough) in order to "exchange" memory. IOW they behave more & more like processes. https://www.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html Regards, Jörg |