From: <don...@is...> - 2017-12-22 13:22:09
|
Joe...@t-... writes: > A) The idea is to give a minimum set of guarantees when users are > programming in the wild, without locks. You mean without adding their own locks or without the hash table implementation adding internal locks? I'm expecting the hash table implementation to do whatever is necessary in order to make programs written by users without adding their own locks work as I think they are justified in expecting in the presence of other threads. > Basically: no crash, no endless iteration due to sudden cycles in internal data structures. > Perhaps attempt to detect that and raise an error as Bruno suggested? > > B) Other parts would use locks in order to give stronger guarantees. I think you mean that the clisp hash table implementation would add locks in order to make things work as expected. That's what I want. > For instance, clisp would internally use locks when updating or > iterating a package's hash table, because I've been assuming that a package IS a hash table and that when the implementation of hash tables is fixed then packages will work. > Therefore, it does not make sense to raise the minimum set of > guarantees for the A case. I don't understand the distinction between A and B. > Of course, you can try and make A and B collapse using algorithms > for concurrent hash-tables. I have some doubts that this can work > because e.g. the package example shows that often enough you need > locks (or rather, critical sections) above and surrounding the > level where you have atomic/concurrent updates to individual > parts. The hash tables may support concurrent updates, but they are > only part of the package object which (likely) has more invariants > and postconditions. I'm missing something. Show me an example of what you're talking about above. You mean the lisp implementation has to lock things internally? That's fine with me. Or the user writing a program to iterate in one thread while altering the table in another has to add locks? I want this to be unnecessary in order to obtain the guarantee I described. If you need something stronger then you might have to add locks. For instance, you might want to guarantee that an entire iteration is done in the same hash table state. And that would require user level locks. > I have not analyzed your algorithm in sufficient detail to > determine whether it could work as a concurrent hash table I have not described an algorithm, only a requirement. I expect the requirement can be satisfied by various implementations, and I hope some of them are even efficient and not too complicated. > implementation. But that's what it wants to be. My proposal was way > more limited: just use the present, non concurrent code, add a few > checkpoints / atomics / fences to ensure it will terminate. But I I don't know enough about the current code to know what can go wrong now or what has to be added in order to have what effects. > don't see how a simple non-concurrent bucket-based algorithm could > prevent keys from disappearing from an enumeration in the presence > of parallel deletion of other keys. That's precisely why CL forbids > such modifications. Again I'm failing to understand your problem. As a simple example, suppose we have a stable ordering of all keys, and that a hash table always has the same number of buckets, with a stable function mapping keys to buckets, and that all keys in the same bucket are stored in an ordered list (sorted by the ordering). Then an iteration would only have to keep track of the current key, right? It could lock the table in order to find the next key and then unlock the table. My specification says there should be some ordering of the keys that were in the table at the start of iteration along with those added during the iteration. If the iteration has an order on buckets, and within buckets an order on keys, then that gives us the required ordering: X < Y if either X belongs in an earlier bucket than Y according to the stable function mapping keys to buckets, or if that function maps both to the same bucket and X precedes Y in the stable ordering of all keys. I realize that GC is likely to have an important impact on all of this. I expect it already does in the current implementation. For the moment I leave that as a later issue. The above is only intended as an example to illustrate the concept. Maybe it can be extended to work acceptably in practice, maybe not. > One could RPLACA the deleted element with #<UNBOUND> to preserve > the integrity of the bucket list. But that would be a burden > imposed on all tables, not exactly nice. I can sort of understand that in the context of some particular algorithm, or type of algorithm, but I don't think it's worth while to try to respond without knowing exactly what algorithm you have in mind. I think you're saying that you DO have a solution for that particular algorithm but you're not satisfied with its performance. Maybe that particular performance problem can be fixed, or maybe a different algorithm would be better. If you're hoping to make a small change to the current hashing algorithms in clisp, I can understand that desire, but since they were probably not designed with this in mind, it might be easier to change them. As I said above, I don't know how they work in any detail. |