From: Thomas Leonard <talex5@gm...>  20100326 21:00:10

The new Pythonbased SAT solver seems to work fine, so I've merged it to master. Please test! The following is partly for people who want to understand it, and partly to help me convince myself it should work ;) How it works In an earlier post, I described my previous attempt to improve version selection using a pseudoboolean solver (minisat+): http://thread.gmane.org/gmane.comp.filesystems.zeroinstall.devel/3054 The SAT solver is similar, except that we represent the problem as a large boolean equation instead of a set of simultaneous equations. For example, consider the restriction: "prog version 2 depends on liba. Two versions of liba are available: 1 and 2" This expands to: "If you select version 2 of the main program (prog2) then you must select either version 1 or 2 of liba." AND "You can't select both versions of liba." As a pseudoboolean constraint (each variable is 0 or 1; the old system): liba1 + liba2  prog2 >= 0 AND liba1 + liba2 <= 1 As a boolean expression in Conjunctive Normal Form[1] (each variable is False or True; the new system): (liba1 or liba2 or not(prog2) ) AND (not(liba1) or not(liba2)) When we want to launch a program, we begin by creating a huge expression with all the dependencies between all known versions of each interface that could be involved in the problem. Solving this gives a set of assignments for the variables, which tells us which versions to download and use. Consider this simple problem (from testsat.py:testBestImpossible): prog: 1 2 liba: 1 prog[2] => liba 3 4 This says we have two versions of prog (1 and 2) and one version of liba. prog2 requires a version of liba between 3 and 4 (which doesn't exist). This gives three clauses (AND'd together): (1) not(prog2) (2) prog2 or prog1 (3) not(prog2) or not(prog1) (prog2 can't be selected; some version of prog must be selected; we can't select both versions of prog) To solve these problems efficiently, we use a form of the DavisPutnamLogemannLoveland algorithm[2]. Basically: 1. Use the "Unit propagation rule"[3] to work out the values for any variables which are "obvious". 2. Pick an undecided variable and try it one way. Go back to step 1. This gives: (1) => prog2 = False (this is a unit clause) (2) therefore implies prog1 (unit propagation) This gives the solution (we should use the older prog1 because prog2 is unusable) without searching. This unit propagation rule can solve quite a few tricky problems by itself. For a more complex example, consider: prog1 has no dependencies prog2 depends on liba (of which there are 10 versions) each liba depends on some libb (of which there are 10 versions) each libb depends on libc (10 versions) each libc depends on libd (10 versions) each libd depends on Python 2.6 (unavailable) Simply searching would require testing all 10,000 combinations of liba, libb, libc and libd. However, using the unit propagation rule we deduce: not(python26) => not(libd1), not(libd2), ... => not(libc1), not(libc2), ... => not(libb1), not(libb2), ... => not(liba1), not(liba2), ... => not(prog2) => prog1 Again, we find the solution without searching. If we find a solution by unitpropagation then it must be the only possible solution. If we have to search, we always try our preferred version first (e.g. prog2 before prog1, then liba2 before liba1). This ensures that we always find the "best" solution. Things get more interesting if we can't detect a conflict until we've made lots of decisions. Consider a similar case to the one above, but where every libd is a 64bit binary and prog2 also depends on libz (version 1 or 2), both versions of libz2 being 32bit binaries. We can select either 64bit libraries or 32bit libraries, so we can't eliminate either libd* or libz* at the start. But we can't have 32bit and 64bit binaries in the same selection. In this case unitpropagation alone doesn't help us much, so we try choosing prog2, then liba10, then libb10, then libc10, then libd10. At that point, unit propagation lets us deduce: libd10 => 64bit => not(32bit) => not(libz1), not(libz2) => not(prog2) => conflict (with our previous choice of "prog2") If we simply undid our choice of libd10 and tried libd9, we'd be back to trying all 10,000 combinations. Instead, we analyse the conflict, following the chain of reasoning backwards: Why did we get a conflict? Because "prog2 and not(prog2)" Why was prog2 set? We chose it. Why was not(prog2) set? Because (not(libz1) and not(libz2)) forces it. Why not(libz1)? Because not(32bit) Why not(libz2)? Also because not(32bit) Why not(32bit)? Because 64bit Why 64bit? Because libd10 Why libd10? We chose it. Stopping at the two "we chose it" reasons, we "learn" the general rule that: libd10 and prog2 => conflict To make sure we don't make this mistake again, we add a new clause to our big boolean expression: not(libd10) or not(prog2) This clause always had to be true, we just didn't know it before. We backtrack to before setting libd10, since we know that didn't work out. Since we're still trying prog2=True, we immediately deduce from the new clause that libd10=False. and so we try libd9, which generates a similar new clause, and so on: not(libd10) or not(prog2) AND not(libd9) or not(prog2) AND not(libd8) or not(prog2) AND ... After exploring all 10 versions of libd, we notice the dependency rule for libc1 depending on libd*: not(libc1) or libd1 or libd2 or ... has become not(libc1) or False or False or ... which is not(libc1) We didn't have to try each version of libc1 with each version of libd. We also immediately get not(libc2), not(libc3), etc. This in turn eliminates all libb*, which removes all liba*, which leads us to learn the rule: not(prog2) Then we select prog1, having only looked at 10 combinations (not 10,000). [1] http://en.wikipedia.org/wiki/Conjunctive_normal_form [2] http://en.wikipedia.org/wiki/DPLL_algorithm [3] http://en.wikipedia.org/wiki/Unit_propagation [4] http://minisat.se/Papers.html  Dr Thomas Leonard ROX desktop / Zero Install GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA 
From: Thomas Leonard <talex5@gm...>  20100418 18:21:43

On 26 March 2010 22:00, Thomas Leonard <talex5@...> wrote: > The new Pythonbased SAT solver seems to work fine, so I've merged it > to master. Please test! I've written a blog post about this, now on osnews: http://roscidus.com/desktop/node/954 http://www.osnews.com/comments/23172  Dr Thomas Leonard ROX desktop / Zero Install GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA 
From: Anders F Björklund <afb@us...>  20100418 19:48:34

Thomas Leonard wrote: >> The new Pythonbased SAT solver seems to work fine, so I've merged it >> to master. Please test! > > I've written a blog post about this, now on osnews: > > http://roscidus.com/desktop/node/954 > http://www.osnews.com/comments/23172 Have you considered entering it into the depsolving competition, at http://mancoosi.org/misc2010/ ? anders 
From: Thomas Leonard <talex5@gm...>  20100418 20:32:46

On 18 April 2010 20:32, Anders F Björklund <afb@...> wrote: > Thomas Leonard wrote: > >>> The new Pythonbased SAT solver seems to work fine, so I've merged it >>> to master. Please test! >> >> I've written a blog post about this, now on osnews: >> >> http://roscidus.com/desktop/node/954 >> http://www.osnews.com/comments/23172 > > Have you considered entering it into the depsolving > competition, at http://mancoosi.org/misc2010/ ? Mine uses the same basic system as OPIUM (which was developed for aptget already)  the only changes I made make it more suitable for Zero Install (and less suitable for solving aptget problems). I'm not really sure why they need a competition, since presumably OPIUM would win on everything, except maybe speed. But they're smart guys, so I'm sure they've got their reasons. BTW, Stefano Zacchiroli (who's part of the Mancoosi project) just got elected as Debian project leader.  Dr Thomas Leonard ROX desktop / Zero Install GPG: 9242 9807 C985 3C07 44A6 8B9A AE07 8280 59A5 3CC1 GPG: DA98 25AE CAD0 8975 7CDA BD8E 0713 3F96 CA74 D8BA 