Overview of "Backward-Chaining"
To do backward-chaining, Pyke finds rules whose then part matches the goal (i.e., the question). Once it finds such a rule, it tries to (recursively) prove all of the subgoals in the if part of that rule. Some of these subgoals are matched against facts, and others are subgoals for other backward-chaining rules. If all of the subgoals can be proven, the rule succeeds and the original goal is proven. Otherwise, the rule fails, and Pyke tries to find another rule whose then part matches the goal, and so on.
So Pyke ends up linking (or chaining) the if part of the first rule to the then part of the next rule.
- Pyke starts by finding a rule whose then part matches the goal.
- Pyke then proceeds to process the if part of that rule.
- Which may link (or chain) to the then part of another rule.
Since Pyke processes these rules from then to if to then to if in the reverse manner that we normally think of using rules, it's called backward chaining.
To make this more clear, Pyke has you write your backward-chaining rules upside down by writing the then part first (since that's how it is processed).
"Use", "When" Rather than "Then", "If"
But then-if rules sound confusing, so Pyke uses the words use and when rather than then and if. You can then read the rule as "use" this statement "when" these other statements can be proven.
Unlike the assert clause in forward-chaining rules, Pyke only allows one statement in the use clause.
Consider this example:
1 direct_father_son 2 use father_son($father, $son, ()) 3 when 4 family.son_of($son, $father) 5 grand_father_son 6 use father_son($grand_father, $grand_son, (grand)) 7 when 8 father_son($father, $grand_son, ()) 9 father_son($grand_father, $father, ()) 10 great_grand_father_son 11 use father_son($gg_father, $gg_son, (great, $prefix1, *$rest_prefixes)) 12 when 13 father_son($father, $gg_son, ()) 14 father_son($gg_father, $father, ($prefix1, *$rest_prefixes)) 15 brothers 16 use brothers($brother1, $brother2) 17 when 18 father_son($father, $brother1, ()) 19 father_son($father, $brother2, ()) 20 check $brother1 != $brother2 21 uncle_nephew 22 use uncle_nephew($uncle, $nephew, $prefix) 23 when 24 brothers($uncle, $father) 25 father_son($father, $nephew, $prefix1) 26 $prefix = ('great',) * len($prefix1) 27 cousins 28 use cousins($cousin1, $cousin2, $distance, $removed) 29 when 30 uncle_nephew($uncle, $cousin1, $prefix1) 31 father_son($uncle, $cousin2, $prefix2) 32 $distance = min(len($prefixes1), len($prefixes2)) + 1 33 $removed = abs(len($prefixes1) - len($prefixes2))
We can draw something similar to a function call graph with these rules:
The easiest way to do this is with some_engine.prove_1 or some_engine.prove_n. Prove_1 only returns the first proof found and then stops (or raises pyke.knowledge_engine.CanNotProve). Prove_n returns a context manager for a generator that generates all possible proofs (which, in some cases, might be infinite). In both cases, you pass a tuple of data arguments and the number of variable arguments as the last two parameters. The total number of arguments for the goal is the sum of the length of the data arguments that you pass plus the number of variable arguments that you specify.
Both functions return the variable bindings for the number of variable arguments you specified as a tuple, along with the plan.
Backtracking with Backward-Chaining Rules
For this example, these are the starting set of family facts:
1 son_of(tim, thomas) 2 son_of(fred, thomas) 3 son_of(bruce, thomas) 4 son_of(michael, bruce)
And we want to know who fred's nephews are. So we'd ask uncle_nephew(fred, $nephew, $prefix).
Here are the steps (in parenthesis) in the inferencing up until the first failure is encountered (with the line number from the example preceding each line):
(1) 22 use uncle_nephew(fred, $nephew, $prefix) 25 brothers(fred, $father) (2) 25 use brothers(fred, $brother2) 18 father_son($father, fred, ()) (3) 2 use father_son($father, fred, ()) 4 family.son_of(fred, $father) 19 father_son(thomas, $brother2, ()) (4) 2 use father_son(thomas, $son, ()) 4 family.son_of($son, thomas) 20 check fred != tim 24 father_son(tim, $nephew, $prefix1) (5.1) 2 use father_son(tim, $son, ()) 4 family.son_of($son, tim) => FAILS (5.2) 6 use father_son(tim, $grand_son, (grand)) 8 father_son(tim, $grand_son, ()) 2 use father_son(tim, $son, ()) 4 family.son_of($son, tim) => FAILS (5.3) 11 use father_son(tim, $gg_son, (great, $prefix1, *$rest_prefixes)) 13 father_son(tim, $gg_son, ()) 2 use father_son(tim, $son, ()) 4 family.son_of($son, tim) => FAILS
Each rule invocation is numbered (in parenthesis) as a step number. Step 5 has tried 3 different rules and they have all failed (5.1, 5.2 and 5.3).
If you think of the rules as functions, the situation now looks like this (the step numbers that succeeded circled in black, and steps that failed circled in red):
At this point, Pyke has hit a dead end and must backtrack. The way that backtracking proceeds is to go back up the list of steps executed, combining the steps from all rules into one list. Thus, when step 5 fails, Pyke backs up to step 4 and tries to find another solution to that step.
If another solution is found, Pyke proceeds forward again from that point. If no other solutions are found, Pyke backs up another step.
When Pyke goes back to step 4, the next solution binds $son to fred. This fails the subsequent check in the brothers rule:
20 check $brother1 != $brother2
And so Pyke goes back to step 4 once again. The next solution binds $son to bruce. This succeeds for brother and is passed down to father_son which returns michael as fred's nephew.
Further backtracking reveals no other solutions.
Thus we see:
- The backtracking algorithm: "fail goes up (or back) while success goes down (or forward)" is not limited to the steps within a single rule's when clause; but includes the entire chain of inferencing from the very start of trying to prove the top level goal.
- This execution model is not available within traditional programming languages like Python.
- The ability to go back to any point in the computation to try an alternate solution is where backward-chaining systems get their power!
Running the Example
>>> from pyke import knowledge_engine >>> engine = knowledge_engine.engine('doc.examples') >>> engine.assert_('family', 'son_of', ('tim', 'thomas')) >>> engine.assert_('family', 'son_of', ('fred', 'thomas')) >>> engine.assert_('family', 'son_of', ('bruce', 'thomas')) >>> engine.assert_('family', 'son_of', ('michael', 'bruce')) >>> engine.activate('bc_example')
Nothing happens this time when we activate the rule base, because there are no forward-chaining rules here.
We want to ask the question: "Who are Fred's nephews?". This translates into the Pyke statement: bc_example.uncle_nephew(fred, $v1, $v2).
Note that we're using the name of the rule base, bc_example rather than the fact base, family here; because we expect this answer to come from the bc_example rule base.
This is 'bc_example', 'uncle_nephew', with ('fred',) followed by 2 pattern variables as arguments:
>>> from __future__ import with_statement >>> with engine.prove_n('bc_example', 'uncle_nephew', ('fred',), 2) as gen: ... for vars, no_plan in gen: ... print(vars) ('michael', ())