On many goals with large set of hypothesis, it is often difficult to see what is relevant.
Yet, a simple traversal on both goal and hypothesis can identify:
a) all function symbols + category (infix/prefix, fcn/op, generic, etc)
b) all free and bound variables
c) all quantified variables requiring witnessing (in goal)/ serving as extra hyp (in hyp)
d) kind of hyp/goal: "type checking", "weakening", "other AI4FM WhyM tags", HR ranks, etc.
e) identification of contrictory hypotheis as the "new goal"
All these could be highlighted to the user in varied forms/colous/orders or hidden from the predicate printed altogether (ala PVS style).
This can be achieved by visiting nodes to add specific annotations to terms.
Once terms are annotated, other "apps" can be impl. like:
a) highlight relevant hypothesis depending on various criteria (e.g., No of fcn symbols matched; kind of hyp (see above); user-given; learned; etc)
b) clear/show hypothesis of interest auto (E.g., from what has been highlighted) or by user choice
c) transitive hypothesis chase
ex: in cloud-workflow (simpler/toy) model on the proof of "WFBLNoReadUp\$domainCheck" as
WFBlNoReadUp ∧ (i, u) ∈ readBy ⇒ i ∈ dom locationD ∧ u ∈ dom clearance
we get this goal we use prove by reduce :
1 Workflow ∧
2 clearance ∈ Service ⇸ Level ∧
3 locationS ∈ Service ⇸ Level ∧
4 locationD ∈ Data ⇸ Level ∧
5 u ∈ Service ∧
6 i ∈ Data ∧
7 dom locationS ∈ ℙ (dom clearance) ∧
8 dom writes ∈ ℙ (dom locationS) ∧
9 ran readBy ∈ ℙ (dom locationS) ∧
10 dom readBy ∈ ℙ (dom locationD) ∧
11 ran writes ∈ ℙ (dom locationD) ∧
12 (∀ s: dom locationS ⦁ locationS s ≤ clearance s) ∧
13 (i, u) ∈ readBy
⇒
G1 i ∈ dom locationD ∧
G2 u ∈ dom clearance
In this, it's possible to infer that 2-6 are type-hyp; 1 is a schema; 7-13 are invariants expanded. Here a tool could decide in two ways:
a) directly identify transitive relations between fcn symbols in goal and hyp:
G1 has
2 names: i, locationD - assuming toolkit or parent names/symbols could be filtered out
1 expr : dom location D
1 pred : whole goal
chasing names in hyps we find "i" in 6 and 13 and "locationD" in 4, 10, and 11 (note 4 and 6 are "known" as a typr hyp, less relevant);
chasing expr "dom locationD" is in 10 and 11 (e.g., name/expr involving name lead to same set)
given goal is set membership, chasing names in and "set-membership-expr" for "i" in 13 (where i was found) and get name "readBy".
Continously do this until you find some hypothesis involving i that leads to one involving "dom location D", and that is 10. This gives
a transitive chain: G1 <- 13 <- 10 <- {1, 4, 6}. We add in the end all type-info known + schemas, as they may have extra relevant info.
Offer it as a lemma or highlight to the user?
∀ Workflow ⦁ ∀ u : Service; i : Data | (i, u) ∈ readBy ∧ dom readBy ∈ ℙ (dom locationD) ⦁ i ∈ dom locationD
Try prove by reduce again (in the background) with the lemma added. It finishes the proof (see next point b)).
Such transitive chains of reasoning are exactly what happens in backward chaining rewriting.
Other theorem provers might infer such chains automatically and lead the lemma unnecessary, but here the point is different:
to try to find/identify ways to improve the user's understanding of the model and its relationships.
----
b) choose the right enabledness / expansion-level for schemas
finding this is fiddly and requires some user interaction/guidance. One option is to have the expansion up to the "right/appropriate"
level of abstraction - and that is user based/ defined - but in this case [and many others] this can be partially inferred as well (see below).
Where does hyp 10 and 13 appear in the user model? They turn out to be in WFBLClearance, which is within WFBlNoReadUp. This
can be found by either looking at the AST, or doing step-wise expansion (e.g., "invoke NAME") on all identified schema names involved.
And a better lemma is:
∀ WFBLClearance ⦁ ∀ s : Service; d : Data | (d, s) ∈ readBy ⦁ d ∈ dom locationD
This can be chased / traced after such expansion to this predicate (e.g., before the prove by reduce) and it has the highest matching index (say)
dom readBy ⊆ dom locationD
Here the user could be queryied if what he wants is related to this or something like that.
----
c) alternative version of the same lemma using frules (for large models)
Forward chaining is present in various provers: it involves progressing by strengthening hypothesis to the goal, rather than the usual weakening
of the goal towards the hypothesis. With schemas this is particularly useful:
* define a frules as "S \implies invNofS" to reveal the invaraints of interest from S
with this we avoid the need to expand S whenever it appears in the goal, and its (forwardly) inferred property is kept in context for rewrite (but not
as an added hypothesis to the user). This makes goals/hyp lists much shorter.
Ex. in Mondex, such use of frules was crucial. without them, goals would explode to over 50 pages of A4 long lists of hyp!
In large models, this is a very important and useful technique.
-----
d) predicate abstraction over complex expressions in hypothesis
Again, various theorem provers (e.g., ACL2, PVS, Isabelle) already do that automtically. Yet here the point is to identify the abstraction to the user,
being that in ZEves, or other prover. So, when a complex expression like
"COMPLEX-EXPR-A" cup "COMPLEX-EXPR-B" = "COMPLEX-EXPR-C"
appears, one could introduce names for them as A, B, C (e.g.,
split (\exists A : Type; .... @ A = ..... \land B = ......) etc.
Then the user just sees it as
A cup B = C
Often this could help the user see what are the algebraic/set-theoretic properties of fcn symbols involved (e.g., see AI4FM example on "partitions.zed").