Yeb Havinga - 2005-08-23

This is a copy paste of some initial thoughts on a way to do opencyc-powered constraint checking on a relational database. Before something can be checked, the stuff that's in the database should be put in OpenCyc. This raises the question what can be logically said about the things in the DB. Very interesting, long history and actual. (see e.g. Jack Minkers 'Logic and Databases, a 20 Year Retrospective)

I drop the text here so that in the case that somebody is interested in this topic too, we could exchange some ideas.

regards,
Yeb Havinga

............................................
Problem:
Constraint checking in Oracle limited to attribute checks in the updated
row. No SQL allowed in check constraints. After statement trigger constructs
can be used for more powerful constraints. This is SQL (declarative)
mixed with PL/SQL (imperative) -> no easy way to proof whether
a constraint in a formal language like Z is implemented by the constraint.

Idea:
constraint checking with OpenCyc for Oracle db's.

Also:
Maybe play around a bit with Global as view (GAV) mapping from OpenCyc's
global terms to the new (local) terms. This term mapping should provide much
extra body / knowledge to the constraint checking.

Bonus:
Leverage common sense existing in OpenCyc

Todo:
Automate data entering in OpenCyc

Read Oracle's dictionary tables and
1) create vocabulary in OpenCyc
2) do an initial copy of data to OpenCyc
3) create triggers in Oracle that replicate updates and insers
to opencyc.

CREATE VOCABULARY

Reiter showed in 1984 that a database could be translated into a first-order
logic theory. The basic idea was to translate a tuple ('John','Mary') in relation
LOVES to a ground fact LOVES(John,Mary) where LOVES is a predicate and John
and Mary are constants. So, let's take this idea as starting point.

Example relation in Oracle's SCOTT/TIGER example db:

    Relation:
    EMP
    attributes:
    EMPNO ENAME      JOB         MGR HIREDATE          SAL      COMM DEPTNO
    ----- ---------- --------- ----- ----------- --------- --------- ------
     7369 SMITH      CLERK      7902 12/17/1980     800.00               20
     7499 ALLEN      SALESMAN   7698 2/20/1981     1600.00    300.00     30
    etc.
   
    EMPNO is known to be PK
    DEPTNO is FK to another relation.
    (Surprisingly MGR is not (in the dictionary) known to be a FK to EMP.)

A question is how to represent the VALUES of these attributes in OpenCyc:
as constants or as characterstrings / numbers or as mix

Suppose I create constants. What is a good name? Why not do something like
<indication for oracle interface><relation><pk name><pk value>
that will be constants like:

O_EMP_EMPO_7369, O_EMP_EMPO_7499, etc...

Now, the PK is already given a translation, head on to ENAME.
Following Reiter, each name should also be translated as a constant, but
I'm a bit uncertain about this. Actually the name doesn't really matter
for the kind of constraint checking I'd like to do with OpenCyc.
The only thing I can think of is enforcing a 'unique name' constraint,
which is easily done
in Oracle. So, let's skip it or just create a predicate which relates
the object denoted by the constants like O_EMP_EMPO_7369 to a characterstring:

(#$o_emp_ename #$O_EMP_EMPO_7369 "SMITH")

Now, should the same be done to Job?

Dates are.
(#$o_emp_hiredate #$O_EMP_EMPO_7369 (DayFn 17 (MonthFn December (YearFn 1980))))
Salary
(#$o_emp_sal #$O_EMP_EMPO_7369 800)

Comm
What to do with null values?
1) Is there a comm value but is it unknown?
Make skolem term:
HYPOTHESIZE (#$o_emp_comm #$O_EMP_EMPO_7369 ?X)
2) or is there no commission?
Ignore the value, i.e.: put nothing in the KB.
3) or perhaps even: we know there there is no commision?
\not \exists ?X (#$o_emp_comm #$O_EMP_EMPO_7369 ?X)

yadayada
PK: use to identify the first order objects, create constants for them.
FK: relate objects.
time attributes: restrict ground facts to a time dimension
other attributes: just name them, ignore them?