Thread: [Gramps-devel] SQL-like API for Gramps gen.lib

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Devs,

Just an update on a project that I have been toying with for a long
time, but now is functional and built on a solid foundation: a
SQL-like API for SELECT, DELETE, and UPDATE for regular Gramps BSDDB
data through gen.lib.

The following is mostly from:
http://www.gramps-project.org/bugs/view.php?id=6513 but I thought it
might be of general interest.

The QueryGramplet in gramps-addons/trunk for gramps/master can now
SELECT, UPDATE, and DELETE. Some examples (keywords are shown
capitalized, but the SQL parser is case-insensitive):

DELETE FROM person WHERE primary_name.first_name == "Travis";

SELECT * from person;

SELECT gramps_id, primary_name.first_name,
primary_name.surname_list.0.surname from person;

SELECT event_ref_list.0.ref FROM person;

UPDATE person SET primary_name.first_name="Gary" where
primary_name.first_name == "Travis"

This API is made possible through the generic struct/json interface.
It is very little code, because it relies on these generic structures.
It should be able to be made solid enough to expose to users (say as a
generic filter). The parser can be made more user friendly... it may
just throw an error currently.

I'd be interested in any limitations you find, or enhancement ideas.
Some notes on use:

1) Most SQL clauses (UPDATE table, FROM table, SELECT ..., SET
field=value, ...) can appear in any position, any order

2) ...except the WHERE clause: it must be last; this is because the
WHERE clause is not parsed, because:

3) The WHERE clause uses any valid Python expressions. (May need to
import some libraries such as random, to have ready any possible
expression needed)

4) The SELECT fields currently use a dotted notation for list
references. Use "event_ref_list.0" rather than "event_ref_list[0]".

5) JOINS are not necessary, because it automatically looks up all
relations through the handles.

6) UPDATE currently only works on the primary object, not on the
joined object. For example, you can't update the birth date of an
event through the person's referenced events. I think this can be
fixed.

7) Tables are lowercase, single (not plural) form (eg, person, tag, event).

8) Need to implement LIMIT number; LIMIT start, stop; and WHERE ROWNUM < number

9) Field names are the actual names of the fields of the
gramps.gen.lib objects, verbatim, no differences. You might need to
look up what you need... no help yet from this interface (although I
am working on defining a built-in Schema that could help)

10) .handle or .ref automatically look up their references.

11) Shortcut: you can use col[N] in the WHERE clause to reference a
column selected.

select gramps_id, private from person WHERE not col[1];

12) If an object doesn't match any selected field, it just doesn't
show. For example, to find all of the people with at least two
surnames on their primary name, use:

select primary_name.surname_list.1 from person;

or

select gramps_id, primary_name.surname_list.1 from person where col[1];

13) In a SELECT (for speed reasons), you need to reference a field
before you can use it. That is not necessary in the UPDATE or DELETE
statements. (The idea here is that SELECTS are done quite frequently,
but UPDATES are done rarely, and it doesn't matter if those take a
little longer). Maybe we can relax this constraint.

14) The semicolon is optional

15) Be careful selecting all fields from all records... that could
take up a lot of memory, and bring down gramps.

16) This should be fairly fast, but it does call eval(). This might
make things a little slower, but made the code much easier to write.
And it does use the full power of python.

17) You can use parens in an "UPDATE table SET field=value" value.
Something like:

UPDATE table SET field=(field + 1);

but that hasn't been well-tested. (Speaking of testing, there is a
Vassilii-inspired unittest with the QueryQuickview... will add more
there).

18) The primary_name... stuff is really long and verbose. Maybe we
need some "virtual columns".

19) Fields that contain other objects, or lists of objects, will show
as dictionaries and lists of dictionaries. You can refine those fields
by further specifying subparts. Maybe we should not show these, or
show in another form...

-Doug

Thread: [Gramps-devel] SQL-like API for Gramps gen.lib

Gramps, the open source genealogy program

gramps-devel