From: Doug B. <dou...@gm...> - 2013-12-24 16:22:09
|
Devs, Just an update on a project that I have been toying with for a long time, but now is functional and built on a solid foundation: a SQL-like API for SELECT, DELETE, and UPDATE for regular Gramps BSDDB data through gen.lib. The following is mostly from: http://www.gramps-project.org/bugs/view.php?id=6513 but I thought it might be of general interest. The QueryGramplet in gramps-addons/trunk for gramps/master can now SELECT, UPDATE, and DELETE. Some examples (keywords are shown capitalized, but the SQL parser is case-insensitive): DELETE FROM person WHERE primary_name.first_name == "Travis"; SELECT * from person; SELECT gramps_id, primary_name.first_name, primary_name.surname_list.0.surname from person; SELECT event_ref_list.0.ref FROM person; UPDATE person SET primary_name.first_name="Gary" where primary_name.first_name == "Travis" This API is made possible through the generic struct/json interface. It is very little code, because it relies on these generic structures. It should be able to be made solid enough to expose to users (say as a generic filter). The parser can be made more user friendly... it may just throw an error currently. I'd be interested in any limitations you find, or enhancement ideas. Some notes on use: 1) Most SQL clauses (UPDATE table, FROM table, SELECT ..., SET field=value, ...) can appear in any position, any order 2) ...except the WHERE clause: it must be last; this is because the WHERE clause is not parsed, because: 3) The WHERE clause uses any valid Python expressions. (May need to import some libraries such as random, to have ready any possible expression needed) 4) The SELECT fields currently use a dotted notation for list references. Use "event_ref_list.0" rather than "event_ref_list[0]". 5) JOINS are not necessary, because it automatically looks up all relations through the handles. 6) UPDATE currently only works on the primary object, not on the joined object. For example, you can't update the birth date of an event through the person's referenced events. I think this can be fixed. 7) Tables are lowercase, single (not plural) form (eg, person, tag, event). 8) Need to implement LIMIT number; LIMIT start, stop; and WHERE ROWNUM < number 9) Field names are the actual names of the fields of the gramps.gen.lib objects, verbatim, no differences. You might need to look up what you need... no help yet from this interface (although I am working on defining a built-in Schema that could help) 10) .handle or .ref automatically look up their references. 11) Shortcut: you can use col[N] in the WHERE clause to reference a column selected. select gramps_id, private from person WHERE not col[1]; 12) If an object doesn't match any selected field, it just doesn't show. For example, to find all of the people with at least two surnames on their primary name, use: select primary_name.surname_list.1 from person; or select gramps_id, primary_name.surname_list.1 from person where col[1]; 13) In a SELECT (for speed reasons), you need to reference a field before you can use it. That is not necessary in the UPDATE or DELETE statements. (The idea here is that SELECTS are done quite frequently, but UPDATES are done rarely, and it doesn't matter if those take a little longer). Maybe we can relax this constraint. 14) The semicolon is optional 15) Be careful selecting all fields from all records... that could take up a lot of memory, and bring down gramps. 16) This should be fairly fast, but it does call eval(). This might make things a little slower, but made the code much easier to write. And it does use the full power of python. 17) You can use parens in an "UPDATE table SET field=value" value. Something like: UPDATE table SET field=(field + 1); but that hasn't been well-tested. (Speaking of testing, there is a Vassilii-inspired unittest with the QueryQuickview... will add more there). 18) The primary_name... stuff is really long and verbose. Maybe we need some "virtual columns". 19) Fields that contain other objects, or lists of objects, will show as dictionaries and lists of dictionaries. You can refine those fields by further specifying subparts. Maybe we should not show these, or show in another form... -Doug |
From: Jerome <rom...@ya...> - 2013-12-24 17:33:17
|
Nice gift on this day! Thanks. Le mar. 24 déc. 2013 at 17:22,Doug Blank <dou...@gm...> a écrit : > Devs, > > Just an update on a project that I have been toying with for a long > time, but now is functional and built on a solid foundation: a > SQL-like API for SELECT, DELETE, and UPDATE for regular Gramps BSDDB > data through gen.lib. > [..] > > -Doug > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most > IT > organizations don't have a clear picture of how application > performance > affects their revenue. With AppDynamics, you get 100% visibility into > your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of > AppDynamics Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk > _______________________________________________ > Gramps-devel mailing list > Gra...@li... > https://lists.sourceforge.net/lists/listinfo/gramps-devel > |
From: Nick H. <nic...@ho...> - 2013-12-27 22:42:50
|
On 24/12/13 16:22, Doug Blank wrote: > I'd be interested in any limitations you find, or enhancement ideas. Doug, I get the following problems: 1. Segmentation faults in Gtk when returning a large amount of data. 2. The family.type field is evaluated as a python keyword rather than a field reference. 3. In the family table, child_ref_list[0].ref doesn't join to the person table. I can think of one enhancement. It would be better if lists were handled as a join to another table returning all rows. The selection could then be narrowed in the where clause. For example, to find the participants of an event: SELECT primary_name.first_name FROM person WHERE event_ref.ref.gramps_id == 'E12345'; or to get a specified attribute: SELECT attribute.value FROM person WHERE gramps_id == 'I12345' and attribute.type.string == 'Identification Number'; The functionality looks good though. I can see this used in a generic filter, and perhaps also for a report writer. Regards, Nick. |
From: Doug B. <dou...@gm...> - 2013-12-27 23:40:58
|
Thanks for the feedback... comments below: On Fri, Dec 27, 2013 at 5:42 PM, Nick Hall <nic...@ho...> wrote: > On 24/12/13 16:22, Doug Blank wrote: >> I'd be interested in any limitations you find, or enhancement ideas. > > Doug, > > I get the following problems: > > 1. Segmentation faults in Gtk when returning a large amount of data. Yes, that can happen quite easily with a "select * from person". Perhaps a paged view show the first N items would be useful. > 2. The family.type field is evaluated as a python keyword rather than a > field reference. Fixed. Changed the order of lookup to check the struct first. > 3. In the family table, child_ref_list[0].ref doesn't join to the person > table. Probably an error in the to_struct... I'll look into that. > I can think of one enhancement. It would be better if lists were > handled as a join to another table returning all rows. The selection > could then be narrowed in the where clause. > > For example, to find the participants of an event: > > SELECT primary_name.first_name > FROM person > WHERE event_ref.ref.gramps_id == 'E12345'; > > or to get a specified attribute: > > SELECT attribute.value > FROM person > WHERE gramps_id == 'I12345' and attribute.type.string == 'Identification > Number'; That's an interesting idea! I was thinking that it could easily explode (number of names x attributes x ...) I'll look into this. > The functionality looks good though. I can see this used in a generic > filter, and perhaps also for a report writer. Hmmm, a report writer sounds interesting.... Thanks! -Doug > Regards, > > > Nick. > > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk > _______________________________________________ > Gramps-devel mailing list > Gra...@li... > https://lists.sourceforge.net/lists/listinfo/gramps-devel |
From: Doug B. <dou...@gm...> - 2013-12-28 03:09:23
|
On Fri, Dec 27, 2013 at 6:40 PM, Doug Blank <dou...@gm...> wrote: > Thanks for the feedback... comments below: > > On Fri, Dec 27, 2013 at 5:42 PM, Nick Hall <nic...@ho...> wrote: >> On 24/12/13 16:22, Doug Blank wrote: >>> I'd be interested in any limitations you find, or enhancement ideas. >> >> Doug, >> >> I get the following problems: >> >> 1. Segmentation faults in Gtk when returning a large amount of data. > > Yes, that can happen quite easily with a "select * from person". > Perhaps a paged view show the first N items would be useful. > >> 2. The family.type field is evaluated as a python keyword rather than a >> field reference. > > Fixed. Changed the order of lookup to check the struct first. > >> 3. In the family table, child_ref_list[0].ref doesn't join to the person >> table. > > Probably an error in the to_struct... I'll look into that. > >> I can think of one enhancement. It would be better if lists were >> handled as a join to another table returning all rows. The selection >> could then be narrowed in the where clause. >> >> For example, to find the participants of an event: >> >> SELECT primary_name.first_name >> FROM person >> WHERE event_ref.ref.gramps_id == 'E12345'; >> So, you can do that selection with Python and the SQL-like SELECT with this mess: SELECT primary_name.first_name, ([e.ref.gramps_id for e in event_ref_list if e.ref.gramps_id == "E12345"][0]) FROM person WHERE col[1]; But your syntax is much cleaner; mine works currently without adding any more code. (I need to make the parser a bit more robust and smart). >> or to get a specified attribute: >> >> SELECT attribute.value >> FROM person >> WHERE gramps_id == 'I12345' and attribute.type.string == 'Identification >> Number'; That can be done in Python with: SELECT ([a.value for a in attribute_list if a.value == "Identification Number"][0]) FROM person WHERE gramps_id == "I12345"; I don't like that one has to write such complicated expressions, but on the other hand there is no speciality code dealing with any special cases. This is a general JSON query engine with ability to set values on arbitrary backend attributes. You are right that we could write some code that would turn all lists into rows, but then we would be changing the semantics, introducing new field names. And you would have a multiplying effect with number of rows. I guess if we only turned the *selected* lists into rows, we could control that to just what is needed... Perhaps we can write some special Python functions (something like filter(), select(), or any()) or syntax that would expand into Python that would make this easier to write, and wouldn't require any special coding. > That's an interesting idea! I was thinking that it could easily > explode (number of names x attributes x ...) I'll look into this. > >> The functionality looks good though. I can see this used in a generic >> filter, and perhaps also for a report writer. > > Hmmm, a report writer sounds interesting.... On the other hand, if this is an interface that would allow people to write filters, one-off updates, and build custom reports, then it might be worth write some special-case code... -Doug > Thanks! > > -Doug > >> Regards, >> >> >> Nick. >> >> >> ------------------------------------------------------------------------------ >> Rapidly troubleshoot problems before they affect your business. Most IT >> organizations don't have a clear picture of how application performance >> affects their revenue. With AppDynamics, you get 100% visibility into your >> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! >> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk >> _______________________________________________ >> Gramps-devel mailing list >> Gra...@li... >> https://lists.sourceforge.net/lists/listinfo/gramps-devel |
From: Doug B. <dou...@gm...> - 2013-12-28 15:29:12
|
On Fri, Dec 27, 2013 at 10:09 PM, Doug Blank <dou...@gm...> wrote: > On Fri, Dec 27, 2013 at 6:40 PM, Doug Blank <dou...@gm...> wrote: >> Thanks for the feedback... comments below: >> >> On Fri, Dec 27, 2013 at 5:42 PM, Nick Hall <nic...@ho...> wrote: >>> On 24/12/13 16:22, Doug Blank wrote: >>>> I'd be interested in any limitations you find, or enhancement ideas. >>> >>> Doug, >>> >>> I get the following problems: >>> >>> 1. Segmentation faults in Gtk when returning a large amount of data. >> >> Yes, that can happen quite easily with a "select * from person". >> Perhaps a paged view show the first N items would be useful. >> >>> 2. The family.type field is evaluated as a python keyword rather than a >>> field reference. >> >> Fixed. Changed the order of lookup to check the struct first. >> >>> 3. In the family table, child_ref_list[0].ref doesn't join to the person >>> table. >> >> Probably an error in the to_struct... I'll look into that. Fixed all ref.to_structs. >>> I can think of one enhancement. It would be better if lists were >>> handled as a join to another table returning all rows. I changed the SELECT display to show all lists on a row as a cross-product join so that a single matching object could expand to many rows. Here is an example: SELECT [1, 2, 3], ["a", "b"] FROM person limit 1; which will list: 1, a 1, b 2, a 2, b 3, a 3, b That will be handy for a report writing system, and produces a better display for this gramplet. I didn't yet figure out how to change the eval() to be able to refer to these rows easily other than the Python expressions shown. Still thinking... -Doug >>> The selection >>> could then be narrowed in the where clause. >>> >>> For example, to find the participants of an event: >>> >>> SELECT primary_name.first_name >>> FROM person >>> WHERE event_ref.ref.gramps_id == 'E12345'; >>> > > So, you can do that selection with Python and the SQL-like SELECT with > this mess: > > SELECT primary_name.first_name, ([e.ref.gramps_id for e in > event_ref_list if e.ref.gramps_id == "E12345"][0]) FROM person WHERE > col[1]; > > But your syntax is much cleaner; mine works currently without adding > any more code. (I need to make the parser a bit more robust and > smart). > >>> or to get a specified attribute: >>> >>> SELECT attribute.value >>> FROM person >>> WHERE gramps_id == 'I12345' and attribute.type.string == 'Identification >>> Number'; > > That can be done in Python with: > > SELECT ([a.value for a in attribute_list if a.value == "Identification > Number"][0]) FROM person WHERE gramps_id == "I12345"; > > I don't like that one has to write such complicated expressions, but > on the other hand there is no speciality code dealing with any special > cases. This is a general JSON query engine with ability to set values > on arbitrary backend attributes. > > You are right that we could write some code that would turn all lists > into rows, but then we would be changing the semantics, introducing > new field names. And you would have a multiplying effect with number > of rows. I guess if we only turned the *selected* lists into rows, we > could control that to just what is needed... > > Perhaps we can write some special Python functions (something like > filter(), select(), or any()) or syntax that would expand into Python > that would make this easier to write, and wouldn't require any special > coding. > >> That's an interesting idea! I was thinking that it could easily >> explode (number of names x attributes x ...) I'll look into this. >> >>> The functionality looks good though. I can see this used in a generic >>> filter, and perhaps also for a report writer. >> >> Hmmm, a report writer sounds interesting.... > > On the other hand, if this is an interface that would allow people to > write filters, one-off updates, and build custom reports, then it > might be worth write some special-case code... > > -Doug > >> Thanks! >> >> -Doug >> >>> Regards, >>> >>> >>> Nick. >>> >>> >>> ------------------------------------------------------------------------------ >>> Rapidly troubleshoot problems before they affect your business. Most IT >>> organizations don't have a clear picture of how application performance >>> affects their revenue. With AppDynamics, you get 100% visibility into your >>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! >>> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Gramps-devel mailing list >>> Gra...@li... >>> https://lists.sourceforge.net/lists/listinfo/gramps-devel |
From: Doug B. <dou...@gm...> - 2013-12-28 18:58:10
|
On Sat, Dec 28, 2013 at 10:29 AM, Doug Blank <dou...@gm...> wrote: > On Fri, Dec 27, 2013 at 10:09 PM, Doug Blank <dou...@gm...> wrote: >> On Fri, Dec 27, 2013 at 6:40 PM, Doug Blank <dou...@gm...> wrote: >>> Thanks for the feedback... comments below: >>> >>> On Fri, Dec 27, 2013 at 5:42 PM, Nick Hall <nic...@ho...> wrote: >>>> On 24/12/13 16:22, Doug Blank wrote: >>>>> I'd be interested in any limitations you find, or enhancement ideas. >>>> >>>> Doug, >>>> >>>> I get the following problems: >>>> >>>> 1. Segmentation faults in Gtk when returning a large amount of data. >>> >>> Yes, that can happen quite easily with a "select * from person". >>> Perhaps a paged view show the first N items would be useful. >>> >>>> 2. The family.type field is evaluated as a python keyword rather than a >>>> field reference. >>> >>> Fixed. Changed the order of lookup to check the struct first. >>> >>>> 3. In the family table, child_ref_list[0].ref doesn't join to the person >>>> table. >>> >>> Probably an error in the to_struct... I'll look into that. > > Fixed all ref.to_structs. > >>>> I can think of one enhancement. It would be better if lists were >>>> handled as a join to another table returning all rows. > > I changed the SELECT display to show all lists on a row as a > cross-product join so that a single matching object could expand to > many rows. Here is an example: > > SELECT [1, 2, 3], ["a", "b"] FROM person limit 1; > > which will list: > 1, a > 1, b > 2, a > 2, b > 3, a > 3, b > > That will be handy for a report writing system, and produces a better > display for this gramplet. I didn't yet figure out how to change the > eval() to be able to refer to these rows easily other than the Python > expressions shown. Still thinking... > > -Doug > >>>> The selection >>>> could then be narrowed in the where clause. >>>> >>>> For example, to find the participants of an event: >>>> >>>> SELECT primary_name.first_name >>>> FROM person >>>> WHERE event_ref.ref.gramps_id == 'E12345'; >>>> Ok, I was able to change the lookup just slight to allow this to work: SELECT primary_name.first_name FROM person WHERE event_ref_list.ref.gramps_id == 'E12345'; The only different between your example and what actually works now is that I didn't introduce a new name "event_ref" but instead used the "event_ref_list". That is, if you attempt to reference an attribute on a list, in will use that attribute to select matching values from the list. >> So, you can do that selection with Python and the SQL-like SELECT with >> this mess: >> >> SELECT primary_name.first_name, ([e.ref.gramps_id for e in >> event_ref_list if e.ref.gramps_id == "E12345"][0]) FROM person WHERE >> col[1]; >> >> But your syntax is much cleaner; mine works currently without adding >> any more code. (I need to make the parser a bit more robust and >> smart). >> >>>> or to get a specified attribute: >>>> >>>> SELECT attribute.value >>>> FROM person >>>> WHERE gramps_id == 'I12345' and attribute.type.string == 'Identification >>>> Number'; This one is a bit trickier, because the attribute.value should be the same attribute in attribute.type.string. If you did this: SELECT attribute.value, attribute.type.string FROM person; then that becomes a cross-product between the two multi-valued columns value and type.string. An approximation is this: SELECT attribute_list.value FROM person WHERE gramps_id == "I12345" and "Identification Number" in attribute_list.type.string; This (sorta) works because attribute_list.type is a list of all of the attributes that have a type, and you can check to see if one of those is an "Identification Number". But, the attribute value might not be the same attribute that has the type "Identification Number". I think in this instance, you would want to select two fields at once, but not cross-product join them. Something like this propsoed syntax: SELECT attribute_list[value,type] as A FROM person WHERE gramps_id == "I12345" and A[1].string == "Identification Number"; (where A is the tuple (value, type) and A[1] is the type). That wouldn't be too bad, would it? -Doug >> That can be done in Python with: >> >> SELECT ([a.value for a in attribute_list if a.value == "Identification >> Number"][0]) FROM person WHERE gramps_id == "I12345"; >> >> I don't like that one has to write such complicated expressions, but >> on the other hand there is no speciality code dealing with any special >> cases. This is a general JSON query engine with ability to set values >> on arbitrary backend attributes. >> >> You are right that we could write some code that would turn all lists >> into rows, but then we would be changing the semantics, introducing >> new field names. And you would have a multiplying effect with number >> of rows. I guess if we only turned the *selected* lists into rows, we >> could control that to just what is needed... >> >> Perhaps we can write some special Python functions (something like >> filter(), select(), or any()) or syntax that would expand into Python >> that would make this easier to write, and wouldn't require any special >> coding. >> >>> That's an interesting idea! I was thinking that it could easily >>> explode (number of names x attributes x ...) I'll look into this. >>> >>>> The functionality looks good though. I can see this used in a generic >>>> filter, and perhaps also for a report writer. >>> >>> Hmmm, a report writer sounds interesting.... >> >> On the other hand, if this is an interface that would allow people to >> write filters, one-off updates, and build custom reports, then it >> might be worth write some special-case code... >> >> -Doug >> >>> Thanks! >>> >>> -Doug >>> >>>> Regards, >>>> >>>> >>>> Nick. >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Rapidly troubleshoot problems before they affect your business. Most IT >>>> organizations don't have a clear picture of how application performance >>>> affects their revenue. With AppDynamics, you get 100% visibility into your >>>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! >>>> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> Gramps-devel mailing list >>>> Gra...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel |
From: Nick H. <nic...@ho...> - 2013-12-29 18:26:03
|
On 28/12/13 18:58, Doug Blank wrote: > I think in this instance, you would want to select two fields at once, > but not cross-product join them. Something like this propsoed syntax: > > SELECT attribute_list[value,type] as A > FROM person > WHERE gramps_id == "I12345" and A[1].string == "Identification Number"; > > (where A is the tuple (value, type) and A[1] is the type). > > That wouldn't be too bad, would it? It depends on what you are trying to achieve. If you are trying to make the Gramps database look like a SQL database, then you need to imagine an extra attribute table. The query would be something like: SELECT attribute.value FROM person JOIN attribute ON person.handle == attribute.person_handle WHERE person.gramps_id == "I12345" AND attribute.type.string == "Identification Number" Of course the attribute table doesn't actually exist, but you may wish to use the JOIN statement in some way to indicate that we wish to consider the attribute list as a table. On the other hand, if you are creating a new Gramps query language, you could make the syntax more pythonic. Regards, Nick. |
From: Doug B. <dou...@gm...> - 2013-12-29 21:24:31
|
On Sun, Dec 29, 2013 at 1:25 PM, Nick Hall <nic...@ho...> wrote: > On 28/12/13 18:58, Doug Blank wrote: >> >> I think in this instance, you would want to select two fields at once, >> but not cross-product join them. Something like this propsoed syntax: >> >> SELECT attribute_list[value,type] as A >> FROM person >> WHERE gramps_id == "I12345" and A[1].string == "Identification Number"; >> >> (where A is the tuple (value, type) and A[1] is the type). >> >> That wouldn't be too bad, would it? > > > It depends on what you are trying to achieve. > > If you are trying to make the Gramps database look like a SQL database, then > you need to imagine an extra attribute table. The query would be something > like: > > SELECT attribute.value > FROM person > JOIN attribute ON person.handle == attribute.person_handle > WHERE person.gramps_id == "I12345" > AND attribute.type.string == "Identification Number" > > Of course the attribute table doesn't actually exist, but you may wish to > use the JOIN statement in some way to indicate that we wish to consider the > attribute list as a table. > > On the other hand, if you are creating a new Gramps query language, you > could make the syntax more pythonic. I guess my goal can be summarized as: create a fairly easy-to-use interface that could be used in a general way for filters, updates, deletes, and reports without writing very much support code. By taking advantage of Python wherever possible and the generic JSON representations, the interface should be flexible, general, and fairly fast. It might not be able to do everything, but should cover as many use cases as it can. Advanced users should be able to easily create a selection that currently would take a lot of work building up a series of filters. Or perhaps be able to make a selection for which no filter currently exists, and a developer would have to write a Filter. If you can think of more ways of making it more Pythonic, I think that would be easier to implement, and make it more powerful. Thanks! -Doug > Regards, > > > Nick. > |
From: Vassilii K. <vas...@ta...> - 2013-12-30 07:47:15
|
On 29.12.2013 23:24, Doug Blank wrote: > If you can think of more ways of making it more Pythonic, I think that > would be easier to implement, and make it more powerful. You mean, use metaclasses instead of the primary/secondary objects' similar code? |
From: Doug B. <dou...@gm...> - 2014-01-08 14:59:48
|
On Mon, Dec 30, 2013 at 2:47 AM, Vassilii Khachaturov <vas...@ta...> wrote: > On 29.12.2013 23:24, Doug Blank wrote: >> If you can think of more ways of making it more Pythonic, I think that >> would be easier to implement, and make it more powerful. > You mean, use metaclasses instead of the primary/secondary objects' > similar code? Yes. For example, I think that selecting items is, after more refinement, now very useful. Nick gave an example desire: >> SELECT attribute.value >> FROM person >> WHERE gramps_id == 'I12345' and attribute.type.string == 'Identification Number'; You can now do this with the following syntax: SELECT gramps_id, attribute_list("value", type__string="Social Security Number") FROM person WHERE gramps_id == "I0044"; This is now documented [1] but briefly: You can select and filter items in lists (shown now as a JOIN) by using parens after the list name (eg, you literally "call" the list as a function). The arguments are strings of what to show, and the keyword arguments are what to match. In this example, it says to show all items in attribute_list that have a "value" component, but only show those that have a corresponding "type.string" component with a value of "Social Security Number", and select records where the gramps_id is "I0044". (The double-underscore of "type__string" should be read as a dot, and is an idea stolen from Django's filter syntax [2].) The WHERE clause can use the same syntax. I'd like to add four bits of functionality to this Query Gramplet: 1) Be able to easily alter data. Currently, you can set specific values, but not do other common tasks, like add a Tag to an object. 2) Would like to be able to easily use regular expressions. Something like sed or awk. 3) Copy/paste/print the results (something to add to the Quick View) 4) Given better error messages At that point I think that this will be a useful scripting tool for quick searches, simple reports, and batch edits/deletes. Perhaps only for advanced users, but some queries are straightforward. -Doug [1] - http://www.gramps-project.org/wiki/index.php?title=QueryGramplet [2] - https://docs.djangoproject.com/en/1.5/topics/db/queries/ > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk > _______________________________________________ > Gramps-devel mailing list > Gra...@li... > https://lists.sourceforge.net/lists/listinfo/gramps-devel |
From: Tony P. <to...@pr...> - 2014-01-08 15:30:08
|
I think discussions have been down this road before. Using a particular SQL dialect, and using particular tables/columns, and especially any use of text-processing such as awk, are doomed to failure eventually... unless Gramps is going to be frozen and never enhanced, of course. A run-time object model that can be accessed from a script language (custom or standard) can defend again such changes. It can also implement the finer detail of searches/matches/etc without having to rely on any form of text-processing. For instance, what you try to match in a name, whether you want searches to allow case-blind/accent-blind,/abbreviations, or how you want to match a place name, or how you want to access dates. Taking a short-cut now, like the SQL approach, is going to create problems further... I can guarantee it. Tony Proctor ----- Original Message ----- From: "Doug Blank" <dou...@gm...> To: "Vassilii Khachaturov" <vas...@ta...> Cc: "Gramps Development List" <gra...@li...> Sent: Wednesday, January 08, 2014 2:59 PM Subject: Re: [Gramps-devel] SQL-like API for Gramps gen.lib > On Mon, Dec 30, 2013 at 2:47 AM, Vassilii Khachaturov > <vas...@ta...> wrote: >> On 29.12.2013 23:24, Doug Blank wrote: >>> If you can think of more ways of making it more Pythonic, I think that >>> would be easier to implement, and make it more powerful. >> You mean, use metaclasses instead of the primary/secondary objects' >> similar code? > > Yes. For example, I think that selecting items is, after more > refinement, now very useful. > > Nick gave an example desire: > >>> SELECT attribute.value >>> FROM person >>> WHERE gramps_id == 'I12345' and attribute.type.string == 'Identification >>> Number'; > > You can now do this with the following syntax: > > SELECT gramps_id, attribute_list("value", type__string="Social Security > Number") > FROM person > WHERE gramps_id == "I0044"; > > This is now documented [1] but briefly: > > You can select and filter items in lists (shown now as a JOIN) by > using parens after the list name (eg, you literally "call" the list as > a function). The arguments are strings of what to show, and the > keyword arguments are what to match. In this example, it says to show > all items in attribute_list that have a "value" component, but only > show those that have a corresponding "type.string" component with a > value of "Social Security Number", and select records where the > gramps_id is "I0044". (The double-underscore of "type__string" should > be read as a dot, and is an idea stolen from Django's filter syntax > [2].) The WHERE clause can use the same syntax. > > I'd like to add four bits of functionality to this Query Gramplet: > > 1) Be able to easily alter data. Currently, you can set specific > values, but not do other common tasks, like add a Tag to an object. > 2) Would like to be able to easily use regular expressions. Something > like sed or awk. > 3) Copy/paste/print the results (something to add to the Quick View) > 4) Given better error messages > > At that point I think that this will be a useful scripting tool for > quick searches, simple reports, and batch edits/deletes. Perhaps only > for advanced users, but some queries are straightforward. > > -Doug > > [1] - http://www.gramps-project.org/wiki/index.php?title=QueryGramplet > [2] - https://docs.djangoproject.com/en/1.5/topics/db/queries/ > >> >> ------------------------------------------------------------------------------ >> Rapidly troubleshoot problems before they affect your business. Most IT >> organizations don't have a clear picture of how application performance >> affects their revenue. With AppDynamics, you get 100% visibility into >> your >> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics >> Pro! >> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk >> _______________________________________________ >> Gramps-devel mailing list >> Gra...@li... >> https://lists.sourceforge.net/lists/listinfo/gramps-devel > > ------------------------------------------------------------------------------ > Rapidly troubleshoot problems before they affect your business. Most IT > organizations don't have a clear picture of how application performance > affects their revenue. With AppDynamics, you get 100% visibility into your > Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics > Pro! > http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk > _______________________________________________ > Gramps-devel mailing list > Gra...@li... > https://lists.sourceforge.net/lists/listinfo/gramps-devel |
From: Michael T. <mic...@gm...> - 2014-01-08 15:40:30
|
On Wed, Jan 8, 2014 at 10:28 AM, Tony Proctor <to...@pr...> wrote: > Taking a short-cut now, like the SQL approach, is going to create problems > further... I can guarantee it. There seems to be so many database abstraction tools/layers for languages such as python, why wouldn't begining the process of incorporating this abstraction layer(s) be a first step anyway? By the time that's done, the actual decision on database to use (if one /is/ used) would be made and ready to implement? It's a very badly worded question. -- << MCT >> Michael C Tiernan. http://www.linkedin.com/in/mtiernan Non Impediti Ratione Cogatationis |
From: Doug B. <dou...@gm...> - 2014-01-08 15:59:59
|
Tony, Don't worry! This Query language is only an addon-on (third-party plugin) and sits on top of Gramps' full data model. I've been working on a variety of high-level data manipulations (like synchronizing data). This is a small (about 300 lines of code) utility that takes advantage of a self-documenting structure representation of the data to allow edits/deletes/selects. On Wed, Jan 8, 2014 at 10:28 AM, Tony Proctor <to...@pr...> wrote: > I think discussions have been down this road before. Using a particular SQL > dialect, and using particular tables/columns, and especially any use of > text-processing such as awk, are doomed to failure eventually... unless > Gramps is going to be frozen and never enhanced, of course. Use of awk-like matches are just a convenient, well-known system. We're not talking about actually using awk, just taken some useful metaphors and functions. > A run-time object model that can be accessed from a script language (custom > or standard) can defend again such changes. It can also implement the finer > detail of searches/matches/etc without having to rely on any form of > text-processing. For instance, what you try to match in a name, whether you > want searches to allow case-blind/accent-blind,/abbreviations, or how you > want to match a place name, or how you want to access dates. > > Taking a short-cut now, like the SQL approach, is going to create problems > further... I can guarantee it. Not sure what you mean... I'm just talking about a system to easily access the data in exactly the form it lives in in the database. Take a look at the docs. -Doug > Tony Proctor > > ----- Original Message ----- From: "Doug Blank" <dou...@gm...> > To: "Vassilii Khachaturov" <vas...@ta...> > Cc: "Gramps Development List" <gra...@li...> > Sent: Wednesday, January 08, 2014 2:59 PM > Subject: Re: [Gramps-devel] SQL-like API for Gramps gen.lib > > > >> On Mon, Dec 30, 2013 at 2:47 AM, Vassilii Khachaturov >> <vas...@ta...> wrote: >>> >>> On 29.12.2013 23:24, Doug Blank wrote: >>>> >>>> If you can think of more ways of making it more Pythonic, I think that >>>> would be easier to implement, and make it more powerful. >>> >>> You mean, use metaclasses instead of the primary/secondary objects' >>> similar code? >> >> >> Yes. For example, I think that selecting items is, after more >> refinement, now very useful. >> >> Nick gave an example desire: >> >>>> SELECT attribute.value >>>> FROM person >>>> WHERE gramps_id == 'I12345' and attribute.type.string == 'Identification >>>> Number'; >> >> >> You can now do this with the following syntax: >> >> SELECT gramps_id, attribute_list("value", type__string="Social Security >> Number") >> FROM person >> WHERE gramps_id == "I0044"; >> >> This is now documented [1] but briefly: >> >> You can select and filter items in lists (shown now as a JOIN) by >> using parens after the list name (eg, you literally "call" the list as >> a function). The arguments are strings of what to show, and the >> keyword arguments are what to match. In this example, it says to show >> all items in attribute_list that have a "value" component, but only >> show those that have a corresponding "type.string" component with a >> value of "Social Security Number", and select records where the >> gramps_id is "I0044". (The double-underscore of "type__string" should >> be read as a dot, and is an idea stolen from Django's filter syntax >> [2].) The WHERE clause can use the same syntax. >> >> I'd like to add four bits of functionality to this Query Gramplet: >> >> 1) Be able to easily alter data. Currently, you can set specific >> values, but not do other common tasks, like add a Tag to an object. >> 2) Would like to be able to easily use regular expressions. Something >> like sed or awk. >> 3) Copy/paste/print the results (something to add to the Quick View) >> 4) Given better error messages >> >> At that point I think that this will be a useful scripting tool for >> quick searches, simple reports, and batch edits/deletes. Perhaps only >> for advanced users, but some queries are straightforward. >> >> -Doug >> >> [1] - http://www.gramps-project.org/wiki/index.php?title=QueryGramplet >> [2] - https://docs.djangoproject.com/en/1.5/topics/db/queries/ >> >>> >>> >>> ------------------------------------------------------------------------------ >>> Rapidly troubleshoot problems before they affect your business. Most IT >>> organizations don't have a clear picture of how application performance >>> affects their revenue. With AppDynamics, you get 100% visibility into >>> your >>> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics >>> Pro! >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Gramps-devel mailing list >>> Gra...@li... >>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >> >> >> >> ------------------------------------------------------------------------------ >> Rapidly troubleshoot problems before they affect your business. Most IT >> organizations don't have a clear picture of how application performance >> affects their revenue. With AppDynamics, you get 100% visibility into your >> Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics >> Pro! >> >> http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk >> _______________________________________________ >> Gramps-devel mailing list >> Gra...@li... >> https://lists.sourceforge.net/lists/listinfo/gramps-devel > > |
From: Tony P. <to...@pr...> - 2014-01-08 15:53:52
|
Because it's not abstract enough Michael. It's tied to the current Gramps SQL schema. Suppose one of the Gramps tables is later factorised into two tables, either for normalisation or indexing purposes. In order to ensure older scripts continue to work, you'd probably have to define some VIEWs, right? Suppose there's a major re-organisation of the schema, though. This is one of the reasons for having an abstract data-model. Ok, you're probably about to say that the SQL could be accessing abstract tables rather than the physical ones, but it's still a SQL-like query language. If you're trying to match a person by name, or a place by name, then a simple text match won't cut-it. There may be multiple spellings of a surname, for instance. This will be one of the reasons people will be asking for awk-like processing. However, it would be much more stable, and efficient, to put the required processing on the other side of a method call. A simple illustration that I've used before involves a contrived requirement to look at all the events in a timeline, then look at all the people sharing those same events, and then to select just the ones whose name(s) have the element "Jesson" in them. This example script uses a java-like syntax but the underlying object model would not be tied to any language (incl. SQL). Person me = New Person("Tony Proctor", 1956); for (Event e: me.allEvents()) { for (Person other: e.allPersons()) { if (other.nameContains("Jesson")) { ...do something with this other person... } } } Tony Proctor ----- Original Message ----- From: "Michael Tiernan" <mic...@gm...> To: "Tony Proctor" <to...@pr...> Cc: "Doug Blank" <dou...@gm...>; "Vassilii Khachaturov" <vas...@ta...>; "Gramps Development List" <gra...@li...> Sent: Wednesday, January 08, 2014 3:40 PM Subject: Re: [Gramps-devel] SQL-like API for Gramps gen.lib > On Wed, Jan 8, 2014 at 10:28 AM, Tony Proctor <to...@pr...> wrote: >> Taking a short-cut now, like the SQL approach, is going to create >> problems >> further... I can guarantee it. > There seems to be so many database abstraction tools/layers for > languages such as python, why wouldn't begining the process of > incorporating this abstraction layer(s) be a first step anyway? > > By the time that's done, the actual decision on database to use (if > one /is/ used) would be made and ready to implement? > > It's a very badly worded question. > -- > << MCT >> Michael C Tiernan. > http://www.linkedin.com/in/mtiernan > Non Impediti Ratione Cogatationis |
From: Doug B. <dou...@gm...> - 2014-01-08 16:02:20
|
On Wed, Jan 8, 2014 at 10:52 AM, Tony Proctor <to...@pr...> wrote: > Because it's not abstract enough Michael. It's tied to the current Gramps > SQL schema. Suppose one of the Gramps tables is later factorised into two No its not tied to a SQL schema. It is tied to the hierarchical data. I'm pretty happy that I was able to make a SQL-like thing, and yet keep exactly the Gramps data model. > tables, either for normalisation or indexing purposes. In order to ensure > older scripts continue to work, you'd probably have to define some VIEWs, > right? Suppose there's a major re-organisation of the schema, though. > > This is one of the reasons for having an abstract data-model. Ok, you're > probably about to say that the SQL could be accessing abstract tables rather > than the physical ones, but it's still a SQL-like query language. If you're > trying to match a person by name, or a place by name, then a simple text > match won't cut-it. There may be multiple spellings of a surname, for > instance. This will be one of the reasons people will be asking for awk-like > processing. However, it would be much more stable, and efficient, to put the > required processing on the other side of a method call. > > A simple illustration that I've used before involves a contrived requirement > to look at all the events in a timeline, then look at all the people sharing > those same events, and then to select just the ones whose name(s) have the > element "Jesson" in them. This example script uses a java-like syntax but > the underlying object model would not be tied to any language (incl. SQL). All try to construct a similar query (in this new language) when I get a chance. -Doug > Person me = New Person("Tony Proctor", 1956); > for (Event e: me.allEvents()) { > for (Person other: e.allPersons()) { > if (other.nameContains("Jesson")) { > ...do something with this other person... > } > } > } > > > Tony Proctor > > ----- Original Message ----- From: "Michael Tiernan" > <mic...@gm...> > To: "Tony Proctor" <to...@pr...> > Cc: "Doug Blank" <dou...@gm...>; "Vassilii Khachaturov" > <vas...@ta...>; "Gramps Development List" > <gra...@li...> > Sent: Wednesday, January 08, 2014 3:40 PM > > Subject: Re: [Gramps-devel] SQL-like API for Gramps gen.lib > > >> On Wed, Jan 8, 2014 at 10:28 AM, Tony Proctor <to...@pr...> wrote: >>> >>> Taking a short-cut now, like the SQL approach, is going to create >>> problems >>> further... I can guarantee it. >> >> There seems to be so many database abstraction tools/layers for >> languages such as python, why wouldn't begining the process of >> incorporating this abstraction layer(s) be a first step anyway? >> >> By the time that's done, the actual decision on database to use (if >> one /is/ used) would be made and ready to implement? >> >> It's a very badly worded question. >> -- >> << MCT >> Michael C Tiernan. >> http://www.linkedin.com/in/mtiernan >> Non Impediti Ratione Cogatationis > > |
From: Doug B. <dou...@gm...> - 2014-01-08 18:36:58
|
>> Person me = New Person("Tony Proctor", 1956); >> for (Event e: me.allEvents()) { >> for (Person other: e.allPersons()) { >> if (other.nameContains("Jesson")) { >> ...do something with this other person... >> } >> } >> } So this SQL-like scripting language currently only has access to data, no methods. So, many convenient methods are not available (like finding back-references). I would image that if one wanted to do this (which would take a bit of Gramps code) in the SQL-like language, one would do something like: SELECT gramps_id, event_ref_list.ref.gramps_id FROM person WHERE "Jesson" in primary_name.surname_list.surname; (The awk-like regular expression utilities would just make it so that I could easily find a more general pattern, such as "J.*s+.*n".) This query selects all people named "Jesson" who have events. (Alternate names matches would require another clause). I'd probably Tag these matches. Then Tag events referenced by the Person whose name is "Tony Proctor". And finally look for people that reference Events that are tagged by both. This isn't meant to replace Gramps Python code, but be a quick method of doing things that regular users wouldn't normally be able to do (or that would require a lot of filters and time). So, this isn't an abstraction per se. In fact, you have to know exactly how the data are laid out to use it. That makes it so that you don't have any of the problems you mention, but requires that users know how the data are designed, which is a different problem---and that could well make this way too hard to use. -Doug |
From: Helge.Herz <Hel...@we...> - 2014-01-18 12:30:51
|
Doug, If I try "select * from person" I get a so called parent_familiy_list with one column only and with e.g. this first row content: {'gramps_id': u'F00052',...}. It's also family (not person) related. A simple select * from person takes a lot of time. It seems to me helpful too to have at least something like an activity bar or an counter to show the process is going on instead 'no response' in the top of the result window. Would it be possible to have a SORTED BY too? It would be very helpful to have a real multi column sorting. select * from family results in: 'NoneType' object has no attribute 'struct' Great: The win-like Ctrl-V runs well within the query editor! Currently it's possible select several lines within the result window. But I don't find an copy to clipboard or an export. I think, it would be helpful too. It seems to me the change value within the result list doesn't show a real date but a number coding the date. All done with last master from GIT on Win7 Prof 64bit German python : 2.7.6 gramps : 4.1.0 gtk++ : 3.10.4 pygobject : 3.10.2 pango : 1.36.1 Not using bsddb3 bsddb : 5.3.0 bsddb.db : 4.7.25 cairo : 1.12.16 pycairo : 1.10.0 osmgpsmap : 1.0 GExiv2 : 0.4 ICU : 50.1.2 PyICU : 1.5 o.s. : win32 Environment settings: --------------------- LANG : de_DE.UTF-8 - Helge |
From: Doug B. <dou...@gm...> - 2014-01-18 13:41:55
|
Helge, thanks for the feedback; comments below: On Sat, Jan 18, 2014 at 7:30 AM, Helge.Herz <Hel...@we...> wrote: > Doug, > If I try "select * from person" I get a so called parent_familiy_list > with one column only and with e.g. this first row content: > {'gramps_id': u'F00052',...}. It's also family (not person) related. The query gramplet uses the raw data, and raw data names. So, every person has a list of parents. In the gen.lib.Person, these families are stored in parent_family_list. In the raw data, the person.parent_family_list is a list of handles that indicate which family. In the gramplet, these are looked-up and shown in the {}. That shows you the values, and also the field names. So you could also: SELECT gramps_id, parent_family_list.gramps_id AS "Family ID" FROM person; and see all of the parent families that each person has. You'll see these one per row, unless you add the keyword FLAT, and then it will show as a list of family gramps_ids. > A simple select * from person takes a lot of time. It seems to me > helpful too to have at least something like an activity bar or an > counter to show the process is going on instead 'no response' in the top > of the result window. That would be some nice refinement. I think I might also explore making the gramplet "yield" the data, so that it will be fast, and then show the data in a paged manner in the QuickView (table window that pops up). > Would it be possible to have a SORTED BY too? It would be very helpful > to have a real multi column sorting. Yes, that should be easy to add (although will make it so that you have to process all data before you can sort). > select * from family results in: 'NoneType' object has no attribute 'struct' Oops, that's a regression. Thanks, I'll track that down. > Great: The win-like Ctrl-V runs well within the query editor! Yes, that was a general issue I fixed that had prevented that from working on certain views. > Currently it's possible select several lines within the result window. > But I don't find an copy to clipboard or an export. I think, it would be > helpful too. Yes, that is something that I have been meaning to add to the SimpleTable for some time. I want to be able to copy and export/save from there. > It seems to me the change value within the result list doesn't show a > real date but a number coding the date. I have some local changes I need to push out to allow you to at least be able to convert between those numbers easily. I think that this will be most useful for quick selection and tagging. Let me know if you have any suggestions! Thanks again, -Doug > All done with last master from GIT on Win7 Prof 64bit German > python : 2.7.6 > gramps : 4.1.0 > gtk++ : 3.10.4 > pygobject : 3.10.2 > pango : 1.36.1 > Not using bsddb3 > bsddb : 5.3.0 > bsddb.db : 4.7.25 > cairo : 1.12.16 > pycairo : 1.10.0 > osmgpsmap : 1.0 > GExiv2 : 0.4 > ICU : 50.1.2 > PyICU : 1.5 > o.s. : win32 > > Environment settings: > --------------------- > LANG : de_DE.UTF-8 > - Helge > > > > > > ------------------------------------------------------------------------------ > CenturyLink Cloud: The Leader in Enterprise Cloud Services. > Learn Why More Businesses Are Choosing CenturyLink Cloud For > Critical Workloads, Development Environments & Everything In Between. > Get a Quote or Start a Free Trial Today. > http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk > _______________________________________________ > Gramps-devel mailing list > Gra...@li... > https://lists.sourceforge.net/lists/listinfo/gramps-devel |