From: jerome <rom...@ya...> - 2011-01-14 18:37:01
|
Hi, I am trying to get an answer to a question about the code: why we cannot keep the order of objects after a Gramps XML file import against export ? Nick pointed out that objects are not ordered on export[1]. Why ? I suppose backup scripts or revision control tools will work better with ordered objects! Anyway, to use 'sort_handles=True' works on export, except for family handles. Any reason for that ? A typo somewhere ? On my side ? [1] http://www.gramps-project.org/bugs/view.php?id=4365 regards, Jérôme |
From: Gerald B. <ger...@gm...> - 2011-01-14 18:53:59
|
The data is not ordered since it comes from bsddb in random order. If we ordered it, we would have to sort it by some key. So, if we did, what keys would you use for: person family event source place repository note media object On Fri, Jan 14, 2011 at 1:36 PM, jerome <rom...@ya...> wrote: > Hi, > > > I am trying to get an answer to a question about the code: why we cannot keep the order of objects after a Gramps XML file import against export ? > > Nick pointed out that objects are not ordered on export[1]. > Why ? I suppose backup scripts or revision control tools will work better with ordered objects! Anyway, to use 'sort_handles=True' works on export, except for family handles. Any reason for that ? A typo somewhere ? On my side ? > > > [1] http://www.gramps-project.org/bugs/view.php?id=4365 > > regards, > Jérôme > > > > > ------------------------------------------------------------------------------ > Protect Your Site and Customers from Malware Attacks > Learn about various malware tactics and how to avoid them. Understand > malware threats, the impact they can have on your business, and how you > can protect your company and customers by using code signing. > http://p.sf.net/sfu/oracle-sfdevnl > _______________________________________________ > Gramps-devel mailing list > Gra...@li... > https://lists.sourceforge.net/lists/listinfo/gramps-devel > -- Gerald Britton |
From: jerome <rom...@ya...> - 2011-01-14 20:12:01
|
I am not certain to understand ... Keys should be handles, no ? 'self.db.get_{object}_handles(sort_handles=True)' is allowed, not 'self.db.iter_{object}_handles(sort_handles=True)'! There is two questions: 1. Why does Gramps only use self.db.iter_family_handles(), else self.get_{object}_handles(), where {object} is person or event or source or place or repository or note or media object. 2. Why 'sort_handles=True' argument is allowed on all primary objects except family object ? > The data is not ordered since it > comes from bsddb in random order. This could explain why I will not be able to keep order on XML import (to bsddb). :( Thanks. Jérôme --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> a écrit : > De: Gerald Britton <ger...@gm...> > Objet: Re: [Gramps-devel] self.db.iter_object_handles(sort_handles=True) > À: "jerome" <rom...@ya...> > Cc: gra...@li... > Date: Vendredi 14 janvier 2011, 19h53 > The data is not ordered since it > comes from bsddb in random order. If > we ordered it, we would have to sort it by some key. > So, if we did, > what keys would you use for: > > person > family > event > source > place > repository > note > media object > > On Fri, Jan 14, 2011 at 1:36 PM, jerome <rom...@ya...> > wrote: > > Hi, > > > > > > I am trying to get an answer to a question about the > code: why we cannot keep the order of objects after a Gramps > XML file import against export ? > > > > Nick pointed out that objects are not ordered on > export[1]. > > Why ? I suppose backup scripts or revision control > tools will work better with ordered objects! Anyway, to use > 'sort_handles=True' works on export, except for family > handles. Any reason for that ? A typo somewhere ? On my side > ? > > > > > > [1] http://www.gramps-project.org/bugs/view.php?id=4365 > > > > regards, > > Jérôme > > > > > > > > > > > ------------------------------------------------------------------------------ > > Protect Your Site and Customers from Malware Attacks > > Learn about various malware tactics and how to avoid > them. Understand > > malware threats, the impact they can have on your > business, and how you > > can protect your company and customers by using code > signing. > > http://p.sf.net/sfu/oracle-sfdevnl > > _______________________________________________ > > Gramps-devel mailing list > > Gra...@li... > > https://lists.sourceforge.net/lists/listinfo/gramps-devel > > > > > > -- > Gerald Britton > |
From: Gerald B. <ger...@gm...> - 2011-01-14 20:21:39
|
On Fri, Jan 14, 2011 at 3:11 PM, jerome <rom...@ya...> wrote: > I am not certain to understand ... > Keys should be handles, no ? Well, that's the question! I can see a case for gramps ids, or surnames, or event dates, etc. etc. > > 'self.db.get_{object}_handles(sort_handles=True)' is allowed, > not 'self.db.iter_{object}_handles(sort_handles=True)'! > > There is two questions: > > 1. Why does Gramps only use self.db.iter_family_handles(), else self.get_{object}_handles(), where {object} is person or event or source or place or repository or note or media object. the get_...handles methods return a list, which can be expensive in memory and must read all objects in one pass. The iter... methods just return one at at time, so are cheaper in memory. So, the iter... methods are preferable. OTOH, they cannot do sorting, since by definition you need to read all records before you can sort them. > > 2. Why 'sort_handles=True' argument is allowed on all primary objects except family object ? I suppose that there has been no requirement so far so no one coded it up. > >> The data is not ordered since it >> comes from bsddb in random order. > > This could explain why I will not be able to keep order on XML import (to bsddb). :( > > > Thanks. > Jérôme > > --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> a écrit : > >> De: Gerald Britton <ger...@gm...> >> Objet: Re: [Gramps-devel] self.db.iter_object_handles(sort_handles=True) >> À: "jerome" <rom...@ya...> >> Cc: gra...@li... >> Date: Vendredi 14 janvier 2011, 19h53 >> The data is not ordered since it >> comes from bsddb in random order. If >> we ordered it, we would have to sort it by some key. >> So, if we did, >> what keys would you use for: >> >> person >> family >> event >> source >> place >> repository >> note >> media object >> >> On Fri, Jan 14, 2011 at 1:36 PM, jerome <rom...@ya...> >> wrote: >> > Hi, >> > >> > >> > I am trying to get an answer to a question about the >> code: why we cannot keep the order of objects after a Gramps >> XML file import against export ? >> > >> > Nick pointed out that objects are not ordered on >> export[1]. >> > Why ? I suppose backup scripts or revision control >> tools will work better with ordered objects! Anyway, to use >> 'sort_handles=True' works on export, except for family >> handles. Any reason for that ? A typo somewhere ? On my side >> ? >> > >> > >> > [1] http://www.gramps-project.org/bugs/view.php?id=4365 >> > >> > regards, >> > Jérôme >> > >> > >> > >> > >> > >> ------------------------------------------------------------------------------ >> > Protect Your Site and Customers from Malware Attacks >> > Learn about various malware tactics and how to avoid >> them. Understand >> > malware threats, the impact they can have on your >> business, and how you >> > can protect your company and customers by using code >> signing. >> > http://p.sf.net/sfu/oracle-sfdevnl >> > _______________________________________________ >> > Gramps-devel mailing list >> > Gra...@li... >> > https://lists.sourceforge.net/lists/listinfo/gramps-devel >> > >> >> >> >> -- >> Gerald Britton >> > > > > -- Gerald Britton |
From: jerome <rom...@ya...> - 2011-01-14 20:59:55
|
> > I am not certain to understand ... > > Keys should be handles, no ? > > Well, that's the question! I can see a case for > gramps ids, or > surnames, or event dates, etc. etc. But handle is the easiest way and safe key for ordering our data. gramps ids could be exotic! surnames is not a good key :( date => date_object => year, then month, then day, then rank, etc ... = horrible index My problem is on plugins/export/ExportXML.py I saw a sortByID function not used, then sometimes the use of list (get_...), then iteration (only family handles). I thought on use lists sorted by handle for having an order rule. I do not want to group handles, handles will be grouped into the Gramps XML, so it was not planned to parse one flat XML file or something like that! But it is not my main problem ... I thought that to sort handles means objects lists will be consistent (Persons, Families, Events, etc ...) Every time I import a Gramps XML, Gramps rebuilds (write, DB commit) some objects! Change time is not the same with a simple import then export. I can understand the random order used by bsddb, but this should not be done on some objects (like family) and not on the others. In my mind, an import without DB change is like a "read-only": it is not the case. OK, you are saying that it is the way used by bsddb. XML files should be able to use 'diff' or revision control tools. With current Gramps XML import/export, these tools are limited. :( Jérôme --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> a écrit : > De: Gerald Britton <ger...@gm...> > Objet: Re: [Gramps-devel] self.db.iter_object_handles(sort_handles=True) > À: "jerome" <rom...@ya...> > Cc: gra...@li... > Date: Vendredi 14 janvier 2011, 21h21 > On Fri, Jan 14, 2011 at 3:11 PM, > jerome <rom...@ya...> > wrote: > > I am not certain to understand ... > > Keys should be handles, no ? > > Well, that's the question! I can see a case for > gramps ids, or > surnames, or event dates, etc. etc. > > > > > 'self.db.get_{object}_handles(sort_handles=True)' is > allowed, > > not > 'self.db.iter_{object}_handles(sort_handles=True)'! > > > > There is two questions: > > > > 1. Why does Gramps only use > self.db.iter_family_handles(), else > self.get_{object}_handles(), where {object} is person or > event or source or place or repository or note or media > object. > > the get_...handles methods return a list, which can be > expensive in > memory and must read all objects in one pass. The > iter... methods > just return one at at time, so are cheaper in memory. > So, the iter... > methods are preferable. OTOH, they cannot do sorting, > since by > definition you need to read all records before you can sort > them. > > > > > 2. Why 'sort_handles=True' argument is allowed on all > primary objects except family object ? > > I suppose that there has been no requirement so far so no > one coded it up. > > > > >> The data is not ordered since it > >> comes from bsddb in random order. > > > > This could explain why I will not be able to keep > order on XML import (to bsddb). :( > > > > > > Thanks. > > Jérôme > > > > --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> > a écrit : > > > >> De: Gerald Britton <ger...@gm...> > >> Objet: Re: [Gramps-devel] > self.db.iter_object_handles(sort_handles=True) > >> À: "jerome" <rom...@ya...> > >> Cc: gra...@li... > >> Date: Vendredi 14 janvier 2011, 19h53 > >> The data is not ordered since it > >> comes from bsddb in random order. If > >> we ordered it, we would have to sort it by some > key. > >> So, if we did, > >> what keys would you use for: > >> > >> person > >> family > >> event > >> source > >> place > >> repository > >> note > >> media object > >> > >> On Fri, Jan 14, 2011 at 1:36 PM, jerome <rom...@ya...> > >> wrote: > >> > Hi, > >> > > >> > > >> > I am trying to get an answer to a question > about the > >> code: why we cannot keep the order of objects > after a Gramps > >> XML file import against export ? > >> > > >> > Nick pointed out that objects are not ordered > on > >> export[1]. > >> > Why ? I suppose backup scripts or revision > control > >> tools will work better with ordered objects! > Anyway, to use > >> 'sort_handles=True' works on export, except for > family > >> handles. Any reason for that ? A typo somewhere ? > On my side > >> ? > >> > > >> > > >> > [1] http://www.gramps-project.org/bugs/view.php?id=4365 > >> > > >> > regards, > >> > Jérôme > >> > > >> > > >> > > >> > > >> > > >> > ------------------------------------------------------------------------------ > >> > Protect Your Site and Customers from Malware > Attacks > >> > Learn about various malware tactics and how > to avoid > >> them. Understand > >> > malware threats, the impact they can have on > your > >> business, and how you > >> > can protect your company and customers by > using code > >> signing. > >> > http://p.sf.net/sfu/oracle-sfdevnl > >> > > _______________________________________________ > >> > Gramps-devel mailing list > >> > Gra...@li... > >> > https://lists.sourceforge.net/lists/listinfo/gramps-devel > >> > > >> > >> > >> > >> -- > >> Gerald Britton > >> > > > > > > > > > > > > -- > Gerald Britton > |
From: Gerald B. <ger...@gm...> - 2011-01-14 21:10:34
|
On Fri, Jan 14, 2011 at 3:59 PM, jerome <rom...@ya...> wrote: >> > I am not certain to understand ... >> > Keys should be handles, no ? >> >> Well, that's the question! I can see a case for >> gramps ids, or >> surnames, or event dates, etc. etc. > > But handle is the easiest way and safe key for ordering our data. Only if that's the order you want > > gramps ids could be exotic! Do you mean unique? Anyway it is a good sort-key candidate > surnames is not a good key :( I can see that some would like it...makes the XML easier to read by a human > date => date_object => year, then month, then day, then rank, etc ... = horrible index Probably, but its just one possibility > > My problem is on plugins/export/ExportXML.py > > I saw a sortByID function not used, then sometimes the use of list (get_...), then iteration (only family handles). > > I thought on use lists sorted by handle for having an order rule. I do not want to group handles, handles will be grouped into the Gramps XML, so it was not planned to parse one flat XML file or something like that! > > But it is not my main problem ... > I thought that to sort handles means objects lists will be consistent (Persons, Families, Events, etc ...) > > Every time I import a Gramps XML, Gramps rebuilds (write, DB commit) some objects! Change time is not the same with a simple import then export. Well, they all need new handles, right? Possibility of collisions. Also with gramps ids. > > I can understand the random order used by bsddb, but this should not be done on some objects (like family) and not on the others. > > In my mind, an import without DB change is like a "read-only": it is not the case. OK, you are saying that it is the way used by bsddb. XML files should be able to use 'diff' or revision control tools. With current Gramps XML import/export, these tools are limited. :( Yep. You're probably looking for something like a UUID for each record. Not a bad idea but not implemented at the moment. > > > Jérôme > > > --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> a écrit : > >> De: Gerald Britton <ger...@gm...> >> Objet: Re: [Gramps-devel] self.db.iter_object_handles(sort_handles=True) >> À: "jerome" <rom...@ya...> >> Cc: gra...@li... >> Date: Vendredi 14 janvier 2011, 21h21 >> On Fri, Jan 14, 2011 at 3:11 PM, >> jerome <rom...@ya...> >> wrote: >> > I am not certain to understand ... >> > Keys should be handles, no ? >> >> Well, that's the question! I can see a case for >> gramps ids, or >> surnames, or event dates, etc. etc. >> >> > >> > 'self.db.get_{object}_handles(sort_handles=True)' is >> allowed, >> > not >> 'self.db.iter_{object}_handles(sort_handles=True)'! >> > >> > There is two questions: >> > >> > 1. Why does Gramps only use >> self.db.iter_family_handles(), else >> self.get_{object}_handles(), where {object} is person or >> event or source or place or repository or note or media >> object. >> >> the get_...handles methods return a list, which can be >> expensive in >> memory and must read all objects in one pass. The >> iter... methods >> just return one at at time, so are cheaper in memory. >> So, the iter... >> methods are preferable. OTOH, they cannot do sorting, >> since by >> definition you need to read all records before you can sort >> them. >> >> > >> > 2. Why 'sort_handles=True' argument is allowed on all >> primary objects except family object ? >> >> I suppose that there has been no requirement so far so no >> one coded it up. >> >> > >> >> The data is not ordered since it >> >> comes from bsddb in random order. >> > >> > This could explain why I will not be able to keep >> order on XML import (to bsddb). :( >> > >> > >> > Thanks. >> > Jérôme >> > >> > --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> >> a écrit : >> > >> >> De: Gerald Britton <ger...@gm...> >> >> Objet: Re: [Gramps-devel] >> self.db.iter_object_handles(sort_handles=True) >> >> À: "jerome" <rom...@ya...> >> >> Cc: gra...@li... >> >> Date: Vendredi 14 janvier 2011, 19h53 >> >> The data is not ordered since it >> >> comes from bsddb in random order. If >> >> we ordered it, we would have to sort it by some >> key. >> >> So, if we did, >> >> what keys would you use for: >> >> >> >> person >> >> family >> >> event >> >> source >> >> place >> >> repository >> >> note >> >> media object >> >> >> >> On Fri, Jan 14, 2011 at 1:36 PM, jerome <rom...@ya...> >> >> wrote: >> >> > Hi, >> >> > >> >> > >> >> > I am trying to get an answer to a question >> about the >> >> code: why we cannot keep the order of objects >> after a Gramps >> >> XML file import against export ? >> >> > >> >> > Nick pointed out that objects are not ordered >> on >> >> export[1]. >> >> > Why ? I suppose backup scripts or revision >> control >> >> tools will work better with ordered objects! >> Anyway, to use >> >> 'sort_handles=True' works on export, except for >> family >> >> handles. Any reason for that ? A typo somewhere ? >> On my side >> >> ? >> >> > >> >> > >> >> > [1] http://www.gramps-project.org/bugs/view.php?id=4365 >> >> > >> >> > regards, >> >> > Jérôme >> >> > >> >> > >> >> > >> >> > >> >> > >> >> >> ------------------------------------------------------------------------------ >> >> > Protect Your Site and Customers from Malware >> Attacks >> >> > Learn about various malware tactics and how >> to avoid >> >> them. Understand >> >> > malware threats, the impact they can have on >> your >> >> business, and how you >> >> > can protect your company and customers by >> using code >> >> signing. >> >> > http://p.sf.net/sfu/oracle-sfdevnl >> >> > >> _______________________________________________ >> >> > Gramps-devel mailing list >> >> > Gra...@li... >> >> > https://lists.sourceforge.net/lists/listinfo/gramps-devel >> >> > >> >> >> >> >> >> >> >> -- >> >> Gerald Britton >> >> >> > >> > >> > >> > >> >> >> >> -- >> Gerald Britton >> > > > > -- Gerald Britton |
From: jerome <rom...@ya...> - 2011-01-14 21:31:27
|
> > gramps ids could be exotic! > Do you mean unique? Anyway it is a good sort-key > candidate ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] In 'handle' I trust! ;) > > Every time I import a Gramps XML, Gramps rebuilds > (write, DB commit) some objects! Change time is not the same > with a simple import then export. > Well, they all need new handles, right? Possibility > of collisions. > Also with gramps ids. In fact, I want to keep handles: they should be the keys control. My problem could be illustrated by something like: $ gramps -i import.gramps -e export.gramps $ gunzip < import.gramps > import.xml $ gunzip < export.gramps > export.xml $ diff -u import.xml export.xml > diff.txt where import.gramps is our "Scientific control". What should be the content of diff.txt ? For me, it should be few lines... Unfortunatly there is some change (order, change time on family objects): that's strange! Jérôme --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> a écrit : > De: Gerald Britton <ger...@gm...> > Objet: Re: [Gramps-devel] self.db.iter_object_handles(sort_handles=True) > À: "jerome" <rom...@ya...> > Cc: gra...@li... > Date: Vendredi 14 janvier 2011, 22h10 > On Fri, Jan 14, 2011 at 3:59 PM, > jerome <rom...@ya...> > wrote: > >> > I am not certain to understand ... > >> > Keys should be handles, no ? > >> > >> Well, that's the question! I can see a case for > >> gramps ids, or > >> surnames, or event dates, etc. etc. > > > > But handle is the easiest way and safe key for > ordering our data. > > Only if that's the order you want > > > > > gramps ids could be exotic! > > Do you mean unique? Anyway it is a good sort-key > candidate > > > surnames is not a good key :( > > I can see that some would like it...makes the XML easier to > read by a human > > > date => date_object => year, then month, then > day, then rank, etc ... = horrible index > > Probably, but its just one possibility > > > > > My problem is on plugins/export/ExportXML.py > > > > I saw a sortByID function not used, then sometimes the > use of list (get_...), then iteration (only family > handles). > > > > I thought on use lists sorted by handle for having an > order rule. I do not want to group handles, handles will be > grouped into the Gramps XML, so it was not planned to parse > one flat XML file or something like that! > > > > But it is not my main problem ... > > I thought that to sort handles means objects lists > will be consistent (Persons, Families, Events, etc ...) > > > > Every time I import a Gramps XML, Gramps rebuilds > (write, DB commit) some objects! Change time is not the same > with a simple import then export. > > Well, they all need new handles, right? Possibility > of collisions. > Also with gramps ids. > > > > > I can understand the random order used by bsddb, but > this should not be done on some objects (like family) and > not on the others. > > > > In my mind, an import without DB change is like a > "read-only": it is not the case. OK, you are saying that it > is the way used by bsddb. XML files should be able to use > 'diff' or revision control tools. With current Gramps XML > import/export, these tools are limited. :( > > Yep. You're probably looking for something like a > UUID for each > record. Not a bad idea but not implemented at the > moment. > > > > > > > Jérôme > > > > > > --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> > a écrit : > > > >> De: Gerald Britton <ger...@gm...> > >> Objet: Re: [Gramps-devel] > self.db.iter_object_handles(sort_handles=True) > >> À: "jerome" <rom...@ya...> > >> Cc: gra...@li... > >> Date: Vendredi 14 janvier 2011, 21h21 > >> On Fri, Jan 14, 2011 at 3:11 PM, > >> jerome <rom...@ya...> > >> wrote: > >> > I am not certain to understand ... > >> > Keys should be handles, no ? > >> > >> Well, that's the question! I can see a case for > >> gramps ids, or > >> surnames, or event dates, etc. etc. > >> > >> > > >> > > 'self.db.get_{object}_handles(sort_handles=True)' is > >> allowed, > >> > not > >> > 'self.db.iter_{object}_handles(sort_handles=True)'! > >> > > >> > There is two questions: > >> > > >> > 1. Why does Gramps only use > >> self.db.iter_family_handles(), else > >> self.get_{object}_handles(), where {object} is > person or > >> event or source or place or repository or note or > media > >> object. > >> > >> the get_...handles methods return a list, which > can be > >> expensive in > >> memory and must read all objects in one pass. > The > >> iter... methods > >> just return one at at time, so are cheaper in > memory. > >> So, the iter... > >> methods are preferable. OTOH, they cannot do > sorting, > >> since by > >> definition you need to read all records before you > can sort > >> them. > >> > >> > > >> > 2. Why 'sort_handles=True' argument is > allowed on all > >> primary objects except family object ? > >> > >> I suppose that there has been no requirement so > far so no > >> one coded it up. > >> > >> > > >> >> The data is not ordered since it > >> >> comes from bsddb in random order. > >> > > >> > This could explain why I will not be able to > keep > >> order on XML import (to bsddb). :( > >> > > >> > > >> > Thanks. > >> > Jérôme > >> > > >> > --- En date de : Ven 14.1.11, Gerald Britton > <ger...@gm...> > >> a écrit : > >> > > >> >> De: Gerald Britton <ger...@gm...> > >> >> Objet: Re: [Gramps-devel] > >> self.db.iter_object_handles(sort_handles=True) > >> >> À: "jerome" <rom...@ya...> > >> >> Cc: gra...@li... > >> >> Date: Vendredi 14 janvier 2011, 19h53 > >> >> The data is not ordered since it > >> >> comes from bsddb in random order. If > >> >> we ordered it, we would have to sort it > by some > >> key. > >> >> So, if we did, > >> >> what keys would you use for: > >> >> > >> >> person > >> >> family > >> >> event > >> >> source > >> >> place > >> >> repository > >> >> note > >> >> media object > >> >> > >> >> On Fri, Jan 14, 2011 at 1:36 PM, jerome > <rom...@ya...> > >> >> wrote: > >> >> > Hi, > >> >> > > >> >> > > >> >> > I am trying to get an answer to a > question > >> about the > >> >> code: why we cannot keep the order of > objects > >> after a Gramps > >> >> XML file import against export ? > >> >> > > >> >> > Nick pointed out that objects are > not ordered > >> on > >> >> export[1]. > >> >> > Why ? I suppose backup scripts or > revision > >> control > >> >> tools will work better with ordered > objects! > >> Anyway, to use > >> >> 'sort_handles=True' works on export, > except for > >> family > >> >> handles. Any reason for that ? A typo > somewhere ? > >> On my side > >> >> ? > >> >> > > >> >> > > >> >> > [1] http://www.gramps-project.org/bugs/view.php?id=4365 > >> >> > > >> >> > regards, > >> >> > Jérôme > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > >> > ------------------------------------------------------------------------------ > >> >> > Protect Your Site and Customers from > Malware > >> Attacks > >> >> > Learn about various malware tactics > and how > >> to avoid > >> >> them. Understand > >> >> > malware threats, the impact they can > have on > >> your > >> >> business, and how you > >> >> > can protect your company and > customers by > >> using code > >> >> signing. > >> >> > http://p.sf.net/sfu/oracle-sfdevnl > >> >> > > >> _______________________________________________ > >> >> > Gramps-devel mailing list > >> >> > Gra...@li... > >> >> > https://lists.sourceforge.net/lists/listinfo/gramps-devel > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> Gerald Britton > >> >> > >> > > >> > > >> > > >> > > >> > >> > >> > >> -- > >> Gerald Britton > >> > > > > > > > > > > > > -- > Gerald Britton > |
From: Doug B. <dou...@gm...> - 2011-01-14 21:57:22
|
On Fri, Jan 14, 2011 at 4:31 PM, jerome <rom...@ya...> wrote: >> > gramps ids could be exotic! >> Do you mean unique? Anyway it is a good sort-key >> candidate > > ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] > > In 'handle' I trust! ;) > >> > Every time I import a Gramps XML, Gramps rebuilds >> (write, DB commit) some objects! Change time is not the same >> with a simple import then export. >> Well, they all need new handles, right? Possibility >> of collisions. >> Also with gramps ids. > > In fact, I want to keep handles: they should be the keys control. > > My problem could be illustrated by something like: > > $ gramps -i import.gramps -e export.gramps > $ gunzip < import.gramps > import.xml > $ gunzip < export.gramps > export.xml > $ diff -u import.xml export.xml > diff.txt > > where import.gramps is our "Scientific control". > > What should be the content of diff.txt ? > > For me, it should be few lines... > Unfortunatly there is some change (order, change time on family objects): that's strange! Yes, it would be handy to do this. This might be called "idempotent" by a mathematician: if the round-trip through gramps was idempotent, then the diff would be empty. What we need is: 1. something smarter than diff for this usage 2. sort on something that doesn't change (like the handle), just for this purpose 3. make it so that the order is preserved I would lean towards #3. I've "fixed" some other places where the order was lost. If you let me know which orders are lost, I'll address. -Doug > Jérôme > > > --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> a écrit : > >> De: Gerald Britton <ger...@gm...> >> Objet: Re: [Gramps-devel] self.db.iter_object_handles(sort_handles=True) >> À: "jerome" <rom...@ya...> >> Cc: gra...@li... >> Date: Vendredi 14 janvier 2011, 22h10 >> On Fri, Jan 14, 2011 at 3:59 PM, >> jerome <rom...@ya...> >> wrote: >> >> > I am not certain to understand ... >> >> > Keys should be handles, no ? >> >> >> >> Well, that's the question! I can see a case for >> >> gramps ids, or >> >> surnames, or event dates, etc. etc. >> > >> > But handle is the easiest way and safe key for >> ordering our data. >> >> Only if that's the order you want >> >> > >> > gramps ids could be exotic! >> >> Do you mean unique? Anyway it is a good sort-key >> candidate >> >> > surnames is not a good key :( >> >> I can see that some would like it...makes the XML easier to >> read by a human >> >> > date => date_object => year, then month, then >> day, then rank, etc ... = horrible index >> >> Probably, but its just one possibility >> >> > >> > My problem is on plugins/export/ExportXML.py >> > >> > I saw a sortByID function not used, then sometimes the >> use of list (get_...), then iteration (only family >> handles). >> > >> > I thought on use lists sorted by handle for having an >> order rule. I do not want to group handles, handles will be >> grouped into the Gramps XML, so it was not planned to parse >> one flat XML file or something like that! >> > >> > But it is not my main problem ... >> > I thought that to sort handles means objects lists >> will be consistent (Persons, Families, Events, etc ...) >> > >> > Every time I import a Gramps XML, Gramps rebuilds >> (write, DB commit) some objects! Change time is not the same >> with a simple import then export. >> >> Well, they all need new handles, right? Possibility >> of collisions. >> Also with gramps ids. >> >> > >> > I can understand the random order used by bsddb, but >> this should not be done on some objects (like family) and >> not on the others. >> > >> > In my mind, an import without DB change is like a >> "read-only": it is not the case. OK, you are saying that it >> is the way used by bsddb. XML files should be able to use >> 'diff' or revision control tools. With current Gramps XML >> import/export, these tools are limited. :( >> >> Yep. You're probably looking for something like a >> UUID for each >> record. Not a bad idea but not implemented at the >> moment. >> >> > >> > >> > Jérôme >> > >> > >> > --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> >> a écrit : >> > >> >> De: Gerald Britton <ger...@gm...> >> >> Objet: Re: [Gramps-devel] >> self.db.iter_object_handles(sort_handles=True) >> >> À: "jerome" <rom...@ya...> >> >> Cc: gra...@li... >> >> Date: Vendredi 14 janvier 2011, 21h21 >> >> On Fri, Jan 14, 2011 at 3:11 PM, >> >> jerome <rom...@ya...> >> >> wrote: >> >> > I am not certain to understand ... >> >> > Keys should be handles, no ? >> >> >> >> Well, that's the question! I can see a case for >> >> gramps ids, or >> >> surnames, or event dates, etc. etc. >> >> >> >> > >> >> > >> 'self.db.get_{object}_handles(sort_handles=True)' is >> >> allowed, >> >> > not >> >> >> 'self.db.iter_{object}_handles(sort_handles=True)'! >> >> > >> >> > There is two questions: >> >> > >> >> > 1. Why does Gramps only use >> >> self.db.iter_family_handles(), else >> >> self.get_{object}_handles(), where {object} is >> person or >> >> event or source or place or repository or note or >> media >> >> object. >> >> >> >> the get_...handles methods return a list, which >> can be >> >> expensive in >> >> memory and must read all objects in one pass. >> The >> >> iter... methods >> >> just return one at at time, so are cheaper in >> memory. >> >> So, the iter... >> >> methods are preferable. OTOH, they cannot do >> sorting, >> >> since by >> >> definition you need to read all records before you >> can sort >> >> them. >> >> >> >> > >> >> > 2. Why 'sort_handles=True' argument is >> allowed on all >> >> primary objects except family object ? >> >> >> >> I suppose that there has been no requirement so >> far so no >> >> one coded it up. >> >> >> >> > >> >> >> The data is not ordered since it >> >> >> comes from bsddb in random order. >> >> > >> >> > This could explain why I will not be able to >> keep >> >> order on XML import (to bsddb). :( >> >> > >> >> > >> >> > Thanks. >> >> > Jérôme >> >> > >> >> > --- En date de : Ven 14.1.11, Gerald Britton >> <ger...@gm...> >> >> a écrit : >> >> > >> >> >> De: Gerald Britton <ger...@gm...> >> >> >> Objet: Re: [Gramps-devel] >> >> self.db.iter_object_handles(sort_handles=True) >> >> >> À: "jerome" <rom...@ya...> >> >> >> Cc: gra...@li... >> >> >> Date: Vendredi 14 janvier 2011, 19h53 >> >> >> The data is not ordered since it >> >> >> comes from bsddb in random order. If >> >> >> we ordered it, we would have to sort it >> by some >> >> key. >> >> >> So, if we did, >> >> >> what keys would you use for: >> >> >> >> >> >> person >> >> >> family >> >> >> event >> >> >> source >> >> >> place >> >> >> repository >> >> >> note >> >> >> media object >> >> >> >> >> >> On Fri, Jan 14, 2011 at 1:36 PM, jerome >> <rom...@ya...> >> >> >> wrote: >> >> >> > Hi, >> >> >> > >> >> >> > >> >> >> > I am trying to get an answer to a >> question >> >> about the >> >> >> code: why we cannot keep the order of >> objects >> >> after a Gramps >> >> >> XML file import against export ? >> >> >> > >> >> >> > Nick pointed out that objects are >> not ordered >> >> on >> >> >> export[1]. >> >> >> > Why ? I suppose backup scripts or >> revision >> >> control >> >> >> tools will work better with ordered >> objects! >> >> Anyway, to use >> >> >> 'sort_handles=True' works on export, >> except for >> >> family >> >> >> handles. Any reason for that ? A typo >> somewhere ? >> >> On my side >> >> >> ? >> >> >> > >> >> >> > >> >> >> > [1] http://www.gramps-project.org/bugs/view.php?id=4365 >> >> >> > >> >> >> > regards, >> >> >> > Jérôme >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> >> >> >> ------------------------------------------------------------------------------ >> >> >> > Protect Your Site and Customers from >> Malware >> >> Attacks >> >> >> > Learn about various malware tactics >> and how >> >> to avoid >> >> >> them. Understand >> >> >> > malware threats, the impact they can >> have on >> >> your >> >> >> business, and how you >> >> >> > can protect your company and >> customers by >> >> using code >> >> >> signing. >> >> >> > http://p.sf.net/sfu/oracle-sfdevnl >> >> >> > >> >> _______________________________________________ >> >> >> > Gramps-devel mailing list >> >> >> > Gra...@li... >> >> >> > https://lists.sourceforge.net/lists/listinfo/gramps-devel >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> Gerald Britton >> >> >> >> >> > >> >> > >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Gerald Britton >> >> >> > >> > >> > >> > >> >> >> >> -- >> Gerald Britton >> > > > > > ------------------------------------------------------------------------------ > Protect Your Site and Customers from Malware Attacks > Learn about various malware tactics and how to avoid them. Understand > malware threats, the impact they can have on your business, and how you > can protect your company and customers by using code signing. > http://p.sf.net/sfu/oracle-sfdevnl > _______________________________________________ > Gramps-devel mailing list > Gra...@li... > https://lists.sourceforge.net/lists/listinfo/gramps-devel > |
From: jerome <rom...@ya...> - 2011-01-14 22:13:01
|
> Yes, it would be handy to do this. This might be called > "idempotent" > by a mathematician: if the round-trip through gramps was > idempotent, > then the diff would be empty. That's exactly what I tried to do. I learned one word! :) Thanks! > 3. make it so that the order is preserved > > I would lean towards #3. I've "fixed" some other places > where the order was lost. If you let me know which orders are lost, I'll address. At glance, I will say events, notes, places. But there is something else: 1. some families are re-written (change time) 2. small samples do not reorder ! cache limit ? http://www.gramps-project.org/bugs/view.php?id=4365 Jérôme --- En date de : Ven 14.1.11, Doug Blank <dou...@gm...> a écrit : > De: Doug Blank <dou...@gm...> > Objet: Re: [Gramps-devel] self.db.iter_object_handles(sort_handles=True) > À: "jerome" <rom...@ya...> > Cc: "Gerald Britton" <ger...@gm...>, gra...@li... > Date: Vendredi 14 janvier 2011, 22h57 > On Fri, Jan 14, 2011 at 4:31 PM, > jerome <rom...@ya...> > wrote: > >> > gramps ids could be exotic! > >> Do you mean unique? Anyway it is a good > sort-key > >> candidate > > > > ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] > > > > In 'handle' I trust! ;) > > > >> > Every time I import a Gramps XML, Gramps > rebuilds > >> (write, DB commit) some objects! Change time is > not the same > >> with a simple import then export. > >> Well, they all need new handles, right? > Possibility > >> of collisions. > >> Also with gramps ids. > > > > In fact, I want to keep handles: they should be the > keys control. > > > > My problem could be illustrated by something like: > > > > $ gramps -i import.gramps -e export.gramps > > $ gunzip < import.gramps > import.xml > > $ gunzip < export.gramps > export.xml > > $ diff -u import.xml export.xml > diff.txt > > > > where import.gramps is our "Scientific control". > > > > What should be the content of diff.txt ? > > > > For me, it should be few lines... > > Unfortunatly there is some change (order, change time > on family objects): that's strange! > > Yes, it would be handy to do this. This might be called > "idempotent" > by a mathematician: if the round-trip through gramps was > idempotent, > then the diff would be empty. > > What we need is: > > 1. something smarter than diff for this usage > 2. sort on something that doesn't change (like the handle), > just for > this purpose > 3. make it so that the order is preserved > > I would lean towards #3. I've "fixed" some other places > where the > order was lost. If you let me know which orders are lost, > I'll > address. > > -Doug > > > Jérôme > > > > > > --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> > a écrit : > > > >> De: Gerald Britton <ger...@gm...> > >> Objet: Re: [Gramps-devel] > self.db.iter_object_handles(sort_handles=True) > >> À: "jerome" <rom...@ya...> > >> Cc: gra...@li... > >> Date: Vendredi 14 janvier 2011, 22h10 > >> On Fri, Jan 14, 2011 at 3:59 PM, > >> jerome <rom...@ya...> > >> wrote: > >> >> > I am not certain to understand ... > >> >> > Keys should be handles, no ? > >> >> > >> >> Well, that's the question! I can see a > case for > >> >> gramps ids, or > >> >> surnames, or event dates, etc. etc. > >> > > >> > But handle is the easiest way and safe key > for > >> ordering our data. > >> > >> Only if that's the order you want > >> > >> > > >> > gramps ids could be exotic! > >> > >> Do you mean unique? Anyway it is a good > sort-key > >> candidate > >> > >> > surnames is not a good key :( > >> > >> I can see that some would like it...makes the XML > easier to > >> read by a human > >> > >> > date => date_object => year, then > month, then > >> day, then rank, etc ... = horrible index > >> > >> Probably, but its just one possibility > >> > >> > > >> > My problem is on plugins/export/ExportXML.py > >> > > >> > I saw a sortByID function not used, then > sometimes the > >> use of list (get_...), then iteration (only > family > >> handles). > >> > > >> > I thought on use lists sorted by handle for > having an > >> order rule. I do not want to group handles, > handles will be > >> grouped into the Gramps XML, so it was not planned > to parse > >> one flat XML file or something like that! > >> > > >> > But it is not my main problem ... > >> > I thought that to sort handles means objects > lists > >> will be consistent (Persons, Families, Events, etc > ...) > >> > > >> > Every time I import a Gramps XML, Gramps > rebuilds > >> (write, DB commit) some objects! Change time is > not the same > >> with a simple import then export. > >> > >> Well, they all need new handles, right? > Possibility > >> of collisions. > >> Also with gramps ids. > >> > >> > > >> > I can understand the random order used by > bsddb, but > >> this should not be done on some objects (like > family) and > >> not on the others. > >> > > >> > In my mind, an import without DB change is > like a > >> "read-only": it is not the case. OK, you are > saying that it > >> is the way used by bsddb. XML files should be able > to use > >> 'diff' or revision control tools. With current > Gramps XML > >> import/export, these tools are limited. :( > >> > >> Yep. You're probably looking for something like > a > >> UUID for each > >> record. Not a bad idea but not implemented at > the > >> moment. > >> > >> > > >> > > >> > Jérôme > >> > > >> > > >> > --- En date de : Ven 14.1.11, Gerald Britton > <ger...@gm...> > >> a écrit : > >> > > >> >> De: Gerald Britton <ger...@gm...> > >> >> Objet: Re: [Gramps-devel] > >> self.db.iter_object_handles(sort_handles=True) > >> >> À: "jerome" <rom...@ya...> > >> >> Cc: gra...@li... > >> >> Date: Vendredi 14 janvier 2011, 21h21 > >> >> On Fri, Jan 14, 2011 at 3:11 PM, > >> >> jerome <rom...@ya...> > >> >> wrote: > >> >> > I am not certain to understand ... > >> >> > Keys should be handles, no ? > >> >> > >> >> Well, that's the question! I can see a > case for > >> >> gramps ids, or > >> >> surnames, or event dates, etc. etc. > >> >> > >> >> > > >> >> > > >> 'self.db.get_{object}_handles(sort_handles=True)' > is > >> >> allowed, > >> >> > not > >> >> > >> > 'self.db.iter_{object}_handles(sort_handles=True)'! > >> >> > > >> >> > There is two questions: > >> >> > > >> >> > 1. Why does Gramps only use > >> >> self.db.iter_family_handles(), else > >> >> self.get_{object}_handles(), where > {object} is > >> person or > >> >> event or source or place or repository or > note or > >> media > >> >> object. > >> >> > >> >> the get_...handles methods return a list, > which > >> can be > >> >> expensive in > >> >> memory and must read all objects in one > pass. > >> The > >> >> iter... methods > >> >> just return one at at time, so are > cheaper in > >> memory. > >> >> So, the iter... > >> >> methods are preferable. OTOH, they > cannot do > >> sorting, > >> >> since by > >> >> definition you need to read all records > before you > >> can sort > >> >> them. > >> >> > >> >> > > >> >> > 2. Why 'sort_handles=True' argument > is > >> allowed on all > >> >> primary objects except family object ? > >> >> > >> >> I suppose that there has been no > requirement so > >> far so no > >> >> one coded it up. > >> >> > >> >> > > >> >> >> The data is not ordered since > it > >> >> >> comes from bsddb in random > order. > >> >> > > >> >> > This could explain why I will not be > able to > >> keep > >> >> order on XML import (to bsddb). :( > >> >> > > >> >> > > >> >> > Thanks. > >> >> > Jérôme > >> >> > > >> >> > --- En date de : Ven 14.1.11, > Gerald Britton > >> <ger...@gm...> > >> >> a écrit : > >> >> > > >> >> >> De: Gerald Britton <ger...@gm...> > >> >> >> Objet: Re: [Gramps-devel] > >> >> > self.db.iter_object_handles(sort_handles=True) > >> >> >> À: "jerome" <rom...@ya...> > >> >> >> Cc: gra...@li... > >> >> >> Date: Vendredi 14 janvier 2011, > 19h53 > >> >> >> The data is not ordered since > it > >> >> >> comes from bsddb in random > order. If > >> >> >> we ordered it, we would have to > sort it > >> by some > >> >> key. > >> >> >> So, if we did, > >> >> >> what keys would you use for: > >> >> >> > >> >> >> person > >> >> >> family > >> >> >> event > >> >> >> source > >> >> >> place > >> >> >> repository > >> >> >> note > >> >> >> media object > >> >> >> > >> >> >> On Fri, Jan 14, 2011 at 1:36 PM, > jerome > >> <rom...@ya...> > >> >> >> wrote: > >> >> >> > Hi, > >> >> >> > > >> >> >> > > >> >> >> > I am trying to get an > answer to a > >> question > >> >> about the > >> >> >> code: why we cannot keep the > order of > >> objects > >> >> after a Gramps > >> >> >> XML file import against export > ? > >> >> >> > > >> >> >> > Nick pointed out that > objects are > >> not ordered > >> >> on > >> >> >> export[1]. > >> >> >> > Why ? I suppose backup > scripts or > >> revision > >> >> control > >> >> >> tools will work better with > ordered > >> objects! > >> >> Anyway, to use > >> >> >> 'sort_handles=True' works on > export, > >> except for > >> >> family > >> >> >> handles. Any reason for that ? A > typo > >> somewhere ? > >> >> On my side > >> >> >> ? > >> >> >> > > >> >> >> > > >> >> >> > [1] http://www.gramps-project.org/bugs/view.php?id=4365 > >> >> >> > > >> >> >> > regards, > >> >> >> > Jérôme > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > >> >> > >> > ------------------------------------------------------------------------------ > >> >> >> > Protect Your Site and > Customers from > >> Malware > >> >> Attacks > >> >> >> > Learn about various malware > tactics > >> and how > >> >> to avoid > >> >> >> them. Understand > >> >> >> > malware threats, the impact > they can > >> have on > >> >> your > >> >> >> business, and how you > >> >> >> > can protect your company > and > >> customers by > >> >> using code > >> >> >> signing. > >> >> >> > http://p.sf.net/sfu/oracle-sfdevnl > >> >> >> > > >> >> > _______________________________________________ > >> >> >> > Gramps-devel mailing list > >> >> >> > Gra...@li... > >> >> >> > https://lists.sourceforge.net/lists/listinfo/gramps-devel > >> >> >> > > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Gerald Britton > >> >> >> > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> Gerald Britton > >> >> > >> > > >> > > >> > > >> > > >> > >> > >> > >> -- > >> Gerald Britton > >> > > > > > > > > > > > ------------------------------------------------------------------------------ > > Protect Your Site and Customers from Malware Attacks > > Learn about various malware tactics and how to avoid > them. Understand > > malware threats, the impact they can have on your > business, and how you > > can protect your company and customers by using code > signing. > > http://p.sf.net/sfu/oracle-sfdevnl > > _______________________________________________ > > Gramps-devel mailing list > > Gra...@li... > > https://lists.sourceforge.net/lists/listinfo/gramps-devel > > > |
From: Jérôme <rom...@ya...> - 2011-01-15 16:59:33
|
>> 3. make it so that the order is preserved >> >> I would lean towards #3. I've "fixed" some other places >> where the order was lost. If you let me know which orders are lost, I'll address. > > At glance, I will say events, notes, places. Sorry, only on events, family and few persons, see diff.txt on: http://www.gramps-project.org/bugs/view.php?id=4365 maybe places and notes on my primary database? Is it possible that there is a cache on import and if there is a lot of handles, then cache is commited like a direct write (do not keep order on import ?). This could explain that long lists of objects could be up-side-down! And only on large database or group of objects. Jérôme jerome a écrit : >> Yes, it would be handy to do this. This might be called >> "idempotent" >> by a mathematician: if the round-trip through gramps was >> idempotent, >> then the diff would be empty. > > That's exactly what I tried to do. > I learned one word! :) > Thanks! > >> 3. make it so that the order is preserved >> >> I would lean towards #3. I've "fixed" some other places >> where the order was lost. If you let me know which orders are lost, I'll address. > > At glance, I will say events, notes, places. > > But there is something else: > 1. some families are re-written (change time) > 2. small samples do not reorder ! cache limit ? > > http://www.gramps-project.org/bugs/view.php?id=4365 > > > Jérôme > > > --- En date de : Ven 14.1.11, Doug Blank <dou...@gm...> a écrit : > >> De: Doug Blank <dou...@gm...> >> Objet: Re: [Gramps-devel] self.db.iter_object_handles(sort_handles=True) >> À: "jerome" <rom...@ya...> >> Cc: "Gerald Britton" <ger...@gm...>, gra...@li... >> Date: Vendredi 14 janvier 2011, 22h57 >> On Fri, Jan 14, 2011 at 4:31 PM, >> jerome <rom...@ya...> >> wrote: >>>>> gramps ids could be exotic! >>>> Do you mean unique? Anyway it is a good >> sort-key >>>> candidate >>> ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] >>> >>> In 'handle' I trust! ;) >>> >>>>> Every time I import a Gramps XML, Gramps >> rebuilds >>>> (write, DB commit) some objects! Change time is >> not the same >>>> with a simple import then export. >>>> Well, they all need new handles, right? >> Possibility >>>> of collisions. >>>> Also with gramps ids. >>> In fact, I want to keep handles: they should be the >> keys control. >>> My problem could be illustrated by something like: >>> >>> $ gramps -i import.gramps -e export.gramps >>> $ gunzip < import.gramps > import.xml >>> $ gunzip < export.gramps > export.xml >>> $ diff -u import.xml export.xml > diff.txt >>> >>> where import.gramps is our "Scientific control". >>> >>> What should be the content of diff.txt ? >>> >>> For me, it should be few lines... >>> Unfortunatly there is some change (order, change time >> on family objects): that's strange! >> >> Yes, it would be handy to do this. This might be called >> "idempotent" >> by a mathematician: if the round-trip through gramps was >> idempotent, >> then the diff would be empty. >> >> What we need is: >> >> 1. something smarter than diff for this usage >> 2. sort on something that doesn't change (like the handle), >> just for >> this purpose >> 3. make it so that the order is preserved >> >> I would lean towards #3. I've "fixed" some other places >> where the >> order was lost. If you let me know which orders are lost, >> I'll >> address. >> >> -Doug >> >>> Jérôme >>> >>> >>> --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> >> a écrit : >>>> De: Gerald Britton <ger...@gm...> >>>> Objet: Re: [Gramps-devel] >> self.db.iter_object_handles(sort_handles=True) >>>> À: "jerome" <rom...@ya...> >>>> Cc: gra...@li... >>>> Date: Vendredi 14 janvier 2011, 22h10 >>>> On Fri, Jan 14, 2011 at 3:59 PM, >>>> jerome <rom...@ya...> >>>> wrote: >>>>>>> I am not certain to understand ... >>>>>>> Keys should be handles, no ? >>>>>> Well, that's the question! I can see a >> case for >>>>>> gramps ids, or >>>>>> surnames, or event dates, etc. etc. >>>>> But handle is the easiest way and safe key >> for >>>> ordering our data. >>>> >>>> Only if that's the order you want >>>> >>>>> gramps ids could be exotic! >>>> Do you mean unique? Anyway it is a good >> sort-key >>>> candidate >>>> >>>>> surnames is not a good key :( >>>> I can see that some would like it...makes the XML >> easier to >>>> read by a human >>>> >>>>> date => date_object => year, then >> month, then >>>> day, then rank, etc ... = horrible index >>>> >>>> Probably, but its just one possibility >>>> >>>>> My problem is on plugins/export/ExportXML.py >>>>> >>>>> I saw a sortByID function not used, then >> sometimes the >>>> use of list (get_...), then iteration (only >> family >>>> handles). >>>>> I thought on use lists sorted by handle for >> having an >>>> order rule. I do not want to group handles, >> handles will be >>>> grouped into the Gramps XML, so it was not planned >> to parse >>>> one flat XML file or something like that! >>>>> But it is not my main problem ... >>>>> I thought that to sort handles means objects >> lists >>>> will be consistent (Persons, Families, Events, etc >> ...) >>>>> Every time I import a Gramps XML, Gramps >> rebuilds >>>> (write, DB commit) some objects! Change time is >> not the same >>>> with a simple import then export. >>>> >>>> Well, they all need new handles, right? >> Possibility >>>> of collisions. >>>> Also with gramps ids. >>>> >>>>> I can understand the random order used by >> bsddb, but >>>> this should not be done on some objects (like >> family) and >>>> not on the others. >>>>> In my mind, an import without DB change is >> like a >>>> "read-only": it is not the case. OK, you are >> saying that it >>>> is the way used by bsddb. XML files should be able >> to use >>>> 'diff' or revision control tools. With current >> Gramps XML >>>> import/export, these tools are limited. :( >>>> >>>> Yep. You're probably looking for something like >> a >>>> UUID for each >>>> record. Not a bad idea but not implemented at >> the >>>> moment. >>>> >>>>> >>>>> Jérôme >>>>> >>>>> >>>>> --- En date de : Ven 14.1.11, Gerald Britton >> <ger...@gm...> >>>> a écrit : >>>>>> De: Gerald Britton <ger...@gm...> >>>>>> Objet: Re: [Gramps-devel] >>>> self.db.iter_object_handles(sort_handles=True) >>>>>> À: "jerome" <rom...@ya...> >>>>>> Cc: gra...@li... >>>>>> Date: Vendredi 14 janvier 2011, 21h21 >>>>>> On Fri, Jan 14, 2011 at 3:11 PM, >>>>>> jerome <rom...@ya...> >>>>>> wrote: >>>>>>> I am not certain to understand ... >>>>>>> Keys should be handles, no ? >>>>>> Well, that's the question! I can see a >> case for >>>>>> gramps ids, or >>>>>> surnames, or event dates, etc. etc. >>>>>> >>>>>>> >>>> 'self.db.get_{object}_handles(sort_handles=True)' >> is >>>>>> allowed, >>>>>>> not >> 'self.db.iter_{object}_handles(sort_handles=True)'! >>>>>>> There is two questions: >>>>>>> >>>>>>> 1. Why does Gramps only use >>>>>> self.db.iter_family_handles(), else >>>>>> self.get_{object}_handles(), where >> {object} is >>>> person or >>>>>> event or source or place or repository or >> note or >>>> media >>>>>> object. >>>>>> >>>>>> the get_...handles methods return a list, >> which >>>> can be >>>>>> expensive in >>>>>> memory and must read all objects in one >> pass. >>>> The >>>>>> iter... methods >>>>>> just return one at at time, so are >> cheaper in >>>> memory. >>>>>> So, the iter... >>>>>> methods are preferable. OTOH, they >> cannot do >>>> sorting, >>>>>> since by >>>>>> definition you need to read all records >> before you >>>> can sort >>>>>> them. >>>>>> >>>>>>> 2. Why 'sort_handles=True' argument >> is >>>> allowed on all >>>>>> primary objects except family object ? >>>>>> >>>>>> I suppose that there has been no >> requirement so >>>> far so no >>>>>> one coded it up. >>>>>> >>>>>>>> The data is not ordered since >> it >>>>>>>> comes from bsddb in random >> order. >>>>>>> This could explain why I will not be >> able to >>>> keep >>>>>> order on XML import (to bsddb). :( >>>>>>> >>>>>>> Thanks. >>>>>>> Jérôme >>>>>>> >>>>>>> --- En date de : Ven 14.1.11, >> Gerald Britton >>>> <ger...@gm...> >>>>>> a écrit : >>>>>>>> De: Gerald Britton <ger...@gm...> >>>>>>>> Objet: Re: [Gramps-devel] >> self.db.iter_object_handles(sort_handles=True) >>>>>>>> À: "jerome" <rom...@ya...> >>>>>>>> Cc: gra...@li... >>>>>>>> Date: Vendredi 14 janvier 2011, >> 19h53 >>>>>>>> The data is not ordered since >> it >>>>>>>> comes from bsddb in random >> order. If >>>>>>>> we ordered it, we would have to >> sort it >>>> by some >>>>>> key. >>>>>>>> So, if we did, >>>>>>>> what keys would you use for: >>>>>>>> >>>>>>>> person >>>>>>>> family >>>>>>>> event >>>>>>>> source >>>>>>>> place >>>>>>>> repository >>>>>>>> note >>>>>>>> media object >>>>>>>> >>>>>>>> On Fri, Jan 14, 2011 at 1:36 PM, >> jerome >>>> <rom...@ya...> >>>>>>>> wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> >>>>>>>>> I am trying to get an >> answer to a >>>> question >>>>>> about the >>>>>>>> code: why we cannot keep the >> order of >>>> objects >>>>>> after a Gramps >>>>>>>> XML file import against export >> ? >>>>>>>>> Nick pointed out that >> objects are >>>> not ordered >>>>>> on >>>>>>>> export[1]. >>>>>>>>> Why ? I suppose backup >> scripts or >>>> revision >>>>>> control >>>>>>>> tools will work better with >> ordered >>>> objects! >>>>>> Anyway, to use >>>>>>>> 'sort_handles=True' works on >> export, >>>> except for >>>>>> family >>>>>>>> handles. Any reason for that ? A >> typo >>>> somewhere ? >>>>>> On my side >>>>>>>> ? >>>>>>>>> >>>>>>>>> [1] http://www.gramps-project.org/bugs/view.php?id=4365 >>>>>>>>> >>>>>>>>> regards, >>>>>>>>> Jérôme >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >> ------------------------------------------------------------------------------ >>>>>>>>> Protect Your Site and >> Customers from >>>> Malware >>>>>> Attacks >>>>>>>>> Learn about various malware >> tactics >>>> and how >>>>>> to avoid >>>>>>>> them. Understand >>>>>>>>> malware threats, the impact >> they can >>>> have on >>>>>> your >>>>>>>> business, and how you >>>>>>>>> can protect your company >> and >>>> customers by >>>>>> using code >>>>>>>> signing. >>>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl >>>>>>>>> >> _______________________________________________ >>>>>>>>> Gramps-devel mailing list >>>>>>>>> Gra...@li... >>>>>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Gerald Britton >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Gerald Britton >>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Gerald Britton >>>> >>> >>> >>> >>> >> ------------------------------------------------------------------------------ >>> Protect Your Site and Customers from Malware Attacks >>> Learn about various malware tactics and how to avoid >> them. Understand >>> malware threats, the impact they can have on your >> business, and how you >>> can protect your company and customers by using code >> signing. >>> http://p.sf.net/sfu/oracle-sfdevnl >>> _______________________________________________ >>> Gramps-devel mailing list >>> Gra...@li... >>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>> > > > > > ------------------------------------------------------------------------------ > Protect Your Site and Customers from Malware Attacks > Learn about various malware tactics and how to avoid them. Understand > malware threats, the impact they can have on your business, and how you > can protect your company and customers by using code signing. > http://p.sf.net/sfu/oracle-sfdevnl > _______________________________________________ > Gramps-devel mailing list > Gra...@li... > https://lists.sourceforge.net/lists/listinfo/gramps-devel > |
From: Gerald B. <ger...@gm...> - 2011-01-15 17:35:35
|
Two ideas: 1. Add sort_handles options to all eight read methods and an option to export to use this option. 2. Post-process the exported xml file through XML Query for sorting. Which option depends on the use cases. If enough folks need option 1, we should do it. Otherwise, I recommend option 2. On Sat, Jan 15, 2011 at 11:59 AM, Jérôme <rom...@ya...> wrote: >>> 3. make it so that the order is preserved >>> >>> I would lean towards #3. I've "fixed" some other places >>> where the order was lost. If you let me know which orders are lost, I'll >>> address. >> >> At glance, I will say events, notes, places. > > Sorry, only on events, family and few persons, see diff.txt on: > http://www.gramps-project.org/bugs/view.php?id=4365 > maybe places and notes on my primary database? > > Is it possible that there is a cache on import and if there is a lot of > handles, then cache is commited like a direct write (do not keep order on > import ?). This could explain that long lists of objects could be > up-side-down! And only on large database or group of objects. > > > Jérôme > > > jerome a écrit : >>> >>> Yes, it would be handy to do this. This might be called >>> "idempotent" >>> by a mathematician: if the round-trip through gramps was >>> idempotent, >>> then the diff would be empty. >> >> That's exactly what I tried to do. I learned one word! :) >> Thanks! >> >>> 3. make it so that the order is preserved >>> >>> I would lean towards #3. I've "fixed" some other places >>> where the order was lost. If you let me know which orders are lost, I'll >>> address. >> >> At glance, I will say events, notes, places. >> >> But there is something else: >> 1. some families are re-written (change time) >> 2. small samples do not reorder ! cache limit ? >> >> http://www.gramps-project.org/bugs/view.php?id=4365 >> >> >> Jérôme >> >> >> --- En date de : Ven 14.1.11, Doug Blank <dou...@gm...> a écrit : >> >>> De: Doug Blank <dou...@gm...> >>> Objet: Re: [Gramps-devel] self.db.iter_object_handles(sort_handles=True) >>> À: "jerome" <rom...@ya...> >>> Cc: "Gerald Britton" <ger...@gm...>, >>> gra...@li... >>> Date: Vendredi 14 janvier 2011, 22h57 >>> On Fri, Jan 14, 2011 at 4:31 PM, >>> jerome <rom...@ya...> >>> wrote: >>>>>> >>>>>> gramps ids could be exotic! >>>>> >>>>> Do you mean unique? Anyway it is a good >>> >>> sort-key >>>>> >>>>> candidate >>>> >>>> ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] >>>> >>>> In 'handle' I trust! ;) >>>> >>>>>> Every time I import a Gramps XML, Gramps >>> >>> rebuilds >>>>> >>>>> (write, DB commit) some objects! Change time is >>> >>> not the same >>>>> >>>>> with a simple import then export. >>>>> Well, they all need new handles, right? >>> >>> Possibility >>>>> >>>>> of collisions. >>>>> Also with gramps ids. >>>> >>>> In fact, I want to keep handles: they should be the >>> >>> keys control. >>>> >>>> My problem could be illustrated by something like: >>>> >>>> $ gramps -i import.gramps -e export.gramps >>>> $ gunzip < import.gramps > import.xml >>>> $ gunzip < export.gramps > export.xml >>>> $ diff -u import.xml export.xml > diff.txt >>>> >>>> where import.gramps is our "Scientific control". >>>> >>>> What should be the content of diff.txt ? >>>> >>>> For me, it should be few lines... >>>> Unfortunatly there is some change (order, change time >>> >>> on family objects): that's strange! >>> >>> Yes, it would be handy to do this. This might be called >>> "idempotent" >>> by a mathematician: if the round-trip through gramps was >>> idempotent, >>> then the diff would be empty. >>> >>> What we need is: >>> >>> 1. something smarter than diff for this usage >>> 2. sort on something that doesn't change (like the handle), >>> just for >>> this purpose >>> 3. make it so that the order is preserved >>> >>> I would lean towards #3. I've "fixed" some other places >>> where the >>> order was lost. If you let me know which orders are lost, >>> I'll >>> address. >>> >>> -Doug >>> >>>> Jérôme >>>> >>>> >>>> --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> >>> >>> a écrit : >>>>> >>>>> De: Gerald Britton <ger...@gm...> >>>>> Objet: Re: [Gramps-devel] >>> >>> self.db.iter_object_handles(sort_handles=True) >>>>> >>>>> À: "jerome" <rom...@ya...> >>>>> Cc: gra...@li... >>>>> Date: Vendredi 14 janvier 2011, 22h10 >>>>> On Fri, Jan 14, 2011 at 3:59 PM, >>>>> jerome <rom...@ya...> >>>>> wrote: >>>>>>>> >>>>>>>> I am not certain to understand ... >>>>>>>> Keys should be handles, no ? >>>>>>> >>>>>>> Well, that's the question! I can see a >>> >>> case for >>>>>>> >>>>>>> gramps ids, or >>>>>>> surnames, or event dates, etc. etc. >>>>>> >>>>>> But handle is the easiest way and safe key >>> >>> for >>>>> >>>>> ordering our data. >>>>> >>>>> Only if that's the order you want >>>>> >>>>>> gramps ids could be exotic! >>>>> >>>>> Do you mean unique? Anyway it is a good >>> >>> sort-key >>>>> >>>>> candidate >>>>> >>>>>> surnames is not a good key :( >>>>> >>>>> I can see that some would like it...makes the XML >>> >>> easier to >>>>> >>>>> read by a human >>>>> >>>>>> date => date_object => year, then >>> >>> month, then >>>>> >>>>> day, then rank, etc ... = horrible index >>>>> >>>>> Probably, but its just one possibility >>>>> >>>>>> My problem is on plugins/export/ExportXML.py >>>>>> >>>>>> I saw a sortByID function not used, then >>> >>> sometimes the >>>>> >>>>> use of list (get_...), then iteration (only >>> >>> family >>>>> >>>>> handles). >>>>>> >>>>>> I thought on use lists sorted by handle for >>> >>> having an >>>>> >>>>> order rule. I do not want to group handles, >>> >>> handles will be >>>>> >>>>> grouped into the Gramps XML, so it was not planned >>> >>> to parse >>>>> >>>>> one flat XML file or something like that! >>>>>> >>>>>> But it is not my main problem ... >>>>>> I thought that to sort handles means objects >>> >>> lists >>>>> >>>>> will be consistent (Persons, Families, Events, etc >>> >>> ...) >>>>>> >>>>>> Every time I import a Gramps XML, Gramps >>> >>> rebuilds >>>>> >>>>> (write, DB commit) some objects! Change time is >>> >>> not the same >>>>> >>>>> with a simple import then export. >>>>> >>>>> Well, they all need new handles, right? >>> >>> Possibility >>>>> >>>>> of collisions. >>>>> Also with gramps ids. >>>>> >>>>>> I can understand the random order used by >>> >>> bsddb, but >>>>> >>>>> this should not be done on some objects (like >>> >>> family) and >>>>> >>>>> not on the others. >>>>>> >>>>>> In my mind, an import without DB change is >>> >>> like a >>>>> >>>>> "read-only": it is not the case. OK, you are >>> >>> saying that it >>>>> >>>>> is the way used by bsddb. XML files should be able >>> >>> to use >>>>> >>>>> 'diff' or revision control tools. With current >>> >>> Gramps XML >>>>> >>>>> import/export, these tools are limited. :( >>>>> >>>>> Yep. You're probably looking for something like >>> >>> a >>>>> >>>>> UUID for each >>>>> record. Not a bad idea but not implemented at >>> >>> the >>>>> >>>>> moment. >>>>> >>>>>> >>>>>> Jérôme >>>>>> >>>>>> >>>>>> --- En date de : Ven 14.1.11, Gerald Britton >>> >>> <ger...@gm...> >>>>> >>>>> a écrit : >>>>>>> >>>>>>> De: Gerald Britton <ger...@gm...> >>>>>>> Objet: Re: [Gramps-devel] >>>>> >>>>> self.db.iter_object_handles(sort_handles=True) >>>>>>> >>>>>>> À: "jerome" <rom...@ya...> >>>>>>> Cc: gra...@li... >>>>>>> Date: Vendredi 14 janvier 2011, 21h21 >>>>>>> On Fri, Jan 14, 2011 at 3:11 PM, >>>>>>> jerome <rom...@ya...> >>>>>>> wrote: >>>>>>>> >>>>>>>> I am not certain to understand ... >>>>>>>> Keys should be handles, no ? >>>>>>> >>>>>>> Well, that's the question! I can see a >>> >>> case for >>>>>>> >>>>>>> gramps ids, or >>>>>>> surnames, or event dates, etc. etc. >>>>>>> >>>>>>>> >>>>> 'self.db.get_{object}_handles(sort_handles=True)' >>> >>> is >>>>>>> >>>>>>> allowed, >>>>>>>> >>>>>>>> not >>> >>> 'self.db.iter_{object}_handles(sort_handles=True)'! >>>>>>>> >>>>>>>> There is two questions: >>>>>>>> >>>>>>>> 1. Why does Gramps only use >>>>>>> >>>>>>> self.db.iter_family_handles(), else >>>>>>> self.get_{object}_handles(), where >>> >>> {object} is >>>>> >>>>> person or >>>>>>> >>>>>>> event or source or place or repository or >>> >>> note or >>>>> >>>>> media >>>>>>> >>>>>>> object. >>>>>>> >>>>>>> the get_...handles methods return a list, >>> >>> which >>>>> >>>>> can be >>>>>>> >>>>>>> expensive in >>>>>>> memory and must read all objects in one >>> >>> pass. >>>>> >>>>> The >>>>>>> >>>>>>> iter... methods >>>>>>> just return one at at time, so are >>> >>> cheaper in >>>>> >>>>> memory. >>>>>>> >>>>>>> So, the iter... >>>>>>> methods are preferable. OTOH, they >>> >>> cannot do >>>>> >>>>> sorting, >>>>>>> >>>>>>> since by >>>>>>> definition you need to read all records >>> >>> before you >>>>> >>>>> can sort >>>>>>> >>>>>>> them. >>>>>>> >>>>>>>> 2. Why 'sort_handles=True' argument >>> >>> is >>>>> >>>>> allowed on all >>>>>>> >>>>>>> primary objects except family object ? >>>>>>> >>>>>>> I suppose that there has been no >>> >>> requirement so >>>>> >>>>> far so no >>>>>>> >>>>>>> one coded it up. >>>>>>> >>>>>>>>> The data is not ordered since >>> >>> it >>>>>>>>> >>>>>>>>> comes from bsddb in random >>> >>> order. >>>>>>>> >>>>>>>> This could explain why I will not be >>> >>> able to >>>>> >>>>> keep >>>>>>> >>>>>>> order on XML import (to bsddb). :( >>>>>>>> >>>>>>>> Thanks. >>>>>>>> Jérôme >>>>>>>> >>>>>>>> --- En date de : Ven 14.1.11, >>> >>> Gerald Britton >>>>> >>>>> <ger...@gm...> >>>>>>> >>>>>>> a écrit : >>>>>>>>> >>>>>>>>> De: Gerald Britton <ger...@gm...> >>>>>>>>> Objet: Re: [Gramps-devel] >>> >>> self.db.iter_object_handles(sort_handles=True) >>>>>>>>> >>>>>>>>> À: "jerome" <rom...@ya...> >>>>>>>>> Cc: gra...@li... >>>>>>>>> Date: Vendredi 14 janvier 2011, >>> >>> 19h53 >>>>>>>>> >>>>>>>>> The data is not ordered since >>> >>> it >>>>>>>>> >>>>>>>>> comes from bsddb in random >>> >>> order. If >>>>>>>>> >>>>>>>>> we ordered it, we would have to >>> >>> sort it >>>>> >>>>> by some >>>>>>> >>>>>>> key. >>>>>>>>> >>>>>>>>> So, if we did, >>>>>>>>> what keys would you use for: >>>>>>>>> >>>>>>>>> person >>>>>>>>> family >>>>>>>>> event >>>>>>>>> source >>>>>>>>> place >>>>>>>>> repository >>>>>>>>> note >>>>>>>>> media object >>>>>>>>> >>>>>>>>> On Fri, Jan 14, 2011 at 1:36 PM, >>> >>> jerome >>>>> >>>>> <rom...@ya...> >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I am trying to get an >>> >>> answer to a >>>>> >>>>> question >>>>>>> >>>>>>> about the >>>>>>>>> >>>>>>>>> code: why we cannot keep the >>> >>> order of >>>>> >>>>> objects >>>>>>> >>>>>>> after a Gramps >>>>>>>>> >>>>>>>>> XML file import against export >>> >>> ? >>>>>>>>>> >>>>>>>>>> Nick pointed out that >>> >>> objects are >>>>> >>>>> not ordered >>>>>>> >>>>>>> on >>>>>>>>> >>>>>>>>> export[1]. >>>>>>>>>> >>>>>>>>>> Why ? I suppose backup >>> >>> scripts or >>>>> >>>>> revision >>>>>>> >>>>>>> control >>>>>>>>> >>>>>>>>> tools will work better with >>> >>> ordered >>>>> >>>>> objects! >>>>>>> >>>>>>> Anyway, to use >>>>>>>>> >>>>>>>>> 'sort_handles=True' works on >>> >>> export, >>>>> >>>>> except for >>>>>>> >>>>>>> family >>>>>>>>> >>>>>>>>> handles. Any reason for that ? A >>> >>> typo >>>>> >>>>> somewhere ? >>>>>>> >>>>>>> On my side >>>>>>>>> >>>>>>>>> ? >>>>>>>>>> >>>>>>>>>> [1] http://www.gramps-project.org/bugs/view.php?id=4365 >>>>>>>>>> >>>>>>>>>> regards, >>>>>>>>>> Jérôme >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>> >>> ------------------------------------------------------------------------------ >>>>>>>>>> >>>>>>>>>> Protect Your Site and >>> >>> Customers from >>>>> >>>>> Malware >>>>>>> >>>>>>> Attacks >>>>>>>>>> >>>>>>>>>> Learn about various malware >>> >>> tactics >>>>> >>>>> and how >>>>>>> >>>>>>> to avoid >>>>>>>>> >>>>>>>>> them. Understand >>>>>>>>>> >>>>>>>>>> malware threats, the impact >>> >>> they can >>>>> >>>>> have on >>>>>>> >>>>>>> your >>>>>>>>> >>>>>>>>> business, and how you >>>>>>>>>> >>>>>>>>>> can protect your company >>> >>> and >>>>> >>>>> customers by >>>>>>> >>>>>>> using code >>>>>>>>> >>>>>>>>> signing. >>>>>>>>>> >>>>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl >>>>>>>>>> >>> _______________________________________________ >>>>>>>>>> >>>>>>>>>> Gramps-devel mailing list >>>>>>>>>> Gra...@li... >>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Gerald Britton >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Gerald Britton >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Gerald Britton >>>>> >>>> >>>> >>>> >>>> >>> >>> ------------------------------------------------------------------------------ >>>> >>>> Protect Your Site and Customers from Malware Attacks >>>> Learn about various malware tactics and how to avoid >>> >>> them. Understand >>>> >>>> malware threats, the impact they can have on your >>> >>> business, and how you >>>> >>>> can protect your company and customers by using code >>> >>> signing. >>>> >>>> http://p.sf.net/sfu/oracle-sfdevnl >>>> _______________________________________________ >>>> Gramps-devel mailing list >>>> Gra...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>>> >> >> >> >> >> ------------------------------------------------------------------------------ >> Protect Your Site and Customers from Malware Attacks >> Learn about various malware tactics and how to avoid them. Understand >> malware threats, the impact they can have on your business, and how you can >> protect your company and customers by using code signing. >> http://p.sf.net/sfu/oracle-sfdevnl >> _______________________________________________ >> Gramps-devel mailing list >> Gra...@li... >> https://lists.sourceforge.net/lists/listinfo/gramps-devel >> > > -- Gerald Britton |
From: Jérôme <rom...@ya...> - 2011-01-15 18:47:09
|
> Two ideas: > > 1. Add sort_handles options to all eight read methods and an option to > export to use this option. > > 2. Post-process the exported xml file through XML Query for sorting. > > Which option depends on the use cases. If enough folks need option 1, > we should do it. Otherwise, I recommend option 2. About "1." to sort lists (except family) was done, but the problem is localized on import (hash, re-write "home person" family, etc ...) I should be able to also sort XML (post-process) with a custom index, but it was not the primary goal (to sort). I wanted to see 'raw' modifications, to be able to make "human plain text" modifications on *.gramps! If it is related to bsddb hash, then why should I still try to sort handles on export? Jérôme Gerald Britton a écrit : > Two ideas: > > 1. Add sort_handles options to all eight read methods and an option to > export to use this option. > > 2. Post-process the exported xml file through XML Query for sorting. > > Which option depends on the use cases. If enough folks need option 1, > we should do it. Otherwise, I recommend option 2. > > On Sat, Jan 15, 2011 at 11:59 AM, Jérôme <rom...@ya...> wrote: >>>> 3. make it so that the order is preserved >>>> >>>> I would lean towards #3. I've "fixed" some other places >>>> where the order was lost. If you let me know which orders are lost, I'll >>>> address. >>> At glance, I will say events, notes, places. >> Sorry, only on events, family and few persons, see diff.txt on: >> http://www.gramps-project.org/bugs/view.php?id=4365 >> maybe places and notes on my primary database? >> >> Is it possible that there is a cache on import and if there is a lot of >> handles, then cache is commited like a direct write (do not keep order on >> import ?). This could explain that long lists of objects could be >> up-side-down! And only on large database or group of objects. >> >> >> Jérôme >> >> >> jerome a écrit : >>>> Yes, it would be handy to do this. This might be called >>>> "idempotent" >>>> by a mathematician: if the round-trip through gramps was >>>> idempotent, >>>> then the diff would be empty. >>> That's exactly what I tried to do. I learned one word! :) >>> Thanks! >>> >>>> 3. make it so that the order is preserved >>>> >>>> I would lean towards #3. I've "fixed" some other places >>>> where the order was lost. If you let me know which orders are lost, I'll >>>> address. >>> At glance, I will say events, notes, places. >>> >>> But there is something else: >>> 1. some families are re-written (change time) >>> 2. small samples do not reorder ! cache limit ? >>> >>> http://www.gramps-project.org/bugs/view.php?id=4365 >>> >>> >>> Jérôme >>> >>> >>> --- En date de : Ven 14.1.11, Doug Blank <dou...@gm...> a écrit : >>> >>>> De: Doug Blank <dou...@gm...> >>>> Objet: Re: [Gramps-devel] self.db.iter_object_handles(sort_handles=True) >>>> À: "jerome" <rom...@ya...> >>>> Cc: "Gerald Britton" <ger...@gm...>, >>>> gra...@li... >>>> Date: Vendredi 14 janvier 2011, 22h57 >>>> On Fri, Jan 14, 2011 at 4:31 PM, >>>> jerome <rom...@ya...> >>>> wrote: >>>>>>> gramps ids could be exotic! >>>>>> Do you mean unique? Anyway it is a good >>>> sort-key >>>>>> candidate >>>>> ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] >>>>> >>>>> In 'handle' I trust! ;) >>>>> >>>>>>> Every time I import a Gramps XML, Gramps >>>> rebuilds >>>>>> (write, DB commit) some objects! Change time is >>>> not the same >>>>>> with a simple import then export. >>>>>> Well, they all need new handles, right? >>>> Possibility >>>>>> of collisions. >>>>>> Also with gramps ids. >>>>> In fact, I want to keep handles: they should be the >>>> keys control. >>>>> My problem could be illustrated by something like: >>>>> >>>>> $ gramps -i import.gramps -e export.gramps >>>>> $ gunzip < import.gramps > import.xml >>>>> $ gunzip < export.gramps > export.xml >>>>> $ diff -u import.xml export.xml > diff.txt >>>>> >>>>> where import.gramps is our "Scientific control". >>>>> >>>>> What should be the content of diff.txt ? >>>>> >>>>> For me, it should be few lines... >>>>> Unfortunatly there is some change (order, change time >>>> on family objects): that's strange! >>>> >>>> Yes, it would be handy to do this. This might be called >>>> "idempotent" >>>> by a mathematician: if the round-trip through gramps was >>>> idempotent, >>>> then the diff would be empty. >>>> >>>> What we need is: >>>> >>>> 1. something smarter than diff for this usage >>>> 2. sort on something that doesn't change (like the handle), >>>> just for >>>> this purpose >>>> 3. make it so that the order is preserved >>>> >>>> I would lean towards #3. I've "fixed" some other places >>>> where the >>>> order was lost. If you let me know which orders are lost, >>>> I'll >>>> address. >>>> >>>> -Doug >>>> >>>>> Jérôme >>>>> >>>>> >>>>> --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> >>>> a écrit : >>>>>> De: Gerald Britton <ger...@gm...> >>>>>> Objet: Re: [Gramps-devel] >>>> self.db.iter_object_handles(sort_handles=True) >>>>>> À: "jerome" <rom...@ya...> >>>>>> Cc: gra...@li... >>>>>> Date: Vendredi 14 janvier 2011, 22h10 >>>>>> On Fri, Jan 14, 2011 at 3:59 PM, >>>>>> jerome <rom...@ya...> >>>>>> wrote: >>>>>>>>> I am not certain to understand ... >>>>>>>>> Keys should be handles, no ? >>>>>>>> Well, that's the question! I can see a >>>> case for >>>>>>>> gramps ids, or >>>>>>>> surnames, or event dates, etc. etc. >>>>>>> But handle is the easiest way and safe key >>>> for >>>>>> ordering our data. >>>>>> >>>>>> Only if that's the order you want >>>>>> >>>>>>> gramps ids could be exotic! >>>>>> Do you mean unique? Anyway it is a good >>>> sort-key >>>>>> candidate >>>>>> >>>>>>> surnames is not a good key :( >>>>>> I can see that some would like it...makes the XML >>>> easier to >>>>>> read by a human >>>>>> >>>>>>> date => date_object => year, then >>>> month, then >>>>>> day, then rank, etc ... = horrible index >>>>>> >>>>>> Probably, but its just one possibility >>>>>> >>>>>>> My problem is on plugins/export/ExportXML.py >>>>>>> >>>>>>> I saw a sortByID function not used, then >>>> sometimes the >>>>>> use of list (get_...), then iteration (only >>>> family >>>>>> handles). >>>>>>> I thought on use lists sorted by handle for >>>> having an >>>>>> order rule. I do not want to group handles, >>>> handles will be >>>>>> grouped into the Gramps XML, so it was not planned >>>> to parse >>>>>> one flat XML file or something like that! >>>>>>> But it is not my main problem ... >>>>>>> I thought that to sort handles means objects >>>> lists >>>>>> will be consistent (Persons, Families, Events, etc >>>> ...) >>>>>>> Every time I import a Gramps XML, Gramps >>>> rebuilds >>>>>> (write, DB commit) some objects! Change time is >>>> not the same >>>>>> with a simple import then export. >>>>>> >>>>>> Well, they all need new handles, right? >>>> Possibility >>>>>> of collisions. >>>>>> Also with gramps ids. >>>>>> >>>>>>> I can understand the random order used by >>>> bsddb, but >>>>>> this should not be done on some objects (like >>>> family) and >>>>>> not on the others. >>>>>>> In my mind, an import without DB change is >>>> like a >>>>>> "read-only": it is not the case. OK, you are >>>> saying that it >>>>>> is the way used by bsddb. XML files should be able >>>> to use >>>>>> 'diff' or revision control tools. With current >>>> Gramps XML >>>>>> import/export, these tools are limited. :( >>>>>> >>>>>> Yep. You're probably looking for something like >>>> a >>>>>> UUID for each >>>>>> record. Not a bad idea but not implemented at >>>> the >>>>>> moment. >>>>>> >>>>>>> Jérôme >>>>>>> >>>>>>> >>>>>>> --- En date de : Ven 14.1.11, Gerald Britton >>>> <ger...@gm...> >>>>>> a écrit : >>>>>>>> De: Gerald Britton <ger...@gm...> >>>>>>>> Objet: Re: [Gramps-devel] >>>>>> self.db.iter_object_handles(sort_handles=True) >>>>>>>> À: "jerome" <rom...@ya...> >>>>>>>> Cc: gra...@li... >>>>>>>> Date: Vendredi 14 janvier 2011, 21h21 >>>>>>>> On Fri, Jan 14, 2011 at 3:11 PM, >>>>>>>> jerome <rom...@ya...> >>>>>>>> wrote: >>>>>>>>> I am not certain to understand ... >>>>>>>>> Keys should be handles, no ? >>>>>>>> Well, that's the question! I can see a >>>> case for >>>>>>>> gramps ids, or >>>>>>>> surnames, or event dates, etc. etc. >>>>>>>> >>>>>> 'self.db.get_{object}_handles(sort_handles=True)' >>>> is >>>>>>>> allowed, >>>>>>>>> not >>>> 'self.db.iter_{object}_handles(sort_handles=True)'! >>>>>>>>> There is two questions: >>>>>>>>> >>>>>>>>> 1. Why does Gramps only use >>>>>>>> self.db.iter_family_handles(), else >>>>>>>> self.get_{object}_handles(), where >>>> {object} is >>>>>> person or >>>>>>>> event or source or place or repository or >>>> note or >>>>>> media >>>>>>>> object. >>>>>>>> >>>>>>>> the get_...handles methods return a list, >>>> which >>>>>> can be >>>>>>>> expensive in >>>>>>>> memory and must read all objects in one >>>> pass. >>>>>> The >>>>>>>> iter... methods >>>>>>>> just return one at at time, so are >>>> cheaper in >>>>>> memory. >>>>>>>> So, the iter... >>>>>>>> methods are preferable. OTOH, they >>>> cannot do >>>>>> sorting, >>>>>>>> since by >>>>>>>> definition you need to read all records >>>> before you >>>>>> can sort >>>>>>>> them. >>>>>>>> >>>>>>>>> 2. Why 'sort_handles=True' argument >>>> is >>>>>> allowed on all >>>>>>>> primary objects except family object ? >>>>>>>> >>>>>>>> I suppose that there has been no >>>> requirement so >>>>>> far so no >>>>>>>> one coded it up. >>>>>>>> >>>>>>>>>> The data is not ordered since >>>> it >>>>>>>>>> comes from bsddb in random >>>> order. >>>>>>>>> This could explain why I will not be >>>> able to >>>>>> keep >>>>>>>> order on XML import (to bsddb). :( >>>>>>>>> Thanks. >>>>>>>>> Jérôme >>>>>>>>> >>>>>>>>> --- En date de : Ven 14.1.11, >>>> Gerald Britton >>>>>> <ger...@gm...> >>>>>>>> a écrit : >>>>>>>>>> De: Gerald Britton <ger...@gm...> >>>>>>>>>> Objet: Re: [Gramps-devel] >>>> self.db.iter_object_handles(sort_handles=True) >>>>>>>>>> À: "jerome" <rom...@ya...> >>>>>>>>>> Cc: gra...@li... >>>>>>>>>> Date: Vendredi 14 janvier 2011, >>>> 19h53 >>>>>>>>>> The data is not ordered since >>>> it >>>>>>>>>> comes from bsddb in random >>>> order. If >>>>>>>>>> we ordered it, we would have to >>>> sort it >>>>>> by some >>>>>>>> key. >>>>>>>>>> So, if we did, >>>>>>>>>> what keys would you use for: >>>>>>>>>> >>>>>>>>>> person >>>>>>>>>> family >>>>>>>>>> event >>>>>>>>>> source >>>>>>>>>> place >>>>>>>>>> repository >>>>>>>>>> note >>>>>>>>>> media object >>>>>>>>>> >>>>>>>>>> On Fri, Jan 14, 2011 at 1:36 PM, >>>> jerome >>>>>> <rom...@ya...> >>>>>>>>>> wrote: >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I am trying to get an >>>> answer to a >>>>>> question >>>>>>>> about the >>>>>>>>>> code: why we cannot keep the >>>> order of >>>>>> objects >>>>>>>> after a Gramps >>>>>>>>>> XML file import against export >>>> ? >>>>>>>>>>> Nick pointed out that >>>> objects are >>>>>> not ordered >>>>>>>> on >>>>>>>>>> export[1]. >>>>>>>>>>> Why ? I suppose backup >>>> scripts or >>>>>> revision >>>>>>>> control >>>>>>>>>> tools will work better with >>>> ordered >>>>>> objects! >>>>>>>> Anyway, to use >>>>>>>>>> 'sort_handles=True' works on >>>> export, >>>>>> except for >>>>>>>> family >>>>>>>>>> handles. Any reason for that ? A >>>> typo >>>>>> somewhere ? >>>>>>>> On my side >>>>>>>>>> ? >>>>>>>>>>> [1] http://www.gramps-project.org/bugs/view.php?id=4365 >>>>>>>>>>> >>>>>>>>>>> regards, >>>>>>>>>>> Jérôme >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>> ------------------------------------------------------------------------------ >>>>>>>>>>> Protect Your Site and >>>> Customers from >>>>>> Malware >>>>>>>> Attacks >>>>>>>>>>> Learn about various malware >>>> tactics >>>>>> and how >>>>>>>> to avoid >>>>>>>>>> them. Understand >>>>>>>>>>> malware threats, the impact >>>> they can >>>>>> have on >>>>>>>> your >>>>>>>>>> business, and how you >>>>>>>>>>> can protect your company >>>> and >>>>>> customers by >>>>>>>> using code >>>>>>>>>> signing. >>>>>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl >>>>>>>>>>> >>>> _______________________________________________ >>>>>>>>>>> Gramps-devel mailing list >>>>>>>>>>> Gra...@li... >>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Gerald Britton >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Gerald Britton >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Gerald Britton >>>>>> >>>>> >>>>> >>>>> >>>> ------------------------------------------------------------------------------ >>>>> Protect Your Site and Customers from Malware Attacks >>>>> Learn about various malware tactics and how to avoid >>>> them. Understand >>>>> malware threats, the impact they can have on your >>>> business, and how you >>>>> can protect your company and customers by using code >>>> signing. >>>>> http://p.sf.net/sfu/oracle-sfdevnl >>>>> _______________________________________________ >>>>> Gramps-devel mailing list >>>>> Gra...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>>>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Protect Your Site and Customers from Malware Attacks >>> Learn about various malware tactics and how to avoid them. Understand >>> malware threats, the impact they can have on your business, and how you can >>> protect your company and customers by using code signing. >>> http://p.sf.net/sfu/oracle-sfdevnl >>> _______________________________________________ >>> Gramps-devel mailing list >>> Gra...@li... >>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>> >> > > > |
From: Gerald B. <ger...@gm...> - 2011-01-15 19:55:45
|
On Sat, Jan 15, 2011 at 1:46 PM, Jérôme <rom...@ya...> wrote: >> Two ideas: >> >> 1. Add sort_handles options to all eight read methods and an option to >> export to use this option. >> >> 2. Post-process the exported xml file through XML Query for sorting. >> >> Which option depends on the use cases. If enough folks need option 1, >> we should do it. Otherwise, I recommend option 2. > > About "1." to sort lists (except family) was done, but the problem is > localized on import (hash, re-write "home person" family, etc ...) > > I should be able to also sort XML (post-process) with a custom index, > but it was not the primary goal (to sort). > > I wanted to see 'raw' modifications, to be able to make "human plain > text" modifications on *.gramps! > > If it is related to bsddb hash, then why should I still try to sort > handles on export? Maybe I misunderstood. I thought your goal was to be able to compare two exports of gramps to xml. I thought that you had found that after export(1), import, export(2), exports 1 and 2 do not match since the order of the objects is different. (In fact I'd be surprised if they were the same!). If this is your concern, then one of the two options would give you want you want. If I misunderstood, I apologize. Would you help me understand it better? > > > Jérôme > > > Gerald Britton a écrit : >> Two ideas: >> >> 1. Add sort_handles options to all eight read methods and an option to >> export to use this option. >> >> 2. Post-process the exported xml file through XML Query for sorting. >> >> Which option depends on the use cases. If enough folks need option 1, >> we should do it. Otherwise, I recommend option 2. >> >> On Sat, Jan 15, 2011 at 11:59 AM, Jérôme <rom...@ya...> wrote: >>>>> 3. make it so that the order is preserved >>>>> >>>>> I would lean towards #3. I've "fixed" some other places >>>>> where the order was lost. If you let me know which orders are lost, I'll >>>>> address. >>>> At glance, I will say events, notes, places. >>> Sorry, only on events, family and few persons, see diff.txt on: >>> http://www.gramps-project.org/bugs/view.php?id=4365 >>> maybe places and notes on my primary database? >>> >>> Is it possible that there is a cache on import and if there is a lot of >>> handles, then cache is commited like a direct write (do not keep order on >>> import ?). This could explain that long lists of objects could be >>> up-side-down! And only on large database or group of objects. >>> >>> >>> Jérôme >>> >>> >>> jerome a écrit : >>>>> Yes, it would be handy to do this. This might be called >>>>> "idempotent" >>>>> by a mathematician: if the round-trip through gramps was >>>>> idempotent, >>>>> then the diff would be empty. >>>> That's exactly what I tried to do. I learned one word! :) >>>> Thanks! >>>> >>>>> 3. make it so that the order is preserved >>>>> >>>>> I would lean towards #3. I've "fixed" some other places >>>>> where the order was lost. If you let me know which orders are lost, I'll >>>>> address. >>>> At glance, I will say events, notes, places. >>>> >>>> But there is something else: >>>> 1. some families are re-written (change time) >>>> 2. small samples do not reorder ! cache limit ? >>>> >>>> http://www.gramps-project.org/bugs/view.php?id=4365 >>>> >>>> >>>> Jérôme >>>> >>>> >>>> --- En date de : Ven 14.1.11, Doug Blank <dou...@gm...> a écrit : >>>> >>>>> De: Doug Blank <dou...@gm...> >>>>> Objet: Re: [Gramps-devel] self.db.iter_object_handles(sort_handles=True) >>>>> À: "jerome" <rom...@ya...> >>>>> Cc: "Gerald Britton" <ger...@gm...>, >>>>> gra...@li... >>>>> Date: Vendredi 14 janvier 2011, 22h57 >>>>> On Fri, Jan 14, 2011 at 4:31 PM, >>>>> jerome <rom...@ya...> >>>>> wrote: >>>>>>>> gramps ids could be exotic! >>>>>>> Do you mean unique? Anyway it is a good >>>>> sort-key >>>>>>> candidate >>>>>> ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] >>>>>> >>>>>> In 'handle' I trust! ;) >>>>>> >>>>>>>> Every time I import a Gramps XML, Gramps >>>>> rebuilds >>>>>>> (write, DB commit) some objects! Change time is >>>>> not the same >>>>>>> with a simple import then export. >>>>>>> Well, they all need new handles, right? >>>>> Possibility >>>>>>> of collisions. >>>>>>> Also with gramps ids. >>>>>> In fact, I want to keep handles: they should be the >>>>> keys control. >>>>>> My problem could be illustrated by something like: >>>>>> >>>>>> $ gramps -i import.gramps -e export.gramps >>>>>> $ gunzip < import.gramps > import.xml >>>>>> $ gunzip < export.gramps > export.xml >>>>>> $ diff -u import.xml export.xml > diff.txt >>>>>> >>>>>> where import.gramps is our "Scientific control". >>>>>> >>>>>> What should be the content of diff.txt ? >>>>>> >>>>>> For me, it should be few lines... >>>>>> Unfortunatly there is some change (order, change time >>>>> on family objects): that's strange! >>>>> >>>>> Yes, it would be handy to do this. This might be called >>>>> "idempotent" >>>>> by a mathematician: if the round-trip through gramps was >>>>> idempotent, >>>>> then the diff would be empty. >>>>> >>>>> What we need is: >>>>> >>>>> 1. something smarter than diff for this usage >>>>> 2. sort on something that doesn't change (like the handle), >>>>> just for >>>>> this purpose >>>>> 3. make it so that the order is preserved >>>>> >>>>> I would lean towards #3. I've "fixed" some other places >>>>> where the >>>>> order was lost. If you let me know which orders are lost, >>>>> I'll >>>>> address. >>>>> >>>>> -Doug >>>>> >>>>>> Jérôme >>>>>> >>>>>> >>>>>> --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> >>>>> a écrit : >>>>>>> De: Gerald Britton <ger...@gm...> >>>>>>> Objet: Re: [Gramps-devel] >>>>> self.db.iter_object_handles(sort_handles=True) >>>>>>> À: "jerome" <rom...@ya...> >>>>>>> Cc: gra...@li... >>>>>>> Date: Vendredi 14 janvier 2011, 22h10 >>>>>>> On Fri, Jan 14, 2011 at 3:59 PM, >>>>>>> jerome <rom...@ya...> >>>>>>> wrote: >>>>>>>>>> I am not certain to understand ... >>>>>>>>>> Keys should be handles, no ? >>>>>>>>> Well, that's the question! I can see a >>>>> case for >>>>>>>>> gramps ids, or >>>>>>>>> surnames, or event dates, etc. etc. >>>>>>>> But handle is the easiest way and safe key >>>>> for >>>>>>> ordering our data. >>>>>>> >>>>>>> Only if that's the order you want >>>>>>> >>>>>>>> gramps ids could be exotic! >>>>>>> Do you mean unique? Anyway it is a good >>>>> sort-key >>>>>>> candidate >>>>>>> >>>>>>>> surnames is not a good key :( >>>>>>> I can see that some would like it...makes the XML >>>>> easier to >>>>>>> read by a human >>>>>>> >>>>>>>> date => date_object => year, then >>>>> month, then >>>>>>> day, then rank, etc ... = horrible index >>>>>>> >>>>>>> Probably, but its just one possibility >>>>>>> >>>>>>>> My problem is on plugins/export/ExportXML.py >>>>>>>> >>>>>>>> I saw a sortByID function not used, then >>>>> sometimes the >>>>>>> use of list (get_...), then iteration (only >>>>> family >>>>>>> handles). >>>>>>>> I thought on use lists sorted by handle for >>>>> having an >>>>>>> order rule. I do not want to group handles, >>>>> handles will be >>>>>>> grouped into the Gramps XML, so it was not planned >>>>> to parse >>>>>>> one flat XML file or something like that! >>>>>>>> But it is not my main problem ... >>>>>>>> I thought that to sort handles means objects >>>>> lists >>>>>>> will be consistent (Persons, Families, Events, etc >>>>> ...) >>>>>>>> Every time I import a Gramps XML, Gramps >>>>> rebuilds >>>>>>> (write, DB commit) some objects! Change time is >>>>> not the same >>>>>>> with a simple import then export. >>>>>>> >>>>>>> Well, they all need new handles, right? >>>>> Possibility >>>>>>> of collisions. >>>>>>> Also with gramps ids. >>>>>>> >>>>>>>> I can understand the random order used by >>>>> bsddb, but >>>>>>> this should not be done on some objects (like >>>>> family) and >>>>>>> not on the others. >>>>>>>> In my mind, an import without DB change is >>>>> like a >>>>>>> "read-only": it is not the case. OK, you are >>>>> saying that it >>>>>>> is the way used by bsddb. XML files should be able >>>>> to use >>>>>>> 'diff' or revision control tools. With current >>>>> Gramps XML >>>>>>> import/export, these tools are limited. :( >>>>>>> >>>>>>> Yep. You're probably looking for something like >>>>> a >>>>>>> UUID for each >>>>>>> record. Not a bad idea but not implemented at >>>>> the >>>>>>> moment. >>>>>>> >>>>>>>> Jérôme >>>>>>>> >>>>>>>> >>>>>>>> --- En date de : Ven 14.1.11, Gerald Britton >>>>> <ger...@gm...> >>>>>>> a écrit : >>>>>>>>> De: Gerald Britton <ger...@gm...> >>>>>>>>> Objet: Re: [Gramps-devel] >>>>>>> self.db.iter_object_handles(sort_handles=True) >>>>>>>>> À: "jerome" <rom...@ya...> >>>>>>>>> Cc: gra...@li... >>>>>>>>> Date: Vendredi 14 janvier 2011, 21h21 >>>>>>>>> On Fri, Jan 14, 2011 at 3:11 PM, >>>>>>>>> jerome <rom...@ya...> >>>>>>>>> wrote: >>>>>>>>>> I am not certain to understand ... >>>>>>>>>> Keys should be handles, no ? >>>>>>>>> Well, that's the question! I can see a >>>>> case for >>>>>>>>> gramps ids, or >>>>>>>>> surnames, or event dates, etc. etc. >>>>>>>>> >>>>>>> 'self.db.get_{object}_handles(sort_handles=True)' >>>>> is >>>>>>>>> allowed, >>>>>>>>>> not >>>>> 'self.db.iter_{object}_handles(sort_handles=True)'! >>>>>>>>>> There is two questions: >>>>>>>>>> >>>>>>>>>> 1. Why does Gramps only use >>>>>>>>> self.db.iter_family_handles(), else >>>>>>>>> self.get_{object}_handles(), where >>>>> {object} is >>>>>>> person or >>>>>>>>> event or source or place or repository or >>>>> note or >>>>>>> media >>>>>>>>> object. >>>>>>>>> >>>>>>>>> the get_...handles methods return a list, >>>>> which >>>>>>> can be >>>>>>>>> expensive in >>>>>>>>> memory and must read all objects in one >>>>> pass. >>>>>>> The >>>>>>>>> iter... methods >>>>>>>>> just return one at at time, so are >>>>> cheaper in >>>>>>> memory. >>>>>>>>> So, the iter... >>>>>>>>> methods are preferable. OTOH, they >>>>> cannot do >>>>>>> sorting, >>>>>>>>> since by >>>>>>>>> definition you need to read all records >>>>> before you >>>>>>> can sort >>>>>>>>> them. >>>>>>>>> >>>>>>>>>> 2. Why 'sort_handles=True' argument >>>>> is >>>>>>> allowed on all >>>>>>>>> primary objects except family object ? >>>>>>>>> >>>>>>>>> I suppose that there has been no >>>>> requirement so >>>>>>> far so no >>>>>>>>> one coded it up. >>>>>>>>> >>>>>>>>>>> The data is not ordered since >>>>> it >>>>>>>>>>> comes from bsddb in random >>>>> order. >>>>>>>>>> This could explain why I will not be >>>>> able to >>>>>>> keep >>>>>>>>> order on XML import (to bsddb). :( >>>>>>>>>> Thanks. >>>>>>>>>> Jérôme >>>>>>>>>> >>>>>>>>>> --- En date de : Ven 14.1.11, >>>>> Gerald Britton >>>>>>> <ger...@gm...> >>>>>>>>> a écrit : >>>>>>>>>>> De: Gerald Britton <ger...@gm...> >>>>>>>>>>> Objet: Re: [Gramps-devel] >>>>> self.db.iter_object_handles(sort_handles=True) >>>>>>>>>>> À: "jerome" <rom...@ya...> >>>>>>>>>>> Cc: gra...@li... >>>>>>>>>>> Date: Vendredi 14 janvier 2011, >>>>> 19h53 >>>>>>>>>>> The data is not ordered since >>>>> it >>>>>>>>>>> comes from bsddb in random >>>>> order. If >>>>>>>>>>> we ordered it, we would have to >>>>> sort it >>>>>>> by some >>>>>>>>> key. >>>>>>>>>>> So, if we did, >>>>>>>>>>> what keys would you use for: >>>>>>>>>>> >>>>>>>>>>> person >>>>>>>>>>> family >>>>>>>>>>> event >>>>>>>>>>> source >>>>>>>>>>> place >>>>>>>>>>> repository >>>>>>>>>>> note >>>>>>>>>>> media object >>>>>>>>>>> >>>>>>>>>>> On Fri, Jan 14, 2011 at 1:36 PM, >>>>> jerome >>>>>>> <rom...@ya...> >>>>>>>>>>> wrote: >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I am trying to get an >>>>> answer to a >>>>>>> question >>>>>>>>> about the >>>>>>>>>>> code: why we cannot keep the >>>>> order of >>>>>>> objects >>>>>>>>> after a Gramps >>>>>>>>>>> XML file import against export >>>>> ? >>>>>>>>>>>> Nick pointed out that >>>>> objects are >>>>>>> not ordered >>>>>>>>> on >>>>>>>>>>> export[1]. >>>>>>>>>>>> Why ? I suppose backup >>>>> scripts or >>>>>>> revision >>>>>>>>> control >>>>>>>>>>> tools will work better with >>>>> ordered >>>>>>> objects! >>>>>>>>> Anyway, to use >>>>>>>>>>> 'sort_handles=True' works on >>>>> export, >>>>>>> except for >>>>>>>>> family >>>>>>>>>>> handles. Any reason for that ? A >>>>> typo >>>>>>> somewhere ? >>>>>>>>> On my side >>>>>>>>>>> ? >>>>>>>>>>>> [1] http://www.gramps-project.org/bugs/view.php?id=4365 >>>>>>>>>>>> >>>>>>>>>>>> regards, >>>>>>>>>>>> Jérôme >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>> ------------------------------------------------------------------------------ >>>>>>>>>>>> Protect Your Site and >>>>> Customers from >>>>>>> Malware >>>>>>>>> Attacks >>>>>>>>>>>> Learn about various malware >>>>> tactics >>>>>>> and how >>>>>>>>> to avoid >>>>>>>>>>> them. Understand >>>>>>>>>>>> malware threats, the impact >>>>> they can >>>>>>> have on >>>>>>>>> your >>>>>>>>>>> business, and how you >>>>>>>>>>>> can protect your company >>>>> and >>>>>>> customers by >>>>>>>>> using code >>>>>>>>>>> signing. >>>>>>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl >>>>>>>>>>>> >>>>> _______________________________________________ >>>>>>>>>>>> Gramps-devel mailing list >>>>>>>>>>>> Gra...@li... >>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Gerald Britton >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Gerald Britton >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Gerald Britton >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> ------------------------------------------------------------------------------ >>>>>> Protect Your Site and Customers from Malware Attacks >>>>>> Learn about various malware tactics and how to avoid >>>>> them. Understand >>>>>> malware threats, the impact they can have on your >>>>> business, and how you >>>>>> can protect your company and customers by using code >>>>> signing. >>>>>> http://p.sf.net/sfu/oracle-sfdevnl >>>>>> _______________________________________________ >>>>>> Gramps-devel mailing list >>>>>> Gra...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>>>>> >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Protect Your Site and Customers from Malware Attacks >>>> Learn about various malware tactics and how to avoid them. Understand >>>> malware threats, the impact they can have on your business, and how you can >>>> protect your company and customers by using code signing. >>>> http://p.sf.net/sfu/oracle-sfdevnl >>>> _______________________________________________ >>>> Gramps-devel mailing list >>>> Gra...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>>> >>> >> >> >> > > > ------------------------------------------------------------------------------ > Protect Your Site and Customers from Malware Attacks > Learn about various malware tactics and how to avoid them. Understand > malware threats, the impact they can have on your business, and how you > can protect your company and customers by using code signing. > http://p.sf.net/sfu/oracle-sfdevnl > _______________________________________________ > Gramps-devel mailing list > Gra...@li... > https://lists.sourceforge.net/lists/listinfo/gramps-devel > -- Gerald Britton |
From: jerome <rom...@ya...> - 2011-01-15 20:11:14
|
> I thought your goal was to be able to compare > two exports of gramps to xml. Yes, it was. But only compare XML import and export should be enough as the first export is our backup! > I thought that you had found that after export(1), import, export(2), exports 1 and 2 do not match since the order of the objects is different. (In fact I'd be surprised if they were the same!). 'Same' was not the expected result, only "pure" data comparison without hash, bsddb, ordering! I get my answer: "bsddb hash uses a random order and handles 'flag like' is ignored on XML import". > If this is your concern, then one of > the two options would give you want you want. I want nothing! except understand ... There is no plan for a specific use, just to understand why it is not possible! > If I misunderstood, I > apologize. Would you help me understand it better? OK, I understand the message ... Thank you for your answers. regards, Jérôme --- En date de : Sam 15.1.11, Gerald Britton <ger...@gm...> a écrit : > De: Gerald Britton <ger...@gm...> > Objet: Re: [Gramps-devel] self.db.iter_object_handles(sort_handles=True) > À: rom...@ya... > Cc: gra...@li... > Date: Samedi 15 janvier 2011, 20h55 > On Sat, Jan 15, 2011 at 1:46 PM, > Jérôme <rom...@ya...> > wrote: > >> Two ideas: > >> > >> 1. Add sort_handles options to all eight read > methods and an option to > >> export to use this option. > >> > >> 2. Post-process the exported xml file through XML > Query for sorting. > >> > >> Which option depends on the use cases. If enough > folks need option 1, > >> we should do it. Otherwise, I recommend option > 2. > > > > About "1." to sort lists (except family) was done, but > the problem is > > localized on import (hash, re-write "home person" > family, etc ...) > > > > I should be able to also sort XML (post-process) with > a custom index, > > but it was not the primary goal (to sort). > > > > I wanted to see 'raw' modifications, to be able to > make "human plain > > text" modifications on *.gramps! > > > > If it is related to bsddb hash, then why should I > still try to sort > > handles on export? > > Maybe I misunderstood. I thought your goal was to be > able to compare > two exports of gramps to xml. I thought that you had > found that after > export(1), import, export(2), exports 1 and 2 do not match > since the > order of the objects is different. (In fact I'd be > surprised if they > were the same!). If this is your concern, then one of > the two options > would give you want you want. If I misunderstood, I > apologize. Would > you help me understand it better? > > > > > > > > Jérôme > > > > > > Gerald Britton a écrit : > >> Two ideas: > >> > >> 1. Add sort_handles options to all eight read > methods and an option to > >> export to use this option. > >> > >> 2. Post-process the exported xml file through XML > Query for sorting. > >> > >> Which option depends on the use cases. If enough > folks need option 1, > >> we should do it. Otherwise, I recommend option > 2. > >> > >> On Sat, Jan 15, 2011 at 11:59 AM, Jérôme <rom...@ya...> > wrote: > >>>>> 3. make it so that the order is > preserved > >>>>> > >>>>> I would lean towards #3. I've "fixed" > some other places > >>>>> where the order was lost. If you let > me know which orders are lost, I'll > >>>>> address. > >>>> At glance, I will say events, notes, > places. > >>> Sorry, only on events, family and few persons, > see diff.txt on: > >>> http://www.gramps-project.org/bugs/view.php?id=4365 > >>> maybe places and notes on my primary > database? > >>> > >>> Is it possible that there is a cache on import > and if there is a lot of > >>> handles, then cache is commited like a direct > write (do not keep order on > >>> import ?). This could explain that long lists > of objects could be > >>> up-side-down! And only on large database or > group of objects. > >>> > >>> > >>> Jérôme > >>> > >>> > >>> jerome a écrit : > >>>>> Yes, it would be handy to do this. > This might be called > >>>>> "idempotent" > >>>>> by a mathematician: if the round-trip > through gramps was > >>>>> idempotent, > >>>>> then the diff would be empty. > >>>> That's exactly what I tried to do. I > learned one word! :) > >>>> Thanks! > >>>> > >>>>> 3. make it so that the order is > preserved > >>>>> > >>>>> I would lean towards #3. I've "fixed" > some other places > >>>>> where the order was lost. If you let > me know which orders are lost, I'll > >>>>> address. > >>>> At glance, I will say events, notes, > places. > >>>> > >>>> But there is something else: > >>>> 1. some families are re-written (change > time) > >>>> 2. small samples do not reorder ! cache > limit ? > >>>> > >>>> http://www.gramps-project.org/bugs/view.php?id=4365 > >>>> > >>>> > >>>> Jérôme > >>>> > >>>> > >>>> --- En date de : Ven 14.1.11, Doug Blank > <dou...@gm...> > a écrit : > >>>> > >>>>> De: Doug Blank <dou...@gm...> > >>>>> Objet: Re: [Gramps-devel] > self.db.iter_object_handles(sort_handles=True) > >>>>> À: "jerome" <rom...@ya...> > >>>>> Cc: "Gerald Britton" <ger...@gm...>, > >>>>> gra...@li... > >>>>> Date: Vendredi 14 janvier 2011, 22h57 > >>>>> On Fri, Jan 14, 2011 at 4:31 PM, > >>>>> jerome <rom...@ya...> > >>>>> wrote: > >>>>>>>> gramps ids could be > exotic! > >>>>>>> Do you mean unique? Anyway > it is a good > >>>>> sort-key > >>>>>>> candidate > >>>>>> ids = [I000001, IAYUTRE235, zharb, > /empty/ , etc ...] > >>>>>> > >>>>>> In 'handle' I trust! ;) > >>>>>> > >>>>>>>> Every time I import a > Gramps XML, Gramps > >>>>> rebuilds > >>>>>>> (write, DB commit) some > objects! Change time is > >>>>> not the same > >>>>>>> with a simple import then > export. > >>>>>>> Well, they all need new > handles, right? > >>>>> Possibility > >>>>>>> of collisions. > >>>>>>> Also with gramps ids. > >>>>>> In fact, I want to keep handles: > they should be the > >>>>> keys control. > >>>>>> My problem could be illustrated by > something like: > >>>>>> > >>>>>> $ gramps -i import.gramps -e > export.gramps > >>>>>> $ gunzip < import.gramps > > import.xml > >>>>>> $ gunzip < export.gramps > > export.xml > >>>>>> $ diff -u import.xml export.xml > > diff.txt > >>>>>> > >>>>>> where import.gramps is our > "Scientific control". > >>>>>> > >>>>>> What should be the content of > diff.txt ? > >>>>>> > >>>>>> For me, it should be few lines... > >>>>>> Unfortunatly there is some change > (order, change time > >>>>> on family objects): that's strange! > >>>>> > >>>>> Yes, it would be handy to do this. > This might be called > >>>>> "idempotent" > >>>>> by a mathematician: if the round-trip > through gramps was > >>>>> idempotent, > >>>>> then the diff would be empty. > >>>>> > >>>>> What we need is: > >>>>> > >>>>> 1. something smarter than diff for > this usage > >>>>> 2. sort on something that doesn't > change (like the handle), > >>>>> just for > >>>>> this purpose > >>>>> 3. make it so that the order is > preserved > >>>>> > >>>>> I would lean towards #3. I've "fixed" > some other places > >>>>> where the > >>>>> order was lost. If you let me know > which orders are lost, > >>>>> I'll > >>>>> address. > >>>>> > >>>>> -Doug > >>>>> > >>>>>> Jérôme > >>>>>> > >>>>>> > >>>>>> --- En date de : Ven 14.1.11, > Gerald Britton <ger...@gm...> > >>>>> a écrit : > >>>>>>> De: Gerald Britton <ger...@gm...> > >>>>>>> Objet: Re: [Gramps-devel] > >>>>> > self.db.iter_object_handles(sort_handles=True) > >>>>>>> À: "jerome" <rom...@ya...> > >>>>>>> Cc: gra...@li... > >>>>>>> Date: Vendredi 14 janvier > 2011, 22h10 > >>>>>>> On Fri, Jan 14, 2011 at 3:59 > PM, > >>>>>>> jerome <rom...@ya...> > >>>>>>> wrote: > >>>>>>>>>> I am not certain > to understand ... > >>>>>>>>>> Keys should be > handles, no ? > >>>>>>>>> Well, that's the > question! I can see a > >>>>> case for > >>>>>>>>> gramps ids, or > >>>>>>>>> surnames, or event > dates, etc. etc. > >>>>>>>> But handle is the easiest > way and safe key > >>>>> for > >>>>>>> ordering our data. > >>>>>>> > >>>>>>> Only if that's the order you > want > >>>>>>> > >>>>>>>> gramps ids could be > exotic! > >>>>>>> Do you mean unique? Anyway > it is a good > >>>>> sort-key > >>>>>>> candidate > >>>>>>> > >>>>>>>> surnames is not a good key > :( > >>>>>>> I can see that some would like > it...makes the XML > >>>>> easier to > >>>>>>> read by a human > >>>>>>> > >>>>>>>> date => date_object > => year, then > >>>>> month, then > >>>>>>> day, then rank, etc ... = > horrible index > >>>>>>> > >>>>>>> Probably, but its just one > possibility > >>>>>>> > >>>>>>>> My problem is on > plugins/export/ExportXML.py > >>>>>>>> > >>>>>>>> I saw a sortByID function > not used, then > >>>>> sometimes the > >>>>>>> use of list (get_...), then > iteration (only > >>>>> family > >>>>>>> handles). > >>>>>>>> I thought on use lists > sorted by handle for > >>>>> having an > >>>>>>> order rule. I do not want to > group handles, > >>>>> handles will be > >>>>>>> grouped into the Gramps XML, > so it was not planned > >>>>> to parse > >>>>>>> one flat XML file or something > like that! > >>>>>>>> But it is not my main > problem ... > >>>>>>>> I thought that to sort > handles means objects > >>>>> lists > >>>>>>> will be consistent (Persons, > Families, Events, etc > >>>>> ...) > >>>>>>>> Every time I import a > Gramps XML, Gramps > >>>>> rebuilds > >>>>>>> (write, DB commit) some > objects! Change time is > >>>>> not the same > >>>>>>> with a simple import then > export. > >>>>>>> > >>>>>>> Well, they all need new > handles, right? > >>>>> Possibility > >>>>>>> of collisions. > >>>>>>> Also with gramps ids. > >>>>>>> > >>>>>>>> I can understand the > random order used by > >>>>> bsddb, but > >>>>>>> this should not be done on > some objects (like > >>>>> family) and > >>>>>>> not on the others. > >>>>>>>> In my mind, an import > without DB change is > >>>>> like a > >>>>>>> "read-only": it is not the > case. OK, you are > >>>>> saying that it > >>>>>>> is the way used by bsddb. XML > files should be able > >>>>> to use > >>>>>>> 'diff' or revision control > tools. With current > >>>>> Gramps XML > >>>>>>> import/export, these tools are > limited. :( > >>>>>>> > >>>>>>> Yep. You're probably looking > for something like > >>>>> a > >>>>>>> UUID for each > >>>>>>> record. Not a bad idea but > not implemented at > >>>>> the > >>>>>>> moment. > >>>>>>> > >>>>>>>> Jérôme > >>>>>>>> > >>>>>>>> > >>>>>>>> --- En date de : Ven > 14.1.11, Gerald Britton > >>>>> <ger...@gm...> > >>>>>>> a écrit : > >>>>>>>>> De: Gerald Britton > <ger...@gm...> > >>>>>>>>> Objet: Re: > [Gramps-devel] > >>>>>>> > self.db.iter_object_handles(sort_handles=True) > >>>>>>>>> À: "jerome" <rom...@ya...> > >>>>>>>>> Cc: gra...@li... > >>>>>>>>> Date: Vendredi 14 > janvier 2011, 21h21 > >>>>>>>>> On Fri, Jan 14, 2011 > at 3:11 PM, > >>>>>>>>> jerome <rom...@ya...> > >>>>>>>>> wrote: > >>>>>>>>>> I am not certain > to understand ... > >>>>>>>>>> Keys should be > handles, no ? > >>>>>>>>> Well, that's the > question! I can see a > >>>>> case for > >>>>>>>>> gramps ids, or > >>>>>>>>> surnames, or event > dates, etc. etc. > >>>>>>>>> > >>>>>>> > 'self.db.get_{object}_handles(sort_handles=True)' > >>>>> is > >>>>>>>>> allowed, > >>>>>>>>>> not > >>>>> > 'self.db.iter_{object}_handles(sort_handles=True)'! > >>>>>>>>>> There is two > questions: > >>>>>>>>>> > >>>>>>>>>> 1. Why does Gramps > only use > >>>>>>>>> > self.db.iter_family_handles(), else > >>>>>>>>> > self.get_{object}_handles(), where > >>>>> {object} is > >>>>>>> person or > >>>>>>>>> event or source or > place or repository or > >>>>> note or > >>>>>>> media > >>>>>>>>> object. > >>>>>>>>> > >>>>>>>>> the get_...handles > methods return a list, > >>>>> which > >>>>>>> can be > >>>>>>>>> expensive in > >>>>>>>>> memory and must read > all objects in one > >>>>> pass. > >>>>>>> The > >>>>>>>>> iter... methods > >>>>>>>>> just return one at at > time, so are > >>>>> cheaper in > >>>>>>> memory. > >>>>>>>>> So, the iter... > >>>>>>>>> methods are > preferable. OTOH, they > >>>>> cannot do > >>>>>>> sorting, > >>>>>>>>> since by > >>>>>>>>> definition you need to > read all records > >>>>> before you > >>>>>>> can sort > >>>>>>>>> them. > >>>>>>>>> > >>>>>>>>>> 2. Why > 'sort_handles=True' argument > >>>>> is > >>>>>>> allowed on all > >>>>>>>>> primary objects except > family object ? > >>>>>>>>> > >>>>>>>>> I suppose that there > has been no > >>>>> requirement so > >>>>>>> far so no > >>>>>>>>> one coded it up. > >>>>>>>>> > >>>>>>>>>>> The data is > not ordered since > >>>>> it > >>>>>>>>>>> comes from > bsddb in random > >>>>> order. > >>>>>>>>>> This could explain > why I will not be > >>>>> able to > >>>>>>> keep > >>>>>>>>> order on XML import > (to bsddb). :( > >>>>>>>>>> Thanks. > >>>>>>>>>> Jérôme > >>>>>>>>>> > >>>>>>>>>> --- En date de : > Ven 14.1.11, > >>>>> Gerald Britton > >>>>>>> <ger...@gm...> > >>>>>>>>> a écrit : > >>>>>>>>>>> De: Gerald > Britton <ger...@gm...> > >>>>>>>>>>> Objet: Re: > [Gramps-devel] > >>>>> > self.db.iter_object_handles(sort_handles=True) > >>>>>>>>>>> À: "jerome" > <rom...@ya...> > >>>>>>>>>>> Cc: gra...@li... > >>>>>>>>>>> Date: Vendredi > 14 janvier 2011, > >>>>> 19h53 > >>>>>>>>>>> The data is > not ordered since > >>>>> it > >>>>>>>>>>> comes from > bsddb in random > >>>>> order. If > >>>>>>>>>>> we ordered it, > we would have to > >>>>> sort it > >>>>>>> by some > >>>>>>>>> key. > >>>>>>>>>>> So, if we > did, > >>>>>>>>>>> what keys > would you use for: > >>>>>>>>>>> > >>>>>>>>>>> person > >>>>>>>>>>> family > >>>>>>>>>>> event > >>>>>>>>>>> source > >>>>>>>>>>> place > >>>>>>>>>>> repository > >>>>>>>>>>> note > >>>>>>>>>>> media object > >>>>>>>>>>> > >>>>>>>>>>> On Fri, Jan > 14, 2011 at 1:36 PM, > >>>>> jerome > >>>>>>> <rom...@ya...> > >>>>>>>>>>> wrote: > >>>>>>>>>>>> Hi, > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> I am > trying to get an > >>>>> answer to a > >>>>>>> question > >>>>>>>>> about the > >>>>>>>>>>> code: why we > cannot keep the > >>>>> order of > >>>>>>> objects > >>>>>>>>> after a Gramps > >>>>>>>>>>> XML file > import against export > >>>>> ? > >>>>>>>>>>>> Nick > pointed out that > >>>>> objects are > >>>>>>> not ordered > >>>>>>>>> on > >>>>>>>>>>> export[1]. > >>>>>>>>>>>> Why ? I > suppose backup > >>>>> scripts or > >>>>>>> revision > >>>>>>>>> control > >>>>>>>>>>> tools will > work better with > >>>>> ordered > >>>>>>> objects! > >>>>>>>>> Anyway, to use > >>>>>>>>>>> > 'sort_handles=True' works on > >>>>> export, > >>>>>>> except for > >>>>>>>>> family > >>>>>>>>>>> handles. Any > reason for that ? A > >>>>> typo > >>>>>>> somewhere ? > >>>>>>>>> On my side > >>>>>>>>>>> ? > >>>>>>>>>>>> [1] http://www.gramps-project.org/bugs/view.php?id=4365 > >>>>>>>>>>>> > >>>>>>>>>>>> regards, > >>>>>>>>>>>> Jérôme > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>> > ------------------------------------------------------------------------------ > >>>>>>>>>>>> Protect > Your Site and > >>>>> Customers from > >>>>>>> Malware > >>>>>>>>> Attacks > >>>>>>>>>>>> Learn > about various malware > >>>>> tactics > >>>>>>> and how > >>>>>>>>> to avoid > >>>>>>>>>>> them. > Understand > >>>>>>>>>>>> malware > threats, the impact > >>>>> they can > >>>>>>> have on > >>>>>>>>> your > >>>>>>>>>>> business, and > how you > >>>>>>>>>>>> can > protect your company > >>>>> and > >>>>>>> customers by > >>>>>>>>> using code > >>>>>>>>>>> signing. > >>>>>>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl > >>>>>>>>>>>> > >>>>> > _______________________________________________ > >>>>>>>>>>>> > Gramps-devel mailing list > >>>>>>>>>>>> Gra...@li... > >>>>>>>>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -- > >>>>>>>>>>> Gerald > Britton > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Gerald Britton > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Gerald Britton > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> > ------------------------------------------------------------------------------ > >>>>>> Protect Your Site and Customers > from Malware Attacks > >>>>>> Learn about various malware > tactics and how to avoid > >>>>> them. Understand > >>>>>> malware threats, the impact they > can have on your > >>>>> business, and how you > >>>>>> can protect your company and > customers by using code > >>>>> signing. > >>>>>> http://p.sf.net/sfu/oracle-sfdevnl > >>>>>> > _______________________________________________ > >>>>>> Gramps-devel mailing list > >>>>>> Gra...@li... > >>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel > >>>>>> > >>>> > >>>> > >>>> > >>>> > ------------------------------------------------------------------------------ > >>>> Protect Your Site and Customers from > Malware Attacks > >>>> Learn about various malware tactics and > how to avoid them. Understand > >>>> malware threats, the impact they can have > on your business, and how you can > >>>> protect your company and customers by > using code signing. > >>>> http://p.sf.net/sfu/oracle-sfdevnl > >>>> > _______________________________________________ > >>>> Gramps-devel mailing list > >>>> Gra...@li... > >>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel > >>>> > >>> > >> > >> > >> > > > > > > > ------------------------------------------------------------------------------ > > Protect Your Site and Customers from Malware Attacks > > Learn about various malware tactics and how to avoid > them. Understand > > malware threats, the impact they can have on your > business, and how you > > can protect your company and customers by using code > signing. > > http://p.sf.net/sfu/oracle-sfdevnl > > _______________________________________________ > > Gramps-devel mailing list > > Gra...@li... > > https://lists.sourceforge.net/lists/listinfo/gramps-devel > > > > > > -- > Gerald Britton > |
From: Jérôme <rom...@ya...> - 2011-01-15 09:17:32
|
> if the round-trip through gramps was idempotent, then the diff would be empty. Expected result was: minor change on date generation (if generated on an other day) and maybe media objects (media paths). I do not expect a full idem potent after round-trip, but currently we cannot easily get the differences. I just wanted testing complete XML migration before major release. Jérôme Doug Blank a écrit : > On Fri, Jan 14, 2011 at 4:31 PM, jerome <rom...@ya...> wrote: >>>> gramps ids could be exotic! >>> Do you mean unique? Anyway it is a good sort-key >>> candidate >> ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] >> >> In 'handle' I trust! ;) >> >>>> Every time I import a Gramps XML, Gramps rebuilds >>> (write, DB commit) some objects! Change time is not the same >>> with a simple import then export. >>> Well, they all need new handles, right? Possibility >>> of collisions. >>> Also with gramps ids. >> In fact, I want to keep handles: they should be the keys control. >> >> My problem could be illustrated by something like: >> >> $ gramps -i import.gramps -e export.gramps >> $ gunzip < import.gramps > import.xml >> $ gunzip < export.gramps > export.xml >> $ diff -u import.xml export.xml > diff.txt >> >> where import.gramps is our "Scientific control". >> >> What should be the content of diff.txt ? >> >> For me, it should be few lines... >> Unfortunatly there is some change (order, change time on family objects): that's strange! > > Yes, it would be handy to do this. This might be called "idempotent" > by a mathematician: if the round-trip through gramps was idempotent, > then the diff would be empty. > > What we need is: > > 1. something smarter than diff for this usage > 2. sort on something that doesn't change (like the handle), just for > this purpose > 3. make it so that the order is preserved > > I would lean towards #3. I've "fixed" some other places where the > order was lost. If you let me know which orders are lost, I'll > address. > > -Doug > >> Jérôme >> >> >> --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> a écrit : >> >>> De: Gerald Britton <ger...@gm...> >>> Objet: Re: [Gramps-devel] self.db.iter_object_handles(sort_handles=True) >>> À: "jerome" <rom...@ya...> >>> Cc: gra...@li... >>> Date: Vendredi 14 janvier 2011, 22h10 >>> On Fri, Jan 14, 2011 at 3:59 PM, >>> jerome <rom...@ya...> >>> wrote: >>>>>> I am not certain to understand ... >>>>>> Keys should be handles, no ? >>>>> Well, that's the question! I can see a case for >>>>> gramps ids, or >>>>> surnames, or event dates, etc. etc. >>>> But handle is the easiest way and safe key for >>> ordering our data. >>> >>> Only if that's the order you want >>> >>>> gramps ids could be exotic! >>> Do you mean unique? Anyway it is a good sort-key >>> candidate >>> >>>> surnames is not a good key :( >>> I can see that some would like it...makes the XML easier to >>> read by a human >>> >>>> date => date_object => year, then month, then >>> day, then rank, etc ... = horrible index >>> >>> Probably, but its just one possibility >>> >>>> My problem is on plugins/export/ExportXML.py >>>> >>>> I saw a sortByID function not used, then sometimes the >>> use of list (get_...), then iteration (only family >>> handles). >>>> I thought on use lists sorted by handle for having an >>> order rule. I do not want to group handles, handles will be >>> grouped into the Gramps XML, so it was not planned to parse >>> one flat XML file or something like that! >>>> But it is not my main problem ... >>>> I thought that to sort handles means objects lists >>> will be consistent (Persons, Families, Events, etc ...) >>>> Every time I import a Gramps XML, Gramps rebuilds >>> (write, DB commit) some objects! Change time is not the same >>> with a simple import then export. >>> >>> Well, they all need new handles, right? Possibility >>> of collisions. >>> Also with gramps ids. >>> >>>> I can understand the random order used by bsddb, but >>> this should not be done on some objects (like family) and >>> not on the others. >>>> In my mind, an import without DB change is like a >>> "read-only": it is not the case. OK, you are saying that it >>> is the way used by bsddb. XML files should be able to use >>> 'diff' or revision control tools. With current Gramps XML >>> import/export, these tools are limited. :( >>> >>> Yep. You're probably looking for something like a >>> UUID for each >>> record. Not a bad idea but not implemented at the >>> moment. >>> >>>> >>>> Jérôme >>>> >>>> >>>> --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> >>> a écrit : >>>>> De: Gerald Britton <ger...@gm...> >>>>> Objet: Re: [Gramps-devel] >>> self.db.iter_object_handles(sort_handles=True) >>>>> À: "jerome" <rom...@ya...> >>>>> Cc: gra...@li... >>>>> Date: Vendredi 14 janvier 2011, 21h21 >>>>> On Fri, Jan 14, 2011 at 3:11 PM, >>>>> jerome <rom...@ya...> >>>>> wrote: >>>>>> I am not certain to understand ... >>>>>> Keys should be handles, no ? >>>>> Well, that's the question! I can see a case for >>>>> gramps ids, or >>>>> surnames, or event dates, etc. etc. >>>>> >>>>>> >>> 'self.db.get_{object}_handles(sort_handles=True)' is >>>>> allowed, >>>>>> not >>> 'self.db.iter_{object}_handles(sort_handles=True)'! >>>>>> There is two questions: >>>>>> >>>>>> 1. Why does Gramps only use >>>>> self.db.iter_family_handles(), else >>>>> self.get_{object}_handles(), where {object} is >>> person or >>>>> event or source or place or repository or note or >>> media >>>>> object. >>>>> >>>>> the get_...handles methods return a list, which >>> can be >>>>> expensive in >>>>> memory and must read all objects in one pass. >>> The >>>>> iter... methods >>>>> just return one at at time, so are cheaper in >>> memory. >>>>> So, the iter... >>>>> methods are preferable. OTOH, they cannot do >>> sorting, >>>>> since by >>>>> definition you need to read all records before you >>> can sort >>>>> them. >>>>> >>>>>> 2. Why 'sort_handles=True' argument is >>> allowed on all >>>>> primary objects except family object ? >>>>> >>>>> I suppose that there has been no requirement so >>> far so no >>>>> one coded it up. >>>>> >>>>>>> The data is not ordered since it >>>>>>> comes from bsddb in random order. >>>>>> This could explain why I will not be able to >>> keep >>>>> order on XML import (to bsddb). :( >>>>>> >>>>>> Thanks. >>>>>> Jérôme >>>>>> >>>>>> --- En date de : Ven 14.1.11, Gerald Britton >>> <ger...@gm...> >>>>> a écrit : >>>>>>> De: Gerald Britton <ger...@gm...> >>>>>>> Objet: Re: [Gramps-devel] >>>>> self.db.iter_object_handles(sort_handles=True) >>>>>>> À: "jerome" <rom...@ya...> >>>>>>> Cc: gra...@li... >>>>>>> Date: Vendredi 14 janvier 2011, 19h53 >>>>>>> The data is not ordered since it >>>>>>> comes from bsddb in random order. If >>>>>>> we ordered it, we would have to sort it >>> by some >>>>> key. >>>>>>> So, if we did, >>>>>>> what keys would you use for: >>>>>>> >>>>>>> person >>>>>>> family >>>>>>> event >>>>>>> source >>>>>>> place >>>>>>> repository >>>>>>> note >>>>>>> media object >>>>>>> >>>>>>> On Fri, Jan 14, 2011 at 1:36 PM, jerome >>> <rom...@ya...> >>>>>>> wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> >>>>>>>> I am trying to get an answer to a >>> question >>>>> about the >>>>>>> code: why we cannot keep the order of >>> objects >>>>> after a Gramps >>>>>>> XML file import against export ? >>>>>>>> Nick pointed out that objects are >>> not ordered >>>>> on >>>>>>> export[1]. >>>>>>>> Why ? I suppose backup scripts or >>> revision >>>>> control >>>>>>> tools will work better with ordered >>> objects! >>>>> Anyway, to use >>>>>>> 'sort_handles=True' works on export, >>> except for >>>>> family >>>>>>> handles. Any reason for that ? A typo >>> somewhere ? >>>>> On my side >>>>>>> ? >>>>>>>> >>>>>>>> [1] http://www.gramps-project.org/bugs/view.php?id=4365 >>>>>>>> >>>>>>>> regards, >>>>>>>> Jérôme >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>> ------------------------------------------------------------------------------ >>>>>>>> Protect Your Site and Customers from >>> Malware >>>>> Attacks >>>>>>>> Learn about various malware tactics >>> and how >>>>> to avoid >>>>>>> them. Understand >>>>>>>> malware threats, the impact they can >>> have on >>>>> your >>>>>>> business, and how you >>>>>>>> can protect your company and >>> customers by >>>>> using code >>>>>>> signing. >>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl >>>>>>>> >>>>> _______________________________________________ >>>>>>>> Gramps-devel mailing list >>>>>>>> Gra...@li... >>>>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Gerald Britton >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Gerald Britton >>>>> >>>> >>>> >>>> >>> >>> >>> -- >>> Gerald Britton >>> >> >> >> >> ------------------------------------------------------------------------------ >> Protect Your Site and Customers from Malware Attacks >> Learn about various malware tactics and how to avoid them. Understand >> malware threats, the impact they can have on your business, and how you >> can protect your company and customers by using code signing. >> http://p.sf.net/sfu/oracle-sfdevnl >> _______________________________________________ >> Gramps-devel mailing list >> Gra...@li... >> https://lists.sourceforge.net/lists/listinfo/gramps-devel >> > |
From: Benny M. <ben...@gm...> - 2011-01-15 13:44:12
|
We should _never_ order on export. We should only access things via an index in the database. Ordering would mean a huge time penalty on exporting for those with very large family trees. Even exporting along a bsddb index would be much slower, as now we go from database page to database page. Just looping over the data and exporting means the the harddisk is the least read (it goes from database page to database page). In other words: 1/ default should be just a cursor of the database table, so order cannot be maintained 2/ ordered output could be optional. If we add an ordered output, it should be along an index page of the database, so no in memory sorting must occur before export can be done. I think ID has a sorted index over it. Handle normally also, as it is the primary key, and will hence be in some sort of B-tree. You must be sure to use the sort index on looping however. Benny 2011/1/15 Jérôme <rom...@ya...> > > if the round-trip through gramps was idempotent, then the diff would be > empty. > > Expected result was: minor change on date generation (if generated on an > other day) and maybe media objects (media paths). > > I do not expect a full idem potent after round-trip, but currently we > cannot easily get the differences. I just wanted testing complete XML > migration before major release. > > > Jérôme > > > Doug Blank a écrit : > > On Fri, Jan 14, 2011 at 4:31 PM, jerome <rom...@ya...> wrote: > >>>> gramps ids could be exotic! > >>> Do you mean unique? Anyway it is a good sort-key > >>> candidate > >> ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] > >> > >> In 'handle' I trust! ;) > >> > >>>> Every time I import a Gramps XML, Gramps rebuilds > >>> (write, DB commit) some objects! Change time is not the same > >>> with a simple import then export. > >>> Well, they all need new handles, right? Possibility > >>> of collisions. > >>> Also with gramps ids. > >> In fact, I want to keep handles: they should be the keys control. > >> > >> My problem could be illustrated by something like: > >> > >> $ gramps -i import.gramps -e export.gramps > >> $ gunzip < import.gramps > import.xml > >> $ gunzip < export.gramps > export.xml > >> $ diff -u import.xml export.xml > diff.txt > >> > >> where import.gramps is our "Scientific control". > >> > >> What should be the content of diff.txt ? > >> > >> For me, it should be few lines... > >> Unfortunatly there is some change (order, change time on family > objects): that's strange! > > > > Yes, it would be handy to do this. This might be called "idempotent" > > by a mathematician: if the round-trip through gramps was idempotent, > > then the diff would be empty. > > > > What we need is: > > > > 1. something smarter than diff for this usage > > 2. sort on something that doesn't change (like the handle), just for > > this purpose > > 3. make it so that the order is preserved > > > > I would lean towards #3. I've "fixed" some other places where the > > order was lost. If you let me know which orders are lost, I'll > > address. > > > > -Doug > > > >> Jérôme > >> > >> > >> --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> > a écrit : > >> > >>> De: Gerald Britton <ger...@gm...> > >>> Objet: Re: [Gramps-devel] > self.db.iter_object_handles(sort_handles=True) > >>> À: "jerome" <rom...@ya...> > >>> Cc: gra...@li... > >>> Date: Vendredi 14 janvier 2011, 22h10 > >>> On Fri, Jan 14, 2011 at 3:59 PM, > >>> jerome <rom...@ya...> > >>> wrote: > >>>>>> I am not certain to understand ... > >>>>>> Keys should be handles, no ? > >>>>> Well, that's the question! I can see a case for > >>>>> gramps ids, or > >>>>> surnames, or event dates, etc. etc. > >>>> But handle is the easiest way and safe key for > >>> ordering our data. > >>> > >>> Only if that's the order you want > >>> > >>>> gramps ids could be exotic! > >>> Do you mean unique? Anyway it is a good sort-key > >>> candidate > >>> > >>>> surnames is not a good key :( > >>> I can see that some would like it...makes the XML easier to > >>> read by a human > >>> > >>>> date => date_object => year, then month, then > >>> day, then rank, etc ... = horrible index > >>> > >>> Probably, but its just one possibility > >>> > >>>> My problem is on plugins/export/ExportXML.py > >>>> > >>>> I saw a sortByID function not used, then sometimes the > >>> use of list (get_...), then iteration (only family > >>> handles). > >>>> I thought on use lists sorted by handle for having an > >>> order rule. I do not want to group handles, handles will be > >>> grouped into the Gramps XML, so it was not planned to parse > >>> one flat XML file or something like that! > >>>> But it is not my main problem ... > >>>> I thought that to sort handles means objects lists > >>> will be consistent (Persons, Families, Events, etc ...) > >>>> Every time I import a Gramps XML, Gramps rebuilds > >>> (write, DB commit) some objects! Change time is not the same > >>> with a simple import then export. > >>> > >>> Well, they all need new handles, right? Possibility > >>> of collisions. > >>> Also with gramps ids. > >>> > >>>> I can understand the random order used by bsddb, but > >>> this should not be done on some objects (like family) and > >>> not on the others. > >>>> In my mind, an import without DB change is like a > >>> "read-only": it is not the case. OK, you are saying that it > >>> is the way used by bsddb. XML files should be able to use > >>> 'diff' or revision control tools. With current Gramps XML > >>> import/export, these tools are limited. :( > >>> > >>> Yep. You're probably looking for something like a > >>> UUID for each > >>> record. Not a bad idea but not implemented at the > >>> moment. > >>> > >>>> > >>>> Jérôme > >>>> > >>>> > >>>> --- En date de : Ven 14.1.11, Gerald Britton < > ger...@gm...> > >>> a écrit : > >>>>> De: Gerald Britton <ger...@gm...> > >>>>> Objet: Re: [Gramps-devel] > >>> self.db.iter_object_handles(sort_handles=True) > >>>>> À: "jerome" <rom...@ya...> > >>>>> Cc: gra...@li... > >>>>> Date: Vendredi 14 janvier 2011, 21h21 > >>>>> On Fri, Jan 14, 2011 at 3:11 PM, > >>>>> jerome <rom...@ya...> > >>>>> wrote: > >>>>>> I am not certain to understand ... > >>>>>> Keys should be handles, no ? > >>>>> Well, that's the question! I can see a case for > >>>>> gramps ids, or > >>>>> surnames, or event dates, etc. etc. > >>>>> > >>>>>> > >>> 'self.db.get_{object}_handles(sort_handles=True)' is > >>>>> allowed, > >>>>>> not > >>> 'self.db.iter_{object}_handles(sort_handles=True)'! > >>>>>> There is two questions: > >>>>>> > >>>>>> 1. Why does Gramps only use > >>>>> self.db.iter_family_handles(), else > >>>>> self.get_{object}_handles(), where {object} is > >>> person or > >>>>> event or source or place or repository or note or > >>> media > >>>>> object. > >>>>> > >>>>> the get_...handles methods return a list, which > >>> can be > >>>>> expensive in > >>>>> memory and must read all objects in one pass. > >>> The > >>>>> iter... methods > >>>>> just return one at at time, so are cheaper in > >>> memory. > >>>>> So, the iter... > >>>>> methods are preferable. OTOH, they cannot do > >>> sorting, > >>>>> since by > >>>>> definition you need to read all records before you > >>> can sort > >>>>> them. > >>>>> > >>>>>> 2. Why 'sort_handles=True' argument is > >>> allowed on all > >>>>> primary objects except family object ? > >>>>> > >>>>> I suppose that there has been no requirement so > >>> far so no > >>>>> one coded it up. > >>>>> > >>>>>>> The data is not ordered since it > >>>>>>> comes from bsddb in random order. > >>>>>> This could explain why I will not be able to > >>> keep > >>>>> order on XML import (to bsddb). :( > >>>>>> > >>>>>> Thanks. > >>>>>> Jérôme > >>>>>> > >>>>>> --- En date de : Ven 14.1.11, Gerald Britton > >>> <ger...@gm...> > >>>>> a écrit : > >>>>>>> De: Gerald Britton <ger...@gm...> > >>>>>>> Objet: Re: [Gramps-devel] > >>>>> self.db.iter_object_handles(sort_handles=True) > >>>>>>> À: "jerome" <rom...@ya...> > >>>>>>> Cc: gra...@li... > >>>>>>> Date: Vendredi 14 janvier 2011, 19h53 > >>>>>>> The data is not ordered since it > >>>>>>> comes from bsddb in random order. If > >>>>>>> we ordered it, we would have to sort it > >>> by some > >>>>> key. > >>>>>>> So, if we did, > >>>>>>> what keys would you use for: > >>>>>>> > >>>>>>> person > >>>>>>> family > >>>>>>> event > >>>>>>> source > >>>>>>> place > >>>>>>> repository > >>>>>>> note > >>>>>>> media object > >>>>>>> > >>>>>>> On Fri, Jan 14, 2011 at 1:36 PM, jerome > >>> <rom...@ya...> > >>>>>>> wrote: > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> > >>>>>>>> I am trying to get an answer to a > >>> question > >>>>> about the > >>>>>>> code: why we cannot keep the order of > >>> objects > >>>>> after a Gramps > >>>>>>> XML file import against export ? > >>>>>>>> Nick pointed out that objects are > >>> not ordered > >>>>> on > >>>>>>> export[1]. > >>>>>>>> Why ? I suppose backup scripts or > >>> revision > >>>>> control > >>>>>>> tools will work better with ordered > >>> objects! > >>>>> Anyway, to use > >>>>>>> 'sort_handles=True' works on export, > >>> except for > >>>>> family > >>>>>>> handles. Any reason for that ? A typo > >>> somewhere ? > >>>>> On my side > >>>>>>> ? > >>>>>>>> > >>>>>>>> [1] http://www.gramps-project.org/bugs/view.php?id=4365 > >>>>>>>> > >>>>>>>> regards, > >>>>>>>> Jérôme > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>> > ------------------------------------------------------------------------------ > >>>>>>>> Protect Your Site and Customers from > >>> Malware > >>>>> Attacks > >>>>>>>> Learn about various malware tactics > >>> and how > >>>>> to avoid > >>>>>>> them. Understand > >>>>>>>> malware threats, the impact they can > >>> have on > >>>>> your > >>>>>>> business, and how you > >>>>>>>> can protect your company and > >>> customers by > >>>>> using code > >>>>>>> signing. > >>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl > >>>>>>>> > >>>>> _______________________________________________ > >>>>>>>> Gramps-devel mailing list > >>>>>>>> Gra...@li... > >>>>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Gerald Britton > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Gerald Britton > >>>>> > >>>> > >>>> > >>>> > >>> > >>> > >>> -- > >>> Gerald Britton > >>> > >> > >> > >> > >> > ------------------------------------------------------------------------------ > >> Protect Your Site and Customers from Malware Attacks > >> Learn about various malware tactics and how to avoid them. Understand > >> malware threats, the impact they can have on your business, and how you > >> can protect your company and customers by using code signing. > >> http://p.sf.net/sfu/oracle-sfdevnl > >> _______________________________________________ > >> Gramps-devel mailing list > >> Gra...@li... > >> https://lists.sourceforge.net/lists/listinfo/gramps-devel > >> > > > > > > ------------------------------------------------------------------------------ > Protect Your Site and Customers from Malware Attacks > Learn about various malware tactics and how to avoid them. Understand > malware threats, the impact they can have on your business, and how you > can protect your company and customers by using code signing. > http://p.sf.net/sfu/oracle-sfdevnl > _______________________________________________ > Gramps-devel mailing list > Gra...@li... > https://lists.sourceforge.net/lists/listinfo/gramps-devel > |
From: Gerald B. <ger...@gm...> - 2011-01-15 15:03:40
|
Agreed. If the export is in handle order we should be fine. Re-importing though can generate new handles, can it not? If so, we lose idempotency which is jerome's issue I think. On 1/15/11, Benny Malengier <ben...@gm...> wrote: > We should _never_ order on export. > We should only access things via an index in the database. > > Ordering would mean a huge time penalty on exporting for those with very > large family trees. > Even exporting along a bsddb index would be much slower, as now we go from > database page to database page. > > Just looping over the data and exporting means the the harddisk is the least > read (it goes from database page to database page). > > In other words: > 1/ default should be just a cursor of the database table, so order cannot be > maintained > 2/ ordered output could be optional. If we add an ordered output, it should > be along an index page of the database, so no in memory sorting must occur > before export can be done. I think ID has a sorted index over it. Handle > normally also, as it is the primary key, and will hence be in some sort of > B-tree. You must be sure to use the sort index on looping however. > > Benny > > 2011/1/15 Jérôme <rom...@ya...> > >> > if the round-trip through gramps was idempotent, then the diff would be >> empty. >> >> Expected result was: minor change on date generation (if generated on an >> other day) and maybe media objects (media paths). >> >> I do not expect a full idem potent after round-trip, but currently we >> cannot easily get the differences. I just wanted testing complete XML >> migration before major release. >> >> >> Jérôme >> >> >> Doug Blank a écrit : >> > On Fri, Jan 14, 2011 at 4:31 PM, jerome <rom...@ya...> wrote: >> >>>> gramps ids could be exotic! >> >>> Do you mean unique? Anyway it is a good sort-key >> >>> candidate >> >> ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] >> >> >> >> In 'handle' I trust! ;) >> >> >> >>>> Every time I import a Gramps XML, Gramps rebuilds >> >>> (write, DB commit) some objects! Change time is not the same >> >>> with a simple import then export. >> >>> Well, they all need new handles, right? Possibility >> >>> of collisions. >> >>> Also with gramps ids. >> >> In fact, I want to keep handles: they should be the keys control. >> >> >> >> My problem could be illustrated by something like: >> >> >> >> $ gramps -i import.gramps -e export.gramps >> >> $ gunzip < import.gramps > import.xml >> >> $ gunzip < export.gramps > export.xml >> >> $ diff -u import.xml export.xml > diff.txt >> >> >> >> where import.gramps is our "Scientific control". >> >> >> >> What should be the content of diff.txt ? >> >> >> >> For me, it should be few lines... >> >> Unfortunatly there is some change (order, change time on family >> objects): that's strange! >> > >> > Yes, it would be handy to do this. This might be called "idempotent" >> > by a mathematician: if the round-trip through gramps was idempotent, >> > then the diff would be empty. >> > >> > What we need is: >> > >> > 1. something smarter than diff for this usage >> > 2. sort on something that doesn't change (like the handle), just for >> > this purpose >> > 3. make it so that the order is preserved >> > >> > I would lean towards #3. I've "fixed" some other places where the >> > order was lost. If you let me know which orders are lost, I'll >> > address. >> > >> > -Doug >> > >> >> Jérôme >> >> >> >> >> >> --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> >> a écrit : >> >> >> >>> De: Gerald Britton <ger...@gm...> >> >>> Objet: Re: [Gramps-devel] >> self.db.iter_object_handles(sort_handles=True) >> >>> À: "jerome" <rom...@ya...> >> >>> Cc: gra...@li... >> >>> Date: Vendredi 14 janvier 2011, 22h10 >> >>> On Fri, Jan 14, 2011 at 3:59 PM, >> >>> jerome <rom...@ya...> >> >>> wrote: >> >>>>>> I am not certain to understand ... >> >>>>>> Keys should be handles, no ? >> >>>>> Well, that's the question! I can see a case for >> >>>>> gramps ids, or >> >>>>> surnames, or event dates, etc. etc. >> >>>> But handle is the easiest way and safe key for >> >>> ordering our data. >> >>> >> >>> Only if that's the order you want >> >>> >> >>>> gramps ids could be exotic! >> >>> Do you mean unique? Anyway it is a good sort-key >> >>> candidate >> >>> >> >>>> surnames is not a good key :( >> >>> I can see that some would like it...makes the XML easier to >> >>> read by a human >> >>> >> >>>> date => date_object => year, then month, then >> >>> day, then rank, etc ... = horrible index >> >>> >> >>> Probably, but its just one possibility >> >>> >> >>>> My problem is on plugins/export/ExportXML.py >> >>>> >> >>>> I saw a sortByID function not used, then sometimes the >> >>> use of list (get_...), then iteration (only family >> >>> handles). >> >>>> I thought on use lists sorted by handle for having an >> >>> order rule. I do not want to group handles, handles will be >> >>> grouped into the Gramps XML, so it was not planned to parse >> >>> one flat XML file or something like that! >> >>>> But it is not my main problem ... >> >>>> I thought that to sort handles means objects lists >> >>> will be consistent (Persons, Families, Events, etc ...) >> >>>> Every time I import a Gramps XML, Gramps rebuilds >> >>> (write, DB commit) some objects! Change time is not the same >> >>> with a simple import then export. >> >>> >> >>> Well, they all need new handles, right? Possibility >> >>> of collisions. >> >>> Also with gramps ids. >> >>> >> >>>> I can understand the random order used by bsddb, but >> >>> this should not be done on some objects (like family) and >> >>> not on the others. >> >>>> In my mind, an import without DB change is like a >> >>> "read-only": it is not the case. OK, you are saying that it >> >>> is the way used by bsddb. XML files should be able to use >> >>> 'diff' or revision control tools. With current Gramps XML >> >>> import/export, these tools are limited. :( >> >>> >> >>> Yep. You're probably looking for something like a >> >>> UUID for each >> >>> record. Not a bad idea but not implemented at the >> >>> moment. >> >>> >> >>>> >> >>>> Jérôme >> >>>> >> >>>> >> >>>> --- En date de : Ven 14.1.11, Gerald Britton < >> ger...@gm...> >> >>> a écrit : >> >>>>> De: Gerald Britton <ger...@gm...> >> >>>>> Objet: Re: [Gramps-devel] >> >>> self.db.iter_object_handles(sort_handles=True) >> >>>>> À: "jerome" <rom...@ya...> >> >>>>> Cc: gra...@li... >> >>>>> Date: Vendredi 14 janvier 2011, 21h21 >> >>>>> On Fri, Jan 14, 2011 at 3:11 PM, >> >>>>> jerome <rom...@ya...> >> >>>>> wrote: >> >>>>>> I am not certain to understand ... >> >>>>>> Keys should be handles, no ? >> >>>>> Well, that's the question! I can see a case for >> >>>>> gramps ids, or >> >>>>> surnames, or event dates, etc. etc. >> >>>>> >> >>>>>> >> >>> 'self.db.get_{object}_handles(sort_handles=True)' is >> >>>>> allowed, >> >>>>>> not >> >>> 'self.db.iter_{object}_handles(sort_handles=True)'! >> >>>>>> There is two questions: >> >>>>>> >> >>>>>> 1. Why does Gramps only use >> >>>>> self.db.iter_family_handles(), else >> >>>>> self.get_{object}_handles(), where {object} is >> >>> person or >> >>>>> event or source or place or repository or note or >> >>> media >> >>>>> object. >> >>>>> >> >>>>> the get_...handles methods return a list, which >> >>> can be >> >>>>> expensive in >> >>>>> memory and must read all objects in one pass. >> >>> The >> >>>>> iter... methods >> >>>>> just return one at at time, so are cheaper in >> >>> memory. >> >>>>> So, the iter... >> >>>>> methods are preferable. OTOH, they cannot do >> >>> sorting, >> >>>>> since by >> >>>>> definition you need to read all records before you >> >>> can sort >> >>>>> them. >> >>>>> >> >>>>>> 2. Why 'sort_handles=True' argument is >> >>> allowed on all >> >>>>> primary objects except family object ? >> >>>>> >> >>>>> I suppose that there has been no requirement so >> >>> far so no >> >>>>> one coded it up. >> >>>>> >> >>>>>>> The data is not ordered since it >> >>>>>>> comes from bsddb in random order. >> >>>>>> This could explain why I will not be able to >> >>> keep >> >>>>> order on XML import (to bsddb). :( >> >>>>>> >> >>>>>> Thanks. >> >>>>>> Jérôme >> >>>>>> >> >>>>>> --- En date de : Ven 14.1.11, Gerald Britton >> >>> <ger...@gm...> >> >>>>> a écrit : >> >>>>>>> De: Gerald Britton <ger...@gm...> >> >>>>>>> Objet: Re: [Gramps-devel] >> >>>>> self.db.iter_object_handles(sort_handles=True) >> >>>>>>> À: "jerome" <rom...@ya...> >> >>>>>>> Cc: gra...@li... >> >>>>>>> Date: Vendredi 14 janvier 2011, 19h53 >> >>>>>>> The data is not ordered since it >> >>>>>>> comes from bsddb in random order. If >> >>>>>>> we ordered it, we would have to sort it >> >>> by some >> >>>>> key. >> >>>>>>> So, if we did, >> >>>>>>> what keys would you use for: >> >>>>>>> >> >>>>>>> person >> >>>>>>> family >> >>>>>>> event >> >>>>>>> source >> >>>>>>> place >> >>>>>>> repository >> >>>>>>> note >> >>>>>>> media object >> >>>>>>> >> >>>>>>> On Fri, Jan 14, 2011 at 1:36 PM, jerome >> >>> <rom...@ya...> >> >>>>>>> wrote: >> >>>>>>>> Hi, >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> I am trying to get an answer to a >> >>> question >> >>>>> about the >> >>>>>>> code: why we cannot keep the order of >> >>> objects >> >>>>> after a Gramps >> >>>>>>> XML file import against export ? >> >>>>>>>> Nick pointed out that objects are >> >>> not ordered >> >>>>> on >> >>>>>>> export[1]. >> >>>>>>>> Why ? I suppose backup scripts or >> >>> revision >> >>>>> control >> >>>>>>> tools will work better with ordered >> >>> objects! >> >>>>> Anyway, to use >> >>>>>>> 'sort_handles=True' works on export, >> >>> except for >> >>>>> family >> >>>>>>> handles. Any reason for that ? A typo >> >>> somewhere ? >> >>>>> On my side >> >>>>>>> ? >> >>>>>>>> >> >>>>>>>> [1] http://www.gramps-project.org/bugs/view.php?id=4365 >> >>>>>>>> >> >>>>>>>> regards, >> >>>>>>>> Jérôme >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>> >> ------------------------------------------------------------------------------ >> >>>>>>>> Protect Your Site and Customers from >> >>> Malware >> >>>>> Attacks >> >>>>>>>> Learn about various malware tactics >> >>> and how >> >>>>> to avoid >> >>>>>>> them. Understand >> >>>>>>>> malware threats, the impact they can >> >>> have on >> >>>>> your >> >>>>>>> business, and how you >> >>>>>>>> can protect your company and >> >>> customers by >> >>>>> using code >> >>>>>>> signing. >> >>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl >> >>>>>>>> >> >>>>> _______________________________________________ >> >>>>>>>> Gramps-devel mailing list >> >>>>>>>> Gra...@li... >> >>>>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> Gerald Britton >> >>>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> Gerald Britton >> >>>>> >> >>>> >> >>>> >> >>>> >> >>> >> >>> >> >>> -- >> >>> Gerald Britton >> >>> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> >> Protect Your Site and Customers from Malware Attacks >> >> Learn about various malware tactics and how to avoid them. Understand >> >> malware threats, the impact they can have on your business, and how you >> >> can protect your company and customers by using code signing. >> >> http://p.sf.net/sfu/oracle-sfdevnl >> >> _______________________________________________ >> >> Gramps-devel mailing list >> >> Gra...@li... >> >> https://lists.sourceforge.net/lists/listinfo/gramps-devel >> >> >> > >> >> >> >> ------------------------------------------------------------------------------ >> Protect Your Site and Customers from Malware Attacks >> Learn about various malware tactics and how to avoid them. Understand >> malware threats, the impact they can have on your business, and how you >> can protect your company and customers by using code signing. >> http://p.sf.net/sfu/oracle-sfdevnl >> _______________________________________________ >> Gramps-devel mailing list >> Gra...@li... >> https://lists.sourceforge.net/lists/listinfo/gramps-devel >> > -- Sent from my mobile device Gerald Britton |
From: Benny M. <ben...@gm...> - 2011-01-15 15:06:12
|
2011/1/15 Gerald Britton <ger...@gm...> > Agreed. If the export is in handle order we should be fine. > Re-importing though can generate new handles, can it not? If so, we > lose idempotency which is jerome's issue I think. > Reimporting into an empty family tree keeps the handles. Benny > > On 1/15/11, Benny Malengier <ben...@gm...> wrote: > > We should _never_ order on export. > > We should only access things via an index in the database. > > > > Ordering would mean a huge time penalty on exporting for those with very > > large family trees. > > Even exporting along a bsddb index would be much slower, as now we go > from > > database page to database page. > > > > Just looping over the data and exporting means the the harddisk is the > least > > read (it goes from database page to database page). > > > > In other words: > > 1/ default should be just a cursor of the database table, so order cannot > be > > maintained > > 2/ ordered output could be optional. If we add an ordered output, it > should > > be along an index page of the database, so no in memory sorting must > occur > > before export can be done. I think ID has a sorted index over it. Handle > > normally also, as it is the primary key, and will hence be in some sort > of > > B-tree. You must be sure to use the sort index on looping however. > > > > Benny > > > > 2011/1/15 Jérôme <rom...@ya...> > > > >> > if the round-trip through gramps was idempotent, then the diff would > be > >> empty. > >> > >> Expected result was: minor change on date generation (if generated on an > >> other day) and maybe media objects (media paths). > >> > >> I do not expect a full idem potent after round-trip, but currently we > >> cannot easily get the differences. I just wanted testing complete XML > >> migration before major release. > >> > >> > >> Jérôme > >> > >> > >> Doug Blank a écrit : > >> > On Fri, Jan 14, 2011 at 4:31 PM, jerome <rom...@ya...> wrote: > >> >>>> gramps ids could be exotic! > >> >>> Do you mean unique? Anyway it is a good sort-key > >> >>> candidate > >> >> ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] > >> >> > >> >> In 'handle' I trust! ;) > >> >> > >> >>>> Every time I import a Gramps XML, Gramps rebuilds > >> >>> (write, DB commit) some objects! Change time is not the same > >> >>> with a simple import then export. > >> >>> Well, they all need new handles, right? Possibility > >> >>> of collisions. > >> >>> Also with gramps ids. > >> >> In fact, I want to keep handles: they should be the keys control. > >> >> > >> >> My problem could be illustrated by something like: > >> >> > >> >> $ gramps -i import.gramps -e export.gramps > >> >> $ gunzip < import.gramps > import.xml > >> >> $ gunzip < export.gramps > export.xml > >> >> $ diff -u import.xml export.xml > diff.txt > >> >> > >> >> where import.gramps is our "Scientific control". > >> >> > >> >> What should be the content of diff.txt ? > >> >> > >> >> For me, it should be few lines... > >> >> Unfortunatly there is some change (order, change time on family > >> objects): that's strange! > >> > > >> > Yes, it would be handy to do this. This might be called "idempotent" > >> > by a mathematician: if the round-trip through gramps was idempotent, > >> > then the diff would be empty. > >> > > >> > What we need is: > >> > > >> > 1. something smarter than diff for this usage > >> > 2. sort on something that doesn't change (like the handle), just for > >> > this purpose > >> > 3. make it so that the order is preserved > >> > > >> > I would lean towards #3. I've "fixed" some other places where the > >> > order was lost. If you let me know which orders are lost, I'll > >> > address. > >> > > >> > -Doug > >> > > >> >> Jérôme > >> >> > >> >> > >> >> --- En date de : Ven 14.1.11, Gerald Britton < > ger...@gm...> > >> a écrit : > >> >> > >> >>> De: Gerald Britton <ger...@gm...> > >> >>> Objet: Re: [Gramps-devel] > >> self.db.iter_object_handles(sort_handles=True) > >> >>> À: "jerome" <rom...@ya...> > >> >>> Cc: gra...@li... > >> >>> Date: Vendredi 14 janvier 2011, 22h10 > >> >>> On Fri, Jan 14, 2011 at 3:59 PM, > >> >>> jerome <rom...@ya...> > >> >>> wrote: > >> >>>>>> I am not certain to understand ... > >> >>>>>> Keys should be handles, no ? > >> >>>>> Well, that's the question! I can see a case for > >> >>>>> gramps ids, or > >> >>>>> surnames, or event dates, etc. etc. > >> >>>> But handle is the easiest way and safe key for > >> >>> ordering our data. > >> >>> > >> >>> Only if that's the order you want > >> >>> > >> >>>> gramps ids could be exotic! > >> >>> Do you mean unique? Anyway it is a good sort-key > >> >>> candidate > >> >>> > >> >>>> surnames is not a good key :( > >> >>> I can see that some would like it...makes the XML easier to > >> >>> read by a human > >> >>> > >> >>>> date => date_object => year, then month, then > >> >>> day, then rank, etc ... = horrible index > >> >>> > >> >>> Probably, but its just one possibility > >> >>> > >> >>>> My problem is on plugins/export/ExportXML.py > >> >>>> > >> >>>> I saw a sortByID function not used, then sometimes the > >> >>> use of list (get_...), then iteration (only family > >> >>> handles). > >> >>>> I thought on use lists sorted by handle for having an > >> >>> order rule. I do not want to group handles, handles will be > >> >>> grouped into the Gramps XML, so it was not planned to parse > >> >>> one flat XML file or something like that! > >> >>>> But it is not my main problem ... > >> >>>> I thought that to sort handles means objects lists > >> >>> will be consistent (Persons, Families, Events, etc ...) > >> >>>> Every time I import a Gramps XML, Gramps rebuilds > >> >>> (write, DB commit) some objects! Change time is not the same > >> >>> with a simple import then export. > >> >>> > >> >>> Well, they all need new handles, right? Possibility > >> >>> of collisions. > >> >>> Also with gramps ids. > >> >>> > >> >>>> I can understand the random order used by bsddb, but > >> >>> this should not be done on some objects (like family) and > >> >>> not on the others. > >> >>>> In my mind, an import without DB change is like a > >> >>> "read-only": it is not the case. OK, you are saying that it > >> >>> is the way used by bsddb. XML files should be able to use > >> >>> 'diff' or revision control tools. With current Gramps XML > >> >>> import/export, these tools are limited. :( > >> >>> > >> >>> Yep. You're probably looking for something like a > >> >>> UUID for each > >> >>> record. Not a bad idea but not implemented at the > >> >>> moment. > >> >>> > >> >>>> > >> >>>> Jérôme > >> >>>> > >> >>>> > >> >>>> --- En date de : Ven 14.1.11, Gerald Britton < > >> ger...@gm...> > >> >>> a écrit : > >> >>>>> De: Gerald Britton <ger...@gm...> > >> >>>>> Objet: Re: [Gramps-devel] > >> >>> self.db.iter_object_handles(sort_handles=True) > >> >>>>> À: "jerome" <rom...@ya...> > >> >>>>> Cc: gra...@li... > >> >>>>> Date: Vendredi 14 janvier 2011, 21h21 > >> >>>>> On Fri, Jan 14, 2011 at 3:11 PM, > >> >>>>> jerome <rom...@ya...> > >> >>>>> wrote: > >> >>>>>> I am not certain to understand ... > >> >>>>>> Keys should be handles, no ? > >> >>>>> Well, that's the question! I can see a case for > >> >>>>> gramps ids, or > >> >>>>> surnames, or event dates, etc. etc. > >> >>>>> > >> >>>>>> > >> >>> 'self.db.get_{object}_handles(sort_handles=True)' is > >> >>>>> allowed, > >> >>>>>> not > >> >>> 'self.db.iter_{object}_handles(sort_handles=True)'! > >> >>>>>> There is two questions: > >> >>>>>> > >> >>>>>> 1. Why does Gramps only use > >> >>>>> self.db.iter_family_handles(), else > >> >>>>> self.get_{object}_handles(), where {object} is > >> >>> person or > >> >>>>> event or source or place or repository or note or > >> >>> media > >> >>>>> object. > >> >>>>> > >> >>>>> the get_...handles methods return a list, which > >> >>> can be > >> >>>>> expensive in > >> >>>>> memory and must read all objects in one pass. > >> >>> The > >> >>>>> iter... methods > >> >>>>> just return one at at time, so are cheaper in > >> >>> memory. > >> >>>>> So, the iter... > >> >>>>> methods are preferable. OTOH, they cannot do > >> >>> sorting, > >> >>>>> since by > >> >>>>> definition you need to read all records before you > >> >>> can sort > >> >>>>> them. > >> >>>>> > >> >>>>>> 2. Why 'sort_handles=True' argument is > >> >>> allowed on all > >> >>>>> primary objects except family object ? > >> >>>>> > >> >>>>> I suppose that there has been no requirement so > >> >>> far so no > >> >>>>> one coded it up. > >> >>>>> > >> >>>>>>> The data is not ordered since it > >> >>>>>>> comes from bsddb in random order. > >> >>>>>> This could explain why I will not be able to > >> >>> keep > >> >>>>> order on XML import (to bsddb). :( > >> >>>>>> > >> >>>>>> Thanks. > >> >>>>>> Jérôme > >> >>>>>> > >> >>>>>> --- En date de : Ven 14.1.11, Gerald Britton > >> >>> <ger...@gm...> > >> >>>>> a écrit : > >> >>>>>>> De: Gerald Britton <ger...@gm...> > >> >>>>>>> Objet: Re: [Gramps-devel] > >> >>>>> self.db.iter_object_handles(sort_handles=True) > >> >>>>>>> À: "jerome" <rom...@ya...> > >> >>>>>>> Cc: gra...@li... > >> >>>>>>> Date: Vendredi 14 janvier 2011, 19h53 > >> >>>>>>> The data is not ordered since it > >> >>>>>>> comes from bsddb in random order. If > >> >>>>>>> we ordered it, we would have to sort it > >> >>> by some > >> >>>>> key. > >> >>>>>>> So, if we did, > >> >>>>>>> what keys would you use for: > >> >>>>>>> > >> >>>>>>> person > >> >>>>>>> family > >> >>>>>>> event > >> >>>>>>> source > >> >>>>>>> place > >> >>>>>>> repository > >> >>>>>>> note > >> >>>>>>> media object > >> >>>>>>> > >> >>>>>>> On Fri, Jan 14, 2011 at 1:36 PM, jerome > >> >>> <rom...@ya...> > >> >>>>>>> wrote: > >> >>>>>>>> Hi, > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> I am trying to get an answer to a > >> >>> question > >> >>>>> about the > >> >>>>>>> code: why we cannot keep the order of > >> >>> objects > >> >>>>> after a Gramps > >> >>>>>>> XML file import against export ? > >> >>>>>>>> Nick pointed out that objects are > >> >>> not ordered > >> >>>>> on > >> >>>>>>> export[1]. > >> >>>>>>>> Why ? I suppose backup scripts or > >> >>> revision > >> >>>>> control > >> >>>>>>> tools will work better with ordered > >> >>> objects! > >> >>>>> Anyway, to use > >> >>>>>>> 'sort_handles=True' works on export, > >> >>> except for > >> >>>>> family > >> >>>>>>> handles. Any reason for that ? A typo > >> >>> somewhere ? > >> >>>>> On my side > >> >>>>>>> ? > >> >>>>>>>> > >> >>>>>>>> [1] http://www.gramps-project.org/bugs/view.php?id=4365 > >> >>>>>>>> > >> >>>>>>>> regards, > >> >>>>>>>> Jérôme > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > >> >>> > >> > ------------------------------------------------------------------------------ > >> >>>>>>>> Protect Your Site and Customers from > >> >>> Malware > >> >>>>> Attacks > >> >>>>>>>> Learn about various malware tactics > >> >>> and how > >> >>>>> to avoid > >> >>>>>>> them. Understand > >> >>>>>>>> malware threats, the impact they can > >> >>> have on > >> >>>>> your > >> >>>>>>> business, and how you > >> >>>>>>>> can protect your company and > >> >>> customers by > >> >>>>> using code > >> >>>>>>> signing. > >> >>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl > >> >>>>>>>> > >> >>>>> _______________________________________________ > >> >>>>>>>> Gramps-devel mailing list > >> >>>>>>>> Gra...@li... > >> >>>>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel > >> >>>>>>>> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> -- > >> >>>>>>> Gerald Britton > >> >>>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>> > >> >>>>> > >> >>>>> -- > >> >>>>> Gerald Britton > >> >>>>> > >> >>>> > >> >>>> > >> >>>> > >> >>> > >> >>> > >> >>> -- > >> >>> Gerald Britton > >> >>> > >> >> > >> >> > >> >> > >> >> > >> > ------------------------------------------------------------------------------ > >> >> Protect Your Site and Customers from Malware Attacks > >> >> Learn about various malware tactics and how to avoid them. Understand > >> >> malware threats, the impact they can have on your business, and how > you > >> >> can protect your company and customers by using code signing. > >> >> http://p.sf.net/sfu/oracle-sfdevnl > >> >> _______________________________________________ > >> >> Gramps-devel mailing list > >> >> Gra...@li... > >> >> https://lists.sourceforge.net/lists/listinfo/gramps-devel > >> >> > >> > > >> > >> > >> > >> > ------------------------------------------------------------------------------ > >> Protect Your Site and Customers from Malware Attacks > >> Learn about various malware tactics and how to avoid them. Understand > >> malware threats, the impact they can have on your business, and how you > >> can protect your company and customers by using code signing. > >> http://p.sf.net/sfu/oracle-sfdevnl > >> _______________________________________________ > >> Gramps-devel mailing list > >> Gra...@li... > >> https://lists.sourceforge.net/lists/listinfo/gramps-devel > >> > > > > -- > Sent from my mobile device > > Gerald Britton > |
From: Doug B. <dou...@gm...> - 2011-01-15 15:06:27
|
On Sat, Jan 15, 2011 at 10:03 AM, Gerald Britton <ger...@gm...> wrote: > Agreed. If the export is in handle order we should be fine. > Re-importing though can generate new handles, can it not? If so, we > lose idempotency which is jerome's issue I think. This wouldn't be a case with Jérôme's case because none of the objects are new... they keep their same handles. -Doug > On 1/15/11, Benny Malengier <ben...@gm...> wrote: >> We should _never_ order on export. >> We should only access things via an index in the database. >> >> Ordering would mean a huge time penalty on exporting for those with very >> large family trees. >> Even exporting along a bsddb index would be much slower, as now we go from >> database page to database page. >> >> Just looping over the data and exporting means the the harddisk is the least >> read (it goes from database page to database page). >> >> In other words: >> 1/ default should be just a cursor of the database table, so order cannot be >> maintained >> 2/ ordered output could be optional. If we add an ordered output, it should >> be along an index page of the database, so no in memory sorting must occur >> before export can be done. I think ID has a sorted index over it. Handle >> normally also, as it is the primary key, and will hence be in some sort of >> B-tree. You must be sure to use the sort index on looping however. >> >> Benny >> >> 2011/1/15 Jérôme <rom...@ya...> >> >>> > if the round-trip through gramps was idempotent, then the diff would be >>> empty. >>> >>> Expected result was: minor change on date generation (if generated on an >>> other day) and maybe media objects (media paths). >>> >>> I do not expect a full idem potent after round-trip, but currently we >>> cannot easily get the differences. I just wanted testing complete XML >>> migration before major release. >>> >>> >>> Jérôme >>> >>> >>> Doug Blank a écrit : >>> > On Fri, Jan 14, 2011 at 4:31 PM, jerome <rom...@ya...> wrote: >>> >>>> gramps ids could be exotic! >>> >>> Do you mean unique? Anyway it is a good sort-key >>> >>> candidate >>> >> ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] >>> >> >>> >> In 'handle' I trust! ;) >>> >> >>> >>>> Every time I import a Gramps XML, Gramps rebuilds >>> >>> (write, DB commit) some objects! Change time is not the same >>> >>> with a simple import then export. >>> >>> Well, they all need new handles, right? Possibility >>> >>> of collisions. >>> >>> Also with gramps ids. >>> >> In fact, I want to keep handles: they should be the keys control. >>> >> >>> >> My problem could be illustrated by something like: >>> >> >>> >> $ gramps -i import.gramps -e export.gramps >>> >> $ gunzip < import.gramps > import.xml >>> >> $ gunzip < export.gramps > export.xml >>> >> $ diff -u import.xml export.xml > diff.txt >>> >> >>> >> where import.gramps is our "Scientific control". >>> >> >>> >> What should be the content of diff.txt ? >>> >> >>> >> For me, it should be few lines... >>> >> Unfortunatly there is some change (order, change time on family >>> objects): that's strange! >>> > >>> > Yes, it would be handy to do this. This might be called "idempotent" >>> > by a mathematician: if the round-trip through gramps was idempotent, >>> > then the diff would be empty. >>> > >>> > What we need is: >>> > >>> > 1. something smarter than diff for this usage >>> > 2. sort on something that doesn't change (like the handle), just for >>> > this purpose >>> > 3. make it so that the order is preserved >>> > >>> > I would lean towards #3. I've "fixed" some other places where the >>> > order was lost. If you let me know which orders are lost, I'll >>> > address. >>> > >>> > -Doug >>> > >>> >> Jérôme >>> >> >>> >> >>> >> --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> >>> a écrit : >>> >> >>> >>> De: Gerald Britton <ger...@gm...> >>> >>> Objet: Re: [Gramps-devel] >>> self.db.iter_object_handles(sort_handles=True) >>> >>> À: "jerome" <rom...@ya...> >>> >>> Cc: gra...@li... >>> >>> Date: Vendredi 14 janvier 2011, 22h10 >>> >>> On Fri, Jan 14, 2011 at 3:59 PM, >>> >>> jerome <rom...@ya...> >>> >>> wrote: >>> >>>>>> I am not certain to understand ... >>> >>>>>> Keys should be handles, no ? >>> >>>>> Well, that's the question! I can see a case for >>> >>>>> gramps ids, or >>> >>>>> surnames, or event dates, etc. etc. >>> >>>> But handle is the easiest way and safe key for >>> >>> ordering our data. >>> >>> >>> >>> Only if that's the order you want >>> >>> >>> >>>> gramps ids could be exotic! >>> >>> Do you mean unique? Anyway it is a good sort-key >>> >>> candidate >>> >>> >>> >>>> surnames is not a good key :( >>> >>> I can see that some would like it...makes the XML easier to >>> >>> read by a human >>> >>> >>> >>>> date => date_object => year, then month, then >>> >>> day, then rank, etc ... = horrible index >>> >>> >>> >>> Probably, but its just one possibility >>> >>> >>> >>>> My problem is on plugins/export/ExportXML.py >>> >>>> >>> >>>> I saw a sortByID function not used, then sometimes the >>> >>> use of list (get_...), then iteration (only family >>> >>> handles). >>> >>>> I thought on use lists sorted by handle for having an >>> >>> order rule. I do not want to group handles, handles will be >>> >>> grouped into the Gramps XML, so it was not planned to parse >>> >>> one flat XML file or something like that! >>> >>>> But it is not my main problem ... >>> >>>> I thought that to sort handles means objects lists >>> >>> will be consistent (Persons, Families, Events, etc ...) >>> >>>> Every time I import a Gramps XML, Gramps rebuilds >>> >>> (write, DB commit) some objects! Change time is not the same >>> >>> with a simple import then export. >>> >>> >>> >>> Well, they all need new handles, right? Possibility >>> >>> of collisions. >>> >>> Also with gramps ids. >>> >>> >>> >>>> I can understand the random order used by bsddb, but >>> >>> this should not be done on some objects (like family) and >>> >>> not on the others. >>> >>>> In my mind, an import without DB change is like a >>> >>> "read-only": it is not the case. OK, you are saying that it >>> >>> is the way used by bsddb. XML files should be able to use >>> >>> 'diff' or revision control tools. With current Gramps XML >>> >>> import/export, these tools are limited. :( >>> >>> >>> >>> Yep. You're probably looking for something like a >>> >>> UUID for each >>> >>> record. Not a bad idea but not implemented at the >>> >>> moment. >>> >>> >>> >>>> >>> >>>> Jérôme >>> >>>> >>> >>>> >>> >>>> --- En date de : Ven 14.1.11, Gerald Britton < >>> ger...@gm...> >>> >>> a écrit : >>> >>>>> De: Gerald Britton <ger...@gm...> >>> >>>>> Objet: Re: [Gramps-devel] >>> >>> self.db.iter_object_handles(sort_handles=True) >>> >>>>> À: "jerome" <rom...@ya...> >>> >>>>> Cc: gra...@li... >>> >>>>> Date: Vendredi 14 janvier 2011, 21h21 >>> >>>>> On Fri, Jan 14, 2011 at 3:11 PM, >>> >>>>> jerome <rom...@ya...> >>> >>>>> wrote: >>> >>>>>> I am not certain to understand ... >>> >>>>>> Keys should be handles, no ? >>> >>>>> Well, that's the question! I can see a case for >>> >>>>> gramps ids, or >>> >>>>> surnames, or event dates, etc. etc. >>> >>>>> >>> >>>>>> >>> >>> 'self.db.get_{object}_handles(sort_handles=True)' is >>> >>>>> allowed, >>> >>>>>> not >>> >>> 'self.db.iter_{object}_handles(sort_handles=True)'! >>> >>>>>> There is two questions: >>> >>>>>> >>> >>>>>> 1. Why does Gramps only use >>> >>>>> self.db.iter_family_handles(), else >>> >>>>> self.get_{object}_handles(), where {object} is >>> >>> person or >>> >>>>> event or source or place or repository or note or >>> >>> media >>> >>>>> object. >>> >>>>> >>> >>>>> the get_...handles methods return a list, which >>> >>> can be >>> >>>>> expensive in >>> >>>>> memory and must read all objects in one pass. >>> >>> The >>> >>>>> iter... methods >>> >>>>> just return one at at time, so are cheaper in >>> >>> memory. >>> >>>>> So, the iter... >>> >>>>> methods are preferable. OTOH, they cannot do >>> >>> sorting, >>> >>>>> since by >>> >>>>> definition you need to read all records before you >>> >>> can sort >>> >>>>> them. >>> >>>>> >>> >>>>>> 2. Why 'sort_handles=True' argument is >>> >>> allowed on all >>> >>>>> primary objects except family object ? >>> >>>>> >>> >>>>> I suppose that there has been no requirement so >>> >>> far so no >>> >>>>> one coded it up. >>> >>>>> >>> >>>>>>> The data is not ordered since it >>> >>>>>>> comes from bsddb in random order. >>> >>>>>> This could explain why I will not be able to >>> >>> keep >>> >>>>> order on XML import (to bsddb). :( >>> >>>>>> >>> >>>>>> Thanks. >>> >>>>>> Jérôme >>> >>>>>> >>> >>>>>> --- En date de : Ven 14.1.11, Gerald Britton >>> >>> <ger...@gm...> >>> >>>>> a écrit : >>> >>>>>>> De: Gerald Britton <ger...@gm...> >>> >>>>>>> Objet: Re: [Gramps-devel] >>> >>>>> self.db.iter_object_handles(sort_handles=True) >>> >>>>>>> À: "jerome" <rom...@ya...> >>> >>>>>>> Cc: gra...@li... >>> >>>>>>> Date: Vendredi 14 janvier 2011, 19h53 >>> >>>>>>> The data is not ordered since it >>> >>>>>>> comes from bsddb in random order. If >>> >>>>>>> we ordered it, we would have to sort it >>> >>> by some >>> >>>>> key. >>> >>>>>>> So, if we did, >>> >>>>>>> what keys would you use for: >>> >>>>>>> >>> >>>>>>> person >>> >>>>>>> family >>> >>>>>>> event >>> >>>>>>> source >>> >>>>>>> place >>> >>>>>>> repository >>> >>>>>>> note >>> >>>>>>> media object >>> >>>>>>> >>> >>>>>>> On Fri, Jan 14, 2011 at 1:36 PM, jerome >>> >>> <rom...@ya...> >>> >>>>>>> wrote: >>> >>>>>>>> Hi, >>> >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> I am trying to get an answer to a >>> >>> question >>> >>>>> about the >>> >>>>>>> code: why we cannot keep the order of >>> >>> objects >>> >>>>> after a Gramps >>> >>>>>>> XML file import against export ? >>> >>>>>>>> Nick pointed out that objects are >>> >>> not ordered >>> >>>>> on >>> >>>>>>> export[1]. >>> >>>>>>>> Why ? I suppose backup scripts or >>> >>> revision >>> >>>>> control >>> >>>>>>> tools will work better with ordered >>> >>> objects! >>> >>>>> Anyway, to use >>> >>>>>>> 'sort_handles=True' works on export, >>> >>> except for >>> >>>>> family >>> >>>>>>> handles. Any reason for that ? A typo >>> >>> somewhere ? >>> >>>>> On my side >>> >>>>>>> ? >>> >>>>>>>> >>> >>>>>>>> [1] http://www.gramps-project.org/bugs/view.php?id=4365 >>> >>>>>>>> >>> >>>>>>>> regards, >>> >>>>>>>> Jérôme >>> >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> >>> >>>>>>>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>>>>>>> Protect Your Site and Customers from >>> >>> Malware >>> >>>>> Attacks >>> >>>>>>>> Learn about various malware tactics >>> >>> and how >>> >>>>> to avoid >>> >>>>>>> them. Understand >>> >>>>>>>> malware threats, the impact they can >>> >>> have on >>> >>>>> your >>> >>>>>>> business, and how you >>> >>>>>>>> can protect your company and >>> >>> customers by >>> >>>>> using code >>> >>>>>>> signing. >>> >>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl >>> >>>>>>>> >>> >>>>> _______________________________________________ >>> >>>>>>>> Gramps-devel mailing list >>> >>>>>>>> Gra...@li... >>> >>>>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>> >>>>>>>> >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> -- >>> >>>>>>> Gerald Britton >>> >>>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>> >>> >>>>> >>> >>>>> -- >>> >>>>> Gerald Britton >>> >>>>> >>> >>>> >>> >>>> >>> >>>> >>> >>> >>> >>> >>> >>> -- >>> >>> Gerald Britton >>> >>> >>> >> >>> >> >>> >> >>> >> >>> ------------------------------------------------------------------------------ >>> >> Protect Your Site and Customers from Malware Attacks >>> >> Learn about various malware tactics and how to avoid them. Understand >>> >> malware threats, the impact they can have on your business, and how you >>> >> can protect your company and customers by using code signing. >>> >> http://p.sf.net/sfu/oracle-sfdevnl >>> >> _______________________________________________ >>> >> Gramps-devel mailing list >>> >> Gra...@li... >>> >> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>> >> >>> > >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Protect Your Site and Customers from Malware Attacks >>> Learn about various malware tactics and how to avoid them. Understand >>> malware threats, the impact they can have on your business, and how you >>> can protect your company and customers by using code signing. >>> http://p.sf.net/sfu/oracle-sfdevnl >>> _______________________________________________ >>> Gramps-devel mailing list >>> Gra...@li... >>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >>> >> > > -- > Sent from my mobile device > > Gerald Britton > |
From: Doug B. <dou...@gm...> - 2011-01-15 15:04:55
|
On Sat, Jan 15, 2011 at 8:44 AM, Benny Malengier <ben...@gm...> wrote: > We should _never_ order on export. > We should only access things via an index in the database. Benny, If I understand what you mean, you mean don't sort export by something *other* than an index. As long as we have an index to sort by, then we are fine, right? Or did you mean something else? -Doug > Ordering would mean a huge time penalty on exporting for those with very > large family trees. > Even exporting along a bsddb index would be much slower, as now we go from > database page to database page. > > Just looping over the data and exporting means the the harddisk is the least > read (it goes from database page to database page). > > In other words: > 1/ default should be just a cursor of the database table, so order cannot be > maintained > 2/ ordered output could be optional. If we add an ordered output, it should > be along an index page of the database, so no in memory sorting must occur > before export can be done. I think ID has a sorted index over it. Handle > normally also, as it is the primary key, and will hence be in some sort of > B-tree. You must be sure to use the sort index on looping however. > > Benny > > 2011/1/15 Jérôme <rom...@ya...> >> >> > if the round-trip through gramps was idempotent, then the diff would be >> > empty. >> >> Expected result was: minor change on date generation (if generated on an >> other day) and maybe media objects (media paths). >> >> I do not expect a full idem potent after round-trip, but currently we >> cannot easily get the differences. I just wanted testing complete XML >> migration before major release. >> >> >> Jérôme >> >> >> Doug Blank a écrit : >> > On Fri, Jan 14, 2011 at 4:31 PM, jerome <rom...@ya...> wrote: >> >>>> gramps ids could be exotic! >> >>> Do you mean unique? Anyway it is a good sort-key >> >>> candidate >> >> ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] >> >> >> >> In 'handle' I trust! ;) >> >> >> >>>> Every time I import a Gramps XML, Gramps rebuilds >> >>> (write, DB commit) some objects! Change time is not the same >> >>> with a simple import then export. >> >>> Well, they all need new handles, right? Possibility >> >>> of collisions. >> >>> Also with gramps ids. >> >> In fact, I want to keep handles: they should be the keys control. >> >> >> >> My problem could be illustrated by something like: >> >> >> >> $ gramps -i import.gramps -e export.gramps >> >> $ gunzip < import.gramps > import.xml >> >> $ gunzip < export.gramps > export.xml >> >> $ diff -u import.xml export.xml > diff.txt >> >> >> >> where import.gramps is our "Scientific control". >> >> >> >> What should be the content of diff.txt ? >> >> >> >> For me, it should be few lines... >> >> Unfortunatly there is some change (order, change time on family >> >> objects): that's strange! >> > >> > Yes, it would be handy to do this. This might be called "idempotent" >> > by a mathematician: if the round-trip through gramps was idempotent, >> > then the diff would be empty. >> > >> > What we need is: >> > >> > 1. something smarter than diff for this usage >> > 2. sort on something that doesn't change (like the handle), just for >> > this purpose >> > 3. make it so that the order is preserved >> > >> > I would lean towards #3. I've "fixed" some other places where the >> > order was lost. If you let me know which orders are lost, I'll >> > address. >> > >> > -Doug >> > >> >> Jérôme >> >> >> >> >> >> --- En date de : Ven 14.1.11, Gerald Britton <ger...@gm...> >> >> a écrit : >> >> >> >>> De: Gerald Britton <ger...@gm...> >> >>> Objet: Re: [Gramps-devel] >> >>> self.db.iter_object_handles(sort_handles=True) >> >>> À: "jerome" <rom...@ya...> >> >>> Cc: gra...@li... >> >>> Date: Vendredi 14 janvier 2011, 22h10 >> >>> On Fri, Jan 14, 2011 at 3:59 PM, >> >>> jerome <rom...@ya...> >> >>> wrote: >> >>>>>> I am not certain to understand ... >> >>>>>> Keys should be handles, no ? >> >>>>> Well, that's the question! I can see a case for >> >>>>> gramps ids, or >> >>>>> surnames, or event dates, etc. etc. >> >>>> But handle is the easiest way and safe key for >> >>> ordering our data. >> >>> >> >>> Only if that's the order you want >> >>> >> >>>> gramps ids could be exotic! >> >>> Do you mean unique? Anyway it is a good sort-key >> >>> candidate >> >>> >> >>>> surnames is not a good key :( >> >>> I can see that some would like it...makes the XML easier to >> >>> read by a human >> >>> >> >>>> date => date_object => year, then month, then >> >>> day, then rank, etc ... = horrible index >> >>> >> >>> Probably, but its just one possibility >> >>> >> >>>> My problem is on plugins/export/ExportXML.py >> >>>> >> >>>> I saw a sortByID function not used, then sometimes the >> >>> use of list (get_...), then iteration (only family >> >>> handles). >> >>>> I thought on use lists sorted by handle for having an >> >>> order rule. I do not want to group handles, handles will be >> >>> grouped into the Gramps XML, so it was not planned to parse >> >>> one flat XML file or something like that! >> >>>> But it is not my main problem ... >> >>>> I thought that to sort handles means objects lists >> >>> will be consistent (Persons, Families, Events, etc ...) >> >>>> Every time I import a Gramps XML, Gramps rebuilds >> >>> (write, DB commit) some objects! Change time is not the same >> >>> with a simple import then export. >> >>> >> >>> Well, they all need new handles, right? Possibility >> >>> of collisions. >> >>> Also with gramps ids. >> >>> >> >>>> I can understand the random order used by bsddb, but >> >>> this should not be done on some objects (like family) and >> >>> not on the others. >> >>>> In my mind, an import without DB change is like a >> >>> "read-only": it is not the case. OK, you are saying that it >> >>> is the way used by bsddb. XML files should be able to use >> >>> 'diff' or revision control tools. With current Gramps XML >> >>> import/export, these tools are limited. :( >> >>> >> >>> Yep. You're probably looking for something like a >> >>> UUID for each >> >>> record. Not a bad idea but not implemented at the >> >>> moment. >> >>> >> >>>> >> >>>> Jérôme >> >>>> >> >>>> >> >>>> --- En date de : Ven 14.1.11, Gerald Britton >> >>>> <ger...@gm...> >> >>> a écrit : >> >>>>> De: Gerald Britton <ger...@gm...> >> >>>>> Objet: Re: [Gramps-devel] >> >>> self.db.iter_object_handles(sort_handles=True) >> >>>>> À: "jerome" <rom...@ya...> >> >>>>> Cc: gra...@li... >> >>>>> Date: Vendredi 14 janvier 2011, 21h21 >> >>>>> On Fri, Jan 14, 2011 at 3:11 PM, >> >>>>> jerome <rom...@ya...> >> >>>>> wrote: >> >>>>>> I am not certain to understand ... >> >>>>>> Keys should be handles, no ? >> >>>>> Well, that's the question! I can see a case for >> >>>>> gramps ids, or >> >>>>> surnames, or event dates, etc. etc. >> >>>>> >> >>>>>> >> >>> 'self.db.get_{object}_handles(sort_handles=True)' is >> >>>>> allowed, >> >>>>>> not >> >>> 'self.db.iter_{object}_handles(sort_handles=True)'! >> >>>>>> There is two questions: >> >>>>>> >> >>>>>> 1. Why does Gramps only use >> >>>>> self.db.iter_family_handles(), else >> >>>>> self.get_{object}_handles(), where {object} is >> >>> person or >> >>>>> event or source or place or repository or note or >> >>> media >> >>>>> object. >> >>>>> >> >>>>> the get_...handles methods return a list, which >> >>> can be >> >>>>> expensive in >> >>>>> memory and must read all objects in one pass. >> >>> The >> >>>>> iter... methods >> >>>>> just return one at at time, so are cheaper in >> >>> memory. >> >>>>> So, the iter... >> >>>>> methods are preferable. OTOH, they cannot do >> >>> sorting, >> >>>>> since by >> >>>>> definition you need to read all records before you >> >>> can sort >> >>>>> them. >> >>>>> >> >>>>>> 2. Why 'sort_handles=True' argument is >> >>> allowed on all >> >>>>> primary objects except family object ? >> >>>>> >> >>>>> I suppose that there has been no requirement so >> >>> far so no >> >>>>> one coded it up. >> >>>>> >> >>>>>>> The data is not ordered since it >> >>>>>>> comes from bsddb in random order. >> >>>>>> This could explain why I will not be able to >> >>> keep >> >>>>> order on XML import (to bsddb). :( >> >>>>>> >> >>>>>> Thanks. >> >>>>>> Jérôme >> >>>>>> >> >>>>>> --- En date de : Ven 14.1.11, Gerald Britton >> >>> <ger...@gm...> >> >>>>> a écrit : >> >>>>>>> De: Gerald Britton <ger...@gm...> >> >>>>>>> Objet: Re: [Gramps-devel] >> >>>>> self.db.iter_object_handles(sort_handles=True) >> >>>>>>> À: "jerome" <rom...@ya...> >> >>>>>>> Cc: gra...@li... >> >>>>>>> Date: Vendredi 14 janvier 2011, 19h53 >> >>>>>>> The data is not ordered since it >> >>>>>>> comes from bsddb in random order. If >> >>>>>>> we ordered it, we would have to sort it >> >>> by some >> >>>>> key. >> >>>>>>> So, if we did, >> >>>>>>> what keys would you use for: >> >>>>>>> >> >>>>>>> person >> >>>>>>> family >> >>>>>>> event >> >>>>>>> source >> >>>>>>> place >> >>>>>>> repository >> >>>>>>> note >> >>>>>>> media object >> >>>>>>> >> >>>>>>> On Fri, Jan 14, 2011 at 1:36 PM, jerome >> >>> <rom...@ya...> >> >>>>>>> wrote: >> >>>>>>>> Hi, >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> I am trying to get an answer to a >> >>> question >> >>>>> about the >> >>>>>>> code: why we cannot keep the order of >> >>> objects >> >>>>> after a Gramps >> >>>>>>> XML file import against export ? >> >>>>>>>> Nick pointed out that objects are >> >>> not ordered >> >>>>> on >> >>>>>>> export[1]. >> >>>>>>>> Why ? I suppose backup scripts or >> >>> revision >> >>>>> control >> >>>>>>> tools will work better with ordered >> >>> objects! >> >>>>> Anyway, to use >> >>>>>>> 'sort_handles=True' works on export, >> >>> except for >> >>>>> family >> >>>>>>> handles. Any reason for that ? A typo >> >>> somewhere ? >> >>>>> On my side >> >>>>>>> ? >> >>>>>>>> >> >>>>>>>> [1] http://www.gramps-project.org/bugs/view.php?id=4365 >> >>>>>>>> >> >>>>>>>> regards, >> >>>>>>>> Jérôme >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>> >> >>> ------------------------------------------------------------------------------ >> >>>>>>>> Protect Your Site and Customers from >> >>> Malware >> >>>>> Attacks >> >>>>>>>> Learn about various malware tactics >> >>> and how >> >>>>> to avoid >> >>>>>>> them. Understand >> >>>>>>>> malware threats, the impact they can >> >>> have on >> >>>>> your >> >>>>>>> business, and how you >> >>>>>>>> can protect your company and >> >>> customers by >> >>>>> using code >> >>>>>>> signing. >> >>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl >> >>>>>>>> >> >>>>> _______________________________________________ >> >>>>>>>> Gramps-devel mailing list >> >>>>>>>> Gra...@li... >> >>>>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> Gerald Britton >> >>>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>> >> >>>>> >> >>>>> -- >> >>>>> Gerald Britton >> >>>>> >> >>>> >> >>>> >> >>>> >> >>> >> >>> >> >>> -- >> >>> Gerald Britton >> >>> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> >> Protect Your Site and Customers from Malware Attacks >> >> Learn about various malware tactics and how to avoid them. Understand >> >> malware threats, the impact they can have on your business, and how you >> >> can protect your company and customers by using code signing. >> >> http://p.sf.net/sfu/oracle-sfdevnl >> >> _______________________________________________ >> >> Gramps-devel mailing list >> >> Gra...@li... >> >> https://lists.sourceforge.net/lists/listinfo/gramps-devel >> >> >> > >> >> >> >> ------------------------------------------------------------------------------ >> Protect Your Site and Customers from Malware Attacks >> Learn about various malware tactics and how to avoid them. Understand >> malware threats, the impact they can have on your business, and how you >> can protect your company and customers by using code signing. >> http://p.sf.net/sfu/oracle-sfdevnl >> _______________________________________________ >> Gramps-devel mailing list >> Gra...@li... >> https://lists.sourceforge.net/lists/listinfo/gramps-devel > > |
From: Benny M. <ben...@gm...> - 2011-01-15 15:13:50
|
2011/1/15 Doug Blank <dou...@gm...> > On Sat, Jan 15, 2011 at 8:44 AM, Benny Malengier > <ben...@gm...> wrote: > > We should _never_ order on export. > > We should only access things via an index in the database. > > Benny, > > If I understand what you mean, you mean don't sort export by something > *other* than an index. As long as we have an index to sort by, then we > are fine, right? Or did you mean something else? > > For 300000 people, not following the index would be best, as then you don't hit a database page twice. When you follow an index, you will jump over your database pages, and the data is too large to stay in memory. It depends on bsddb structure if following sorted record key has this effect or not. So, index is good, but not as good as just reading the database table out. It depends on how much performance you want. For Gramps as a desktop applicatoin I can accept following an index is good enough, even if not the best. Benny > -Doug > > > Ordering would mean a huge time penalty on exporting for those with very > > large family trees. > > Even exporting along a bsddb index would be much slower, as now we go > from > > database page to database page. > > > > Just looping over the data and exporting means the the harddisk is the > least > > read (it goes from database page to database page). > > > > In other words: > > 1/ default should be just a cursor of the database table, so order cannot > be > > maintained > > 2/ ordered output could be optional. If we add an ordered output, it > should > > be along an index page of the database, so no in memory sorting must > occur > > before export can be done. I think ID has a sorted index over it. Handle > > normally also, as it is the primary key, and will hence be in some sort > of > > B-tree. You must be sure to use the sort index on looping however. > > > > Benny > > > > 2011/1/15 Jérôme <rom...@ya...> > >> > >> > if the round-trip through gramps was idempotent, then the diff would > be > >> > empty. > >> > >> Expected result was: minor change on date generation (if generated on an > >> other day) and maybe media objects (media paths). > >> > >> I do not expect a full idem potent after round-trip, but currently we > >> cannot easily get the differences. I just wanted testing complete XML > >> migration before major release. > >> > >> > >> Jérôme > >> > >> > >> Doug Blank a écrit : > >> > On Fri, Jan 14, 2011 at 4:31 PM, jerome <rom...@ya...> wrote: > >> >>>> gramps ids could be exotic! > >> >>> Do you mean unique? Anyway it is a good sort-key > >> >>> candidate > >> >> ids = [I000001, IAYUTRE235, zharb, /empty/ , etc ...] > >> >> > >> >> In 'handle' I trust! ;) > >> >> > >> >>>> Every time I import a Gramps XML, Gramps rebuilds > >> >>> (write, DB commit) some objects! Change time is not the same > >> >>> with a simple import then export. > >> >>> Well, they all need new handles, right? Possibility > >> >>> of collisions. > >> >>> Also with gramps ids. > >> >> In fact, I want to keep handles: they should be the keys control. > >> >> > >> >> My problem could be illustrated by something like: > >> >> > >> >> $ gramps -i import.gramps -e export.gramps > >> >> $ gunzip < import.gramps > import.xml > >> >> $ gunzip < export.gramps > export.xml > >> >> $ diff -u import.xml export.xml > diff.txt > >> >> > >> >> where import.gramps is our "Scientific control". > >> >> > >> >> What should be the content of diff.txt ? > >> >> > >> >> For me, it should be few lines... > >> >> Unfortunatly there is some change (order, change time on family > >> >> objects): that's strange! > >> > > >> > Yes, it would be handy to do this. This might be called "idempotent" > >> > by a mathematician: if the round-trip through gramps was idempotent, > >> > then the diff would be empty. > >> > > >> > What we need is: > >> > > >> > 1. something smarter than diff for this usage > >> > 2. sort on something that doesn't change (like the handle), just for > >> > this purpose > >> > 3. make it so that the order is preserved > >> > > >> > I would lean towards #3. I've "fixed" some other places where the > >> > order was lost. If you let me know which orders are lost, I'll > >> > address. > >> > > >> > -Doug > >> > > >> >> Jérôme > >> >> > >> >> > >> >> --- En date de : Ven 14.1.11, Gerald Britton < > ger...@gm...> > >> >> a écrit : > >> >> > >> >>> De: Gerald Britton <ger...@gm...> > >> >>> Objet: Re: [Gramps-devel] > >> >>> self.db.iter_object_handles(sort_handles=True) > >> >>> À: "jerome" <rom...@ya...> > >> >>> Cc: gra...@li... > >> >>> Date: Vendredi 14 janvier 2011, 22h10 > >> >>> On Fri, Jan 14, 2011 at 3:59 PM, > >> >>> jerome <rom...@ya...> > >> >>> wrote: > >> >>>>>> I am not certain to understand ... > >> >>>>>> Keys should be handles, no ? > >> >>>>> Well, that's the question! I can see a case for > >> >>>>> gramps ids, or > >> >>>>> surnames, or event dates, etc. etc. > >> >>>> But handle is the easiest way and safe key for > >> >>> ordering our data. > >> >>> > >> >>> Only if that's the order you want > >> >>> > >> >>>> gramps ids could be exotic! > >> >>> Do you mean unique? Anyway it is a good sort-key > >> >>> candidate > >> >>> > >> >>>> surnames is not a good key :( > >> >>> I can see that some would like it...makes the XML easier to > >> >>> read by a human > >> >>> > >> >>>> date => date_object => year, then month, then > >> >>> day, then rank, etc ... = horrible index > >> >>> > >> >>> Probably, but its just one possibility > >> >>> > >> >>>> My problem is on plugins/export/ExportXML.py > >> >>>> > >> >>>> I saw a sortByID function not used, then sometimes the > >> >>> use of list (get_...), then iteration (only family > >> >>> handles). > >> >>>> I thought on use lists sorted by handle for having an > >> >>> order rule. I do not want to group handles, handles will be > >> >>> grouped into the Gramps XML, so it was not planned to parse > >> >>> one flat XML file or something like that! > >> >>>> But it is not my main problem ... > >> >>>> I thought that to sort handles means objects lists > >> >>> will be consistent (Persons, Families, Events, etc ...) > >> >>>> Every time I import a Gramps XML, Gramps rebuilds > >> >>> (write, DB commit) some objects! Change time is not the same > >> >>> with a simple import then export. > >> >>> > >> >>> Well, they all need new handles, right? Possibility > >> >>> of collisions. > >> >>> Also with gramps ids. > >> >>> > >> >>>> I can understand the random order used by bsddb, but > >> >>> this should not be done on some objects (like family) and > >> >>> not on the others. > >> >>>> In my mind, an import without DB change is like a > >> >>> "read-only": it is not the case. OK, you are saying that it > >> >>> is the way used by bsddb. XML files should be able to use > >> >>> 'diff' or revision control tools. With current Gramps XML > >> >>> import/export, these tools are limited. :( > >> >>> > >> >>> Yep. You're probably looking for something like a > >> >>> UUID for each > >> >>> record. Not a bad idea but not implemented at the > >> >>> moment. > >> >>> > >> >>>> > >> >>>> Jérôme > >> >>>> > >> >>>> > >> >>>> --- En date de : Ven 14.1.11, Gerald Britton > >> >>>> <ger...@gm...> > >> >>> a écrit : > >> >>>>> De: Gerald Britton <ger...@gm...> > >> >>>>> Objet: Re: [Gramps-devel] > >> >>> self.db.iter_object_handles(sort_handles=True) > >> >>>>> À: "jerome" <rom...@ya...> > >> >>>>> Cc: gra...@li... > >> >>>>> Date: Vendredi 14 janvier 2011, 21h21 > >> >>>>> On Fri, Jan 14, 2011 at 3:11 PM, > >> >>>>> jerome <rom...@ya...> > >> >>>>> wrote: > >> >>>>>> I am not certain to understand ... > >> >>>>>> Keys should be handles, no ? > >> >>>>> Well, that's the question! I can see a case for > >> >>>>> gramps ids, or > >> >>>>> surnames, or event dates, etc. etc. > >> >>>>> > >> >>>>>> > >> >>> 'self.db.get_{object}_handles(sort_handles=True)' is > >> >>>>> allowed, > >> >>>>>> not > >> >>> 'self.db.iter_{object}_handles(sort_handles=True)'! > >> >>>>>> There is two questions: > >> >>>>>> > >> >>>>>> 1. Why does Gramps only use > >> >>>>> self.db.iter_family_handles(), else > >> >>>>> self.get_{object}_handles(), where {object} is > >> >>> person or > >> >>>>> event or source or place or repository or note or > >> >>> media > >> >>>>> object. > >> >>>>> > >> >>>>> the get_...handles methods return a list, which > >> >>> can be > >> >>>>> expensive in > >> >>>>> memory and must read all objects in one pass. > >> >>> The > >> >>>>> iter... methods > >> >>>>> just return one at at time, so are cheaper in > >> >>> memory. > >> >>>>> So, the iter... > >> >>>>> methods are preferable. OTOH, they cannot do > >> >>> sorting, > >> >>>>> since by > >> >>>>> definition you need to read all records before you > >> >>> can sort > >> >>>>> them. > >> >>>>> > >> >>>>>> 2. Why 'sort_handles=True' argument is > >> >>> allowed on all > >> >>>>> primary objects except family object ? > >> >>>>> > >> >>>>> I suppose that there has been no requirement so > >> >>> far so no > >> >>>>> one coded it up. > >> >>>>> > >> >>>>>>> The data is not ordered since it > >> >>>>>>> comes from bsddb in random order. > >> >>>>>> This could explain why I will not be able to > >> >>> keep > >> >>>>> order on XML import (to bsddb). :( > >> >>>>>> > >> >>>>>> Thanks. > >> >>>>>> Jérôme > >> >>>>>> > >> >>>>>> --- En date de : Ven 14.1.11, Gerald Britton > >> >>> <ger...@gm...> > >> >>>>> a écrit : > >> >>>>>>> De: Gerald Britton <ger...@gm...> > >> >>>>>>> Objet: Re: [Gramps-devel] > >> >>>>> self.db.iter_object_handles(sort_handles=True) > >> >>>>>>> À: "jerome" <rom...@ya...> > >> >>>>>>> Cc: gra...@li... > >> >>>>>>> Date: Vendredi 14 janvier 2011, 19h53 > >> >>>>>>> The data is not ordered since it > >> >>>>>>> comes from bsddb in random order. If > >> >>>>>>> we ordered it, we would have to sort it > >> >>> by some > >> >>>>> key. > >> >>>>>>> So, if we did, > >> >>>>>>> what keys would you use for: > >> >>>>>>> > >> >>>>>>> person > >> >>>>>>> family > >> >>>>>>> event > >> >>>>>>> source > >> >>>>>>> place > >> >>>>>>> repository > >> >>>>>>> note > >> >>>>>>> media object > >> >>>>>>> > >> >>>>>>> On Fri, Jan 14, 2011 at 1:36 PM, jerome > >> >>> <rom...@ya...> > >> >>>>>>> wrote: > >> >>>>>>>> Hi, > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> I am trying to get an answer to a > >> >>> question > >> >>>>> about the > >> >>>>>>> code: why we cannot keep the order of > >> >>> objects > >> >>>>> after a Gramps > >> >>>>>>> XML file import against export ? > >> >>>>>>>> Nick pointed out that objects are > >> >>> not ordered > >> >>>>> on > >> >>>>>>> export[1]. > >> >>>>>>>> Why ? I suppose backup scripts or > >> >>> revision > >> >>>>> control > >> >>>>>>> tools will work better with ordered > >> >>> objects! > >> >>>>> Anyway, to use > >> >>>>>>> 'sort_handles=True' works on export, > >> >>> except for > >> >>>>> family > >> >>>>>>> handles. Any reason for that ? A typo > >> >>> somewhere ? > >> >>>>> On my side > >> >>>>>>> ? > >> >>>>>>>> > >> >>>>>>>> [1] http://www.gramps-project.org/bugs/view.php?id=4365 > >> >>>>>>>> > >> >>>>>>>> regards, > >> >>>>>>>> Jérôme > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > >> >>>>>>>> > >> >>> > >> >>> > ------------------------------------------------------------------------------ > >> >>>>>>>> Protect Your Site and Customers from > >> >>> Malware > >> >>>>> Attacks > >> >>>>>>>> Learn about various malware tactics > >> >>> and how > >> >>>>> to avoid > >> >>>>>>> them. Understand > >> >>>>>>>> malware threats, the impact they can > >> >>> have on > >> >>>>> your > >> >>>>>>> business, and how you > >> >>>>>>>> can protect your company and > >> >>> customers by > >> >>>>> using code > >> >>>>>>> signing. > >> >>>>>>>> http://p.sf.net/sfu/oracle-sfdevnl > >> >>>>>>>> > >> >>>>> _______________________________________________ > >> >>>>>>>> Gramps-devel mailing list > >> >>>>>>>> Gra...@li... > >> >>>>>>>> https://lists.sourceforge.net/lists/listinfo/gramps-devel > >> >>>>>>>> > >> >>>>>>> > >> >>>>>>> > >> >>>>>>> -- > >> >>>>>>> Gerald Britton > >> >>>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>>> > >> >>>>> > >> >>>>> > >> >>>>> -- > >> >>>>> Gerald Britton > >> >>>>> > >> >>>> > >> >>>> > >> >>>> > >> >>> > >> >>> > >> >>> -- > >> >>> Gerald Britton > >> >>> > >> >> > >> >> > >> >> > >> >> > >> >> > ------------------------------------------------------------------------------ > >> >> Protect Your Site and Customers from Malware Attacks > >> >> Learn about various malware tactics and how to avoid them. Understand > >> >> malware threats, the impact they can have on your business, and how > you > >> >> can protect your company and customers by using code signing. > >> >> http://p.sf.net/sfu/oracle-sfdevnl > >> >> _______________________________________________ > >> >> Gramps-devel mailing list > >> >> Gra...@li... > >> >> https://lists.sourceforge.net/lists/listinfo/gramps-devel > >> >> > >> > > >> > >> > >> > >> > ------------------------------------------------------------------------------ > >> Protect Your Site and Customers from Malware Attacks > >> Learn about various malware tactics and how to avoid them. Understand > >> malware threats, the impact they can have on your business, and how you > >> can protect your company and customers by using code signing. > >> http://p.sf.net/sfu/oracle-sfdevnl > >> _______________________________________________ > >> Gramps-devel mailing list > >> Gra...@li... > >> https://lists.sourceforge.net/lists/listinfo/gramps-devel > > > > > |