From: Doug B. <dou...@gm...> - 2009-09-26 22:55:14
|
On Sat, Sep 26, 2009 at 6:22 PM, Gerald Britton <ger...@gm...> wrote: > Now that's just weird. The 'update' method just calls DBCursor.put, > which is a standard bsddb method. Nuttin' fancy. Is the behaviour > consistent (same counts every time)? If not, it will surely be harder > to pin down. Still, I don't believe that the implementation is > faulty. There's just not that many variables. You can read the doc > here: > > http://www.oracle.com/technology/documentation/berkeley-db/db/api_reference/C/dbcput.html > Ok, thanks for the link. I see that it says "If DBcursor->put() succeeds and an item is inserted into the database, the cursor is always positioned to refer to the newly inserted item." I can't tell what order the data is in, but the behavior looks like the position is changing. I'll have to do some more tests to see if it deterministically gives the same results... -Doug > On Sat, Sep 26, 2009 at 3:47 PM, Doug Blank <dou...@gm...> wrote: >> On Sat, Sep 26, 2009 at 1:27 PM, Doug Blank <dou...@gm...> wrote: >>> On Sat, Sep 26, 2009 at 12:40 PM, Gerald Britton >>> <ger...@gm...> wrote: >>>> Not sure I know what you mean. Accessing by cursor means accessing in >>>> key (that is handle) order, not in surname order. That means that the >>>> cursor shouldn't skip over records since the name is just a piece of >>>> data in the record and not related to the order in which records are >>>> processed. So the order (that is, handle order) won't change as >>>> updates are made. There must be something else going on. >>> >>> I'm not sure what order they are in, but updating the cursor appears >>> to be making it so that not all of the people are iterated through. >>> I'll dig a little more... >>> >> >> Looks like the update... just tried this variation: >> >>>>> count = 0 >>>>> with db.get_person_cursor(update=True, commit=True) as cursor: >> for handle, data in cursor: >> count += 1 >> person = Person(data) >> name = person.get_primary_name() >> name.set_surname(name.get_surname().upper()) >> cursor.update(handle, person.serialize()) >> >>>>> count >> 8731 >> >> Should have been 9882. The only thing I'm doing is changing the name. I'm using: >> >> Python version: 2.6 (r26:66714, Jun 8 2009, 16:07:26) [GCC 4.4.0 >> 20090506 (Red Hat 4.4.0-4)] >> BSDDB version: 4.7.3 >> Gramps version: 3.2.0-0.SVN13094M >> LANG: C >> OS: Linux >> Distribution: 2.6.29.6-217.2.3.fc11.i586 >> >> -Doug >> >> >>> -Doug >>> >>>> On Sat, Sep 26, 2009 at 11:44 AM, Doug Blank <dou...@gm...> wrote: >>>>> Gerald (et al), >>>>> >>>>> I'm looking at some code in trunk/src/plugins/tool/ChangeNames.py: >>>>> >>>>> with self.db.get_person_cursor(update=True, commit=True) as cursor: >>>>> for handle, data in cursor: >>>>> person = Person(data) >>>>> change = False >>>>> for name in [person.get_primary_name()] + >>>>> person.get_alternate_names(): >>>>> sname = name.get_surname() >>>>> if sname in changelist: >>>>> change = True >>>>> sname = self.name_cap(sname) >>>>> name.set_surname(sname) >>>>> if change: >>>>> cursor.update(handle, person.serialize()) >>>>> >>>>> and it looks like the cursor is skipping around the order as the names >>>>> change, and missing some of the people, and thus names. >>>>> >>>>> To test this out, you could download the following zipped GEDCOM of >>>>> the Tudor royal family of England, and try to fix the capital letters >>>>> by running Tools -> Database Processing -> Fix Capitalization of >>>>> Family Names. There are 9882 names in the file, FYI. When I go through >>>>> the above loop, I only hit 6758 people the first time, 7367 the second >>>>> time, 7846 the third time, 8278 the fourth time, ... it takes 9 passes >>>>> to go through all 9882. >>>>> >>>>> Do you think what I describe is the problem? Is there a way to get a >>>>> cursor that won't change order as you update? >>>>> >>>>> Thanks for any insight, >>>>> >>>>> -Doug >>>>> >>>>> http://www.genealogyforum.com/gedcom/gedcom2a/gedr2090.zip >>>>> >>>> >>>> >>>> >>>> -- >>>> Gerald Britton >>>> >>> >> > > > > -- > Gerald Britton > |