#468 gedfix.php v0.6 - fix common gedcom errors

open
Greg Roach
5
2008-11-06
2008-03-25
Greg Roach
No

gedfix.php detects a wide range of common gedcom errors, and allows you to fix most of them with a single click.

It identifies things like:

Broken links.

Invalid names.

Male/female inconsistencies based on given name.

Missing BIRT/DEAT records

Multiple occurrences of facts/events that should be unique.

Individuals/families with no links to other records.

Discussion

1 2 > >> (Page 1 of 2)
  • Greg Roach
    Greg Roach
    2008-03-25

    gedfix.php v0.1

     
    Attachments
  • Thomas52
    Thomas52
    2008-03-25

    Logged In: YES
    user_id=1447380
    Originator: NO

    Greg: I am using 4.1.3 and get a error messages. A number of errors are shown, but with no ID like:

    ERROR 8: Use of undefined constant PGV_USER_CAN_EDIT - assumed 'PGV_USER_CAN_EDIT'
    0 Error occurred on in function unknown
    1 called from line 26 of file gedfix.php

    ERROR 2: Cannot modify header information - headers already sent by (output started at /home/adkins92/public_html/tree/includes/functions.php:360)

    Warning: Cannot modify header information - headers already sent by (output started at /home/adkins92/public_html/tree/includes/functions.php:360) in /home/adkins92/public_html/tree/includes/functions_print.php on line 523

    ERROR 8: Use of undefined constant PGV_GED_ID - assumed 'PGV_GED_ID'
    0 Error occurred on in function unknown
    1 called from line 46 of file gedfix.php

    ERROR 8: Use of undefined constant PGV_GED_ID - assumed 'PGV_GED_ID'
    0 Error occurred on in function unknown
    1 called from line 48 of file gedfix.php

    ERROR 2: preg_match() expects parameter 2 to be string, array given
    0 Error occurred on in function preg_match
    1 called from line 54 of file gedfix.php

    Warning: preg_match() expects parameter 2 to be string, array given in /home/adkins92/public_html/tree/gedfix.php on line 54

    unknown (userinfo) has no birth or christening record. «1 BIRT Y» (Do this for all)

    unknown (userinfo) has no name record. «1 NAME //»

    unknown (userinfo) has no sex record. «1 SEX M» «1 SEX F» «1 SEX U»

    unknown (userinfo) is not a member of any family

     
  • Greg Roach
    Greg Roach
    2008-03-25

    Logged In: YES
    user_id=1466942
    Originator: YES

    OK - this is because it needs 4.1.4 (or current SVN) to run. For 4.1.3, try changing:

    PGV_USER_CAN_EDIT to userCanEdit()
    PGV_GED_ID to $GEDCOMS[$GEDCOM]['id']

     
  • Thomas52
    Thomas52
    2008-03-25

    Logged In: YES
    user_id=1447380
    Originator: NO

    I think you've identified the problem in the version, but "userCanEdit" doesn't seem to be the label we're looking for; still error messages. Possibly "canedit" or "U_canedit"? ((I'm out of my depth here.))

     
  • kiwi_pgv
    kiwi_pgv
    2008-03-25

    Logged In: YES
    user_id=1910459
    Originator: NO

    Yet another excellent tool. Thanks Greg. I can confirm, so far, that it works great on 4.1.4. Perhaps users should be aware that, subject to the quality of their data, running this may be a little slow. First time through for me, on about 6,000 INDIS, took 45 secs.

    One error had me puzzled at first:
    unknown + unknown = Eliza Kate James (F3407) has less than two family members. Delete

    But looking closer I see I had a family for just one individual where the parents were both unknown. I now understand and agree with the recommendations. It is deleted!

    I'm sure you'll get many requests for more checks and fixes - so I'll get in first.

    How about a check for possible missing married names (2 _MARNM tag) for married women? I guess one question would be, do you include all with either a MARR date, and those with just a MARR Y tag, or only those with a definite date? I suspect I have quite a few, but really don't want to re-import my GEDCOM just to fix that.

    I think I would also prefer to pick each possible error individually and just run it for ,say, invalid names rather than check for everything in one go.

     
  • Greg Roach
    Greg Roach
    2008-03-25

    Logged In: YES
    user_id=1466942
    Originator: YES

    <<How about a check for possible missing married names>>

    If you look through the code, you'll see this check is already there - but commented out as it wasn't quite working. It tried to compare the number of _MARNM records with the number of FAMS records, but it fails to take into account unmarried partnerships. This will be fixed in a later version.

    I'm most proud of the sex/name check. If you have 99 males called william and one female, it guesses you've got the sex wrong.

    I've written all these separate checks over the past few years, and have simply gathered them together and added a GUI.

     
  • Thomas52
    Thomas52
    2008-03-25

    Logged In: YES
    user_id=1447380
    Originator: NO

    4.1.3 users, line 25:
    require_once 'config.php';
    if (!userCanEdit(getUserName())) {
    header('Location: login.php?url=gedfix.php');
    exit;
    }

     
  • Greg Roach
    Greg Roach
    2008-03-25

    Logged In: YES
    user_id=1466942
    Originator: YES

    Thanks Tom. 4.1.3 compatible version attached.
    File Added: gedfix-413.php

     
  • Greg Roach
    Greg Roach
    2008-03-25

    gedfix.php v0.1 (for PGV 4.1.3)

     
    Attachments
  • kiwi_pgv
    kiwi_pgv
    2008-03-27

    Logged In: YES
    user_id=1910459
    Originator: NO

    Greg, I hope you don't mind, but I feel I should just express a word of caution to anyone planning on using this tool (and I think everyone should). If your GEDCOM is as good (or bad) as mine you may well come across two time issues:
    1 - I had hundreds, or it may even have been thousands of '1 BIRT Y' and 1 DEAT Y' tags to add. Not totally necessary I know, but I think a good idea anyway. So I selected the 'Change ALL' option. It took a very long time to process, and even timed out on the server once - but it did eventually complete.

    2 - Today I couldn't understand why the 'My Portal' page was taking so long (over 25 secs) to load. Then I remebered, I have a 'Recent Changes' block there - so it was listing ALL those changes from gedfix. I've decided to remove the block for now, and will put it back in a week or so.

     
  • Greg Roach
    Greg Roach
    2008-03-27

    Logged In: YES
    user_id=1466942
    Originator: YES

    <<I hope you don't mind>> Of course not.

    I was aware that the bulk updates can take a while, so the module is designed to resume after a time-out by pressing F5.

    I guess I spend little time on my portal page, so hadn't noticed the effects of 1000s of recent changes.

    There are two reasons why I like to add empty BIRT/DEAT records. First is performance. If there is no BIRT, PGV then has to check for a CHR and a BAPM to find an alternative date. An empty BIRT short-circuits this logic. Similarly, the logic for calculating "is dead" is quite long. I know PGV stores this in the DB, but this is lost when you export/import. It would have been better (IMHO) to store it in the file (using "1 DEAT Y").

    The second is that I am working on some utilities to populate empty BIRT/DEAT with estimated dates/places. I know not everyone will want this, but I'm doing a one-name study and being faced with a list of hundreds individuals called "unknown AFFORD" with no birth/death info isn't very helpful. If they have EST dates for BIRT/DEAT and "estimated places" (e.g. ??Lancashire??, ??England??), it is much easier to match people up.

     
  • KosherJava
    KosherJava
    2008-03-27

    Logged In: YES
    user_id=634811
    Originator: NO

    Greg,
    It might interest you to have a look at the GEDCOM Estimator at http://home.no.net/gedcom/ . While it is not in PHP, the source code is available and it seems to do what you are looking to accomplish.

    <<The second is that I am working on some utilities to populate empty
    BIRT/DEAT with estimated dates/places. >>

     
  • Greg Roach
    Greg Roach
    2008-04-02

    gedfix.php v0.2

     
    Attachments
  • Greg Roach
    Greg Roach
    2008-04-02

    Logged In: YES
    user_id=1466942
    Originator: YES

    v.0.2 has improved functions for repairing broken indi<->family links
    File Added: gedfix.php

     
  • Thomas52
    Thomas52
    2008-04-02

    Logged In: YES
    user_id=1447380
    Originator: NO

    I don't know how you knew I had all those errors, Greg, but you were right!
    Most excellent!
    Now the married names.... I have many with none, and a few with multiples of the same name, but I'm not looking a gift horse in the mouth.

     
  • Greg Roach
    Greg Roach
    2008-04-03

    • summary: gedfix.php - fix common gedcom errors --> gedfix.php v0.3 - fix common gedcom errors
     
  • Greg Roach
    Greg Roach
    2008-04-03

    Logged In: YES
    user_id=1466942
    Originator: YES

    v0.3 detects/fixes missing _MARNM records
    File Added: gedfix.php

     
  • Greg Roach
    Greg Roach
    2008-04-03

    gedfix.php v0.3

     
    Attachments
  • Mark Hattam
    Mark Hattam
    2008-04-03

    Logged In: YES
    user_id=623181
    Originator: NO

    How do you get gedfix.php to run? I've logged in as Admin, I then call it from a
    http://mysite/pgv/gedfix.php
    URL and it returns me to the login page, I supply login, refreshes to the login page, etc etc

    I'm running 4.1.4 SVN 2799

    Mark

     
  • Mark Hattam
    Mark Hattam
    2008-04-03

    Logged In: YES
    user_id=623181
    Originator: NO

    ah ... had to enable "Editing" in the Gedcom's config page ...

    Now I see it identifying all the widows etc who re-marry, claiming that I haven't entered _MARNM tags. So I guess I'll go and comment out all the MARNM checking/fixing as I don't use those at all.

    Good utility though!

    On my main gedcom/database it did
    Execution time: 69.867 sec. Total Database Queries: 19165. Total privacy checks: 23918. Total Memory Usage: 69497.73 KB.

    Mark

     
  • Greg Roach
    Greg Roach
    2008-04-03

    Logged In: YES
    user_id=1466942
    Originator: YES

    <<comment out all the MARNM checking/fixing as I don't use those at all.>>

    It only does the _MARNM check if you have set the "surname tradition" to "paternal" in the gedcom config. So, you can simply change this setting.

    Oh - and you've worked out that is checking for edit-permissions before it runs ;-)

     
  • Mark Hattam
    Mark Hattam
    2008-04-03

    Logged In: YES
    user_id=623181
    Originator: NO

    I'd only ever used that section of the gedcom config to turn editing OFF. Having done so, I didn't think any other oprions in that section applied.

    I'll now undo all my commenting around the _MARNM checking and set tradition to NONE

    Thanks

    Mark

     
  • kiwi_pgv
    kiwi_pgv
    2008-04-03

    Logged In: YES
    user_id=1910459
    Originator: NO

    Regarding the question about how to run gedfix. As there are a number of similar great tools now, I have added a "Tools" menu item in my own custom theme. It includes gedfix, gedcheck, gedmerge, placecheck, and a couple of minor ones of my own. Might be worth thinking of a Tools, or Utilities menu for the standard relese perhaps? I find it easier even though some of these tools are on the admin page as well.

     
  • Greg Roach
    Greg Roach
    2008-04-03

    Logged In: YES
    user_id=1466942
    Originator: YES

    As soon as the tools become reasonably stable/complete, I'll add them to the main build. This is what happened with GedCheck.

     
  • kiwi_pgv
    kiwi_pgv
    2008-04-06

    Logged In: YES
    user_id=1910459
    Originator: NO

    Greg

    gedfix has highlighted many 'marriages' that have no 1 MARR Y tag, and offers me the option to add a 1 _NMR Y tag instead if I choose. A great idea. But it made me realise that (as far as I know) PGV has no built in/GUI way to enter this tag, other than manually editing the GEDCOM. Have I missed something? If not, it seems like a very useful RFE.

     
1 2 > >> (Page 1 of 2)