gedfix.php v0.6 - fix common gedcom errors
Brought to you by:
canajun2eh,
yalnifj
gedfix.php detects a wide range of common gedcom errors, and allows you to fix most of them with a single click.
It identifies things like:
Broken links.
Invalid names.
Male/female inconsistencies based on given name.
Missing BIRT/DEAT records
Multiple occurrences of facts/events that should be unique.
Individuals/families with no links to other records.
gedfix.php v0.1
Logged In: YES
user_id=1447380
Originator: NO
Greg: I am using 4.1.3 and get a error messages. A number of errors are shown, but with no ID like:
ERROR 8: Use of undefined constant PGV_USER_CAN_EDIT - assumed 'PGV_USER_CAN_EDIT'
0 Error occurred on in function unknown
1 called from line 26 of file gedfix.php
ERROR 2: Cannot modify header information - headers already sent by (output started at /home/adkins92/public_html/tree/includes/functions.php:360)
Warning: Cannot modify header information - headers already sent by (output started at /home/adkins92/public_html/tree/includes/functions.php:360) in /home/adkins92/public_html/tree/includes/functions_print.php on line 523
ERROR 8: Use of undefined constant PGV_GED_ID - assumed 'PGV_GED_ID'
0 Error occurred on in function unknown
1 called from line 46 of file gedfix.php
ERROR 8: Use of undefined constant PGV_GED_ID - assumed 'PGV_GED_ID'
0 Error occurred on in function unknown
1 called from line 48 of file gedfix.php
ERROR 2: preg_match() expects parameter 2 to be string, array given
0 Error occurred on in function preg_match
1 called from line 54 of file gedfix.php
Warning: preg_match() expects parameter 2 to be string, array given in /home/adkins92/public_html/tree/gedfix.php on line 54
unknown (userinfo) has no birth or christening record. «1 BIRT Y» (Do this for all)
unknown (userinfo) has no name record. «1 NAME //»
unknown (userinfo) has no sex record. «1 SEX M» «1 SEX F» «1 SEX U»
unknown (userinfo) is not a member of any family
Logged In: YES
user_id=1466942
Originator: YES
OK - this is because it needs 4.1.4 (or current SVN) to run. For 4.1.3, try changing:
PGV_USER_CAN_EDIT to userCanEdit()
PGV_GED_ID to $GEDCOMS[$GEDCOM]['id']
Logged In: YES
user_id=1447380
Originator: NO
I think you've identified the problem in the version, but "userCanEdit" doesn't seem to be the label we're looking for; still error messages. Possibly "canedit" or "U_canedit"? ((I'm out of my depth here.))
Logged In: YES
user_id=1910459
Originator: NO
Yet another excellent tool. Thanks Greg. I can confirm, so far, that it works great on 4.1.4. Perhaps users should be aware that, subject to the quality of their data, running this may be a little slow. First time through for me, on about 6,000 INDIS, took 45 secs.
One error had me puzzled at first:
unknown + unknown = Eliza Kate James (F3407) has less than two family members. Delete
But looking closer I see I had a family for just one individual where the parents were both unknown. I now understand and agree with the recommendations. It is deleted!
I'm sure you'll get many requests for more checks and fixes - so I'll get in first.
How about a check for possible missing married names (2 _MARNM tag) for married women? I guess one question would be, do you include all with either a MARR date, and those with just a MARR Y tag, or only those with a definite date? I suspect I have quite a few, but really don't want to re-import my GEDCOM just to fix that.
I think I would also prefer to pick each possible error individually and just run it for ,say, invalid names rather than check for everything in one go.
Logged In: YES
user_id=1466942
Originator: YES
<<How about a check for possible missing married names>>
If you look through the code, you'll see this check is already there - but commented out as it wasn't quite working. It tried to compare the number of _MARNM records with the number of FAMS records, but it fails to take into account unmarried partnerships. This will be fixed in a later version.
I'm most proud of the sex/name check. If you have 99 males called william and one female, it guesses you've got the sex wrong.
I've written all these separate checks over the past few years, and have simply gathered them together and added a GUI.
Logged In: YES
user_id=1447380
Originator: NO
4.1.3 users, line 25:
require_once 'config.php';
if (!userCanEdit(getUserName())) {
header('Location: login.php?url=gedfix.php');
exit;
}
Logged In: YES
user_id=1466942
Originator: YES
Thanks Tom. 4.1.3 compatible version attached.
File Added: gedfix-413.php
gedfix.php v0.1 (for PGV 4.1.3)
Logged In: YES
user_id=1910459
Originator: NO
Greg, I hope you don't mind, but I feel I should just express a word of caution to anyone planning on using this tool (and I think everyone should). If your GEDCOM is as good (or bad) as mine you may well come across two time issues:
1 - I had hundreds, or it may even have been thousands of '1 BIRT Y' and 1 DEAT Y' tags to add. Not totally necessary I know, but I think a good idea anyway. So I selected the 'Change ALL' option. It took a very long time to process, and even timed out on the server once - but it did eventually complete.
2 - Today I couldn't understand why the 'My Portal' page was taking so long (over 25 secs) to load. Then I remebered, I have a 'Recent Changes' block there - so it was listing ALL those changes from gedfix. I've decided to remove the block for now, and will put it back in a week or so.
Logged In: YES
user_id=1466942
Originator: YES
<<I hope you don't mind>> Of course not.
I was aware that the bulk updates can take a while, so the module is designed to resume after a time-out by pressing F5.
I guess I spend little time on my portal page, so hadn't noticed the effects of 1000s of recent changes.
There are two reasons why I like to add empty BIRT/DEAT records. First is performance. If there is no BIRT, PGV then has to check for a CHR and a BAPM to find an alternative date. An empty BIRT short-circuits this logic. Similarly, the logic for calculating "is dead" is quite long. I know PGV stores this in the DB, but this is lost when you export/import. It would have been better (IMHO) to store it in the file (using "1 DEAT Y").
The second is that I am working on some utilities to populate empty BIRT/DEAT with estimated dates/places. I know not everyone will want this, but I'm doing a one-name study and being faced with a list of hundreds individuals called "unknown AFFORD" with no birth/death info isn't very helpful. If they have EST dates for BIRT/DEAT and "estimated places" (e.g. ??Lancashire??, ??England??), it is much easier to match people up.
Logged In: YES
user_id=634811
Originator: NO
Greg,
It might interest you to have a look at the GEDCOM Estimator at http://home.no.net/gedcom/ . While it is not in PHP, the source code is available and it seems to do what you are looking to accomplish.
<<The second is that I am working on some utilities to populate empty
BIRT/DEAT with estimated dates/places. >>
gedfix.php v0.2
Logged In: YES
user_id=1466942
Originator: YES
v.0.2 has improved functions for repairing broken indi<->family links
File Added: gedfix.php
Logged In: YES
user_id=1447380
Originator: NO
I don't know how you knew I had all those errors, Greg, but you were right!
Most excellent!
Now the married names.... I have many with none, and a few with multiples of the same name, but I'm not looking a gift horse in the mouth.
Logged In: YES
user_id=1466942
Originator: YES
v0.3 detects/fixes missing _MARNM records
File Added: gedfix.php
gedfix.php v0.3
Logged In: YES
user_id=623181
Originator: NO
How do you get gedfix.php to run? I've logged in as Admin, I then call it from a
http://mysite/pgv/gedfix.php
URL and it returns me to the login page, I supply login, refreshes to the login page, etc etc
I'm running 4.1.4 SVN 2799
Mark
Logged In: YES
user_id=623181
Originator: NO
ah ... had to enable "Editing" in the Gedcom's config page ...
Now I see it identifying all the widows etc who re-marry, claiming that I haven't entered _MARNM tags. So I guess I'll go and comment out all the MARNM checking/fixing as I don't use those at all.
Good utility though!
On my main gedcom/database it did
Execution time: 69.867 sec. Total Database Queries: 19165. Total privacy checks: 23918. Total Memory Usage: 69497.73 KB.
Mark
Logged In: YES
user_id=1466942
Originator: YES
<<comment out all the MARNM checking/fixing as I don't use those at all.>>
It only does the _MARNM check if you have set the "surname tradition" to "paternal" in the gedcom config. So, you can simply change this setting.
Oh - and you've worked out that is checking for edit-permissions before it runs ;-)
Logged In: YES
user_id=623181
Originator: NO
I'd only ever used that section of the gedcom config to turn editing OFF. Having done so, I didn't think any other oprions in that section applied.
I'll now undo all my commenting around the _MARNM checking and set tradition to NONE
Thanks
Mark
Logged In: YES
user_id=1910459
Originator: NO
Regarding the question about how to run gedfix. As there are a number of similar great tools now, I have added a "Tools" menu item in my own custom theme. It includes gedfix, gedcheck, gedmerge, placecheck, and a couple of minor ones of my own. Might be worth thinking of a Tools, or Utilities menu for the standard relese perhaps? I find it easier even though some of these tools are on the admin page as well.
Logged In: YES
user_id=1466942
Originator: YES
As soon as the tools become reasonably stable/complete, I'll add them to the main build. This is what happened with GedCheck.
Logged In: YES
user_id=1910459
Originator: NO
Greg
gedfix has highlighted many 'marriages' that have no 1 MARR Y tag, and offers me the option to add a 1 _NMR Y tag instead if I choose. A great idea. But it made me realise that (as far as I know) PGV has no built in/GUI way to enter this tag, other than manually editing the GEDCOM. Have I missed something? If not, it seems like a very useful RFE.