Thread: RE: [Phpgedview-talk] XML genealogy format
Brought to you by:
canajun2eh,
yalnifj
From: John F. <Joh...@ne...> - 2005-10-24 23:03:54
|
Hi Alex, The first thing that popped in my head when I read this email was if you had looked at any of the other XML standards for genealogy such as XgenML or GEDCOM v6? Would it be better to look at one of those that others also support? On the other hand if gramps actually exports data in an XML format (I don't know of other programs that do) then is there any reason to switch to another standard? PhpGedView uses GEDCOM as its base data model, so in order for PGV to support importing XML it would have to translate the XML back into GEDCOM at least for now. In the future, the goal is to have the data abstract enough that it could be stored in an XML format. Maybe a good place to start would be an XML export from PGV that could then be imported into gramps. Anytime you do a translation, there is going to be some data loss. That is why PGV stores data in raw GEDCOM format. We don't lose anything trying to convert gedcom into an internal data model. But we there would probably still be some data loss in the translation from gedcom to XML. I think that this type of collaboration between projects would be a good thing. I also think that we can get something going sooner than the commercial software can. Is there any documentation about your XML standard that we can read up on? Thanks, --John -----Original Message----- From: php...@li... [mailto:php...@li...] On Behalf Of Alex Roitman Sent: Friday, October 21, 2005 3:46 PM To: php...@li... Cc: Don Allingham Subject: [Phpgedview-talk] XML genealogy format Hello, I am a member of the development team of GRAMPS, a personal desktop genealogy program that is Free/Open source. We are constantly facing problems of supporting users who come with their data in GEDCOM format created by various software. The limitations of the GEDCOM format are preventing the lossless data transfer between the programs. Different programs are extending the GEDCOM standard in different ways and these extensions are sometimes hard to learn about. We would like to develop an open XML genealogy data format. We have an existing XML data format that used to be our primary format. Lately we switched to using BSDDB for storage, but we still maintain our XML for transferring data between the versions. We feel that this format would be a good starting point for developing cross-application XML format for genealogy. We know that similar attempts were done before and did not get anywhere. However, our situation is better than most, because we already have a working implementation and it is not just the proposal. We would like to invite the phpgedview developers to collaborate with us on developing such a format. It seems that we should be able to find common language with fellow Free software developers sooner than those of commercial software. While we realize that GEDCOM is a de-facto standard, however poor it is, we would be thrilled if the phpgedview developers were interested in adding support for our format and/or giving us feedback on the format it would be great. If you might be interested in collaborating on this, please let us know. You may either reply to me personally, to this list, or post to our list at gra...@li... whatever works for you. Thanks in advance, Alex --=20 Alexander Roitman http://www.gramps-project.org |
From: Roger W. <gen...@wi...> - 2005-10-25 19:42:04
|
Alex, I've looked over the sample XML file ( http://www.gramps-project.org/files/data.gramps ) and have a few comments. Your major tags have both an id and handle. Just stick with one. I think given and surname are better labels than first and last as some cultures reverse the order of names. I would suggest a fullname tag or attribute that shows all the name parts in the correct order. I suggest making change into its own tag that allows us to specify WHO made the change, not just the date of the last change. Date fields should accept partial dates, and should support BC. In this politically correct world, any new XML format should allow 2 moms or 2 dads as well as a single parent within a family. Also on the note of families, the XML format should allow us to indicate the relationship of a child to a family (ie biological, adopted). I would clean up the tag and attribute names a bit, but overall it's a great start. aloha, Roger |
From: Alex R. <sh...@gr...> - 2005-10-25 20:41:47
|
Roger, Thanks for the feedback! On Tue, 2005-10-25 at 13:41 -0600, Roger Winget wrote: > I've looked over the sample XML file ( http://www.gramps-project.org/file= s/data.gramps ) > and have a few comments. >=20 > Your major tags have both an id and handle. Just stick with one. This is our existing format, and here's why we're using both at the moment. The "handle" is the unique ID assigned by GRAMPS. It will never be changed for a given person. The "id" is the user-visible ID that can be changed, either manually (discouraged) or by re-ordering the IDs from program tool. We're using both for the sake of backward compatibility with the earlier GRAMPS formats. This does not have to be the case for the general cross-software format. What I envision for the more general format is that there will be a single user-visible ID and possibly many software-specific IDs: <person id=3D"I0001"> <uid source=3D"gramps" id=3D"_AKSD23234KHK432JH"/> <uid source=3D"phpgedview" id=3D"0234-lkj"/> <uid source=3D"ftm" id=3DFTM0123-P012"/> blah </person> This way the unique IDs assigned by any program that processed the data can be kept, without affecting either the user-visible IDs or each other. > I think given and surname are better labels than first and last as some c= ultures > reverse the order of names. I would suggest a fullname tag or attribute t= hat shows > all the name parts in the correct order. >=20 > I suggest making change into its own tag that allows us to specify WHO ma= de the change, > not just the date of the last change. Agreed. > Date fields should accept partial dates, and should support BC. This is already supported: <dateval val=3D"1897" type=3D"about"/> Supported are: about, before, after, between A and B, from A to B, estimated, calculated, as well as dates with missing day or month, as well as the textual dates ("Christmas before John was born") and BC. We should probably make a separate example featuring all supported dates, I won't list the XML here. > In this politically correct world, any new XML format should allow 2 moms= or 2 dads as well > as a single parent within a family. Also on the note of families, the XM= L format should > allow us to indicate the relationship of a child to a family (ie biologic= al, adopted). A missing "father" or "mother" tag would be a single parent. The "mother" and "father" can be any gender, so 2 female parents or 2 male parents are supported. The relationship of child to a family is already recorded. The default is birth (biological) and is omitted in XML. But if it's anything else then: <childof hlink=3D"_S7MT6D1JSGX9PZO27F" mrel=3D"Stepchild" frel=3D"Adopte= d"/> where relationship to father (frel) and mother (mrel) do not have to be the same. > I would clean up the tag and attribute names a bit, but overall it's a gr= eat start. Thanks! The question I would like to ask is whether there is interest outside GRAMPS to support similar XML format for data exchange. We are happy with using this format, and keep adjusting it as we add new features and fix bugs. It seems that the wider genealogical community could benefit from having a documented and extensible format that would be: 1. improving GEDCOM's shortcomings 2. community-supported: not tied to any single organization, church, etc 3. naturally extensible: X in XML :-) 4. supported by numerous XML tools: parsing, validation, transformations= , etc So while we definitely welcome the feedback on our format, we'd like to collaborate with other genealogical software developers to develop something common that we all can natively support. Sorry for being redundant :-) Alex --=20 Alexander Roitman http://www.gramps-project.org |
From: waldo k. <wki...@al...> - 2005-10-26 02:00:18
|
Roger Winget wrote: > In this politically correct world, any new XML format should allow 2 moms or 2 dads as well at least... > as a single parent within a family. Also on the note of families, the XML format should > allow us to indicate the relationship of a child to a family (ie biological, adopted). there should also be some way to indicate who the parents are without requiring a marriage between them... however, i don't know that a dataformat is where that stickywicket should be taken care of... -- _\/ (@@) Waldo Kitty, Waldo's Place USA __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0543-1, 10/25/2005 Tested on: 10/25/05 10:00:07 PM avast! - copyright (c) 1988-2005 ALWIL Software. http://www.avast.com |
From: Roger W. <gen...@wi...> - 2005-10-27 01:49:02
|
On 25/Oct/2005 20:00 waldo kitty wrote .. > there should also be some way to indicate who the parents are without requiring > a marriage between them... however, i don't know that a dataformat is where > that stickywicket should be taken care of... The answer is simple. Create a family element with a "type" attribute. Type could be married, cohabitional (sp?), or single (something that the parents did not live together). aloha, Roger |
From: waldo k. <wki...@al...> - 2005-10-28 00:30:22
|
Roger Winget wrote: > On 25/Oct/2005 20:00 waldo kitty wrote .. > >>there should also be some way to indicate who the parents are without requiring >>a marriage between them... however, i don't know that a dataformat is where >>that stickywicket should be taken care of... > > > The answer is simple. Create a family element with a "type" attribute. Type could > be married, cohabitional (sp?), or single (something that the parents did not live > together). that does appear that it would work for XML stuffs but not for others ;) i've had a few entries of similar nature that i've had to make over the years in several geneology programs... needless to say, 15+ years ago, we had to fake a marriage... over time, it has gotten better in those programs that i've used over the years... in most all cases, the problem has come down to the basic premise that children are born within a marriage and those writting the programs have forgotten about "wild sown seed" O:) -- _\/ (@@) Waldo Kitty, Waldo's Place USA __ooO_( )_Ooo_____________________ telnet://bbs.wpusa.dynip.com _|_____|_____|_____|_____|_____|_____ http://www.wpusa.dynip.com ____|_____|_____|_____|_____|_____|_____ ftp://ftp.wpusa.dynip.com _|_Eat_SPAM_to_email_me!_YUM!__|_____|_____ wkitty42 -at- alltel.net --- avast! Antivirus: Outbound message clean. Virus Database (VPS): 0543-2, 10/27/2005 Tested on: 10/27/05 8:30:03 PM avast! - copyright (c) 1988-2005 ALWIL Software. http://www.avast.com |
From: Alex R. <sh...@gr...> - 2005-10-25 01:34:33
|
John, Thanks for your response! On Mon, 2005-10-24 at 17:01 -0600, John Finlay wrote: > The first thing that popped in my head when I read this email was if you > had looked at any of the other XML standards for genealogy such as > XgenML or GEDCOM v6? Would it be better to look at one of those that > others also support?=20 Nobody supports those. GEDCOM 6 is abandoned unfinished, and a variety of the proposed XML formats are nothing but words. Our format has been working for many users for some years now. In fact, it was our primary format up to last year. > On the other hand if gramps actually exports data > in an XML format (I don't know of other programs that do) then is there > any reason to switch to another standard? We are happy with our format, but we're looking for a broader format to enable lossless transfer between different software. GEDCOM has numerous limitations, leading to the data loss on import/export with most software. XML is a very appealing choice for such an open format, and since we already have an XML genealogical format, we thought that maybe we will start from there, if anybody else is interested :-) > PhpGedView uses GEDCOM as its base data model, so in order for PGV to > support importing XML it would have to translate the XML back into > GEDCOM at least for now. I see. This can already be done. All programs including GRAMPS can write into GEDCOM. The data gets lost, however. > In the future, the goal is to have the data > abstract enough that it could be stored in an XML format. Maybe a good > place to start would be an XML export from PGV that could then be > imported into gramps. >=20 > Anytime you do a translation, there is going to be some data loss. That > is why PGV stores data in raw GEDCOM format. We don't lose anything > trying to convert gedcom into an internal data model. But we there > would probably still be some data loss in the translation from gedcom to > XML. I guess my point is that we'd like to have a better format than GEDCOM, not convert to and from GEDCOM. Converting back and forth makes any new exchange format meaningless. > I think that this type of collaboration between projects would be a good > thing. I also think that we can get something going sooner than the > commercial software can. >=20 > Is there any documentation about your XML standard that we can read up > on? This page has formal documentation: DTD and RELAX NG schema: http://www.gramps-project.org/xml/1.0.0/ The example data file can be found here: http://www.gramps-project.org/files/data.gramps It's actually very self-explanatory. If you're still interested then we should definitely work on this. If something in GRAMPS format does not seem reasonable, we can discuss it and change it in the new format. But ultimately, I think XML is a much better choice for a common ground between different software, for a number of reasons. If OpenDocument could do it for the word processing, why can't we do it with genealogy? Alex --=20 Alexander Roitman http://www.gramps-project.org |