On May 24, 2013, at 7:41 AM, Benny Malengier <benny.malengier@gmail.com> wrote:

2013/5/24 John Ralls <jralls@ceridwen.us>

On May 24, 2013, at 12:22 AM, Benny Malengier <benny.malengier@gmail.com> wrote:
> You are right. But in my suggestion no persona is made, only the record is kept as part of our database. The Persona that come from it are the software that updates  existing Person objects.
> The software part can be repeated at every point, and is an addition. In the same way, you could do a substraction, and then add again all the remaining records that are connected.
> This allows to update and change Person objects from stored records like the census record, without having stored actual Persona in the database. It's just business logic of the generation part. We could present it as a sort of persona object in the GUI which the user must approve of, but we don't store.

That would be very bad practice. Remember, good genealogy demands that every statement is backed by
careful analysis of all of the evidence discovered from a thorough search. A persona is special because it's a
reflection of the evidence about a person from a single record: It's expected that a Person object consolidates
the various personas with analysis of the credibility of the underlying evidence for each to arrive at a conclusion
about the events, attributes, and relationships that describe the historical person's life. Only in that context would
it be permissible for a program to automagically change conclusions in a persona (*not* a Person) instance.

I'm not convinced. A record holds data, what is extracted from it should be independent of the person reading it. So that data, be it Persona or not, can be done automatic.

No, it's nowhere near that easy. A record holds information which may or may not be correct and may or 
may not pertain to the person whom you're researching. The information must be carefully analyzed both 
in the context of the document or artifact which contains it and in the context of all of the other relevant
information found in other documents and artifacts discovered by a thorough search. If you haven't done
so already, I urge you to study the subject. I'm afraid that I'm familiar only with the American literature on
the subject, the most important books of which are:

Elizabeth Shown Mills, "Evidence Explained: Citing History Sources from Artifacts to Cyberspace", 
ISBN 978-0806318066, and "Evidence! Citation and Analysis for the Family Historian", ISBN 978-0806315430

The former is an expanded version of the latter most of which is detailed explanations of how to write citations
for a wide variety of source documents and artifacts. It's also quite a bit more expensive, so I suggest "Evidence!".
Both are available from Amazon.com.

There's a new book, 
Thomas W. Jones, "Mastering Genealogical Proof", ISBN 978-1935815075, which was released two weeks ago
at the (American) National Genealogical Society's Family History Conference, but it seems to be available only 
through the society (www.ngsgenealogy.org). It's more of a tutorial than are Mills's books, and also a bit wider-
ranging, covering the whole process from search to writing the final proof argument.

A third, not yet published, is
Robert Charles Anderson, "Elements of Genealogical Analysis", will present a somewhat different process similar
to what we've been calling n-tier in the GedcomX discussions. Anderson presented a summary of the book at the
conference two weeks ago. He expects that it will be released sometime this fall by the New England Historic 
Genealogical Society.

The missing piece in my opinion is then the Analysis Document (A.D.). That as you explain it is the human intervention. You see as date 1?/04/1713, You store an event and choose April or 10/04/1713 or whatever, and you indicate in the A.D why. You see a name, and you make a new person or use an existing person, and indicate why.

Yes, exactly.

To repurpose notes for this is somewhat annoying though, as preferably it is something we can scan in code, and not something a user afterward can delete or edit with ease. The already existing Gramps model should not require it for it's operation.

No, analysis documents are subject to constant revision as the researcher discovers new evidence. Ease of editing
is critical.

We already parse the links in Notes; otherwise they wouldn't work. If we have extracted-evidence objects like personas
we can scan the Notes objects  for links: If a link to a persona appears in a Note attached to a person then the scanner 
will know that that persona is accounted for and shouldn't be flagged for review. If the reference gets removed in a
later revision of the Note, a subsequent run of the scanner will flag it. 

I doubt that anyone likely to contribute to Gramps has the knowledge necessary to write a (multi-lingual!) natural 
language processor capable of understanding an analysis, and I don't think the state of the art in artificial intelligence 
is yet capable of constructing one. 

My only beef is that the more objects come into play, the more our data model resembles spaghetti.

Gramps is already a pretty big plate of that! :-( Since we already have a class which provides text with links to other
objects, what would be gained by adding another?

John Ralls