From: Jim S. <ji...@ji...> - 2004-09-11 15:58:51
|
My messages don't appear to be making it onto the list -- sorry if you've already received the below message. Jim On Thu, 2004-09-09 at 20:30, Don Allingham wrote: > Other modifiers provide additional information: > > Calculated January 1990 > Estimated May 2000 > > Now, at least from my reading of the GEDCOM specs, that is the limit > that is supported. However, Alex and I have been wondering if this the > correct level. For example, does it make sense to say: > > Calculated before January 1990 or Estimated after May 2000 > > I can see two opposing arguments: > > 1) Calculated and before (or estimated and after) are both indicators > of ambiguity, so are therefore redundant > 2) There is a difference between calculated and non-calculated > dates, so the indicator provides some information > Comments or suggestions? What level of date support do you need? > I have some ideas to throw into the mix... In GEDCOM 6 they have separate Participant entity/object for the mapping between Person->Event, this mapping has (amongst others) a field for holding the age of the participant at the time of the event. This is useful, for example, when holding records representing a person's participation in a census. e.g.: [Participant record] [1904 Smalltown Census] - ["Mr. N. E. Body"] - [68yrs old] [EventID] - [PersonID] - [age] - [...] If you know the exact date of the census (and the recorded age is correct), you can calculate that the person was born between two dates. This date (date range) would calculated, and I think it fits perfectly with the GEDCOM definition "Calculated mathematically, for example, from an event date and an age". The above would also apply to marriage and death certificate information, as these also capture the age of a person at an event. Now, if, for example, we also have the following records/information: 1914 Smalltown Census - "Mr. N. E. Body" - 79yrs old (same person, 10 years later, aged 11 years older) If again we know the exact date of this census, we result in another CALculated date range for the same event (birth) for that same person. Maybe on the Person Editor window, when an individual doesn't have an exact birth date a drop-down list could be shown, listing the different calculated dates and the source of this calculation, e.g.: CAL BET 12 MAR 1836 & 11 MAR 1837 - calc. from 1904 Sm'lt'n Census recs. CAL BET 26 SEP 1836 & 25 SEP 1837 - calc. from 1914 Sm'lt'n Census recs. CAL BET 2 OCT 1835 & 1 OCT 1836 - calc. from Marriage cert. (& = AND, for line lengths) I'm sure we've all kind of done something like the above when working in our own family databases -- I have; it tends to annoy me because it's fiddly to do on paper with a calculator, and then when I do enter that kind of information into my db, there's no magic/easy way to keep track of these separate and distinct calculated ranges for a birth date. As an extension of the above system of calculated dates, we can look at all the possible calculated date ranges that we have for this birth event, and we can deduce that a person was born between two dates having a much smaller range/window. e.g. assuming the above three records are all for our "Mr. N. E. Body", we could now have a possible birth date (in our drop-down list) of: EST BET 26 SEP 1836 & 1 OCT 1836 - est. from 1914 Census and Marriage ..which is useful, because if I've calculated it correctly we've ended up with a very small date range for the estimated birth date, and that will be useful when doing other research for birth records. I call this 'estimated' as it is calculated from calculations. The GEDCOM definition for an EST date is 'Estimated based on an algorithm using some other event date' -- does that cover the above kind of stuff, or is this just another CALculated date range? Spec definitions aside, this calculated-from-calculations date range would logically derive from an intersection of the date ranges of only two specific events, that is, there will be two events that give us the narrowest possible window / date range. These can be found by looking for the range that begins first in time, and the range that ends last in time -- the resulting date range is simply the overlap of these two ranges. Maybe, if we are less confident in some of our information for some reason (e.g. a census may be handwritten and unclear), we could perhaps offer other/bigger date ranges in the drop-down list too. The problem of extra introduced ambiguity over the possible tightness of these calculated date ranges is why I prefer to think of these date ranges as ESTimated. I don't know what anyone else thinks of the above ideas, and I have no clue if this kind of thing is what the GEDCOM spec is trying to infer -- but I'm without any doubt that most people would find that kind of built-in functionality very useful. Perhaps I've just invented a complete coding nightmare..? I can't really figure it all out myself right now. > I believe it was Albert Einstein who said "Keep it as simple as > possible, but no simpler". We want to make date handling a simple as > possible for the user, but not at the risk of losing information. Adding > useless complexity makes everyone's life more miserable, but not > providing needed functionality is just as bad. > I think that in the spirit of not losing information we should always try to capture the 'exact text as shown on the source document' as well as holding any interpreted or derived date information, i.e. we should maybe add fields called 'source_text' to dates, etc. I think the GenTech spec recommends something like that, and I feel it is likely the right thing to do. |