Is the persona tree view over-powering?
Should we really be showing the breakdown of personas on the navigation tree?
I see to distinct audiences for this product. The first have read the GDM and realized it potential power and so want to use a product that is based on this superior model. They will be aware that every person is represented by a collection of events, personas and characteristics, and they may want to have a visual representation of this.
The second group know that they want to use a good piece of genealogical software and have discovered the top ranking GeneaPro and are giving it a go. Do they need to know from day one that thier imported GEDCOM has been sliced and diced to produce the 'Same Person' Group Uncle Ted?
If we only showed on the Navigation tree the 'Same Person' groups, and any personas that were not part of such a group (effrectivly the top level we show at the moment without the ability to drill down into them) then I think it would appear much slicker to the first time GDM user.
However at some point the fact that this High Level Persona is a Group will become important. Therefore we should provide an 'Outline View' much like Eclipse does for Java files, this would show all the Grouped personas and probably all assertons made about them.
Great comments and just what we need to get to a useful final product.
There are a few thing to keep in mind about the current code:
1. It is strictly a technology and data model "proof of concept" - just to show we could tie the database, RCP, and other technologies together, come up with some standards, and be able to test out some of our assumptions.
2. This is the very first attempt to view and get a feel for what the GDM looks like at the very lowest levels - I think of it almost as the equal to a pixel view on a graphics program. But you must admit, putting the GDM into an application even at this crude level is very enlightening, and much more helpful in our design and planning stage then prototyping standard preson family views.
3. Uisng the RCP model, I anticipate we will have perspectives for everything from this lowest level to a more "traditional view, even to points of view on the data only possible because we have the low level detail of the GDM.
So, with that in mind here is my own tough assessment of the current "Persona Tree"
1. We will need to "collapse" as a minimum the "SamePerson" grouping - in fact I'm think we may want to in general have a detailed view like this use groups to organize, but not treat groups as their own entity - bottom line, you would not see groups displayed as such, but more as labels on the group members - still thinking about this and how it impacts other type of groups addressed by the GDM
2. I agree with your comparision to eclipse - and must say for this view to be useful at the lowest level, we have to take the tree further - in otherwords, the tree should go all the way down to the assertion or sub-assertion (Subject) level. Note that this would probably eliminate the need for the associated table view, and, much like eclipse, we could have editors associated with each type in the tree - select an assertion, you get an assertion editor, select the group of assertions, maybe some sort of "persona" editor (or other group editors) , select the highest persona/individual, you get a more traditional editor/view of the person.
3. The table view, which while not needed here, I think will have general use, need to only display the subset of assertions, I cheat now by showing all assertions for the associated individual, but this is mostly due to having too little test data in the database.
4. In general, we need to figure out how to handle multiple databases - do we tag each element with the database it belongs to, or have a higher grouping by database? And how should assertions/Subjects in one database interact (if at all) with each other?
5. The individuals, personas, and groups need more data in the label - at least birth-death year, maybe something else? and of course loading up a tooltip with expanded data would be nice.
6. My final concern actaully goes back to the GDM itself. Every persona (Subject) has a "name" associate with it - that is where the name for display is pulled from now. But this is not the "official" person's name and the GDM makes clear this is a researcher record keeping only name, as the real name must be peiced together as characteristic assertions. We need to be vary careful of the use of the "name" field and right now is a example of how NOT to use it. Again, the defense here is that we have very limited test data, and do not have full name handling implemented.
Although I fully agree with your assertion that the code is a "proof of concept", I also think that much of the future work will rely heavily on the perspectives and views that are in the current code. In fact, I think that many of these will exist in a 'finished' product as for Data Administration functions.
Now on to the real meat of this reply.
You cite in a couple of your points the lack of a large amount of data as hindrance to the project at this point in time. Therefore I think one of the areas that probably needs addressing soon is that of Import (and as it's linked, export) of Data. The main benefit of this is that it then becomes easy to quickly increase the amount of data in our test databases. The other benefit is that when designing the import we will by necessity have to think about the build up of data, and the various ways that we can populate the database with data for events.
You mention the issue of multiple databases, this is an interesting one, apart from in testing when do you see connecting to multiple databases at the same time to be an advantage, what problems does it solve?
I feel that we should actively discourage having connections to more than one database active, and should prevent any interaction between them. The import/ export facility should be used to provide access to 'external' data. This will have the added benefit of being able to highlight any changes in the state of the data in the original compared with the last import, and so quickly identify any assertions that we've used that have been disproved by the originator. (I think I may be working up to taking initial ownership of the import/ export piece).
Finally, you suggest that the name we are displaying in the tree view is incorrect as its pulled from the name of the subject rather than from building up the 'official' name from characteristic assertions. I think this is correct. My main reasoning for this is that to perform the possibly complex knitting together of many name characteristics for every name as we add it to the tree is a total waste of time. What we should do is try and guide the creation of this subject name in the first place, but after the user has given it this familiar name that is what we should display in the tree (with the option to rename it if needed).
My rough assessment is that a tree will be showing up to 50 items at any one time, but we will be possibly adding the entire database of persona to it, In one of my gedcoms this would be 3,346 individuals! Any guesses as to how fast we create objects from the database, because at ten a second that would be a thirty second refresh!
I suppose it brings me back to the point that we NEED a gedcom import asap so that we can stress test various components.
Sorry for the meandering, I've been typing this e-mail in between work and it's taken all day
The GUI design is not done, and no-one, especailly myself is stuck on what code we have - in fact, as a first pass on the unique GDM model, I would have been surprised if we nailed it, or even came close. We did get good information from it though, and as you said, some version of what's been done will end up in the final product.
In this same thread, importation of conclusional data (GEDCOM) into a GDM/assertion model would not give us a well thought out test database that reflected the unique capabilites of the GDM (assertions, groupings of personas, disproved assertions, proper breakdown of characteristics, etc).
While that will happen (as you noted Import/Export design is still needed) and it would be neat to see what my GEDCOM data looks like in GeneaPro, we also need to nail down the internal data model (again - close but not quite complete) , before we try to import stuff into it.
What I think is needed is what we've started - very specific test data that would look like we entered assertion-by-assertion and represent the unique corner cases we expect GeneaPro to address cleanly as an implementation of the GDM. Ability to enter raw assertions (and other raw input for the Admin, Evidence, and Conclusional modules) will support this and would be a priority, again, when we get the data model fairly complete.
You bring up an interesting point on multiple databases, basically, do we need to access more than one at a time?
I can think of several instances where a researcher will want to compare either side by side or even line by line to someone else's database, even if that database was imported into a separate database on their machine.
Think about your standard GEDCOM import - do I want to dump it directly into my single open database? Or do I want to create a new database, compare, maybe select/drag/drop/copy/paste items from one to another? Or maybe I can just point (in a URL type fashion?) to other database entries. Any direct, interactive operations will require at least 2 databases opened at the same time.
I like your idea of taking the name feilds and more or less automating the entry of data into them, rather than expose the researcher to inadvertantly putting genealogical data into the tracking feilds.
Last point - nobody can argue with design and code that works, and each project member makes their own priorities and runs with what interests them. If you want to own and deliver a import/export wizard design and code, maybe even a GEDCOM import wizard implementation for GeneaPro, you will have everyone's support. That exercise would definetly expose issues and force decisions about the data model and our internal arrchitecture that would help move us along.
Just been following this discussion since I stumbled onto it a few weeks ago, and following up on your identification of the need for a specific sort of test data, that is more "GDM-native" and assertion-based, rather than a GEDCOM import.
When I first learned of GDM a week or so ago, I was very intrigued by the assertion-based model, as it was a very natural fit for some of the research I have been doing in 18th century Northeast Scotland. On the one hand, the LDS have done a great job of extracting data from all the old parish registers into an online-queriable format, so there is a fairly rich vein of data to mine. However, the data is difficult to make certain connections for two reasons: baptism records of that time often named a father but not a mother, and the same given names were fairly heavily recycled. For instance, there were several people running around the Aberdeen area circa 1770 who were named Robert SPRING. I have various bits of information connected with Robert SPRING. But the tricky part is figuring out exactly how many different Robert SPRINGs there were, and which bits of information go with which.
The reason I bring this up is that if it is of any use to your project to provide some actual "raw data" of the sort you seem to need, I would be happy to provide some.
On a different note, I gather from the discussion that there is an early prototype of GeneaPro available that folks are looking at. I'd certainly be interested to see what it looks like. At this point, does one require Eclipse just to download and take a spin through the prototype?
Los Angeles, CA
The GeneaPro project is winding up the design phase, which included building some test code. There is an old sample posted (0002) which can run without Eclipse (but needs MySQL and JDBC driver as well as Java) , but the current iteration in CVS (0.0..5.PreAlpha) only runs under Eclipse until we can get the RCP packaging working.
As far as test data, again, we are in the early phases and the only mode we have now of loading data is direct manual SQL into the database (either MySQL or HSQLDB) - and even the final working data model is still in flux. Bottom line, data and code contribution at this stage requires Eclipse and some Java and/or SQL database skills.
Log in to post a comment.