Anyone have a recommendation for a good (free) data modelling and database design tool? Anyone have the GDM model or any of the popular genealogy databases in such a tool?
I chose FabForce's DB Designer 4 tool (http://www.fabforce.net/dbdesigner4/) from this page to try http://www.databaseanswers.com/modelling_tools.htm.
It is free and appears to do the basics, but I don't know if there's something better out there.
I got it to read GeneaPro, Legacy, and Genbox databases, but it would save time if someone had already constructed models of them (as well as the ever popular "best genealogy program").
One of the reasons for interest in the other programs is to see some of the real world things that are needed in addition to the abstract stuff that's in GDM 1.1.
Jean-François El Fouly
I've used Druid (http://druid.sourceforge.net) for years and it works well.
Thanks for the pointer. That's one that I hadn't come across and it looks like a nice basic tool.
I had a brief look and it doesn't look like it has anything over fabForce DBDesigner 4 which makes it worth switching right now, but it's always good to have another tool...
Jean-François El Fouly
Since my post, I read carefully the "Gentech Standard SQL Schema", especially the Section 3, "Deliverables".
What is suggested there is a simple and practical approach, but by no way the only one (write a simple, single script for all databases). Another way, that has proved very powerful at least in my experience is to define a data model with a tool and/or in XML. I've used two such tools, one is Druid mentioned earlier and the other is Apache Torque ( http://db.apache.org/torque/ ). The idea is to have a data model, expressed in XML, from which you can generate appropriate scripts for various databases (taking into account their specifics, incompatibilities and weaknesses). The strong advantage here is that these tools will also produce from the model:
- an excellent, up-to-date schema documentation;
- they may somehow generate useful Java code for parameter checking (such as String lengths) or even a complete data layer (Torque).
Our experience with this approach and these tools has been excellent both in commercial and open-source projects.
I agree that I higher level representation of the design than a SQL script is a very important goal. I think the specification of a SQL script as a deliverable was just meant to be a stake in the group which was open for comment and revision.
The problem is to find a tool/representation which will preserve the design information and make it reusable in as many forms as possible.
Ideally you want to be able to:
- reverse engineer from Java, SQL or JDBC
- design in object or E/R modes and map between the two
- import/export UML and other popular modeling formats
- generate Java, mapping files for Hibernate & other persistance layers,
- etc, etc, etc.
I bet tools like Rational's Rose and other commercial tools do all this and more, but I don't think our budget can afford them.
I had another look at Druid and decided that it had more power than I realized hidden behind a poor user interface and incomplete documentation. It does quite a bit out of the box and looks like it is flexible enough to be extended to do whatever is required, so I decided to take the plunge and enter the GDM model into it.
I've gotten to the point where I've generated a database, the associated Hibernate mapping files, the Java classes, and a simple test app, so I'm pretty confident that it's a viable approach. It's not as easy as I'd like, but not terrible either...
I tried to upload the PDF documentation as a "patch", but apparently there's a 256KB limit on that and the PDF file is about 550KB (E/R diagrams in image format).
I'll have to figure out another solution...
I've put the Druid PDF output in CVS in the ./doc/design directory. Folks can fetch it from viewCVS here http://cvs.sourceforge.net/viewcvs.py/geneapro/GeneaPro/doc/design/gdm-druid.pdf?rev=1.1&view=auto
This incarnation represents only the pure GDM 1.1 specification with all that that implies - no dates, poorly specified places, difficult to implement assertion links.
Ed's prototype database design has solutions for some of these things, but I figured something which matched the spec exactly was a good starting point (and might be useful for others as well).
I took a look at the pdf file, then downloaded druid and imported the existing GeneaPro database into it.
It looks like a good way to document the database as well as create some DB independant definitions that would allow easier support for other databases in the future.
I would like to recap, mostly to ensure my own understanding and limited research.
First, there are several issues the project needs to address:
1. Database and Data Model documentation - currently just some manual project docs and some Java Class comments - pretty basic and difficult to maintain.
2. Database Interface object orientation - how to map/model Java's object classes to the underlying relational database - again currently primitive mapping of class variables pretty directly to the database fields - limits the actual "OO" used and changes require many layers of updates.
3. Database Interface Persistance - related to 2 above in that this is the method of read/write data to/from database. Again, the current code in GeneaPro uses a laborious method of defining SQL for each and every type of transaction, some of the layers that 2 referred to. Ideally, we could just tell the program what the database looked like once, and then use simple read/write methods would do all the hard work of generating correct JDBC SQL and saving our Java objects.
Tools I've looked at, not too closely yet:
1. Druid - GUI to document data model and generate a varity of output, including PDF docs, multiple platform SQL code to create/maintain databases, Hibernate input definitions, etc.
2. Hibernate - tool that takes data model/database input and provides "easy" Java Class interface to the underlying database.
3. Torque - looks to do similiar functions to Hibernate, but I have the feeling I do not understand the full functions of either of these tools to actually compare.
I'm thinking that Druid would probably be fine as a solution for data documentation and it would be interesting to see if we could really leverage having our model in Druid by using Hibernate and/or Torque - this is the part that will take a little play time, I think. Also, if we decide on any of these tools, we then need to revisit our project database and core design docs.
So, please correct and enlighten me on my defintion of the issues and the short versions of the tools functions. And if you have the time to research and test this, that would be very helpful too.
One more fundamental issue is what domain you want to be doing your modeling in -- the object domain or the relational domain. The GDM was done as an E/R model, so that's where I started to allow easy comparison with the pre-existing GDM 1.1 model, but I think the right direction to go is to model in the object domain, then map back to the relational domain using a ORM tool like Hibernate. All the cruft like M:N tables for linking are just due to the relational model and wouldn't even show up in an object model. Hibernate can figure out automatically when to generate them.
Using the terminology on this page http://www.hibernate.org/199.html the process would be "bottom up" to bootstrap from GDM 1.1 then take the resulting result Java object skeleton and use it as the basis for refinement and fleshing out before regenerating things top-down. You can see the roundtrip cycle here http://www.hibernate.org/102.html
I think the main thing is to be able to easily move between design, documentation, and coding tasks and hopefully generate all required artifacts from a single machine readable version of the design.
I like that Holy Grail - one file defining the data structure and all needed output comes from that.
I've been working a little with Druid and have downloaded Hibernate - still struggling a bit with some of the concepts and and workflow of the tools, but I do like the overall idea.
By the way, feel free to share any experimental files via the CVS /doc/design directory - that way we can look over each other's work and see how actually sharing these work products might work.