From: Jonathan C. <cra...@pc...> - 2003-01-15 19:18:43
|
Hi Joan- Arnaud did supply us with documentation (attached) for the new Phenotype tables, but I just haven't loaded it into the database yet (I've also been quite busy :)) I started working on updating the documentation a couple of days ago, but in the process discovered that there are some invalid rows in core.DatabaseDocumentation that should be corrected first. A query shows that there are 73 rows in this table that reference nonexistent columns in GUS 3.0. For the most part I think that these are relatively minor problems stemming from the fact that the schema has been updated more recently than the documentation. However, there are also a few rows that suggest we need to improve the plugin and/or procedure used to populate this table. For example, the following rows have spaces in the column name (attribute_name), probably because the input files were invalid and the plugin has no restrictions on the format of the attribute_name: DATABASE_DOCUMENTATION_ID ------------------------- ATTRIBUTE_NAME -------------------------------------------------------------------------------- 1419 bio_material_id fk to LabelledExtract view of BioMaterial 1103 bio_source_characteristic_id primary key 1120 treatment_id fk to Treatment DATABASE_DOCUMENTATION_ID ------------------------- ATTRIBUTE_NAME -------------------------------------------------------------------------------- 1374 review_status_id The identifer of the review status 1418 assay_id fk to Assay 1373 synonym_name The gene symbol 6 rows selected. Also, as an aside (and not a comment to you in particular), it strikes me that column "documentation" of the form "fk to Table X" and "Primary key" could be generated automatically from the schema. However, comments on foreign keys are useful if they identify the specific subclass (i.e. view) to which the reference is expected to link, or if they explain what the referenced value is used for (if not obvious). Anyway, since there are still some minor schema changes taking place, I think that next week might be a good time to worry about updating all the documentation, since the database will be locked down for the migration at that point anyway. As for the controlled vocabularies, I think you're right, and we should try to populate these as soon as we can, even if it will be an iterative process in some cases. Jonathan -- Jonathan Crabtree Center for Bioinformatics, University of Pennsylvania 1406 Blockley Hall, 423 Guardian Drive Philadelphia, PA 19104-6021 215-573-3115 |