From: <vga...@us...> - 2010-02-08 17:08:24
|
Revision: 486 http://treebase.svn.sourceforge.net/treebase/?rev=486&view=rev Author: vgapeyev Date: 2010-02-08 17:07:55 +0000 (Mon, 08 Feb 2010) Log Message: ----------- Added a few details. Modified Paths: -------------- trunk/treebase-core/db/README.txt trunk/treebase-core/db/TODO.txt Modified: trunk/treebase-core/db/README.txt =================================================================== --- trunk/treebase-core/db/README.txt 2010-02-08 00:23:34 UTC (rev 485) +++ trunk/treebase-core/db/README.txt 2010-02-08 17:07:55 UTC (rev 486) @@ -9,18 +9,14 @@ in Ruby or Python (but with scripts written in SQL and no automation). The idea is that we store in SVN the initial snapshot of the DB schema -and sequential parches that we have been developing w.r.t +and sequential patches that we have been developing w.r.t treebasedev. These patches will be used to bring up-to-date treebasestage, and then treebaseprod. After that we can erase the patches, refresh the snapshot, and continue for another cycle. Official TB2 code releases are to be declared at the start of a cycle, when there are no patches yet. The snapshot is what should be used to -create a fresh TB2 instance. +create a fresh TB2 instance by someone who wants to run their own. -I suggest to create the directory treebase-code/db to store patches -and snapshots. (Another option would be a separate project, treebase- -db, a a sibling of treebase-core and treebase-web.) - Patches are to be sequentially numbered (by hand). To keep track which DB instance has already been brought up to which patch version, I suggest creating a special table with a number column. Each patch @@ -28,3 +24,20 @@ number. If this sounds ok, I hope to set this up by end of tomorrow. + + +Some more details: + +* Patch naming: 0001_short_name.sql +* Each patch is wrapped into a transaction and starts by incrementing the number in the version table. +* Patches and snapshots must not include any ownership or permissions commands. +* Version table schema: +CREATE TABLE versionhistory +( + id serial NOT NULL, + patchnumber integer, + patchlabel character varying(50) NOT NULL, + patchdescription character varying(1000), + applied timestamp with time zone, + CONSTRAINT versionhistory_pkey PRIMARY KEY (id) +) \ No newline at end of file Modified: trunk/treebase-core/db/TODO.txt =================================================================== --- trunk/treebase-core/db/TODO.txt 2010-02-08 00:23:34 UTC (rev 485) +++ trunk/treebase-core/db/TODO.txt 2010-02-08 17:07:55 UTC (rev 486) @@ -3,5 +3,44 @@ [To be put in as patches after the storage structure for them is in place in this directory -- VG 2010-02-04]: * Drop geospot_id_sequence -- there is no corresponding sequence + +* Drop hibernate_sequence + * Correct PK name in stepmatrixelement and create stepmatrixelement_id_sequence + * Create indexes, as suggested by Bill 2010-02-03 (and maybe later) + Bill's message of 2010-02-03: "Of the top of my head, I would assume that the following fields need to be indexed: + Table Field + -------- ------- + taxon name + taxonvariant fullname + taxonlabel taxonlabel + citation title + citation abstract + person lastname" + More details in bug https://sourceforge.net/tracker/?func=detail&atid=1126676&aid=2945391&group_id=248804, + esp phylotree, phylonode indexes. + Also, compare to the end-of-Jan schema snapshot of treebasedev + + +Pending data cleanups (mostly, on staging): + +* Remove junk, particularly that created by tests + +* Known junk: Submission 22 and its related records + Youjun's email of 2010-01-28: + "In table phylotree, many trees do not have a study_id value, but their phylotreenode related to study 22 via table taxonlabel. + Their phylotree_id are: + 1129,1130,1131,1132, 2333,2334,2335,2336,2337,2338, 2339,2340,2341,2343,2556,2557,2686,2726,2727,2787,2788,2789,2790, + 3446,3671,3766,3767,3901,3902,3903,3904,3905,3906,3907,3908,3909,3910,3911,3912, 4062,4063, 5705,5706,5707,5708,5720,5721,5921,5941,5981,160000022341 + I will delete those trees because their foreign key constrain prevent me from cleaning up about 10,000 dummy taxonlabel records related to submission 22 + (study 22 and submission 22 happen to be the same)." + VG: these IDs are from end-of-Jan treebasedev; the current treebasestage may have fewer. + +* Replace "owner" nulls by the special "migration" owner (see Bill's message of around 2010-02-02) + "Also because we have reverted to an older instance of the data, the migrated records contain have NULL in the user_id of the submission table. This causes null pointer errors for the unit tests. To fix the problem, I did the following: + 1. I created a user with username "migration" and email address "mig...@tr...". This is the dummy "user" who will now "own" all migrated submissions that lack an owner. + 2. I got the user_id for the "migration" user, and it happened to be 9955 + 3. I ran this statement: "UPDATE submission SET user_id = 9955 WHERE user_id IS NULL" + Now all migrated data belong to the user "migration". " + \ No newline at end of file This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |