From: Angel P. <an...@pc...> - 2003-02-28 18:57:30
|
On Fri, 28 Feb 2003, Pjm wrote: > Hi all, > > While updating LoadPfam I have come accross some interesting "things". A PFam > release has links to databases such as 'INTERPRO; IPR000308;'. The old schema > just had a link DBref ->ExternalDatabase. The new schema has a middle table to > take into account a release of database. > > My first issue is populating these tables - I found some XML for GUSdev that I > am tweaking to setup ExternalDatabase and ExternalDatabaseRelease for testing > LoadPfam. I will assume all releases for now are version one, just for testing. > When I do it proper, shall I just insert the latest release only? This really depends on the representation that your situation needs. For instance, when we represent either GenBank or dbEST, we use a "continuous" release and update the ExternalDatabaseRelease entry to reflect either the last day dbEST was imported or the last release of GenBank that was imported. DoTS build team correct me if I am wrong here. This allows us to have an alternate key on the (external_database_release_id, source_id) tuple and also a small number of ext_db_rel_id's to query for a particular type of source_id. For other databases, you may want to keep the entries distinct between releases. A good example of this would be unigene, that does not really have stable identifiers for a set of constiuent sequences (e.g. when sets split or merge, they old ID is completely lost). Another would be for handling ontologies such as the MGED Ontology, where terms and definitions may change, but you want to keep track of the changes in order to successfully migrate data from one release to the next and be able to track down deprecated terms. > > Do I need to create a small plugin to create new DB releases as needed? > Shouldn't be hard. Most times if a plugin is this simple, it is better to rely on UpdateGusFromXML to do the work for you. > > LoadPfam itself will assume any reference to a DB in the PFam file is to the > latest DB. For example, picking the last release of INTERPRO will mean getting > the youngest record by using ExternalDatabaseRelease.release_date for INTERPRO > and link a SRes.DBref to it. > Yes this works, but if you do not need multiple releases of Pfam, we may want to consider the continuous release route. DoTS build team comments? > How does this sound? > > Have a good weekend, > Paul. > > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > -- Angel Pizarro Programmer Analyst Center for Bioinformatics an...@pc... |