|
From: Deborah P. <pi...@pc...> - 2005-08-29 16:30:50
|
Steve Fischer wrote: > i am not sure i like the idea of stuffing multiple aliases into one row. I agree but it is a way to keep the data if desired and is searchable and would not require a new table, GeneticMarkerSynonym, as I wanted to use the view right away. We have several synonym tables as GeneSynonym,MoietySynonym, and ProteinSynonym and a couple others. I would propose an all-purpose table of synonyms but I thought we were very negative about tables with soft-links. At any rate that is a discussion for a release. > > steve > > Deborah Pinney wrote: > >> Aaron J. Mackey wrote: >> >>> A few notes/thoughts on the proposal: >>> >>> Why would a genetic marker be considered a sequence? I guess you >>> instead meant a view of dots.NaFeatureImp? >> >> >> >> >> Yes I mistyped in the note, the view definition was "FROM >> DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' ". >> >>> >>> The relationships between a given set of markers are defined >>> genetically (i.e. as linked loci with genetic distances measured in >>> centiMorgans) with respect to a single genetic map (which >>> corresponds to a specific set of experimental crosses or an >>> observed pedigree). The same two markers may have different >>> distances (or even be unlinked) in a different map. >> >> >> >> Not all genetic marker sets are defined strictly genetically. When >> location is a physical location on a sequence, this information can >> be captured in the dots.nalocation table and a marker can have >> multiple locations. When location is defined genetically in the >> stricter sense, capturing the information is not as straight forward >> but can be done using the same view and nalocation (note that there >> is a linkage_group attribute in the proposed view as well as an >> external_database_release_id) and a marker can be represented by >> more than one row in the nafeature view. >> >>> >>> A markers phenotype is often only its physical definition: SNP, >>> SSLP, SSCP, RFLP, etc. Is this what the SO term is supposed to >>> capture? If so, what is the "type" field meant to capture? >> >> >> >> >> No, type was intended to capture the kind of marker represented in >> the row as the ones you mention as well as markers not necessarily in >> SO as blood groups or allozymes. The sequence_ontology_id is there >> because all nafeature views have this attribute and of course it can >> be used in the case where there is an approrpriate >> sequenceontology.term_name. >> >>> >>> As mentioned elsewhere, organism/strain should probably be a single >>> foreign key into the taxonomy table. >> >> >> >> Yes, I agreed with Chris and this was dropped. >> >>> >>> Measures of heterogeneity and penetrance are also specific to a >>> given population study, and are not universally true. I could >>> imagine these and other attributes of a given study being captured >>> independently. This will be an important area of growth in the >>> next 10 years as widescale familial genotyping becomes more prevalent. >> >> >> >> >> These are nullable so not required from every study. Perhaps we don't >> want these attributes and they can be eliminated. The data I intended >> to load don't have these values but this is only one example and >> these are values often associated with markers. >> >>> >>> >>> Markers may have multiple aliases. >> >> >> >> >> Yes, this is true. We can decide not to enter a value into this >> attribute as it is nullable or we can have a list as the attribute is >> a varchar2(1000). >> >>> >>> >>> Thanks, >>> >>> -Aaron >>> >>> On Aug 23, 2005, at 12:49 PM, Deborah Pinney wrote: >>> >>>> I suggest a new view of dots.NaSequenceImp that would be used to >>>> store genetic marker data. Genetic markers are a staple genetic >>>> tool but include a large variety of data types, some of which may >>>> be covered by other feature views. I am proposing this view for >>>> the variety of genetic marker data that are not specifically >>>> stored elsewhere. Below is a proposed view definition that >>>> requires review and probably modification. >>>> >>>> SELECT NA_Feature_ID as na_feature_id, >>>> NA_SEQUENCE_ID as na_sequence_id, >>>> SUBCLASS_VIEW as subclass_view, >>>> NAME as name, >>>> SEQUENCE_ONTOLOGY_ID as sequence_ontology_id, >>>> PARENT_ID as parent_id, >>>> EXTERNAL_DATABASE_RELEASE_ID as external_database_release_id, >>>> SOURCE_ID as source_id, >>>> PREDICTION_ALGORITHM_ID as prediction_algorithm_id, >>>> IS_PREDICTED as is_predicted, >>>> REVIEW_STATUS_ID as review_status_id, >>>> STRING1 as alias, >>>> STRING2 as phenotype, >>>> STRING3 as type, >>>> STRING4 as linkage_group, >>>> STRING5 as centimorgan, >>>> STRING6 as measure_of_heterogeneity, >>>> STRING7 as penetrance, >>>> STRING8 as organism, >>>> STRING9 as strain, >>>> STRING12 as product, >>>> MODIFICATION_DATE as modification_date, >>>> USER_READ as user_read, >>>> USER_WRITE as user_write, >>>> GROUP_READ as group_read, >>>> GROUP_WRITE as group_write, >>>> OTHER_READ as other_read, >>>> OTHER_WRITE as other_write, >>>> ROW_USER_ID as row_user_id, >>>> ROW_GROUP_ID as row_group_id, >>>> ROW_PROJECT_ID as row_project_id, >>>> ROW_ALG_INVOCATION_ID as row_alg_invocation_id, >>>> FROM DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' >>>> >>>> >>>> >>>> ------------------------------------------------------- >>>> SF.Net email is Sponsored by the Better Software Conference & EXPO >>>> September 19-22, 2005 * San Francisco, CA * Development Lifecycle >>>> Practices >>>> Agile & Plan-Driven Development * Managing Projects & Teams * >>>> Testing & QA >>>> Security * Process Improvement & Measurement * http://www.sqe.com/ >>>> bsce5sf >>>> _______________________________________________ >>>> Gusdev-gusdev mailing list >>>> Gus...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>> >>> >>> -- >>> Aaron J. Mackey, Ph.D. >>> Project Manager, ApiDB Bioinformatics Resource Center >>> Penn Genomics Institute, University of Pennsylvania >>> email: am...@pc... >>> office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI) >>> fax: 215-746-6697 >>> postal: Penn Genomics Institute >>> Goddard Labs 212 >>> 415 S. University Avenue >>> Philadelphia, PA 19104-6017 >>> >> >> >> >> ------------------------------------------------------- >> SF.Net email is Sponsored by the Better Software Conference & EXPO >> September 19-22, 2005 * San Francisco, CA * Development Lifecycle >> Practices >> Agile & Plan-Driven Development * Managing Projects & Teams * Testing >> & QA >> Security * Process Improvement & Measurement * >> http://www.sqe.com/bsce5sf >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |