Re: [GUSDEV] GeneticMarker view

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Aaron J. Mackey wrote:

> A few notes/thoughts on the proposal:
>
> Why would a genetic marker be considered a sequence?   I guess you  
> instead meant a view of dots.NaFeatureImp?

Yes I mistyped in the note, the view definition was  "FROM 
DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' ".

>
> The relationships between a given set of markers are defined  
> genetically (i.e. as linked loci with genetic distances measured in  
> centiMorgans) with respect to a single genetic map (which corresponds  
> to a specific set of experimental crosses or an observed pedigree).   
> The same two markers may have different distances (or even be  
> unlinked) in a different map.

Not all genetic marker sets are defined strictly genetically. When 
location is a physical location on a sequence, this information can be 
captured in the dots.nalocation table and a marker can have multiple 
locations. When location is defined genetically in the stricter sense, 
capturing the information is not as straight forward but can be done 
using the same view and nalocation (note that there is a linkage_group 
attribute in the proposed view as well as an 
external_database_release_id) and a  marker can be represented by more 
than one row in the nafeature view.

>
> A markers phenotype is often only its physical definition: SNP, SSLP,  
> SSCP, RFLP, etc.  Is this what the SO term is supposed to capture?   
> If so, what is the "type" field meant to capture?

No, type was intended to capture the kind of marker represented in the 
row as the ones you mention as well as markers not necessarily in SO as 
blood groups or allozymes. The sequence_ontology_id is there because all 
nafeature views have this attribute and of course it can be used in the 
case where there is an approrpriate sequenceontology.term_name.

>
> As mentioned elsewhere, organism/strain should probably be a single  
> foreign key into the taxonomy table.

Yes, I agreed with Chris and this was dropped.

>
> Measures of heterogeneity and penetrance are also specific to a given  
> population study, and are not universally true.  I could imagine  
> these and other attributes of a given study being captured  
> independently.  This will be an important area of growth in the next  
> 10 years as widescale familial genotyping becomes more prevalent.

These are nullable so not required from every study. Perhaps we don't 
want these attributes and they can be eliminated. The data I intended to 
load don't have these values but this is only one example and these are 
values often associated with markers.

>
>
> Markers may have multiple aliases.

Yes, this is true. We can decide not to enter a value into this 
attribute as it is nullable or we can have a list as the attribute is a 
varchar2(1000).

>
>
> Thanks,
>
> -Aaron
>
> On Aug 23, 2005, at 12:49 PM, Deborah Pinney wrote:
>
>> I suggest a new view of dots.NaSequenceImp that would be used to  
>> store genetic marker data. Genetic markers are a staple genetic  tool 
>> but include a large variety of data types, some of which may  be 
>> covered by other feature views. I am proposing this view for the  
>> variety of genetic marker data that are not specifically stored  
>> elsewhere. Below is a proposed view definition that requires review  
>> and probably modification.
>>
>> SELECT NA_Feature_ID as na_feature_id,
>> NA_SEQUENCE_ID as na_sequence_id,
>> SUBCLASS_VIEW as subclass_view,
>> NAME as name,
>> SEQUENCE_ONTOLOGY_ID as sequence_ontology_id,
>> PARENT_ID as parent_id,
>> EXTERNAL_DATABASE_RELEASE_ID as external_database_release_id,
>> SOURCE_ID as source_id,
>> PREDICTION_ALGORITHM_ID as prediction_algorithm_id,
>> IS_PREDICTED as is_predicted,
>> REVIEW_STATUS_ID as review_status_id,
>> STRING1 as alias,
>> STRING2 as phenotype,
>> STRING3 as type,
>> STRING4 as linkage_group,
>> STRING5 as centimorgan,
>> STRING6 as measure_of_heterogeneity,
>> STRING7 as penetrance,
>> STRING8 as organism,
>> STRING9 as strain,
>> STRING12 as product,
>> MODIFICATION_DATE as modification_date,
>> USER_READ as user_read,
>> USER_WRITE as user_write,
>> GROUP_READ as group_read,
>> GROUP_WRITE as group_write,
>> OTHER_READ as other_read,
>> OTHER_WRITE as other_write,
>> ROW_USER_ID as row_user_id,
>> ROW_GROUP_ID as row_group_id,
>> ROW_PROJECT_ID as row_project_id,
>> ROW_ALG_INVOCATION_ID as row_alg_invocation_id,
>> FROM DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker'
>>
>>
>>
>> -------------------------------------------------------
>> SF.Net email is Sponsored by the Better Software Conference & EXPO
>> September 19-22, 2005 * San Francisco, CA * Development Lifecycle  
>> Practices
>> Agile & Plan-Driven Development * Managing Projects & Teams *  
>> Testing & QA
>> Security * Process Improvement & Measurement * http://www.sqe.com/ 
>> bsce5sf
>> _______________________________________________
>> Gusdev-gusdev mailing list
>> Gus...@li...
>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev
>>
>
> -- 
> Aaron J. Mackey, Ph.D.
> Project Manager, ApiDB Bioinformatics Resource Center
> Penn Genomics Institute, University of Pennsylvania
> email:  am...@pc...
> office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI)
> fax:    215-746-6697
> postal: Penn Genomics Institute
>         Goddard Labs 212
>         415 S. University Avenue
>         Philadelphia, PA  19104-6017
>