|
From: Deborah P. <pi...@pc...> - 2005-08-23 16:50:19
|
I suggest a new view of dots.NaSequenceImp that would be used to store genetic marker data. Genetic markers are a staple genetic tool but include a large variety of data types, some of which may be covered by other feature views. I am proposing this view for the variety of genetic marker data that are not specifically stored elsewhere. Below is a proposed view definition that requires review and probably modification. SELECT NA_Feature_ID as na_feature_id, NA_SEQUENCE_ID as na_sequence_id, SUBCLASS_VIEW as subclass_view, NAME as name, SEQUENCE_ONTOLOGY_ID as sequence_ontology_id, PARENT_ID as parent_id, EXTERNAL_DATABASE_RELEASE_ID as external_database_release_id, SOURCE_ID as source_id, PREDICTION_ALGORITHM_ID as prediction_algorithm_id, IS_PREDICTED as is_predicted, REVIEW_STATUS_ID as review_status_id, STRING1 as alias, STRING2 as phenotype, STRING3 as type, STRING4 as linkage_group, STRING5 as centimorgan, STRING6 as measure_of_heterogeneity, STRING7 as penetrance, STRING8 as organism, STRING9 as strain, STRING12 as product, MODIFICATION_DATE as modification_date, USER_READ as user_read, USER_WRITE as user_write, GROUP_READ as group_read, GROUP_WRITE as group_write, OTHER_READ as other_read, OTHER_WRITE as other_write, ROW_USER_ID as row_user_id, ROW_GROUP_ID as row_group_id, ROW_PROJECT_ID as row_project_id, ROW_ALG_INVOCATION_ID as row_alg_invocation_id, FROM DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' |
|
From: Chris S. <sto...@pc...> - 2005-08-23 22:06:04
|
Hi Debbie, Some of the strings look like they might be numbers (e.g., centimorgan) or foreign keys to a controlled vocabulary (e.g., phenotype, type, organism, strain). Since NAFeatures have NASequence which have taxon_id is "organism" really needed? Is product a protein and therefore should be linked to an AA table? In other words, which are the key attributes that should be intregrated with other data in GUS and what can just go into a free text description field? Thanks. Chris On Aug 23, 2005, at 12:49 PM, Deborah Pinney wrote: > I suggest a new view of dots.NaSequenceImp that would be used to > store genetic marker data. Genetic markers are a staple genetic > tool but include a large variety of data types, some of which may > be covered by other feature views. I am proposing this view for the > variety of genetic marker data that are not specifically stored > elsewhere. Below is a proposed view definition that requires review > and probably modification. > > SELECT NA_Feature_ID as na_feature_id, > NA_SEQUENCE_ID as na_sequence_id, > SUBCLASS_VIEW as subclass_view, > NAME as name, > SEQUENCE_ONTOLOGY_ID as sequence_ontology_id, > PARENT_ID as parent_id, > EXTERNAL_DATABASE_RELEASE_ID as external_database_release_id, > SOURCE_ID as source_id, > PREDICTION_ALGORITHM_ID as prediction_algorithm_id, > IS_PREDICTED as is_predicted, > REVIEW_STATUS_ID as review_status_id, > STRING1 as alias, > STRING2 as phenotype, > STRING3 as type, > STRING4 as linkage_group, > STRING5 as centimorgan, > STRING6 as measure_of_heterogeneity, > STRING7 as penetrance, > STRING8 as organism, > STRING9 as strain, > STRING12 as product, > MODIFICATION_DATE as modification_date, > USER_READ as user_read, > USER_WRITE as user_write, > GROUP_READ as group_read, > GROUP_WRITE as group_write, > OTHER_READ as other_read, > OTHER_WRITE as other_write, > ROW_USER_ID as row_user_id, > ROW_GROUP_ID as row_group_id, > ROW_PROJECT_ID as row_project_id, > ROW_ALG_INVOCATION_ID as row_alg_invocation_id, > FROM DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' > > > > ------------------------------------------------------- > SF.Net email is Sponsored by the Better Software Conference & EXPO > September 19-22, 2005 * San Francisco, CA * Development Lifecycle > Practices > Agile & Plan-Driven Development * Managing Projects & Teams * > Testing & QA > Security * Process Improvement & Measurement * http://www.sqe.com/ > bsce5sf > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |
|
From: Steve F. <sfi...@pc...> - 2005-08-24 01:13:54
|
just a reminder that foreign keys have to go into the Imp table, and therefore into the superclass... steve Chris Stoeckert wrote: > Hi Debbie, > Some of the strings look like they might be numbers (e.g., > centimorgan) or foreign keys to a controlled vocabulary (e.g., > phenotype, type, organism, strain). Since NAFeatures have NASequence > which have taxon_id is "organism" really needed? Is product a protein > and therefore should be linked to an AA table? In other words, which > are the key attributes that should be intregrated with other data in > GUS and what can just go into a free text description field? > Thanks. > Chris > > On Aug 23, 2005, at 12:49 PM, Deborah Pinney wrote: > >> I suggest a new view of dots.NaSequenceImp that would be used to >> store genetic marker data. Genetic markers are a staple genetic tool >> but include a large variety of data types, some of which may be >> covered by other feature views. I am proposing this view for the >> variety of genetic marker data that are not specifically stored >> elsewhere. Below is a proposed view definition that requires review >> and probably modification. >> >> SELECT NA_Feature_ID as na_feature_id, >> NA_SEQUENCE_ID as na_sequence_id, >> SUBCLASS_VIEW as subclass_view, >> NAME as name, >> SEQUENCE_ONTOLOGY_ID as sequence_ontology_id, >> PARENT_ID as parent_id, >> EXTERNAL_DATABASE_RELEASE_ID as external_database_release_id, >> SOURCE_ID as source_id, >> PREDICTION_ALGORITHM_ID as prediction_algorithm_id, >> IS_PREDICTED as is_predicted, >> REVIEW_STATUS_ID as review_status_id, >> STRING1 as alias, >> STRING2 as phenotype, >> STRING3 as type, >> STRING4 as linkage_group, >> STRING5 as centimorgan, >> STRING6 as measure_of_heterogeneity, >> STRING7 as penetrance, >> STRING8 as organism, >> STRING9 as strain, >> STRING12 as product, >> MODIFICATION_DATE as modification_date, >> USER_READ as user_read, >> USER_WRITE as user_write, >> GROUP_READ as group_read, >> GROUP_WRITE as group_write, >> OTHER_READ as other_read, >> OTHER_WRITE as other_write, >> ROW_USER_ID as row_user_id, >> ROW_GROUP_ID as row_group_id, >> ROW_PROJECT_ID as row_project_id, >> ROW_ALG_INVOCATION_ID as row_alg_invocation_id, >> FROM DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' >> >> >> >> ------------------------------------------------------- >> SF.Net email is Sponsored by the Better Software Conference & EXPO >> September 19-22, 2005 * San Francisco, CA * Development Lifecycle >> Practices >> Agile & Plan-Driven Development * Managing Projects & Teams * >> Testing & QA >> Security * Process Improvement & Measurement * http://www.sqe.com/ >> bsce5sf >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> > > > > ------------------------------------------------------- > SF.Net email is Sponsored by the Better Software Conference & EXPO > September 19-22, 2005 * San Francisco, CA * Development Lifecycle > Practices > Agile & Plan-Driven Development * Managing Projects & Teams * Testing > & QA > Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
|
From: Deborah P. <pi...@pc...> - 2005-08-24 14:53:15
|
Chris Stoeckert wrote: You're right about centimorgans, I originally had float and somehow dropped it. Also, there is an sres.phenotype table of which I was not aware. Do you know if there is an accepted phenotype ontology that is not mammal centric? This leads to the question raised by Steve's comment that a foreign key has to be included in the superclass view and imp table. Do we want to add phenotype as a foreign key? There probably should be a CV for genetic marker type but I haven't seen it and SO terms don't cover them (blood groups for example) so I'm not sure that type can be a reference to another table. You're probably right about the organism and strain being covered by the taxon_id in the NASequence Imp view. Those are probably appropriate for the SeqVariation view but not for GeneticMarker and I'll drop them. On second thought, I think product should probably be dropped as phenotype is probably the correct association and already an attribute. I was trying to anticipate a variety of data sets (I'm only dealing with a single example) so I included linkage_group but I'm not sure whether this attribute is needed. Here is an altered view definition: SELECT NA_Feature_ID as na_feature_id, NA_SEQUENCE_ID as na_sequence_id, SUBCLASS_VIEW as subclass_view, NAME as name, SEQUENCE_ONTOLOGY_ID as sequence_ontology_id, PARENT_ID as parent_id, EXTERNAL_DATABASE_RELEASE_ID as external_database_release_id, SOURCE_ID as source_id, PREDICTION_ALGORITHM_ID as prediction_algorithm_id, IS_PREDICTED as is_predicted, REVIEW_STATUS_ID as review_status_id, STRING1 as alias, INT1 as phenotype, STRING3 as type, STRING4 as linkage_group, FLOAT1 as centimorgan, STRING5 as measure_of_heterogeneity, STRING6 as penetrance, MODIFICATION_DATE as modification_date, USER_READ as user_read, USER_WRITE as user_write, GROUP_READ as group_read, GROUP_WRITE as group_write, OTHER_READ as other_read, OTHER_WRITE as other_write, ROW_USER_ID as row_user_id, ROW_GROUP_ID as row_group_id, ROW_PROJECT_ID as row_project_id, ROW_ALG_INVOCATION_ID as row_alg_invocation_id, FROM DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' > Hi Debbie, > Some of the strings look like they might be numbers (e.g., > centimorgan) or foreign keys to a controlled vocabulary (e.g., > phenotype, type, organism, strain). Since NAFeatures have NASequence > which have taxon_id is "organism" really needed? Is product a protein > and therefore should be linked to an AA table? In other words, which > are the key attributes that should be intregrated with other data in > GUS and what can just go into a free text description field? > Thanks. > Chris > > On Aug 23, 2005, at 12:49 PM, Deborah Pinney wrote: > >> I suggest a new view of dots.NaSequenceImp that would be used to >> store genetic marker data. Genetic markers are a staple genetic tool >> but include a large variety of data types, some of which may be >> covered by other feature views. I am proposing this view for the >> variety of genetic marker data that are not specifically stored >> elsewhere. Below is a proposed view definition that requires review >> and probably modification. >> >> SELECT NA_Feature_ID as na_feature_id, >> NA_SEQUENCE_ID as na_sequence_id, >> SUBCLASS_VIEW as subclass_view, >> NAME as name, >> SEQUENCE_ONTOLOGY_ID as sequence_ontology_id, >> PARENT_ID as parent_id, >> EXTERNAL_DATABASE_RELEASE_ID as external_database_release_id, >> SOURCE_ID as source_id, >> PREDICTION_ALGORITHM_ID as prediction_algorithm_id, >> IS_PREDICTED as is_predicted, >> REVIEW_STATUS_ID as review_status_id, >> STRING1 as alias, >> STRING2 as phenotype, >> STRING3 as type, >> STRING4 as linkage_group, >> STRING5 as centimorgan, >> STRING6 as measure_of_heterogeneity, >> STRING7 as penetrance, >> STRING8 as organism, >> STRING9 as strain, >> STRING12 as product, >> MODIFICATION_DATE as modification_date, >> USER_READ as user_read, >> USER_WRITE as user_write, >> GROUP_READ as group_read, >> GROUP_WRITE as group_write, >> OTHER_READ as other_read, >> OTHER_WRITE as other_write, >> ROW_USER_ID as row_user_id, >> ROW_GROUP_ID as row_group_id, >> ROW_PROJECT_ID as row_project_id, >> ROW_ALG_INVOCATION_ID as row_alg_invocation_id, >> FROM DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' >> >> >> >> ------------------------------------------------------- >> SF.Net email is Sponsored by the Better Software Conference & EXPO >> September 19-22, 2005 * San Francisco, CA * Development Lifecycle >> Practices >> Agile & Plan-Driven Development * Managing Projects & Teams * >> Testing & QA >> Security * Process Improvement & Measurement * http://www.sqe.com/ >> bsce5sf >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> |
|
From: Aaron J. M. <am...@pc...> - 2005-08-29 12:49:56
|
A few notes/thoughts on the proposal: Why would a genetic marker be considered a sequence? I guess you instead meant a view of dots.NaFeatureImp? The relationships between a given set of markers are defined genetically (i.e. as linked loci with genetic distances measured in centiMorgans) with respect to a single genetic map (which corresponds to a specific set of experimental crosses or an observed pedigree). The same two markers may have different distances (or even be unlinked) in a different map. A markers phenotype is often only its physical definition: SNP, SSLP, SSCP, RFLP, etc. Is this what the SO term is supposed to capture? If so, what is the "type" field meant to capture? As mentioned elsewhere, organism/strain should probably be a single foreign key into the taxonomy table. Measures of heterogeneity and penetrance are also specific to a given population study, and are not universally true. I could imagine these and other attributes of a given study being captured independently. This will be an important area of growth in the next 10 years as widescale familial genotyping becomes more prevalent. Markers may have multiple aliases. Thanks, -Aaron On Aug 23, 2005, at 12:49 PM, Deborah Pinney wrote: > I suggest a new view of dots.NaSequenceImp that would be used to > store genetic marker data. Genetic markers are a staple genetic > tool but include a large variety of data types, some of which may > be covered by other feature views. I am proposing this view for the > variety of genetic marker data that are not specifically stored > elsewhere. Below is a proposed view definition that requires review > and probably modification. > > SELECT NA_Feature_ID as na_feature_id, > NA_SEQUENCE_ID as na_sequence_id, > SUBCLASS_VIEW as subclass_view, > NAME as name, > SEQUENCE_ONTOLOGY_ID as sequence_ontology_id, > PARENT_ID as parent_id, > EXTERNAL_DATABASE_RELEASE_ID as external_database_release_id, > SOURCE_ID as source_id, > PREDICTION_ALGORITHM_ID as prediction_algorithm_id, > IS_PREDICTED as is_predicted, > REVIEW_STATUS_ID as review_status_id, > STRING1 as alias, > STRING2 as phenotype, > STRING3 as type, > STRING4 as linkage_group, > STRING5 as centimorgan, > STRING6 as measure_of_heterogeneity, > STRING7 as penetrance, > STRING8 as organism, > STRING9 as strain, > STRING12 as product, > MODIFICATION_DATE as modification_date, > USER_READ as user_read, > USER_WRITE as user_write, > GROUP_READ as group_read, > GROUP_WRITE as group_write, > OTHER_READ as other_read, > OTHER_WRITE as other_write, > ROW_USER_ID as row_user_id, > ROW_GROUP_ID as row_group_id, > ROW_PROJECT_ID as row_project_id, > ROW_ALG_INVOCATION_ID as row_alg_invocation_id, > FROM DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' > > > > ------------------------------------------------------- > SF.Net email is Sponsored by the Better Software Conference & EXPO > September 19-22, 2005 * San Francisco, CA * Development Lifecycle > Practices > Agile & Plan-Driven Development * Managing Projects & Teams * > Testing & QA > Security * Process Improvement & Measurement * http://www.sqe.com/ > bsce5sf > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > -- Aaron J. Mackey, Ph.D. Project Manager, ApiDB Bioinformatics Resource Center Penn Genomics Institute, University of Pennsylvania email: am...@pc... office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI) fax: 215-746-6697 postal: Penn Genomics Institute Goddard Labs 212 415 S. University Avenue Philadelphia, PA 19104-6017 |
|
From: Deborah P. <pi...@pc...> - 2005-08-29 15:22:13
|
Aaron J. Mackey wrote: > A few notes/thoughts on the proposal: > > Why would a genetic marker be considered a sequence? I guess you > instead meant a view of dots.NaFeatureImp? Yes I mistyped in the note, the view definition was "FROM DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' ". > > The relationships between a given set of markers are defined > genetically (i.e. as linked loci with genetic distances measured in > centiMorgans) with respect to a single genetic map (which corresponds > to a specific set of experimental crosses or an observed pedigree). > The same two markers may have different distances (or even be > unlinked) in a different map. Not all genetic marker sets are defined strictly genetically. When location is a physical location on a sequence, this information can be captured in the dots.nalocation table and a marker can have multiple locations. When location is defined genetically in the stricter sense, capturing the information is not as straight forward but can be done using the same view and nalocation (note that there is a linkage_group attribute in the proposed view as well as an external_database_release_id) and a marker can be represented by more than one row in the nafeature view. > > A markers phenotype is often only its physical definition: SNP, SSLP, > SSCP, RFLP, etc. Is this what the SO term is supposed to capture? > If so, what is the "type" field meant to capture? No, type was intended to capture the kind of marker represented in the row as the ones you mention as well as markers not necessarily in SO as blood groups or allozymes. The sequence_ontology_id is there because all nafeature views have this attribute and of course it can be used in the case where there is an approrpriate sequenceontology.term_name. > > As mentioned elsewhere, organism/strain should probably be a single > foreign key into the taxonomy table. Yes, I agreed with Chris and this was dropped. > > Measures of heterogeneity and penetrance are also specific to a given > population study, and are not universally true. I could imagine > these and other attributes of a given study being captured > independently. This will be an important area of growth in the next > 10 years as widescale familial genotyping becomes more prevalent. These are nullable so not required from every study. Perhaps we don't want these attributes and they can be eliminated. The data I intended to load don't have these values but this is only one example and these are values often associated with markers. > > > Markers may have multiple aliases. Yes, this is true. We can decide not to enter a value into this attribute as it is nullable or we can have a list as the attribute is a varchar2(1000). > > > Thanks, > > -Aaron > > On Aug 23, 2005, at 12:49 PM, Deborah Pinney wrote: > >> I suggest a new view of dots.NaSequenceImp that would be used to >> store genetic marker data. Genetic markers are a staple genetic tool >> but include a large variety of data types, some of which may be >> covered by other feature views. I am proposing this view for the >> variety of genetic marker data that are not specifically stored >> elsewhere. Below is a proposed view definition that requires review >> and probably modification. >> >> SELECT NA_Feature_ID as na_feature_id, >> NA_SEQUENCE_ID as na_sequence_id, >> SUBCLASS_VIEW as subclass_view, >> NAME as name, >> SEQUENCE_ONTOLOGY_ID as sequence_ontology_id, >> PARENT_ID as parent_id, >> EXTERNAL_DATABASE_RELEASE_ID as external_database_release_id, >> SOURCE_ID as source_id, >> PREDICTION_ALGORITHM_ID as prediction_algorithm_id, >> IS_PREDICTED as is_predicted, >> REVIEW_STATUS_ID as review_status_id, >> STRING1 as alias, >> STRING2 as phenotype, >> STRING3 as type, >> STRING4 as linkage_group, >> STRING5 as centimorgan, >> STRING6 as measure_of_heterogeneity, >> STRING7 as penetrance, >> STRING8 as organism, >> STRING9 as strain, >> STRING12 as product, >> MODIFICATION_DATE as modification_date, >> USER_READ as user_read, >> USER_WRITE as user_write, >> GROUP_READ as group_read, >> GROUP_WRITE as group_write, >> OTHER_READ as other_read, >> OTHER_WRITE as other_write, >> ROW_USER_ID as row_user_id, >> ROW_GROUP_ID as row_group_id, >> ROW_PROJECT_ID as row_project_id, >> ROW_ALG_INVOCATION_ID as row_alg_invocation_id, >> FROM DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' >> >> >> >> ------------------------------------------------------- >> SF.Net email is Sponsored by the Better Software Conference & EXPO >> September 19-22, 2005 * San Francisco, CA * Development Lifecycle >> Practices >> Agile & Plan-Driven Development * Managing Projects & Teams * >> Testing & QA >> Security * Process Improvement & Measurement * http://www.sqe.com/ >> bsce5sf >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> > > -- > Aaron J. Mackey, Ph.D. > Project Manager, ApiDB Bioinformatics Resource Center > Penn Genomics Institute, University of Pennsylvania > email: am...@pc... > office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI) > fax: 215-746-6697 > postal: Penn Genomics Institute > Goddard Labs 212 > 415 S. University Avenue > Philadelphia, PA 19104-6017 > |
|
From: Steve F. <sfi...@pc...> - 2005-08-29 15:54:41
|
i am not sure i like the idea of stuffing multiple aliases into one row. steve Deborah Pinney wrote: > Aaron J. Mackey wrote: > >> A few notes/thoughts on the proposal: >> >> Why would a genetic marker be considered a sequence? I guess you >> instead meant a view of dots.NaFeatureImp? > > > > Yes I mistyped in the note, the view definition was "FROM > DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' ". > >> >> The relationships between a given set of markers are defined >> genetically (i.e. as linked loci with genetic distances measured in >> centiMorgans) with respect to a single genetic map (which >> corresponds to a specific set of experimental crosses or an observed >> pedigree). The same two markers may have different distances (or >> even be unlinked) in a different map. > > > Not all genetic marker sets are defined strictly genetically. When > location is a physical location on a sequence, this information can be > captured in the dots.nalocation table and a marker can have multiple > locations. When location is defined genetically in the stricter sense, > capturing the information is not as straight forward but can be done > using the same view and nalocation (note that there is a linkage_group > attribute in the proposed view as well as an > external_database_release_id) and a marker can be represented by more > than one row in the nafeature view. > >> >> A markers phenotype is often only its physical definition: SNP, >> SSLP, SSCP, RFLP, etc. Is this what the SO term is supposed to >> capture? If so, what is the "type" field meant to capture? > > > > No, type was intended to capture the kind of marker represented in the > row as the ones you mention as well as markers not necessarily in SO > as blood groups or allozymes. The sequence_ontology_id is there > because all nafeature views have this attribute and of course it can > be used in the case where there is an approrpriate > sequenceontology.term_name. > >> >> As mentioned elsewhere, organism/strain should probably be a single >> foreign key into the taxonomy table. > > > Yes, I agreed with Chris and this was dropped. > >> >> Measures of heterogeneity and penetrance are also specific to a >> given population study, and are not universally true. I could >> imagine these and other attributes of a given study being captured >> independently. This will be an important area of growth in the next >> 10 years as widescale familial genotyping becomes more prevalent. > > > > These are nullable so not required from every study. Perhaps we don't > want these attributes and they can be eliminated. The data I intended > to load don't have these values but this is only one example and these > are values often associated with markers. > >> >> >> Markers may have multiple aliases. > > > > Yes, this is true. We can decide not to enter a value into this > attribute as it is nullable or we can have a list as the attribute is > a varchar2(1000). > >> >> >> Thanks, >> >> -Aaron >> >> On Aug 23, 2005, at 12:49 PM, Deborah Pinney wrote: >> >>> I suggest a new view of dots.NaSequenceImp that would be used to >>> store genetic marker data. Genetic markers are a staple genetic >>> tool but include a large variety of data types, some of which may >>> be covered by other feature views. I am proposing this view for the >>> variety of genetic marker data that are not specifically stored >>> elsewhere. Below is a proposed view definition that requires review >>> and probably modification. >>> >>> SELECT NA_Feature_ID as na_feature_id, >>> NA_SEQUENCE_ID as na_sequence_id, >>> SUBCLASS_VIEW as subclass_view, >>> NAME as name, >>> SEQUENCE_ONTOLOGY_ID as sequence_ontology_id, >>> PARENT_ID as parent_id, >>> EXTERNAL_DATABASE_RELEASE_ID as external_database_release_id, >>> SOURCE_ID as source_id, >>> PREDICTION_ALGORITHM_ID as prediction_algorithm_id, >>> IS_PREDICTED as is_predicted, >>> REVIEW_STATUS_ID as review_status_id, >>> STRING1 as alias, >>> STRING2 as phenotype, >>> STRING3 as type, >>> STRING4 as linkage_group, >>> STRING5 as centimorgan, >>> STRING6 as measure_of_heterogeneity, >>> STRING7 as penetrance, >>> STRING8 as organism, >>> STRING9 as strain, >>> STRING12 as product, >>> MODIFICATION_DATE as modification_date, >>> USER_READ as user_read, >>> USER_WRITE as user_write, >>> GROUP_READ as group_read, >>> GROUP_WRITE as group_write, >>> OTHER_READ as other_read, >>> OTHER_WRITE as other_write, >>> ROW_USER_ID as row_user_id, >>> ROW_GROUP_ID as row_group_id, >>> ROW_PROJECT_ID as row_project_id, >>> ROW_ALG_INVOCATION_ID as row_alg_invocation_id, >>> FROM DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' >>> >>> >>> >>> ------------------------------------------------------- >>> SF.Net email is Sponsored by the Better Software Conference & EXPO >>> September 19-22, 2005 * San Francisco, CA * Development Lifecycle >>> Practices >>> Agile & Plan-Driven Development * Managing Projects & Teams * >>> Testing & QA >>> Security * Process Improvement & Measurement * http://www.sqe.com/ >>> bsce5sf >>> _______________________________________________ >>> Gusdev-gusdev mailing list >>> Gus...@li... >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >> >> -- >> Aaron J. Mackey, Ph.D. >> Project Manager, ApiDB Bioinformatics Resource Center >> Penn Genomics Institute, University of Pennsylvania >> email: am...@pc... >> office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI) >> fax: 215-746-6697 >> postal: Penn Genomics Institute >> Goddard Labs 212 >> 415 S. University Avenue >> Philadelphia, PA 19104-6017 >> > > > > ------------------------------------------------------- > SF.Net email is Sponsored by the Better Software Conference & EXPO > September 19-22, 2005 * San Francisco, CA * Development Lifecycle > Practices > Agile & Plan-Driven Development * Managing Projects & Teams * Testing > & QA > Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev |
|
From: Deborah P. <pi...@pc...> - 2005-08-29 16:30:50
|
Steve Fischer wrote: > i am not sure i like the idea of stuffing multiple aliases into one row. I agree but it is a way to keep the data if desired and is searchable and would not require a new table, GeneticMarkerSynonym, as I wanted to use the view right away. We have several synonym tables as GeneSynonym,MoietySynonym, and ProteinSynonym and a couple others. I would propose an all-purpose table of synonyms but I thought we were very negative about tables with soft-links. At any rate that is a discussion for a release. > > steve > > Deborah Pinney wrote: > >> Aaron J. Mackey wrote: >> >>> A few notes/thoughts on the proposal: >>> >>> Why would a genetic marker be considered a sequence? I guess you >>> instead meant a view of dots.NaFeatureImp? >> >> >> >> >> Yes I mistyped in the note, the view definition was "FROM >> DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' ". >> >>> >>> The relationships between a given set of markers are defined >>> genetically (i.e. as linked loci with genetic distances measured in >>> centiMorgans) with respect to a single genetic map (which >>> corresponds to a specific set of experimental crosses or an >>> observed pedigree). The same two markers may have different >>> distances (or even be unlinked) in a different map. >> >> >> >> Not all genetic marker sets are defined strictly genetically. When >> location is a physical location on a sequence, this information can >> be captured in the dots.nalocation table and a marker can have >> multiple locations. When location is defined genetically in the >> stricter sense, capturing the information is not as straight forward >> but can be done using the same view and nalocation (note that there >> is a linkage_group attribute in the proposed view as well as an >> external_database_release_id) and a marker can be represented by >> more than one row in the nafeature view. >> >>> >>> A markers phenotype is often only its physical definition: SNP, >>> SSLP, SSCP, RFLP, etc. Is this what the SO term is supposed to >>> capture? If so, what is the "type" field meant to capture? >> >> >> >> >> No, type was intended to capture the kind of marker represented in >> the row as the ones you mention as well as markers not necessarily in >> SO as blood groups or allozymes. The sequence_ontology_id is there >> because all nafeature views have this attribute and of course it can >> be used in the case where there is an approrpriate >> sequenceontology.term_name. >> >>> >>> As mentioned elsewhere, organism/strain should probably be a single >>> foreign key into the taxonomy table. >> >> >> >> Yes, I agreed with Chris and this was dropped. >> >>> >>> Measures of heterogeneity and penetrance are also specific to a >>> given population study, and are not universally true. I could >>> imagine these and other attributes of a given study being captured >>> independently. This will be an important area of growth in the >>> next 10 years as widescale familial genotyping becomes more prevalent. >> >> >> >> >> These are nullable so not required from every study. Perhaps we don't >> want these attributes and they can be eliminated. The data I intended >> to load don't have these values but this is only one example and >> these are values often associated with markers. >> >>> >>> >>> Markers may have multiple aliases. >> >> >> >> >> Yes, this is true. We can decide not to enter a value into this >> attribute as it is nullable or we can have a list as the attribute is >> a varchar2(1000). >> >>> >>> >>> Thanks, >>> >>> -Aaron >>> >>> On Aug 23, 2005, at 12:49 PM, Deborah Pinney wrote: >>> >>>> I suggest a new view of dots.NaSequenceImp that would be used to >>>> store genetic marker data. Genetic markers are a staple genetic >>>> tool but include a large variety of data types, some of which may >>>> be covered by other feature views. I am proposing this view for >>>> the variety of genetic marker data that are not specifically >>>> stored elsewhere. Below is a proposed view definition that >>>> requires review and probably modification. >>>> >>>> SELECT NA_Feature_ID as na_feature_id, >>>> NA_SEQUENCE_ID as na_sequence_id, >>>> SUBCLASS_VIEW as subclass_view, >>>> NAME as name, >>>> SEQUENCE_ONTOLOGY_ID as sequence_ontology_id, >>>> PARENT_ID as parent_id, >>>> EXTERNAL_DATABASE_RELEASE_ID as external_database_release_id, >>>> SOURCE_ID as source_id, >>>> PREDICTION_ALGORITHM_ID as prediction_algorithm_id, >>>> IS_PREDICTED as is_predicted, >>>> REVIEW_STATUS_ID as review_status_id, >>>> STRING1 as alias, >>>> STRING2 as phenotype, >>>> STRING3 as type, >>>> STRING4 as linkage_group, >>>> STRING5 as centimorgan, >>>> STRING6 as measure_of_heterogeneity, >>>> STRING7 as penetrance, >>>> STRING8 as organism, >>>> STRING9 as strain, >>>> STRING12 as product, >>>> MODIFICATION_DATE as modification_date, >>>> USER_READ as user_read, >>>> USER_WRITE as user_write, >>>> GROUP_READ as group_read, >>>> GROUP_WRITE as group_write, >>>> OTHER_READ as other_read, >>>> OTHER_WRITE as other_write, >>>> ROW_USER_ID as row_user_id, >>>> ROW_GROUP_ID as row_group_id, >>>> ROW_PROJECT_ID as row_project_id, >>>> ROW_ALG_INVOCATION_ID as row_alg_invocation_id, >>>> FROM DoTS.NAFeatureImp WHERE subclass_view='GeneticMarker' >>>> >>>> >>>> >>>> ------------------------------------------------------- >>>> SF.Net email is Sponsored by the Better Software Conference & EXPO >>>> September 19-22, 2005 * San Francisco, CA * Development Lifecycle >>>> Practices >>>> Agile & Plan-Driven Development * Managing Projects & Teams * >>>> Testing & QA >>>> Security * Process Improvement & Measurement * http://www.sqe.com/ >>>> bsce5sf >>>> _______________________________________________ >>>> Gusdev-gusdev mailing list >>>> Gus...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>> >>> >>> -- >>> Aaron J. Mackey, Ph.D. >>> Project Manager, ApiDB Bioinformatics Resource Center >>> Penn Genomics Institute, University of Pennsylvania >>> email: am...@pc... >>> office: 215-898-1205 (Goddard) / 215-746-7018 (PCBI) >>> fax: 215-746-6697 >>> postal: Penn Genomics Institute >>> Goddard Labs 212 >>> 415 S. University Avenue >>> Philadelphia, PA 19104-6017 >>> >> >> >> >> ------------------------------------------------------- >> SF.Net email is Sponsored by the Better Software Conference & EXPO >> September 19-22, 2005 * San Francisco, CA * Development Lifecycle >> Practices >> Agile & Plan-Driven Development * Managing Projects & Teams * Testing >> & QA >> Security * Process Improvement & Measurement * >> http://www.sqe.com/bsce5sf >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |