From: Junmin L. <ju...@pc...> - 2007-01-25 19:59:07
|
Hi, Helen and others Sorry for the late comments. In terms of the protocol REF, I will agree with Joe that Protocol is listed in IDF, and we can assume its default term source by destination or source repository. But I have question on term source for Array Design REF. For example, if Array Express gets a MAGE-TAB refers to an array design in GEO, you cann't load it until AE loads this array design into AE locally first, can you? So you have to convert it to AE's array design id, right? ---junmin On Mon, 22 Jan 2007, Helen Parkinson wrote: > Hi > > In the interests of not being ArrayExpress centric I'd be interested to > see what those who plan to consume/provide these sheets from multiple > sources think. Junmin, Upenn people do you have an opinion? > > cheers > > Helen > > > Joe White wrote: >> Hi Helen, >> >> Regarding item 4, I thought the Protocol REF elements actually DID refer >> to the IDF. So using that option makes sense to me. But I also agree >> with Tim's idea of allowing a Term Source column in the SDRF as an >> alternative--that's what we did with other ontology terms. For the AD, >> we need the Term Source column, since the AD isn't listed in the IDF. >> So I prefer the same option that you, Michael, and Tim do; however, the >> default should be that Protocol REF is listed in IDF and the default >> Term Source is ArrayExpress --since that's where these sheets are going >> anyway. Alternatively, the default Term Source could be listed in the >> IDF, if AE is not the destination repository. >> >> Cheers, >> Joe >> >> >> >> Helen Parkinson wrote: >> >> >>> Dear all, >>> >>> here are the collated comments as promised. These are mostly minor >>> excepting 4 and 5. No-one has objected to the suggestion in 4, though 3 >>> people have expressed a preference, please see our comments in response >>> to point 5. I think the next step could be a phone call to discuss >>> these, if we need this, I suggest Thursday 25th 4pm GMT, please could >>> you indicate your availibility, >>> >>> cheers >>> >>> Helen >>> >>> >>> >>> 1. Clarification of date format in response to Joe White. YYYY-MM-D >>> with time optional is correct. >>> >>> >>> 3. Suggestion to modify the format of the mapping file/and or provide >>> some notes >>> >>> " In the mapping file it might be helpful to have some description of >>> the MAGEv1.1 items, ie class.association.attribute. In some cases we >>> follow several associations. Unless you know MAGE fairly well, it might >>> be difficult to understand what the mapped values refer to. In all >>> cases, the value starts with a MAGE class, and ends with some MAGE >>> attribute. There will be 0 of more associations in between. 3) In the >>> mapping file, the [...] tend to look like separate columns. " >>> >>> This can be modified if needed. We think the target audience are MAGE >>> literate anyway so it's a minor addition of some explanatory notes. >>> >>> 4. Suggestion from Tim to indicate a source database for protocol or ad >>> accessions >>> >>> One possible alteration which has come up is a means of indicating a >>> source database for protocol or array design accessions, where such >>> information is reused between experiments. I'd like to propose that we >>> allow the Protocol REF and Array Design REF columns to refer to the IDF >>> Term Source Name using either square brackets or parentheses, e.g.: >>> >>> Protocol REF [ArrayExpress] >>> >>> Array Design REF [GEO] >>> >>> where ArrayExpress or GEO are explicitly listed in the IDF as Term >>> Sources. I'd also suggest that in the absence of such tags it is assumed >>> that the identifier is local to the context in which the SDRF is used, >>> e.g. assuming ArrayExpress accessions for submissions to ArrayExpress. >>> >>> Note that there is scope for using the Protocol REF:namespace syntax to >>> add an external namespace to identifiers in the SDRF, but that doesn't >>> really work for accessions which don't have namespaces (for good or ill). >>> >>> >>> OR >>> >>> to allow Protocol REF and Array Design REF to be associated with Term >>> Source REF columns. It's more flexible and only a minor addition to the >>> specification. >>> >>> Michael prefers this option, so do Helen and Tim >>> >>> 5. Set of comments from Michael, my comments in line >>> >>> the additional set of fields for the IDF are to specify a set of files >>> that carry additional annotation information on the Material fields of >>> the SRDF. the use case is perhaps an additional MAGE-ML file whose >>> BioMaterial identifier matches up to the identifier of one of the >>> source, sample or extract names (including the specified or default >>> <authority field) and simply contains <OntologyEntry elements with no >>> reference elements (those are in the SRDF file). the other example type >>> of file might be a CDISC SEND formatted file. >>> >>> i would propose that the IDF be able to include along with the SDRF >>> file, an 'Annotation File' row and an 'Annotation File Type' ("MAGE-ML" >>> or "CDISC-SEND Clinical Pathology") row which could have multiple >>> entries. >>> >>> ------------------------------------------------------------------------- >>> **This is a major extension of the core proposal. Tim and Helen have >>> reservations: >>> >>> 5.1. About modifying the core proposal at this point - we are on a tight >>> deadline for our EBI services review and the discussion required might >>> compromise our implementation being ready on time. >>> >>> 5.2. Mix and matching MAGE and or other formats - MAGE is not human >>> readable and should not be mixed and matched with MAGE-TAB in our view. >>> Either it's MAGE-TAB or MAGE-ML not a mix. Anyone's local >>> implemenatation is of course up to them, but this is a representation >>> format not an implementation. One could use a Comment[CDISC file] for >>> this in the IDF for example if support is needed right away. >>> >>> 5.3. CDISC is an interesting case, this should be investigated and maybe >>> a MAGE-TAB 1.1 could reference such a format. There will probably be >>> other such interesting cases We (AE) don't want to commit to supporting >>> such formats at this point without a group discusson and some examples >>> should be carefully examined. We are not happy to add this to the spec, >>> especially as it's already published with no mention of this. Is there >>> an available parser API? It would be good to initiate a discussion with >>> CDISC as well. So we're not ruling this out, but we would prefer not for >>> this version. In fact it might be better discussed as MAGE2 and MAGE2's >>> TAB representation, where we might consider such extensions. >>> >>> >>> 6. Michael's general editing comments, all OK in principle. >>> =============== >>> Section 1.2 (ADF) >>> If the investigation uses arrays for which a description has >>> been previously provided, cross-references to entries in a public >>> repository (e.g., an ArrayExpress >>> accession number) can be included instead of explicit array >>> descriptions. >>> >>> becomes: >>> >>> If the investigation uses arrays for which a description has >>> been previously provided, cross-references to entries in a public >>> repository (e.g., an ArrayExpress >>> accession number), such as a standard commercial array, can be included >>> instead of explicit array descriptions. >>> === >>> paragraph beginning with "The main weight..." in the e.g. it looks like >>> 'row' should be 'raw' >>> === >>> Section 1.2 ('The degree of nodes') >>> One example has the source nodes having 10 outgoing nodes, so it and >>> reference nodes both might have a large number plus the usual max >>> outside of source and reference nodes is probably more like 4 than 3. >>> === >>> Many of the figures (1,4,7,20.b,22,etc) don't have all the rows and >>> columns with clear separator lines. >>> ==== >>> 2.3.6 >>> the example is confusing to me, it is the variation in ChIP-chip which >>> probably is better as one diagram to show the gap, i think a better >>> example is when there are a lot of annotation columns where breaking it >>> up clearly on a sample or extract as the last column and beginning with >>> that same column in the second file might be less confusing. >>> === >>> 2.3.7 >>> last sentence says "Alternatively...", shouldn't that be "In >>> addition..."? >>> === >>> 2.4 >>> 1st para 2nd sentence says "abundance", wouldn't "presence" be better? >>> === >>> 2.3.5 and Notes on Table 7 >>> "gaps (or the - symbol)" >>> might be clearer >>> "gaps (or the - symbol) separated by tabs" >>> === >>> 2.4 >>> 3rd para 2nd sentence says 'Composite Elements and Reporters' and figure >>> in 2.5 has column Composite Element Name before Map2Reporter. >>> >>> stylistically (and for clarity) it might be more consistent to always >>> have a Reporter mention before a Composite Element mention (sorry, my >>> english master degree speaking out) >>> === >>> 3.1, 5th bullet >>> if annotation files are added, mention annotation files here in addition >>> === >>> new section 3.1.3 added to mention annotation files >>> === >>> Figure 1 and 24, >>> if annotation files added, adding to figures and example file >>> === >>> 3.1.5 >>> add at end that "this allows specifying <authority in these cases". >>> some of the earlier sections in 3.1 might do to mention how different >>> <authority modifiers to the <name field come in. >>> === >>> 3.2.3 >>> end of first sentence add "and one or more ArrayDesigns" >>> === >>> 3.3.1 >>> 3rd para, 5th sentence(?) "umber" should be "number" >>> === >>> 3.3.2 >>> para after figure 26, it is also possible in distinguishing type that >>> when there are two different types at the same level, to resolve this >>> just means moving the node representation to a higher level where there >>> is already a matching type. >>> === >>> table 7 >>> this is a bit confusing, might be better to have a table of the top, >>> non-modifying columns, then the set of columns that modify the top level >>> columns, then the set of columns that modify that set and so on. >>> >>> >>> >>> >> >> >> ------------------------------------------------------------------------- >> Take Surveys. Earn Cash. Influence the Future of IT >> Join SourceForge.net's Techsay panel and you'll get the chance to share your >> opinions on IT & business topics through brief surveys - and earn cash >> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV >> _______________________________________________ >> Mged-MAGE2 mailing list >> Mge...@li... >> https://lists.sourceforge.net/lists/listinfo/mged-mage2 >> > > -- > Helen Parkinson, PhD > Curation Coordinator > Microarray Informatics Team, > EBI > > EBI 01223 494672 > Skype: helen.parkinson.ebi > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Mged-MAGE2 mailing list > Mge...@li... > https://lists.sourceforge.net/lists/listinfo/mged-mage2 > |