Hi,
Is this correct?
"
1. It was agreed at the NCI workshop to go for the 'vertical format',
for the reason that it is easier to view and scroll vertically than
horizontally;
Which format is easier to scroll is, perhaps, a matter of taste. What
is *not* open to debate is that the horizontal format will fail to
catch a major class of invalid entries because of it's inability to do
type checking. The choice of a horizontal format guarantees that the
chief priority of MAGE-TAB -- as a way for researchers to enter and
supply information -- will be badly compromised.
"
I thought the horizontal format *will* do error and type checking. What
is meant by 'horizontal' format? Column headers along the top--right?
Joe
Don Maier wrote:
> Hi Alvis,
>
> I fear that some of your points glide past some critical
> considerations (see below).
> But I do appreciate your willingness to keep this issue open!
>
> Regards,
> Don M.
>
> On May 20, 2006, at 12:39 AM, Alvis Brazma wrote:
>
>> Folks,
>>
>> 1. It was agreed at the NCI workshop to go for the 'vertical
>> format', for the reason that it is easier to view and scroll
>> vertically than horizontally;
>
>
> Which format is easier to scroll is, perhaps, a matter of taste.
> What is *not* open to debate is that the horizontal format will fail
> to catch a major class of invalid entries because of it's inability
> to do type checking. The choice of a horizontal format guarantees
> that the chief priority of MAGE-TAB -- as a way for researchers to
> enter and supply information -- will be badly compromised.
>
>>
>> 2. It is a trivial operation to transpose a matrix, so if anybody
>> wants it horizontally for his/her, use they can do it and then
>> transpose back;
>
>
> Of course any one of us can *program* a matrix transpose. But we
> ignore the prime intent of MAGE-TAB if we expect researchers trying
> to enter data to attempt this. And while transpose is an operation
> in Excel, it does not change the columns' role in defining data
> types. Nor does this change any spreadsheet user's expectations
> that the column headers will be at the top!
>
>>
>> 3. We did agree to revisit and finalize some of the details in the
>> next jamboree in Hinxton in July;
>>
>> 4. IDF is the very simplest of the MAGE-TAB formats, therefore I
>> think it is overkill to spend too much of the idscussion time on
>> this particular issue.
>
>
> Unfortunately, small design errors can (and often do) undermine
> entire projects. The fact that IDF holds fewer data than ADF or EDF
> does not make it any less critical in a coherent design.
>
>>
>> Personally I do not mind which format we use, but for now it's
>> 'vertical' as agreed, and if there are compelling reason to
>> transpose it, let's do it in Hinxton
>>
>> Cheers,
>> -Alvis
>>
>>
>> On Fri, 19 May 2006, Catherine Ball wrote:
>>
>>> When I mocked up the initial stab at the IDF, it was done
>>> horizontally for no particular reason -- maybe because it fit
>>> better on my laptop monitor during the single afternoon I was
>>> working on it. Unless there are compelling reasons to stick with
>>> the horizontal arrangement, I see no reason not to change it to be
>>> easier to use and in the same orientation as the other files (ADF,
>>> EDF).
>>>
>>> Cathy
>>>
>>> On May 19, 2006, at 2:41 PM, Don Maier wrote:
>>>
>>>> Hi Alvis,
>>>> In light of my previous problems getting my thoughts out, please
>>>> let me resubmit my comments on the Investigation Design Format
>>>> (IDF) in response to your note:
>>>> I'd like to resurrect the discussion of the so-called "vertical"
>>>> versus "horizontal" format for the IDF. I'm doing this
>>>> reluctantly -- only because this issue really is too important to
>>>> leave without making sure that we understand the implications of
>>>> adopting a "vertical" format.
>>>> By "vertical" we mean that the column headers are arranged
>>>> vertically -- on the left -- as opposed to the "horizontal" format
>>>> in which the column headers are on the top. I think that some of
>>>> us may have thought that this is a matter of taste in data display
>>>> -- especially for IDF information which has a lot of headers and
>>>> few (one, two or three) rows.
>>>> Unfortunately, this choice is *not* just a matter of taste, because:
>>>> 1) The vertical format does not allow a spreadsheet program to
>>>> impose or check data types. Data type checking is one of the
>>>> chief attributes that makes spreadsheets easy to use -- because it
>>>> facilitates easy correction of a significant class of user data
>>>> entry errors. Consider, for example, the "Public Release Date",
>>>> which we want to be a date, not an arbitrary string; or the PubMed
>>>> ID, which should be a number, or Author List, which should be a
>>>> semi-colon-separated list. At least half (maybe more) of the
>>>> entries have a format that can Excel (or most other spreadsheet
>>>> programs) can validate at the point of entry -- provided that the
>>>> validation criteria are specified for a column.
>>>>
>>>> In short, with the "vertical" format where no validation is
>>>> possible, the chances of getting correctly entered data are likely
>>>> near zero.
>>>> 2) (Corollary of 1) The vertical format requires that every
>>>> piece of information be of the same data type -- or actually
>>>> (worse) untyped. Even if all the data are strings now, you can't
>>>> later add a number, or an enumeration (fixed set of values). It
>>>> would be really confusing for users to make one data type
>>>> masquerade as another.
>>>> 3) The vertical format differs from the horizontal format used
>>>> for all the other (ADF, etc.) spreadsheets. This non-uniformity
>>>> is guaranteed to confuse users.
>>>> 4) The vertical format precludes automatic generation of
>>>> mapping code. Automatic code generators rely on the semantics of a
>>>> column being uniform -- because that's how spreadsheets (should)
>>>> work. Sure, you or I can write a parser and translator
>>>> manually. But it is highly likely that it won't be as thoroughly
>>>> tested or bug-free as the automatically generated translator. Nor
>>>> will it be as maintainable -- because a change will require
>>>> another, bug-prone manual effort.
>>>> In short, the vertical IDF format is a *form* of some kind. But
>>>> it's *not* a spreadsheet in any meaningful sense. And it does
>>>> not have the ease of use that we are trying to achieve in the
>>>> MAGE-TAB format.
>>>> I'd urge that we not make this serious design error.
>>>> I would suggest that we break the currently proposed format (which
>>>> is neither a table nor a spreadsheet) into two or three
>>>> spreadsheets within a single .xls file. I think that this would
>>>> very nicely accommodate the desire to collect together all this
>>>> information together as "header" information in one "workbook",
>>>> while still segregating the different types of information (into
>>>> different "worksheets") that permit them to be effectively
>>>> represented as spreadsheets.
>>>> Regards,
>>>> Don Maier
>>>> Sr. Software Designer/Research
>>>> Dept. of Biochemistry
>>>> Stanford University School of Medicine
>>>> On Apr 27, 2006, at 4:55 AM, Alvis Brazma wrote:
>>>>
>>>>> Dear All,
>>>>> As some of you know, jointly with my colleagues including some of
>>>>> you, I've been working on a draft paper about MAGE-TAB. The
>>>>> latest draft is in the attachment for your comments. The current
>>>>> list of authors include include some of those who have
>>>>> contributed to the proposal substantially, either in the two MAGE
>>>>> workshops or otherwise. I may have missed somebody, and who
>>>>> knows, may be some may not want to be authors. The authors list
>>>>> is open and will be finalised after the May workshop at NCI, if
>>>>> we decide to go ahead and submit this paper.
>>>>> The MAGE-TAB documentation has been recently updated and is
>>>>> available from http://www.mged.org/Workgroups/MAGE/
>>>>> mage.html#mage-tab
>>>>> All comments are most welcome either before or during the NCI
>>>>> workshop.
>>>>> Cheers,
>>>>> - Alvis
>>>>> <mage-tab-paper-draft1.doc>
>>>>
>>>
>
> Don Maier
> Sr. Software Designer/Research
> Dept. of Biochemistry
> Stanford University School of Medicine
>
|