Allyson Lister - 2014-11-26

Plain text file format seems redundant, as does ASCII format. It is enough to use EDAM's "plain text format (unformatted)", as it is much clearer. EDAM's class is intended for those files which have no formatting requirements. The SWO "plain text file format" hierarchy is awkward and not in keeping with the rest of the "text file format" hierarchy

plain text file format
ASCII format
.raw files
.CDF ASCII
.CEL ASCII
Data File Standard for Flow Cytometry
FCS3.0
tab delimited file format
cdt
gct
gpr format
gtr
MAGE-TAB

None of the other formats are classed under ASCII, and technically ASCII is an encoding scheme, not a format. Instead, if we want to keep ASCII, which might be useful (there are a number of character encodings out there!), I think we should move ASCII into a sibling class of data and data format called "character encoding scheme" and link to formats via a relationship of the type "formatA hasEncoding some encodingB". The new hierarchies would be:

data
data format specification
[...]
.raw
text file format
[...]
.CDF ASCII has encoding only ASCII
.CEL ASCII has encoding only ASCII

    Data File Standard for Flow Cytometry
        FCS3.0 has encoding some ASCII
    tab delimited file format
        cdt
        gct
        gpr format
        gtr
        MAGE-TAB
    [...]

character encoding scheme
ASCII

The "raw" class has moved up as there are lots of manufacturer-specific versions of raw files, and may include image data as well as textual data, and therefore shouldn't remain where it is.