#2 Standard synonym set

open
nobody
None
5
2005-03-03
2005-03-03
girlwithglasses
No

The standard synonym set would be a selection of words which could
be used in place of the GO wording for a concept. For example, you
might have a GO term for 'regulation' of a process; the standard
synonym set would contain words such as 'control', 'coordination',
'moderation', 'modification', 'modulation' and 'organization'. The
standard synonym set could be used to generate a set of the
possible synonyms for a given term, which could either be integrated
into an expanded GO flat file or into a search engine.

e.g. user searches for 'formation of ethylene'

The standard synonym set contains the info that 'ethylene' is a SS of
'ethene' and 'formation' is a SS of 'biosynthesis'. The search engine
searches for 'ethene' and 'biosynthesis' in the term string, and comes
up with GO:000xxxx (whatever the term ID is).

The SSS could be used in both directions: you could use it as
illustrated above, to convert non-GO phrases into GO terminology, or
you could use it the other way around, to create a list of possible
ways of expressing a GO concept (eg. ethylene biosynthesis,
ethylene anabolism, ethylene formation, etc., etc.).

Obviously, not all combinations of SS are going to make sense or will
represent colloquial usage - for example, people say 'cell cycle
control' but rarely say 'cell cycle moderation'. There are two
possibilities here: the first is to rate each SS as to how 'synonymous'
it is with the main term and from that, create a confidence rating for
our synonym. E.g. 'ethylene' is an exact synonym of 'ethene', but
'regulation' and 'modification' have different meanings (modification is
also used to refer to physical alteration), so you could be very
confident that 'ethylene biosynthesis' meant the same as 'ethene
biosynthesis', but less confident that 'modification of ethylene
biosynthesis' meant the same as 'regulation of ethene biosynthesis'.

The alternative is to ignore the possible divergence in meaning
between term and synonym, because after all, the SSS would not be
able to work out whether a certain phrase makes sense or if it is used
colloquially. The SSS could be strengthened with some grammatical
rules (and I would like to start incorporating these into the
documentation), e.g. regulation terms should be of the form

[modifier] regulation of [process|function|trait]

but it's never going to be able to work out that people always talk
about 'induction of apoptosis' rather than 'initiation of apoptosis'.

Discussion