A request from Prudence to add a new grouping term to remap some existing 'binding ; GO:0005488' annotations.
small molecule binding ; GO:NEW
We have 'small molecule metabolism ; GO:0044281' terms in process. 'small molecule transport ; GO:0006832' has been obsoleted.
'small molecule ; GOCHE:0090012' has been obsoleted from GOCHE.
Based on the current GO process terms, 'small molecule binding ; GO:NEW' would have the following children:
alcohol binding ; GO:0043178
organic acid binding ; GO:0043177
urea binding ; GO:0033219
vitamin binding ; GO:0019842
I suspect it would need a few more children. Is it a useful grouping term? If so, we'd need to tighten our definition of 'small molecule' in GO. It's currently:
any monomeric molecule of small relative molecular mass.
Becky
Question (not argument). "Small molecule transport" was obsoleted because it was an unnecessary grouping term. What is the difference between binding and transport that that the one needs grouping and the other doesn't?
We added it back at a later point because it turned out it was useful for some people! At least in relation to metabolic processes. Val should comment.
Definitions (also picking up on Harold's comment in the discussion of SF 2106200.
Michael made a distinction that looks really useful here, between "encoded" molecules and "unencoded" ones. The former are DNA, RNA, proteins and perhaps covalently modified forms of these. The latter is everything else. Here, I think we're talking about the latter, and I think the encoded / unencoded distinction handles edge cases well. Oligo peptides like small mammalian peptide hormones, derived by proteolytic cleavage of an encoded larger protein are encoded. Polypeptides of similar size synthesized by other mechanisms like glutathione or some fungal peptide antibiotics are not. (And that seems right - we'd want to describe a metabolic origin for the latter but not the former.) Likewise glycogen, starch and cellulaose are not and again, the conventional textbook descriptions of the generation and turnover of these molecules group them with small-molecule metabolism.
Thinking more, a reason for the transport / binding discrepancy becomes clear: the word transport is applied to processes that move unencoded molecules. When an encoded one moves, some other word is used - protein import / export across the nuclear envelope, protein import into mitochondria, vesicle-mediated protein secretion from cells, pinocytosis- and endocytosis-mediated uptake of proteins and complexes built around proteins. That leaves cellulose, glycogen, and starch as orphans, but do these entities ever travel between cellular compartments or in and out of cells?
Notice that large carbohydrates are solidly on the small-molecule (unencoded) side here - I'm pretty sure that works fine for mammals, but is it OK for yeasts and other fungi?
Peter,
I like it!
Val
I like Peter's distinction between encoded- and non-encoded. The building blocks of DNA (nucleosides etc) are generally considered small-molecules tho.
Wikipedia (I'm not saying it's a definitive reference, just looked it up out of interest) defines a small molecule as
A low molecular weight, organic, non-polymeric compound ....
(i.e. includes monosachharides but not polysaccharides as small molecules (this fits with what we currently have in GO). The lines seem blurred for disaccharides, but I'm tempted to exclude them.
Created: small molecule binding ; GO:0036094
Gave it the following is_a children, based on the children of the existing small molecule metabolism terns:
alcohol binding ; GO:0043178
nucleobase binding ; GO:0002054
nucleoside binding ; GO:0001882
nucleotide binding ; GO:0000166
organic acid binding ; GO:0043177
urea binding ; GO:0033219
vitamin binding ; GO:0019842
Defined small molecule as:
A low molecular weight, monomeric, non-encoded molecule.'
Comment: Small molecules in GO include monosaccharides but exclude disaccharides and polysaccharides.
"Small molecule" is definitely a slippery term here. Do you really want to include cholesterol (mw = 386.65) or a cholesterol ester but exclude sucrose (mw = 342.30) because it is a disaccharide? Or exclude a fungal peptide antibiotic (not synthesized on ribosomes from mRNA, so it's not a small protein either)? For that matter, we definitely want to be able to annotate a protein that binds bacterial lipopolysaccharide, but LPS also is an orphan - too big to be small, but not a genome encoded DNA, RNA, or protein.
The distinction between encoded and not_encoded molecules might be more robust (where "encoded" includes processing products).