Re: [GUSDEV] additional (AA/NA) sequence attributes

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Quoting Chris Stoeckert <sto...@pc...>:

> if these are overlooked attributes that really should be part of  any 
> sequence then we would alter the base NASequence/ AASequence  tables.

Some are, some aren't.  The most notable ones to be included in base 
AASequence
are min_molecular_weight and max_molecular_weight (just as you already use
min_start and max_start to handle fuzzy locations).  isoelectric_point also
seems like a "should have".

But "hydropathicity_gravy_score" and "aromaticity_score" are just the tip of a
long list of esoteric attributes that creating a two-column view called
DoTS.AASequenceAromaticity (aa_sequence_id, aromaticity_score) seems like
overkill, and would lead to further proliferation of the schema.

> The major argument I would see for weak typed tag values is  that 
> there is a long and arbitrary list of such attributes but I'm  not 
> convinced that this is the case.

You're not convinced because you haven't seen a long and arbitrary list, or
because you don't believe one could actually exist?

On a related topic, how should we handle organellar/compartmental targetting
predictions (i.e. not signalP predictions that are easily handled as locatable
features, but rather output from such things as MitoPred and TargetP 
that again
provide non-locatable "attributes" to an AASequence).  One route would be to
associate the GO component term via an IEA evidence code - but then where do
the algorithm_id and score(s) go?  Alternatively, these become yet more
AASequence attributes ...

Thanks,

-Aaron