From: Jonathan C. <cra...@pc...> - 2003-05-22 20:05:15
|
Jessie- Jessica Kissinger wrote: > If memory serves me correct, once sequences were loaded into GUS, we > then retrieved them along with their GUS ID to submit for blast searches. Yes, I think that's right. > When we retrieved the sequences, we created a custom format for the > header line, such that the blast results once generated for these > sequences could be easily parsed and loaded with the existing plug-in. > > Can someone tell me what the format of the fasta header should be, > i.e is it ">GUSID, External_NA _sequence Name" or the other way around > and should there be any formatting, tabs, spaces etc. If I remember > correctly, the blast results were loaded by GUSID not "name", but I > don't remember. My recollection is that the defline started as you said, with ">GUSID ". I don't believe that the format is crucial, because (again, from what I remember) when you run the plugin to load the BLAST similarities you supply it with a regular expression that it uses to pick the GUSID (an na_sequence_id for most of the PlasmoDB searches) out of the defline. So as long as the regex matches the defline format, you should be OK, and I don't think that the plugin uses anything on the defline except for the GUSID. Jonathan |