From: Arnaud K. <ax...@sa...> - 2004-10-08 10:47:56
|
Steve Fischer wrote: > Arnaud- > > see below. > > steve > > Arnaud Kerhornou wrote: > >> Hi everyone >> >> To be able to reproduce the OrthoMCL method, I would like to raise >> two issues we've got: >> >> * The first issue relates to the view where are stored the protein >> sequences. I was thinking to use the TranslatedAASequence view as >> this one contains the translated sequences of our gene models. The >> problem I have is that it is missing a name attribute so I can not >> match the blast output query and subject names with the data into GUS >> (I didn't want to use the TranslatedAASequence primary keys as the >> identifiers of my proteins of interest). >> Could we add a name attribute to this view ? > > > hmm. not quite following. what would this name be, where would it > be derived from? By default we assign the systematic id of the corresponding CDS to the protein name. > why not use source_id and/or secondary_identifier? We could do that, but in any case that would involve to modify the code of the loading BLAST output plugin (LoadBlastSimFast.pm) to get the sequences entries. At the moment the match is made on the primary key (which I want to avoid) or the name attribute. The source_id attribute would do instead of the name attribute. It must work for any blast (DNA Vs DNA or Protein Vs Protein) with the various potential GUS sequence objects we want to attach similarity data to. As far as I can see the source_id attribute is present in all of them (AASequenceImp and NASequenceImp tables). > or, presumably this translated sequence has a relationship back to > its na sequence (although i don't immediately see that in the schema > browser), so couldn't you get a name or source_id from there? > That would require a more sophisticate query to get the sequence entry. >> >> * The second issue relates to the BLAST output parsing, done by a >> module called BlastAnal.pm in the CBIL package. This module seems to >> parse BLAST output file with only one query sequence. I have more >> than one query sequence reported so I had to change the code of this >> module to allow more than one query sequence. Can my code be >> integrated to CBIL package ? Note that I didn't change the interface >> of this module so it doesn't affect the scripts that are using it, >> I'm thinking in particular of parseBlastFilesForSimilarity.pl > > this sounds ok. how about we just take a quick look at this together > while you are visiting? then we can fold it into the code base. do > you want to send it by mail? > That's fine, the module is attached. >> >> cheers >> Arnaud >> >> > |