Does anybody know how to export:save all the readings of a contig in fasta format ?
Thank you in adavance for your assistance,
Hello Francois! You are right. It is very difficoult to export all the readings froma a contig in a fasta format. Precisely I have the same problem and I'm working on a program/script (in Lazarus - similar to Delphi - Pascal) for solving the problem. I'm just finishing it (it's name is "Fasta Viewer") and if you will be patient I hope I'll put it on my department site as soon as possible (I have just to fix some bugs... but unfortunatelly I don't have time... you now... lab problems).
But for now I have a temporary solution for you (good if you don't manage a lot of sequences):
1. Open "Gap4"
2. On main menu click on "Options" then on submenu click on "Configure menus"
3. Now you opened a new pop up window. Chose "Expert" radio bottom
4. Click on "OK Permanent"
5. From now and for ever you will be using "Expert" configuration. This allow you to export readings:
6. Open your project (File-Open-"your database name.aux"
7. On Gap4 main menu choose "File" and on submenu there is now a new voice: "Extract readings" click on it!
8. On the new window choose this configuration:
- Input reading names from: list and then "allreadings"
- Destination directory: you can leave Extract (it create a new folder wher gap4 extract all the readings)
- Output format: normal
This allow you when opening the folder extract to open all the radings. But there are two problems (my program solve them :-) but you have to wait...):
1. They are not in fasta format. For this reason you have to open them one by one and change manually them using a text editor like Notebook (put ">" before the name, clean all the line you don't need and save the file with the .aa extension (wich mean that this is a fasta file)
2. They are all saved one by one and not together so you have to do it by hand using copy paste.... a lot of work I know. Expecally if you have a lot of sequence but when my program will be finish all this things it make it by its self.......
Sory for errors but now is very late and I'm tired and write it very fast....
I'd agree with Filips detailed post however I would create the list of readings for a particular contig using the
Lists > Contigs to readings
option to create a contig specific list instead of "allreadings"
The files produced by 'Extract Readings' are in exp format and various software packages will accept these as a form of the embl format and convert them OK to fasta format.
Staden does have a utility extract_seq that should convert from exp to fasta that can also usefully remove the quality and vector clipped regions from the ends of the sequences.
Unfortunately 'extract_seq' currently seems to have some problems and doesn't work as described so the way I used to use it (just running the Extract Sequence module from pregap4 by itself on the exp files) no longer works.(I do now have a rewitten version of the extract_seq.p4m file which now works for me, let me know if that is of interest)
Anyway if you are running Staden on linux a (nasty) bash command like:
while read exp ; do extract_seq -fasta_out -good_only $exp >> my_seqs.fasta ; done < fofn
should work for you (if run in the Extract directory metioned above) to convert everything to fasta format.
Where fofn is the file of filenames generated by the extract readings process and my_seqs.fasta is the output file for the reformatted seqs.
I've just noticed that James has actually just fixed the extract_seq problem in his latest io_lib update (9/7/08) which means the Pregap4 module Extract Sequences will work again for converting exp files to fasta.