I have just recently become involved in a new sequencing project. I used to use Staden - oh 4-5 yrs ago. Now I tried to bring in >1500 sequences and pregapped and then gap4 with a new installation of 1.7.0 on Redhat. I am getting some wierd things. For example. Some sequences are clipped to 50 bases, and they look fine. Others are not clipped and go out to 1400+ bases, giving huge regions that obviously have poor matching. Some "aligned" sequences are a base off in the automatically generated contigs. When I right click on a match from the find internal joins contig display, they don't come up aligned, but I need to manually click the "align" button.
I don't know exactly where to start figuring out what is wrong. I have tried assembling repeatedly and each time get similar results. These are AB1 reads. I have tried with and without phred althoguh the quality clip was by confidence, as the AB files have a confidence value in them, I am using gap4/phrap. -- maybe I should try it without phrap?.
Anyone else seeing these problems? Is it our installation, or some parameter I am setting wrong. I have never dived into a project with this much data to start with. This already has been analyzed and some finishing started, so there is a combination of vector primers (transposon mediated methodology on a bac) and finishing primers. also alternative chemistry. Of course file names were not created in any systematic way, so I am treating all the sequences the same.
Log in to post a comment.