Requested via the mailing list. Fairly easy to do.
Is there a way to list restriction enzyme sites that cut within one region but also do not cut
within another region with the EMBOSS programs?
I thought that restrict should be able to do this, but it doesn't seem to or I'm missing
something. GCG's mapsort -excl does this.
Your point about "not directly" inspired me to generate an alias in a .src file. If one keeps a
list of favorite restriction enzymes in a file called lmbenz.dat (containing one restriction
enzyme name per line) in a directory like ~/sequences, then the command:
re t.gbk 300 500
Will generate restrict output only showing enzymes that do not cut within the region 300 to 500 in
the sequence named as the first parameter, t.gbk in this case.
alias re 'restrict -auto \!:1 re.tmp -enz=@~bcitron/sequences/lmbenz.dat -sb \!:2 -sen \!:3
-rformat2 simple; grep name re.tmp| sed s/"Enzyme_name: "//g > rel.tmp; grep -v -f rel.tmp
~bcitron/sequences/lmbenz.dat > rene.dat; restrict -auto \!:1 stdout -enz=@rene.dat'
There are probably better ways to accomplish this, and with intermediate files, one should be able
to show the excluded enzymes and other enzyme sets. And there may be output options for restrict
that don't require quite the grep and sed cleanup. Also, it should certainly be possible to
adjust this for multiple exclusion ranges, etc.