From: Mark G. <mg...@us...> - 2003-06-26 16:41:48
|
Update of /cvsroot/gmod/apollo/src/java/apollo/datamodel In directory sc8-pr-cvs1:/tmp/cvs-serv10599/src/java/apollo/datamodel Modified Files: AbstractSequence.java CacheSeqIdLoader.java EnsCGISequence.java GAMESequence.java Sequence.java Log Message: So I set out to fix the jalview bug where all the sequences were getting off by one on scrolling and it turned into a slippery slope. So Suzi had changed jalviews Sequence (that was independent of apollos Sequence) to AlignSequence which subclassed from apollo.datamodel.Sequence. This is nice because it gets rid of redundant sequence models, but it introduced this off by one bug. The problem was that jalview sequences are zero based and apollo sequence are either one based or they have their own range, but they dont deal with sequences that are zero based(with no range). Theres two things that have to happen with sequence. 1) The end has to be made inclusive by adding one to it. This is because apollos substring ends are inclusive where java Strings ends are exclusive. 2) Before doing java string operations(ie substring), the coordinates have to be set to zero based. If they are one based, one has to be subtracted from start and end. If they are range based, the range start has to be subtracted from start and end. If they are zero based(jalview) dont do anything. I did a little refactoring as I found things a bit confusing how they were and it took me a while to figure out what was going on. AbstractSequence.pegLimits checked if the coordinate was in bounds, but it also had the side effect of doing the shifting and end inclusion. Previously AbstractSequence.getResidues(start,end) called pegLimits(start,true). The true would cause pegLimits to subtract one from the start if the sequence had no range. pegLimits on end was called with false, so it would not get one subtracted. So thus it would both be zero shifted and end included. But end inclusion was only for non-range seqs, so seqs with range would still have to do there own end inclusion, and you just have to know this. I changed pegLimits to do none of the side effects of shifting, and it just checks limits now. I added a makeEndInclusive method (which just adds one, its there to make it really clear why its doing it). I added a needInclusiveEnd method which by default is true. EnsCGISequence overrides this to return false because the cgi takes care of inclusive end (no java string issues). GAMESequence no longer adds one to its end as its now done in AbstractSequence. CacheSeqIdLoader (which can only handle GAMESequence and perhaps should be renamed CacheGameSeqLoader) was adjusted to deal with the inclusive end. I also added shiftFromOneToZeroBased(int) which just subtracts one as well, again its to make explicit what is going on. And theres a needShiftFromOneToZeroBasedCoords() and a setNeedShiftFromOneToZeroBasedCoords(boolean). Seqs that have a range return false for needShiftFromOneToZeroBasedCoords as they do the shifting themselves. Rangeless seqs default to return true for needShifting. Zero based seqs(AlignSeq) setNeedShift to false. While this is clearer than before, now that im writing this im seeing it could be done even clearer, maybe ill fiddle with it a bit more. So now AbstractSequence can deal with zero-based seqs which fixes the bug in jalview. Also another bug fixed is that result seqs were coming up with an extra base pair. It turns out that inclusive end was being compensated for twice, both in AbstractSequence and Sequence. So I took that out of Sequence. Steve,Michele, or Vivek: could you please test these changes on ensembl results (and other data adapters). Either that or get result seqs visible from the outside world - i know the pfetch server is up but its still not installed in apollo to view the result seqs - steve do i have this right? Ive tested it with the genomic from ensembl and thats working ok. I also noticed the very last base of the genomic sequence is not getting drawn, but it will select and find and even draw if coloring by splice site potential, so im going to look into that. Index: AbstractSequence.java =================================================================== RCS file: /cvsroot/gmod/apollo/src/java/apollo/datamodel/AbstractSequence.java,v retrieving revision 1.10 retrieving revision 1.11 diff -C2 -d -r1.10 -r1.11 *** AbstractSequence.java 23 May 2003 22:40:35 -0000 1.10 --- AbstractSequence.java 26 Jun 2003 16:41:15 -0000 1.11 *************** *** 149,155 **** public final String getResidues (int start, int end) { if (start <= end) { // pegging does CHECK > or >= ! start = pegLimits (start, true); ! end = pegLimits (end, false); return getResiduesImpl(start, end); } else { --- 149,166 ---- public final String getResidues (int start, int end) { if (start <= end) { + // order here is important. pegLimits checking expects one or range based + // without the extended end for inclusiveness // pegging does CHECK > or >= ! //start = pegLimits (start, true); ! start = pegLimits (start);//, false); ! end = pegLimits (end);//, false); ! ! if (needInclusiveEnd()) end = makeEndInclusive(end); ! ! if (needShiftFromOneToZeroBasedCoords()) { ! start = shiftFromOneToZeroBased(start); ! end = shiftFromOneToZeroBased(end); ! } ! return getResiduesImpl(start, end); } else { *************** *** 158,178 **** } ! private int pegLimits (int value, boolean begin) { // There was a problem (that came out when I was trying to get BlixemRunner to work) // where some sequences inexplicably had ranges of -1 to -1/ when they should have had null ranges. // I hope this check won't break anything! if (getRange() == null || getRange().getStart() < 0) { ! if (begin) { ! --value; if (value < 0) value = 0; if (value >= getLength()) value = getLength() - 1; ! } else { ! if (value < 1) ! value = 1; ! if (value > getLength()) ! value = getLength(); ! } } else { // System.out.println("pegLimits: range = " + getRange().getStart() + "-" + getRange().getEnd() + ", starting value = " + value); // DEL --- 169,232 ---- } ! /** Whether the end needs adding to, to make it inclusive. EnsCGISequence does not ! need an inclusive end and overrides this to return false. */ ! protected boolean needInclusiveEnd() { return true; } ! ! /** This just adds one to the end. This is because java String.substring ! does not include the end, where in apollo we want the character at the ! end coordinate included in our string. */ ! private int makeEndInclusive(int nonInclusiveEnd) { ! return nonInclusiveEnd + 1; ! } ! ! private boolean needShiftFromOneToZeroBasedCoords = true; ! /** Perhaps this is funny, but if we have a Range then shift wont happen ! regardless of what you set here. This only applies if we have no range */ ! protected void setNeedShiftFromOneToZeroBasedCoords(boolean needShift) { ! needShiftFromOneToZeroBasedCoords = needShift; ! } ! ! /** Returns true if getRange()==null && needShiftFromOneToZeroBasedCoords is true. ! If AbsSeq has a range then the shift will be handled by the subclass dealing ! with the range(for now) (look into migrating range math to AbsSeq) */ ! protected boolean needShiftFromOneToZeroBasedCoords() { ! return getRange()==null && needShiftFromOneToZeroBasedCoords; ! } ! ! /** This just subtracts one from coord to shift from one based(apollo) ! to zero based(java string) coord system. Made this an explicit method for ! clarity sake. */ ! protected int shiftFromOneToZeroBased(int coord) { ! return coord - 1; ! } ! ! ! /** If getRange!=null, checks that value is in range, and if not then puts it in ! range. If no range then does check with 1 to length. If 0 based should override this ! (as jalview.datamodel.AlignSequence does) ! Get rid of begin/one->zero shift here - clever but opaque. ! This now expects coordinates to be one based or range based (not zero based) ! (it used to do the actual one->zero shifting) ! Zero based seqs need to override this mthod (see jalview.datamodel.AlignSeq) ! */ ! protected int pegLimits (int value/*, boolean begin*/) { // There was a problem (that came out when I was trying to get BlixemRunner to work) // where some sequences inexplicably had ranges of -1 to -1/ when they should have had null ranges. // I hope this check won't break anything! if (getRange() == null || getRange().getStart() < 0) { ! /*if (begin) { // DELETE ! //if (usingOneBasedCoordinates()) ! --value; if (value < 0) value = 0; if (value >= getLength()) value = getLength() - 1; ! } ! else { */ ! if (value < 1) ! value = 1; ! if (value > getLength()) ! value = getLength(); ! //} } else { // System.out.println("pegLimits: range = " + getRange().getStart() + "-" + getRange().getEnd() + ", starting value = " + value); // DEL Index: CacheSeqIdLoader.java =================================================================== RCS file: /cvsroot/gmod/apollo/src/java/apollo/datamodel/CacheSeqIdLoader.java,v retrieving revision 1.2 retrieving revision 1.3 diff -C2 -d -r1.2 -r1.3 *** CacheSeqIdLoader.java 10 Sep 2002 12:58:25 -0000 1.2 --- CacheSeqIdLoader.java 26 Jun 2003 16:41:15 -0000 1.3 *************** *** 7,10 **** --- 7,13 ---- import org.bdgp.util.*; + /** This presently will only work with parent as GAMESequence. + Perhaps rename CacheGameLoader? */ + public class CacheSeqIdLoader extends CacheSequenceLoader { *************** *** 13,17 **** } ! // Zero based numbering for sequence. public String getResidues(int low, int high) { RangeI range = ((GAMESequence) parent).getRange(); --- 16,22 ---- } ! // Zero based numbering for sequence. - thats wrong ! // low and high are in range coords (not zero based) and the end is ! // added on to make it inclusive public String getResidues(int low, int high) { RangeI range = ((GAMESequence) parent).getRange(); *************** *** 20,24 **** return ""; ! if (high > range.getHigh()) return ""; --- 25,29 ---- return ""; ! if (high > range.getHigh() + 1) // +1 -> inclusive end return ""; Index: EnsCGISequence.java =================================================================== RCS file: /cvsroot/gmod/apollo/src/java/apollo/datamodel/EnsCGISequence.java,v retrieving revision 1.4 retrieving revision 1.5 diff -C2 -d -r1.4 -r1.5 *** EnsCGISequence.java 4 Mar 2003 16:12:00 -0000 1.4 --- EnsCGISequence.java 26 Jun 2003 16:41:15 -0000 1.5 *************** *** 37,40 **** --- 37,45 ---- + /** AbstractSequence by default will add one to the end on getRes to counteract + java Strings non inclusiveness. EnsCGI already retrieves sequence with the end + included. Returning false disables AbstractSequence's inclusive end. */ + protected boolean needInclusiveEnd() { return false; } + public void setServer(String server) { this.server = server; Index: GAMESequence.java =================================================================== RCS file: /cvsroot/gmod/apollo/src/java/apollo/datamodel/GAMESequence.java,v retrieving revision 1.6 retrieving revision 1.7 diff -C2 -d -r1.6 -r1.7 *** GAMESequence.java 23 May 2003 22:40:35 -0000 1.6 --- GAMESequence.java 26 Jun 2003 16:41:15 -0000 1.7 *************** *** 11,15 **** /** Currently GAMESequence is not lazy. I think it extends AbstractLazySequence ! because it intends to be lazy at some point */ public class GAMESequence extends AbstractLazySequence { String range_residues; --- 11,16 ---- /** Currently GAMESequence is not lazy. I think it extends AbstractLazySequence ! because it intends to be lazy at some point. GAMESequence is the main genomic ! sequence for game data(but not the result sequences) */ public class GAMESequence extends AbstractLazySequence { String range_residues; *************** *** 38,48 **** } ! /* Note: low and high are relative coordinates */ protected String getResiduesFromSourceImpl(int low, int high) { if (range_residues == null) return null; int str_low = low - genomicRange.getStart(); ! int str_high = high - genomicRange.getStart() + 1; if (str_low == str_high) --- 39,55 ---- } ! /* Note: low and high are relative coordinates(coords in the range). ! genomicRange.getStart() is subtracted to transform them into java string zero ! based coords. */ protected String getResiduesFromSourceImpl(int low, int high) { if (range_residues == null) return null; + int str_low = low - genomicRange.getStart(); ! // +1 is for the inclusive end i believe ! // inclusive end is now centralized in AbstractSequence to alleviate confusion ! //int str_high = high - genomicRange.getStart() + 1; ! int str_high = high - genomicRange.getStart(); if (str_low == str_high) Index: Sequence.java =================================================================== RCS file: /cvsroot/gmod/apollo/src/java/apollo/datamodel/Sequence.java,v retrieving revision 1.21 retrieving revision 1.22 diff -C2 -d -r1.21 -r1.22 *** Sequence.java 6 Jun 2003 18:14:33 -0000 1.21 --- Sequence.java 26 Jun 2003 16:41:15 -0000 1.22 *************** *** 70,79 **** return residues; } ! protected String getResiduesImpl(int start, int end) { if (residues != null) { // why plus one? plus one can cause StringIndexOutOfBoundsException if no residueOffset // and end equals length of residues ! int substringEnd = end - residueOffset + 1; // prevents StringIndexOutOfBoundsException if (substringEnd > residues.length()) substringEnd = residues.length(); --- 70,89 ---- return residues; } ! ! /** this does a plus one to the end which i assume is to be inclucsive, the only ! problem i think is AbstractSequence.getResudes calls pegLimits which sutracts ! one from start to ! convert from one based to zero based, but one is not subtracted from the end ! which i think is also an act of inclusivity. By inclusivity i mean that java ! substring includes the start but not end, the apollo standard is to include both. ! This seems to accomodated for twice and is thus adding an extra base. Am I ! right about this? */ protected String getResiduesImpl(int start, int end) { if (residues != null) { // why plus one? plus one can cause StringIndexOutOfBoundsException if no residueOffset // and end equals length of residues ! // + 1 was for inclusive end but thats already in AbstractSequence ! //int substringEnd = end - residueOffset + 1; ! int substringEnd = end - residueOffset; // prevents StringIndexOutOfBoundsException if (substringEnd > residues.length()) substringEnd = residues.length(); |