Thread: Re: [Pydev-code] [Pydev-cvs] org.python.pydev.core/src/org/python/pydev/core/structure FastStringB
Brought to you by:
fabioz
From: Radim K. <ra...@ku...> - 2008-06-15 05:51:05
|
Hello, I noticed this interesting commit and want to ask: Why do you think StringBuffer performance is a problem (for PyDev)? Why do you think this FastStringBuffer is better? Did you consider to use StringBuilder? What are result of your measurements? Few things to note: be carefull to do enough run to avoid results skewed by interpreted runs before JIT compiles the code. Look at the difference between server and client VM (at server VM I do not see any difference). Check various JDKs - StringBuffer can be slower on JDK 1.5 but gets to comparable level on JDK6 with its biased locking. Generally it can happen that JIT will optimize your code completely and remove it ;-) It is really hard to write correct microbenchmark for Java code. I do not see where you saved allocations/GC (javadoc says that this new class is more effective). Even if you do - GC of an object dying in eden is not a problem. Only surviving objects are important. -Radim On Sat, Jun 14, 2008 at 3:14 PM, Fabio Zadrozny <fa...@us...> wrote: > Update of /cvsroot/pydev/org.python.pydev.core/src/org/python/pydev/core/structure > In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv27891/src/org/python/pydev/core/structure > > Added Files: > FastStringBuffer.java > Log Message: > Using faster version of StringBuffer: FastStrintgBuffer / Better icons for auto-import. > > --- NEW FILE: FastStringBuffer.java --- > package org.python.pydev.core.structure; > > /** > * This is a custom string that works around char[] objects to provide minimum allocation/garbage collection overhead. > * To be used mostly when several small concatenations of strings are used and in local contexts while reusing the > * same object to create multiple strings. > * > * @author Fabio > */ > public final class FastStringBuffer { > > /** > * Holds the actual chars > */ > private char[] value; > > /** > * Count for which chars are actually used > */ > private int count; > > /** > * Initializes with a default initial size (128 chars) > */ > public FastStringBuffer() { > this(128); > } > > /** > * An initial size can be specified (if available and given for no allocations it can be more efficient) > */ > public FastStringBuffer(int initialSize) { > this.value = new char[initialSize]; > this.count = 0; > } > > /** > * initializes from a string and the additional size for the buffer > * > * @param s string with the initial contents > * @param additionalSize the additional size for the buffer > */ > public FastStringBuffer(String s, int additionalSize) { > this.count = s.length(); > value = new char[this.count + additionalSize]; > s.getChars(0, this.count, value, 0); > } > > /** > * Appends a string to the buffer > */ > public FastStringBuffer append(String string) { > int strLen = string.length(); > > if (this.count + strLen > this.value.length) { > resizeForMinimum(this.count + strLen); > } > string.getChars(0, strLen, value, this.count); > this.count += strLen; > > return this; > } > > private void resizeForMinimum(int minimumCapacity) { > int newCapacity = (value.length + 1) * 2; > if (newCapacity < 0) { > newCapacity = Integer.MAX_VALUE; > } else if (minimumCapacity > newCapacity) { > newCapacity = minimumCapacity; > } > char newValue[] = new char[newCapacity]; > System.arraycopy(value, 0, newValue, 0, count); > value = newValue; > } > > public final FastStringBuffer append(int n) { > append(String.valueOf(n)); > return this; > } > > public final FastStringBuffer append(char n) { > if (count + 1 > value.length) { > resizeForMinimum(count + 1); > } > value[count] = n; > count += 1; > return this; > } > > public final FastStringBuffer append(long n) { > append(String.valueOf(n)); > return this; > } > > public final FastStringBuffer append(boolean b) { > append(String.valueOf(b)); > return this; > } > > public FastStringBuffer append(char[] chars) { > if (count + chars.length > value.length) { > resizeForMinimum(count + chars.length); > } > System.arraycopy(chars, 0, value, count, chars.length); > count += chars.length; > return this; > } > > public FastStringBuffer append(FastStringBuffer other) { > append(other.value, 0, other.count); > return this; > } > > public FastStringBuffer append(char[] chars, int offset, int len) { > if (count + len > value.length) { > resizeForMinimum(count + len); > } > System.arraycopy(chars, offset, value, count, len); > count += len; > return this; > } > > public FastStringBuffer reverse() { > final int limit = count / 2; > for (int i = 0; i < limit; ++i) { > char c = value[i]; > value[i] = value[count - i - 1]; > value[count - i - 1] = c; > } > return this; > } > > public void clear() { > this.count = 0; > } > > public int length() { > return this.count; > } > > @Override > public String toString() { > return new String(value, 0, count); > } > > public void deleteLast() { > if (this.count > 0) { > this.count -= 1; > } > } > > public char charAt(int i) { > return this.value[i]; > } > > public FastStringBuffer insert(int offset, String str) { > int len = str.length(); > int newCount = count + len; > if (newCount > value.length){ > resizeForMinimum(newCount); > } > System.arraycopy(value, offset, value, offset + len, count - offset); > str.getChars(0, str.length(), value, offset); > count = newCount; > return this; > } > > public FastStringBuffer appendObject(Object attribute) { > return append(attribute != null?attribute.toString():"null"); > } > > public void setCount(int newLen) { > this.count = newLen; > } > > } > > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > _______________________________________________ > Pydev-cvs mailing list > Pyd...@li... > https://lists.sourceforge.net/lists/listinfo/pydev-cvs > |
From: Fabio Z. <fa...@gm...> - 2008-06-15 12:37:44
|
Hello Radim, > I noticed this interesting commit and want to ask: > > Why do you think StringBuffer performance is a problem (for PyDev)? > Why do you think this FastStringBuffer is better? > Did you consider to use StringBuilder? > What are result of your measurements? StringBuilder is not an option because pydev needs to support java 1.4 / Pydev does lot's of string concatenations in a good number of cases. The results here are on java 1.5... the microbenchmark is committed at http://pydev.cvs.sourceforge.net/pydev/org.python.pydev.core/tests/org/python/pydev/core/structure/FastStringBufferTest.java?view=markup (although the benchmark part is commented out). It's optimized for: creating a string, filling it, calling clear() and then reusing it again (which is what the benchmark does). In the microbenchmark it runs at 0.4 times the time the actual StringBuffer runs in java 1.5. Basically on that case it saves a couple of calls to a super.append method, does not check for null strings and has no len check (so, basically it does less checking to gain speed -- it'll still throw exceptions, but not exceptions as given by a StringBuffer, only on invalid array accesses). Also, a clear() method and deleteLast() were added which are also much faster (basically, they just change the marked size of the buffer -- pydev relies on test-cases instead of having checks for abnormal cases, so, I'm not sure I'd recommend it as a general case StringBuffer). > Few things to note: be carefull to do enough run to avoid results > skewed by interpreted runs before JIT compiles the code. > > Look at the difference between server and client VM (at server VM I do > not see any difference). > > Check various JDKs - StringBuffer can be slower on JDK 1.5 but gets to > comparable level on JDK6 with its biased locking. > > Generally it can happen that JIT will optimize your code completely > and remove it ;-) It is really hard to write correct microbenchmark > for Java code. Yeap, I've taken that into account, but as you never know which jdk the client will be using, even if in the end it only significantly does a change for some java vms, it's worth it -- if it does not get slower on another version... but I'm pretty confident that this will not happen in this case -- it's amazing that some versions of MacOS are bounded to a JDK 1.4 and you can't update it (at least that's the complain I got a lot when pydev started to support only java 1.5 -- so, 1.4 support was restored). > I do not see where you saved allocations/GC (javadoc says that this > new class is more effective). Even if you do - GC of an object dying > in eden is not a problem. Only surviving objects are important. You're right, it does not save allocations (I'll update the docs). The idea is to save allocations on the case where it says that should be used in the docs: create the buffer, do lot's of things then call clear() and do things again... so, loops should only allocate one and rely on cler() instead of creating a new buffer to save allocations (in that commit a number of places were updated for that case). So, one of the advantages of that case is that I also know which places I've re-reviewed for that case (basically, if it's not using that version of the buffer it's not reviewed for that case -- although some reviewed places kept on using StringBuffer because of features that won't be added to that version). Cheers, Fabio |
From: Radim K. <ra...@ku...> - 2008-06-15 18:20:57
|
That's great explanation! Thanks. I'll add a few comments. On Sun, Jun 15, 2008 at 5:37 AM, Fabio Zadrozny <fa...@gm...> wrote: > Hello Radim, > >> I noticed this interesting commit and want to ask: >> >> Why do you think StringBuffer performance is a problem (for PyDev)? >> Why do you think this FastStringBuffer is better? >> Did you consider to use StringBuilder? >> What are result of your measurements? > > StringBuilder is not an option because pydev needs to support java 1.4 > / Pydev does lot's of string concatenations in a good number of cases. > Almost every Java program does so this is indeed a common hot spot. > The results here are on java 1.5... the microbenchmark is committed at > http://pydev.cvs.sourceforge.net/pydev/org.python.pydev.core/tests/org/python/pydev/core/structure/FastStringBufferTest.java?view=markup > (although the benchmark part is commented out). > > It's optimized for: creating a string, filling it, calling clear() and > then reusing it again (which is what the benchmark does). > > In the microbenchmark it runs at 0.4 times the time the actual > StringBuffer runs in java 1.5. > I do not any actual numbers in the log or test source itself and my runs showed smaller improvement. This is not a problem. Hopefully there is some task where this change maps to improved perceived performance - a user action that is noticeably faster. Re reusing: object allocation is basically bump up a pointer and return it. Only when a thread local allocation buffer is exhausted this needs to get a new slice of memory (your microbenchmark is not covering any threading). > Basically on that case it saves a couple of calls to a super.append > method, does not check for null strings and has no len check (so, > basically it does less checking to gain speed -- it'll still throw > exceptions, but not exceptions as given by a StringBuffer, only on > invalid array accesses). Also, a clear() method and deleteLast() were > added which are also much faster (basically, they just change the > marked size of the buffer -- pydev relies on test-cases instead of > having checks for abnormal cases, so, I'm not sure I'd recommend it as > a general case StringBuffer). > Generaly most of these optimization can be done in HotSpot and some of them really are. Aggressive inlining is common, then you can analyze flow and avoid null check. Array bounds checking can be simpler with loop unrolling. It takes a while but current compilers are really smart. >> Few things to note: be carefull to do enough run to avoid results >> skewed by interpreted runs before JIT compiles the code. >> >> Look at the difference between server and client VM (at server VM I do >> not see any difference). >> >> Check various JDKs - StringBuffer can be slower on JDK 1.5 but gets to >> comparable level on JDK6 with its biased locking. >> >> Generally it can happen that JIT will optimize your code completely >> and remove it ;-) It is really hard to write correct microbenchmark >> for Java code. > > Yeap, I've taken that into account, but as you never know which jdk > the client will be using, even if in the end it only significantly > does a change for some java vms, it's worth it -- if it does not get > slower on another version... but I'm pretty confident that this will > not happen in this case -- it's amazing that some versions of MacOS > are bounded to a JDK 1.4 and you can't update it (at least that's the > complain I got a lot when pydev started to support only java 1.5 -- > so, 1.4 support was restored). > The opposite way is: just by upgrading to JDK6 users get a nice benefit of a performance on the same level as you current have and there is no need for code changes. If you really have a lot of users running JDK1.4 this is a valid use case (especially if they use pydev ext). Though these folks are in a really troubled situation. Now Apple released Java6 and I wonder how long they will support 1.4 an old systems. Well that's a strange world of Java on Mac. Regards, Radim >> I do not see where you saved allocations/GC (javadoc says that this >> new class is more effective). Even if you do - GC of an object dying >> in eden is not a problem. Only surviving objects are important. > > You're right, it does not save allocations (I'll update the docs). The > idea is to save allocations on the case where it says that should be > used in the docs: create the buffer, do lot's of things then call > clear() and do things again... so, loops should only allocate one and > rely on cler() instead of creating a new buffer to save allocations > (in that commit a number of places were updated for that case). So, > one of the advantages of that case is that I also know which places > I've re-reviewed for that case (basically, if it's not using that > version of the buffer it's not reviewed for that case -- although some > reviewed places kept on using StringBuffer because of features that > won't be added to that version). > > Cheers, > > Fabio > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://sourceforge.net/services/buy/index.php > _______________________________________________ > pydev-code mailing list > pyd...@li... > https://lists.sourceforge.net/lists/listinfo/pydev-code > |