From: Nathan J. <nr...@gm...> - 2005-06-10 01:37:10
|
Hi, It appears to me, though I haven't looked at the code, that every time the cobertura.ser file is touched by a test for instrumentation the entire file is read in, updated and then written back to disk. This seems to be a hug performance hit when compared to running the test without instrumentation. (in our case about 6-7 times slower, on a build that take 15 minutes without instrumentation this makes it a very long build cycle.) Would it perhaps make more sense to append instrumentation data to the cobertura.ser file (may need to be done a different way if cobertura.ser is a serialized object(s)) and then process all this data at reporting time to aggregate it into class statistics? In other words for each test, when it reads in cobertura.ser and writes it, it creates its own data either appended or in a separate file, and these are all then merged at reporting time. This would reduce disk access and speed up the testing phase and move some of the overhead to when the reporting is done. I suppose one possibility is to make use of different cobertura.ser files for each module and merge them at the end? Would appreciate any comments you might have. Nathan Johns |
From: Grzegorz L. <ha...@gm...> - 2005-06-15 20:39:31
|
The problem here may be implemetation of ProjectData.getOrCreateClassData method, and additionaly ClassData.touch implementation. Both these methods are invoked each time "line of code" is executed. Both methods get/put some information from maps. The result is that with each line of code some operations that are in most cases many times more expensive are executed. Grzegorz On 6/10/05, Nathan Johns <nr...@gm...> wrote: > Hi, >=20 > It appears to me, though I haven't looked at the code, that every time > the cobertura.ser file is touched by a test for instrumentation the > entire file is read in, updated and then written back to disk. This > seems to be a hug performance hit when compared to running the test > without instrumentation. (in our case about 6-7 times slower, on a > build that take 15 minutes without instrumentation this makes it a > very long build cycle.) >=20 > Would it perhaps make more sense to append instrumentation data to the > cobertura.ser file (may need to be done a different way if > cobertura.ser is a serialized object(s)) and then process all this > data at reporting time to aggregate it into class statistics? >=20 > In other words for each test, when it reads in cobertura.ser and > writes it, it creates its own data either appended or in a separate > file, and these are all then merged at reporting time. This would > reduce disk access and speed up the testing phase and move some of the > overhead to when the reporting is done. >=20 > I suppose one possibility is to make use of different cobertura.ser > files for each module and merge them at the end? Would appreciate any > comments you might have. >=20 > Nathan Johns >=20 >=20 > ------------------------------------------------------- > This SF.Net email is sponsored by: NEC IT Guy Games. How far can you sho= tput > a projector? How fast can you ride your desk chair down the office luge t= rack? > If you want to score the big prize, get to know the little guy. > Play to win an NEC 61" plasma display: http://www.necitguy.com/?r > _______________________________________________ > Cobertura-devel mailing list > Cob...@li... > https://lists.sourceforge.net/lists/listinfo/cobertura-devel > |
From: Nathan J. <nr...@gm...> - 2005-06-15 22:05:16
|
After some experimentation we were able to speed up our unit testing time by setting the fork options on the junit task in ant. Previously when the build was setup we required Junit to fork into a separate JVM: <junit fork=3D"true" haltonfailure=3D"true">.... But no one realised this was forking a new JVM for each test when we upgraded to Ant 1.6.2. (Not sure if this was the case for Ant 1.5) So have now changed the build scripts to: <junit fork=3D"true" forkmode=3D"once" haltonfailure=3D"true">.... And the speed improvement in the unit testing phase is amazing.=20 Previously Each test seemed to have an overhead of 4 to 5 seconds for when the instrumentation data was read and written, now they all run very quickly and this overhead occurs once for each fork. (For us that is once for each module.) Thanks for the pointers in resolving this... one of those little quirks of Ant and the junit task. I still think there may be something to consider in what I originally said about appending data but I suppose you can achieve the same affect, with the forkmode option, using a different cobertura.ser file for each module and then merging them at the end. Nathan Johns On 6/16/05, Grzegorz Lukasik <ha...@gm...> wrote: > The problem here may be implemetation of > ProjectData.getOrCreateClassData method, and additionaly > ClassData.touch implementation. Both these methods are invoked each > time "line of code" is executed. Both methods get/put some information > from maps. The result is that with each line of code some operations > that are in most cases many times more expensive are executed. >=20 > Grzegorz >=20 > On 6/10/05, Nathan Johns <nr...@gm...> wrote: > > Hi, > > > > It appears to me, though I haven't looked at the code, that every time > > the cobertura.ser file is touched by a test for instrumentation the > > entire file is read in, updated and then written back to disk. This > > seems to be a hug performance hit when compared to running the test > > without instrumentation. (in our case about 6-7 times slower, on a > > build that take 15 minutes without instrumentation this makes it a > > very long build cycle.) > > > > Would it perhaps make more sense to append instrumentation data to the > > cobertura.ser file (may need to be done a different way if > > cobertura.ser is a serialized object(s)) and then process all this > > data at reporting time to aggregate it into class statistics? > > > > In other words for each test, when it reads in cobertura.ser and > > writes it, it creates its own data either appended or in a separate > > file, and these are all then merged at reporting time. This would > > reduce disk access and speed up the testing phase and move some of the > > overhead to when the reporting is done. > > > > I suppose one possibility is to make use of different cobertura.ser > > files for each module and merge them at the end? Would appreciate any > > comments you might have. > > > > Nathan Johns > > > > > > ------------------------------------------------------- > > This SF.Net email is sponsored by: NEC IT Guy Games. How far can you s= hotput > > a projector? How fast can you ride your desk chair down the office luge= track? > > If you want to score the big prize, get to know the little guy. > > Play to win an NEC 61" plasma display: http://www.necitguy.com/?r > > _______________________________________________ > > Cobertura-devel mailing list > > Cob...@li... > > https://lists.sourceforge.net/lists/listinfo/cobertura-devel > > > |
From: Grzegorz L. <ha...@gm...> - 2005-06-15 22:19:27
Attachments:
cob-gcd.zip.txt
|
If you want to investigate a little, you can run tests with option -Xprof. Some information about execution time of methods will be displayed. If you will see that HashMap.get and String.equals operations are most expensive, then slow execution of tests is caused by the problem I described. (If you are using ant "junit" task, just add <jvmarg value=3D3D"-Xprof"/> inside junit tag). I have made some tests and they show that map operations can be very expensive. Tested program is simplificated algorithmics one, so the problem is much more visible (maybe to much) here than in normal cases. Here are my results: ---------------------------------------------------------------------------= =3D --------------------------- target "run" test execution time: 4 seconds [junit] Compiled + native Method [junit] 73.1% 245 + 0 simple.GCD.gcd [junit] 2.1% 7 + 0 simple.GCD.gcd [junit] 0.3% 0 + 1 Interpreter [junit] 0.3% 1 + 0 java.lang.String.startsWith [junit] 0.3% 1 + 0 java.io.BufferedWriter.write [junit] 0.3% 1 + 0 java.lang.String.indexOf [junit] 0.3% 1 + 0 java.lang.String.hashCode [junit] 76.7% 256 + 1 Total compiled ---------------------------------------------------------------------------= =3D --------------------------- target "run-instrumented" test execution time: 85 seconds [junit] Compiled + native Method [junit] 46.3% 1574 + 0 java.util.HashMap.get [junit] 28.2% 960 + 0 java.lang.String.equals [junit] 13.3% 452 + 0 java.util.HashMap.hash [junit] 9.0% 295 + 12 simple.GCD.gcd [junit] 0.8% 27 + 0 simple.GCD.gcd [junit] 0.0% 0 + 1 java.util.jar.JarFile.hasClassPathAttr= =3D ibute [junit] 0.0% 1 + 0 =3D20 sun.reflect.ClassFileAssembler.emitConstantPoolUTF8 [junit] 97.7% 3309 + 13 Total compiled ---------------------------------------------------------------------------= =3D --------------------------- As you can see, map connected operations (get, String.equals, hash) are most expensive one. But in more natural situations (where tests do not perfom to many operations, or many operations are performed using not instrumented code - from libraries), the difference will be lower. NOTICE: If you will use attached project, copy cobertura with dependend libraries into lib/ subdirectory. Grzegorz On 6/15/05, Grzegorz Lukasik <ha...@gm...> wrote: > The problem here may be implemetation of > ProjectData.getOrCreateClassData method, and additionaly > ClassData.touch implementation. Both these methods are invoked each > time "line of code" is executed. Both methods get/put some information > from maps. The result is that with each line of code some operations > that are in most cases many times more expensive are executed. >=3D20 > Grzegorz >=3D20 > On 6/10/05, Nathan Johns <nr...@gm...> wrote: > > Hi, > > > > It appears to me, though I haven't looked at the code, that every time > > the cobertura.ser file is touched by a test for instrumentation the > > entire file is read in, updated and then written back to disk. This > > seems to be a hug performance hit when compared to running the test > > without instrumentation. (in our case about 6-7 times slower, on a > > build that take 15 minutes without instrumentation this makes it a > > very long build cycle.) > > > > Would it perhaps make more sense to append instrumentation data to the > > cobertura.ser file (may need to be done a different way if > > cobertura.ser is a serialized object(s)) and then process all this > > data at reporting time to aggregate it into class statistics? > > > > In other words for each test, when it reads in cobertura.ser and > > writes it, it creates its own data either appended or in a separate > > file, and these are all then merged at reporting time. This would > > reduce disk access and speed up the testing phase and move some of the > > overhead to when the reporting is done. > > > > I suppose one possibility is to make use of different cobertura.ser > > files for each module and merge them at the end? Would appreciate any > > comments you might have. > > > > Nathan Johns > > > > > > ------------------------------------------------------- > > This SF.Net email is sponsored by: NEC IT Guy Games. How far can you s= =3D hotput > > a projector? How fast can you ride your desk chair down the office luge= =3D track? > > If you want to score the big prize, get to know the little guy. > > Play to win an NEC 61" plasma display: http://www.necitguy.com/?r > > _______________________________________________ > > Cobertura-devel mailing list > > Cob...@li... > > https://lists.sourceforge.net/lists/listinfo/cobertura-devel > > > |
From: Grzegorz L. <ha...@gm...> - 2005-07-03 14:05:34
|
I have submitted a patch that fixes the problems I have described. Grzegorz On 6/15/05, Grzegorz Lukasik <ha...@gm...> wrote: > The problem here may be implemetation of > ProjectData.getOrCreateClassData method, and additionaly > ClassData.touch implementation. Both these methods are invoked each > time "line of code" is executed. Both methods get/put some information > from maps. The result is that with each line of code some operations > that are in most cases many times more expensive are executed. >=20 > Grzegorz >=20 > On 6/10/05, Nathan Johns <nr...@gm...> wrote: > > Hi, > > > > It appears to me, though I haven't looked at the code, that every time > > the cobertura.ser file is touched by a test for instrumentation the > > entire file is read in, updated and then written back to disk. This > > seems to be a hug performance hit when compared to running the test > > without instrumentation. (in our case about 6-7 times slower, on a > > build that take 15 minutes without instrumentation this makes it a > > very long build cycle.) > > > > Would it perhaps make more sense to append instrumentation data to the > > cobertura.ser file (may need to be done a different way if > > cobertura.ser is a serialized object(s)) and then process all this > > data at reporting time to aggregate it into class statistics? > > > > In other words for each test, when it reads in cobertura.ser and > > writes it, it creates its own data either appended or in a separate > > file, and these are all then merged at reporting time. This would > > reduce disk access and speed up the testing phase and move some of the > > overhead to when the reporting is done. > > > > I suppose one possibility is to make use of different cobertura.ser > > files for each module and merge them at the end? Would appreciate any > > comments you might have. > > > > Nathan Johns > > > > > > ------------------------------------------------------- > > This SF.Net email is sponsored by: NEC IT Guy Games. How far can you s= hotput > > a projector? How fast can you ride your desk chair down the office luge= track? > > If you want to score the big prize, get to know the little guy. > > Play to win an NEC 61" plasma display: http://www.necitguy.com/?r > > _______________________________________________ > > Cobertura-devel mailing list > > Cob...@li... > > https://lists.sourceforge.net/lists/listinfo/cobertura-devel > > > |