Thread: [Cobertura-devel] Performance Issues - cobertura.ser

Brought to you by: christ66, dennislundberg, hauserx, jthomerson, lewijw

cobertura-devel

[Cobertura-devel] Performance Issues - cobertura.ser

From: Nathan J. <nr...@gm...> - 2005-06-10 01:37:10

Hi,

It appears to me, though I haven't looked at the code, that every time
the cobertura.ser file is touched by a test for instrumentation the
entire file is read in, updated and then written back to disk.  This
seems to be a hug performance hit when compared to running the test
without instrumentation.  (in our case about 6-7 times slower, on a
build that take 15 minutes without instrumentation this makes it a
very long build cycle.)

Would it perhaps make more sense to append instrumentation data to the
cobertura.ser file (may need to be done a different way if
cobertura.ser is a serialized object(s)) and then process all this
data at reporting time to aggregate it into class statistics?

In other words for each test, when it reads in cobertura.ser and
writes it, it creates its own data either appended or in a separate
file, and these are all then merged at reporting time.  This would
reduce disk access and speed up the testing phase and move some of the
overhead to when the reporting is done.

I suppose one possibility is to make use of different cobertura.ser
files for each module and merge them at the end?  Would appreciate any
comments you might have.

Nathan Johns

Re: [Cobertura-devel] Performance Issues - cobertura.ser

From: Grzegorz L. <ha...@gm...> - 2005-06-15 20:39:31

The problem here may be implemetation of
ProjectData.getOrCreateClassData method, and additionaly
ClassData.touch implementation. Both these methods are invoked each
time "line of code" is executed. Both methods get/put some information
from maps. The result is that with each line of code some operations
that are in most cases many times more expensive are executed.

Grzegorz

On 6/10/05, Nathan Johns <nr...@gm...> wrote:
> Hi,
>=20
> It appears to me, though I haven't looked at the code, that every time
> the cobertura.ser file is touched by a test for instrumentation the
> entire file is read in, updated and then written back to disk.  This
> seems to be a hug performance hit when compared to running the test
> without instrumentation.  (in our case about 6-7 times slower, on a
> build that take 15 minutes without instrumentation this makes it a
> very long build cycle.)
>=20
> Would it perhaps make more sense to append instrumentation data to the
> cobertura.ser file (may need to be done a different way if
> cobertura.ser is a serialized object(s)) and then process all this
> data at reporting time to aggregate it into class statistics?
>=20
> In other words for each test, when it reads in cobertura.ser and
> writes it, it creates its own data either appended or in a separate
> file, and these are all then merged at reporting time.  This would
> reduce disk access and speed up the testing phase and move some of the
> overhead to when the reporting is done.
>=20
> I suppose one possibility is to make use of different cobertura.ser
> files for each module and merge them at the end?  Would appreciate any
> comments you might have.
>=20
> Nathan Johns
>=20
>=20
> -------------------------------------------------------
> This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you sho=
tput
> a projector? How fast can you ride your desk chair down the office luge t=
rack?
> If you want to score the big prize, get to know the little guy.
> Play to win an NEC 61" plasma display: http://www.necitguy.com/?r
> _______________________________________________
> Cobertura-devel mailing list
> Cob...@li...
> https://lists.sourceforge.net/lists/listinfo/cobertura-devel
>

Re: [Cobertura-devel] Performance Issues - cobertura.ser

From: Nathan J. <nr...@gm...> - 2005-06-15 22:05:16

After some experimentation we were able to speed up our unit testing
time by setting the fork options on the junit task in ant.

Previously when the build was setup we required Junit to fork into a
separate JVM:
<junit fork=3D"true" haltonfailure=3D"true">....
But no one realised this was forking a new JVM for each test when we
upgraded to Ant 1.6.2. (Not sure if this was the case for Ant 1.5)  So
have now changed the build scripts to:
<junit fork=3D"true" forkmode=3D"once" haltonfailure=3D"true">....
And the speed improvement in the unit testing phase is amazing.=20
Previously Each test seemed to have an overhead of 4 to 5 seconds for
when the instrumentation data was read and written, now they all run
very quickly and this overhead occurs once for each fork.  (For us
that is once for each module.)

Thanks for the pointers in resolving this...  one of those little
quirks of Ant and the junit task.  I still think there may be
something to consider in what I originally said about appending data
but I suppose you can achieve the same affect, with the forkmode
option, using a different cobertura.ser file for each module and then
merging them at the end.

Nathan Johns

On 6/16/05, Grzegorz Lukasik <ha...@gm...> wrote:
> The problem here may be implemetation of
> ProjectData.getOrCreateClassData method, and additionaly
> ClassData.touch implementation. Both these methods are invoked each
> time "line of code" is executed. Both methods get/put some information
> from maps. The result is that with each line of code some operations
> that are in most cases many times more expensive are executed.
>=20
> Grzegorz
>=20
> On 6/10/05, Nathan Johns <nr...@gm...> wrote:
> > Hi,
> >
> > It appears to me, though I haven't looked at the code, that every time
> > the cobertura.ser file is touched by a test for instrumentation the
> > entire file is read in, updated and then written back to disk.  This
> > seems to be a hug performance hit when compared to running the test
> > without instrumentation.  (in our case about 6-7 times slower, on a
> > build that take 15 minutes without instrumentation this makes it a
> > very long build cycle.)
> >
> > Would it perhaps make more sense to append instrumentation data to the
> > cobertura.ser file (may need to be done a different way if
> > cobertura.ser is a serialized object(s)) and then process all this
> > data at reporting time to aggregate it into class statistics?
> >
> > In other words for each test, when it reads in cobertura.ser and
> > writes it, it creates its own data either appended or in a separate
> > file, and these are all then merged at reporting time.  This would
> > reduce disk access and speed up the testing phase and move some of the
> > overhead to when the reporting is done.
> >
> > I suppose one possibility is to make use of different cobertura.ser
> > files for each module and merge them at the end?  Would appreciate any
> > comments you might have.
> >
> > Nathan Johns
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you s=
hotput
> > a projector? How fast can you ride your desk chair down the office luge=
 track?
> > If you want to score the big prize, get to know the little guy.
> > Play to win an NEC 61" plasma display: http://www.necitguy.com/?r
> > _______________________________________________
> > Cobertura-devel mailing list
> > Cob...@li...
> > https://lists.sourceforge.net/lists/listinfo/cobertura-devel
> >
>

Re: [Cobertura-devel] Performance Issues - cobertura.ser

From: Grzegorz L. <ha...@gm...> - 2005-06-15 22:19:27

Attachments: cob-gcd.zip.txt

If you want to investigate a little, you can run tests with option
-Xprof. Some information about execution time of methods will be
displayed. If you will see that HashMap.get and String.equals
operations are most expensive, then slow execution of tests is caused
by the problem I described. (If you are using ant "junit" task, just
add <jvmarg value=3D3D"-Xprof"/>
inside junit tag).

I have made some tests and they show that map operations can be very
expensive. Tested program is simplificated algorithmics one, so the
problem is much more visible (maybe to much) here than in normal
cases. Here are my results:

---------------------------------------------------------------------------=
=3D
---------------------------
target "run"
test execution time: 4 seconds

   [junit]      Compiled + native   Method
   [junit]  73.1%   245  +     0    simple.GCD.gcd
   [junit]   2.1%     7  +     0    simple.GCD.gcd
   [junit]   0.3%     0  +     1    Interpreter
   [junit]   0.3%     1  +     0    java.lang.String.startsWith
   [junit]   0.3%     1  +     0    java.io.BufferedWriter.write
   [junit]   0.3%     1  +     0    java.lang.String.indexOf
   [junit]   0.3%     1  +     0    java.lang.String.hashCode
   [junit]  76.7%   256  +     1    Total compiled


---------------------------------------------------------------------------=
=3D
---------------------------
target "run-instrumented"
test execution time: 85 seconds

   [junit]      Compiled + native   Method
   [junit]  46.3%  1574  +     0    java.util.HashMap.get
   [junit]  28.2%   960  +     0    java.lang.String.equals
   [junit]  13.3%   452  +     0    java.util.HashMap.hash
   [junit]   9.0%   295  +    12    simple.GCD.gcd
   [junit]   0.8%    27  +     0    simple.GCD.gcd
   [junit]   0.0%     0  +     1    java.util.jar.JarFile.hasClassPathAttr=
=3D
ibute
   [junit]   0.0%     1  +     0  =3D20
sun.reflect.ClassFileAssembler.emitConstantPoolUTF8
   [junit]  97.7%  3309  +    13    Total compiled
---------------------------------------------------------------------------=
=3D
---------------------------

As you can see, map connected operations (get, String.equals, hash)
are most expensive one. But in more natural situations (where tests do
not perfom to many operations, or many operations are performed using
not instrumented code - from libraries), the difference will be lower.

NOTICE: If you will use attached project, copy cobertura with
dependend libraries into lib/ subdirectory.

Grzegorz


On 6/15/05, Grzegorz Lukasik <ha...@gm...> wrote:
> The problem here may be implemetation of
> ProjectData.getOrCreateClassData method, and additionaly
> ClassData.touch implementation. Both these methods are invoked each
> time "line of code" is executed. Both methods get/put some information
> from maps. The result is that with each line of code some operations
> that are in most cases many times more expensive are executed.
>=3D20
> Grzegorz
>=3D20
> On 6/10/05, Nathan Johns <nr...@gm...> wrote:
> > Hi,
> >
> > It appears to me, though I haven't looked at the code, that every time
> > the cobertura.ser file is touched by a test for instrumentation the
> > entire file is read in, updated and then written back to disk.  This
> > seems to be a hug performance hit when compared to running the test
> > without instrumentation.  (in our case about 6-7 times slower, on a
> > build that take 15 minutes without instrumentation this makes it a
> > very long build cycle.)
> >
> > Would it perhaps make more sense to append instrumentation data to the
> > cobertura.ser file (may need to be done a different way if
> > cobertura.ser is a serialized object(s)) and then process all this
> > data at reporting time to aggregate it into class statistics?
> >
> > In other words for each test, when it reads in cobertura.ser and
> > writes it, it creates its own data either appended or in a separate
> > file, and these are all then merged at reporting time.  This would
> > reduce disk access and speed up the testing phase and move some of the
> > overhead to when the reporting is done.
> >
> > I suppose one possibility is to make use of different cobertura.ser
> > files for each module and merge them at the end?  Would appreciate any
> > comments you might have.
> >
> > Nathan Johns
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you s=
=3D
hotput
> > a projector? How fast can you ride your desk chair down the office luge=
=3D
 track?
> > If you want to score the big prize, get to know the little guy.
> > Play to win an NEC 61" plasma display: http://www.necitguy.com/?r
> > _______________________________________________
> > Cobertura-devel mailing list
> > Cob...@li...
> > https://lists.sourceforge.net/lists/listinfo/cobertura-devel
> >
>

Re: [Cobertura-devel] Performance Issues - cobertura.ser

From: Grzegorz L. <ha...@gm...> - 2005-07-03 14:05:34

I have submitted a patch that fixes the problems I have described.

Grzegorz

On 6/15/05, Grzegorz Lukasik <ha...@gm...> wrote:
> The problem here may be implemetation of
> ProjectData.getOrCreateClassData method, and additionaly
> ClassData.touch implementation. Both these methods are invoked each
> time "line of code" is executed. Both methods get/put some information
> from maps. The result is that with each line of code some operations
> that are in most cases many times more expensive are executed.
>=20
> Grzegorz
>=20
> On 6/10/05, Nathan Johns <nr...@gm...> wrote:
> > Hi,
> >
> > It appears to me, though I haven't looked at the code, that every time
> > the cobertura.ser file is touched by a test for instrumentation the
> > entire file is read in, updated and then written back to disk.  This
> > seems to be a hug performance hit when compared to running the test
> > without instrumentation.  (in our case about 6-7 times slower, on a
> > build that take 15 minutes without instrumentation this makes it a
> > very long build cycle.)
> >
> > Would it perhaps make more sense to append instrumentation data to the
> > cobertura.ser file (may need to be done a different way if
> > cobertura.ser is a serialized object(s)) and then process all this
> > data at reporting time to aggregate it into class statistics?
> >
> > In other words for each test, when it reads in cobertura.ser and
> > writes it, it creates its own data either appended or in a separate
> > file, and these are all then merged at reporting time.  This would
> > reduce disk access and speed up the testing phase and move some of the
> > overhead to when the reporting is done.
> >
> > I suppose one possibility is to make use of different cobertura.ser
> > files for each module and merge them at the end?  Would appreciate any
> > comments you might have.
> >
> > Nathan Johns
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: NEC IT Guy Games.  How far can you s=
hotput
> > a projector? How fast can you ride your desk chair down the office luge=
 track?
> > If you want to score the big prize, get to know the little guy.
> > Play to win an NEC 61" plasma display: http://www.necitguy.com/?r
> > _______________________________________________
> > Cobertura-devel mailing list
> > Cob...@li...
> > https://lists.sourceforge.net/lists/listinfo/cobertura-devel
> >
>