GroboUtils / Feature Requests / #30 improve code coverage runtime performance

#30 improve code coverage runtime performance

Milestone: Next Release

Status: open

Owner: Matt Albrecht

Labels: improve interface (12)

Priority: 5

Updated: 2004-04-04

Created: 2004-04-04

Creator: Matt Albrecht

Private: No

Need to go over the runtime code coverage classes with
a microscope to cut out every bit of fat we can.

Make this thing fast!

Discussion

Scott Ellsworth - 2004-04-16

Logged In: YES
user_id=331345

This just came back to haunt me.

On my current jdbc based project, the test time went over ten minutes
recently, while when using Clover, it is under a minute. Most of the time,
performance is within a factor of two as that given by Clover, so
something has changed pathologically with this one test in my project.

Last time, switching to MinDirChannelLoggerFactory fixed things - is
there a cache tuning parameter or the like?

Do you have any suggestions on how to figure out what it is doing, and
why the time is so egregious? It sure sounds like dropping out of cache,
or perhaps something suddenly went n^2 on us, as does this while
running one of my unit tests, not later.

Scott

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Matt Albrecht - 2004-04-16

Logged In: YES
user_id=171822

For the CacheDir factory, if the number of classes you're
instrumenting is larger than the cache size. It's designed
to optimize situations where the code is referencing the
same classes over and over. If the number of classes that
it's interacting with is more than the cache size, then it
will start opening and closing files, slowing down performance.

You can set the "cachesize" parameter on the <logsettings>
tag to the number of cached files you want to keep open.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Matt Albrecht - 2004-04-16

Logged In: YES
user_id=171822

An idea for a new log factory: sockets!

Have the logger talk to a separate server via a single
socket (maybe named pipes?). The logger, then, would send
all of the probe information along the single socket.

The server would parse this output and push it to the
standard file output. This would need a threaded pipe to a
file writer (or writers) so as not to block the socket with
pending writes.

A shutdown hook in the logger factory would close and flush
the pipe.

Optionally, we could add a "flush on count" property, where
every X writes to the pipe causes a flush to the pipe.

The server could even use the java.nio library to enhance
its performance.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Matt Albrecht - 2004-04-16

Logged In: YES
user_id=171822

Another user-based optimization for the app as it stands
now: turn off the virus checker to enhance the disk I/O
performance.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Scott Ellsworth - 2004-04-16

Logged In: YES
user_id=331345

I figured out why I saw such a slowdown, but not a solution as yet. The
code I am testing calls one of my functions roughly 3-6M times per run.
This takes roughly 20 seconds under normal circumstances, roughly 40s
under clover, and more than ten minutes under grobo.

(This is actually a problem with the program, but I suspect that a major
performance boost could be had by looking carefully at what happens
when a nontrivial function is called a LOT of times during a run.)

This makes me think that switching loggers might not help, as it is just
pounding the heck out of a few functions, rather than loading a lot of
classes. The entire program is under 60 classes, including tests. It has
many more that it uses from various jars, but program code itself is
pretty isolated.

Hope this helps,

Scott

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Matt Albrecht - 2004-04-16

Logged In: YES
user_id=171822

To extend the ideas presented with the socket factory, we
can apply some of the same ideas to a file-based factory.

The single file-based factory would keep a single file open,
and write, on each probe line, the channel number + the
class ID + the standard probe line.

An additional report filter would need to run first, to
split this file into the current directory structure.

This format would have the added benefit of simplifying the
process of joining multiple log directory output into a
single repository.

Both the socket and single-file logger factories could be
abstracted out to use the same logic (different writers).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Matt Albrecht - 2004-04-16

Logged In: YES
user_id=171822

If this is your situation, where the same functions are
called over and over, then the MinDir might be the right
answer. I'm curious as to the performance differences
between these in different scenarios.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Matt Albrecht - 2004-04-16

Logged In: YES
user_id=171822

I've started a new factory, "FileSingleSourceLoggerFactory",
which outputs all of the classes for every channel to a
single file. This should reduce the I/O of opening and
closing files. Also, it has multi-write buffering so it
doesn't have to write or flush on every call (but can). It
also includes a shutdown hook to close off the open file.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Matt Albrecht - 2004-06-09

Logged In: YES
user_id=171822

v5 beta 3 has been released on the SF downloads page, and
includes this factory.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Brian Gartner - 2004-06-15

Logged In: YES
user_id=1063517

First, thank you so much for creating and releasing this tool.

I'm currently using grobocoverage to instrument and report
on approximately 2400 classes. Using v5 beta 2 it would take
about 32 hours generate the source reports. Using v5 beta 3
has reduced this somewhat, but report generation still
(understandably) takes quite a while (the numbers aren't in
front of me as I type this from home). I'd like to try to
squeeze a bit more performance out of it by using the
FileSingleSourceLoggerFactory that you've mentioned (or
anything you can suggest), but am wondering if I use this in
addition to or instead of the "single" logger (which I'm
currently using) or if they're actually the same thing.

Do I just use the loggerFactory attribute in my ant
grobo-instrument task? Will this have any effect on the
speed of report generation? I don't mean to sound ungrateful
for the work that you've obviously put into this excellent
tool, but a little additional clarification on exactly how
to use the factory would be much appreciated.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Matt Albrecht - 2004-06-15

Logged In: YES
user_id=171822

This bug is for the actual runtime performance of generating
the coverage numbers. The logger is what affects the
runtime performance.

Brian, it looks like you're looking for bug 947857, which
handles the "offline" report generation (that is, it runs
after tests have completed).

As for the question about the FileSingleSourceLoggerFactory,
this is aliased with the "single" logger.

Also, thanks for the note about the docs. Some of them need
work, and that sounds like an area I could improve them.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link: