Menu

How to use EMMA

2004-08-04
2004-08-22
  • Jeremy Whitlock

    Jeremy Whitlock - 2004-08-04

    I'm new to Agile Programming and it's techniques.  What exacly is EMMA used for?  It appears in the documents that to use EMMA you have to run the actual application.  Am I wrong?  I basically want to add EMMA to my current build.xml file and use it to see how much of my application is covered by JUnit tests.  Is this what EMMA is for?  I just need an explanation of what EMMA is for.  Thanks, Jeremy

     
    • Vlad Roubtsov

      Vlad Roubtsov - 2004-08-05

      Jeremy,

      EMMA is, well, a code coverage tool just like it says on the front page (http://emma.sourceforge.net). "Code coverage" refers to a software engineering technique whereby you track the quality and comprehensiveness of your test suite by determining simple metrics like what percentage of {classes, methods, lines, etc} have been *executed* when the test suite ran.

      What does “executed” mean? It means you exercise your application code using a “driver” of some kind. The driver could be a JUnit testsuite, a testsuite in some other test framework or it could even be a human sitting in front of your application and clicking on buttons acccording to some use case script. The driver thus “induces” coverage. The application code itself runs more or less according to how it is meant to run: as a standalone application, as a J2EE app, maybe even distributed over several machines.

      There is no way to predict how your application will execute and I assure you that not everybody uses JUnit as the driver. Because of that, EMMA does not assume a particular testing methodology. Instead, it offers generic tools to coverage-enable your application classes and assumes you will then deploy them as you have been doing it so far. EMMA does not know whether you use Websphere or Swing or whatever. EMMA assumes just a standard JVM and will collect coverage statistics very unintrusively, with a small runtime performance overhead and as a quiet side effect of your testing.

      This explains why EMMA does not tout special “JUnit integration”. It may happen in the future, but for now JUnit is supported just like any other test framework (you instrument your classes after you compile them and before you run them).

      Let’s talk more about coverage metrics. Practice shows that coverage percentages below, say, 60% correspond to poorly tested software. You can expect undiscovered bugs in such software (to be discovered by your users). Because of this, “good” software companies instill internal processes whereby a team cannot release a piece of software unless it passes release gates like “line coverage must be 80% or higher”. The default coverage thresholds in EMMA correspond to release gates at the company I work for.

      And as a collateral to the above, reaching for 100% coverage is also not profitable. You just get a lot less quality improvement for considerably more effort to reach such perfection. I’d say coverage close to 85-90% is “good enough” for all practical purposes.

      The topic of which coverage metric is “better” is somewhat religious. There have been academic studies showing that, for example, path coverage at a certain level detects slightly more bugs than line coverage at the same level. I personally think the actual metric definition is not that important. I’d rather empower all developers on my team with a free and fast tool so that they can track their own coverage (of some kind) early and frequently. An experienced developer will look at the coverage report that links to the source code, drill down a bit, look at the “red” areas, and figure out which, if any, areas of the product he left somewhat under-tested. This is why EMMA opts for a simple metric that is easy to obtain without a lot of runtime overhead.

      Where does agile come in here? Well, “agile” is very trendy . To me, “agile” means fast dev-test iterative cycles. To have that, you’ve got to have fast tools. We have fast test framework, incremental compilers (e.g, Eclipse), and EMMA joins that club of fast/incremental tools.

      Obtaining coverage with some commercial tools like TrueCoverage is so goddamn slow you will never convince developers they should do it early and often. EMMA was written to change this mentality.

      EMMA has also been architected to enable individual developers on a team to “zoom in” into just their particular piece/module/package/whatever when doing coverage. This is what I do when I check in code at work: if my package has 80%+ line coverage when I check it in, I know I am in line with what the entire app will need in order to release.

      There is plenty of content about this on the web if you search. Some links:

      - some informal coverage thoughts from a user’s perspective: http://www.testing.com/writings/coverage-terminology.html
      - general code coverage discussion (includes different coverage metrics definitions): http://www.bullseye.com/coverage.html

       
    • Jeremy Whitlock

      Jeremy Whitlock - 2004-08-06

      Vlad,
           Thanks for the rundown.  After your email, I understood a little better but this covers things nicely.  I gave it a dry run but no reports were generated because I never ran anything against any of the classes.  I'm getting there.  Thanks, Jeremy

       
    • Jeremy Whitlock

      Jeremy Whitlock - 2004-08-06

      Vlad, is there a way to build the coverage report even when none of the instrumented classes have been "ran"?  I would like to see a report no matter what the coverage is and right now, I don't get any report since I don't run any app against the instrumented classes.  Thanks, Jeremy

       
    • Vlad Roubtsov

      Vlad Roubtsov - 2004-08-06

      No, if no classes have been executed then there will be no runtime coverage data (*.ec files) and the coverage is automatically 0%. Generating a report for such an edge case seems kind of silly, so EMMA refuses to do it.

      Jeremy, it sounds like you are having extraordinary difficulties just running your code, with or without coverage. I think you should get that part sorted out first, before you start integrating with EMMA.

       
    • Jeremy Whitlock

      Jeremy Whitlock - 2004-08-06

      Vlad,
           You misunderstand me.  I do not have problems running my code but my boss wants to begin running these reports in our continuous integration suite.  The reason I was hoping for this is that JCoverage will build a report with no coverage and you told me that EMMA was better.  I want to see a report for my whole app regardless of whether or not it's been ran via tests or what not.  I admit that I'm new to the code coverage scene but I'm not new to development.  I am basing my questions off of what I've used and JCoverage is the only one I've gotten to work.  EMMA works but produces nothing if you don't run anything against your instrumented classes.  While this appears to be by practice, you should consider what I say and not tell me that I don't know what I'm doing.  Thanks, Jeremy

       
    • Jeremy Whitlock

      Jeremy Whitlock - 2004-08-06

      Vlad,
           I forgot to add in the earlier post that even though my code coverage is 0% as of now, it would still be beneficial to see that information.  In doing so, you have a report telling you what you have to do to get to where you want to be.  JCoverage provides that for you.  I don't know which is better since JCoverage doesn't allow for multiple source locations and EMMA requires you to run your instrumented code even to tell you that you have no code coverage.  Am I in the wrong for wanting such functionality?  You have to start somewhere and unfortunately, my company didn't write test cases before I got here and we are starting at the bottom but it sure would be great to see what we have to do.  Thanks, Jeremy

       
    • Vlad Roubtsov

      Vlad Roubtsov - 2004-08-06

      Jeremy, if I understand you correctly, you would like EMMA to generate a coverage report even though you have not executed any code, with all coverage metrics thus being (predictably) 0%, is that correct?

      I admit I find this a little strange. I don't tell my users whether they are wrong or right in wanting certain features, but I expect them to be reasonable about the value/cost tradeoffs of what they are asking for. I know that EMMA right now will refuse to generate reports unless it has both *.em and *.ec files -- this is by design. It has the beneficial effect of warning users that they may have forgotten to swap the instrumented classes  for the original classes when they ran their tests (a mistake some users occasionally make).

      Your statement "EMMA requires you to run your instrumented code even to tell you that you have no code coverage" is not quite true: if you do run some instrumented code, you will execute some methods and you will thus have non-zero coverage.

      You are asking for EMMA to support a rare edge case that becomes unnecessary as soon as you have written and executed your first testcase. Do you see what I am getting at? If you have no testcases to run, maybe you should worry about *that* more than about 0% coverage.

       
    • Jeremy Whitlock

      Jeremy Whitlock - 2004-08-16

      Vlad,
           Point taken.  I am just comparing EMMA to JCoverage since that is the only tool I've used and I met you via the JCoverage mailing list.  I am sure that your practice is correct and shouldn't be modified for my reasoning.  Hopefully you see where I'm coming from as well though.  I tried JCoverage and it's GPL version is over a year outdated, so says JCoverage, and it's capabilities of multiple source locations in the report aren't featured.  JCoverage did build the report even though I didn't run any instrumented classes.  EMMA allows the features I need that JCoverage doesn't BUT it doesn't build the report unless you run instrumented classes.  I see why you do this and I understand it's by practice but the only reason I mentioned the need/want of this is because the only other tool I tested did this already.  How would this "rare edge case" become unnecessary once I write my first test case?  That will only show me code coverage for that class tested in the unit test and not report on the whole source tree as JCoverage does.  I don't intend on comparing the two other than to state where I'm coming from.  EMMA is fantastic.  I've tested it in a controled environment where I do have test cases and it only reports on what you test and not your application as a whole like JCoverage.  Sure my coverage on many source locations would be 0% and I agree that is important to correct BUT EMMA only reports when you test.  If I have 45 portions of my app and I test 43 all the time but neglected to test the other 2, I might have 100% code coverage for the 43 but what about the other 2?  I'm left in the dark with EMMA where as with JCoverage, I know that there are untested portions of my code.

      This post isn't to tell you how to write your app, although you seem to be telling me what is supposed to be important to me :P, it's to show you that I have a need and I have to take care of my need.  My need may be only a "rare edge case" to you but to me, and possibly others, it's something that is needed.  I agree in all that you've said, don't get me wrong, but do you even in the slightest bit see where I'm coming from?  Thanks, Jeremy

       
      • Dean Hiller

        Dean Hiller - 2004-08-16

        I take both sides, but I think Jeremy has a flaw in his thinking unless I am missing something.  (I can see the frustration in some of those mails!)  Emma requires you to specify a main bootstrap class period.  If Emma supported doing Junit, what Jeremey says makes more sense.  As Emma would look for all the JUnit tests representing a pattern and execute them all, and if it did not find any, would obviously report 0% coverage in that case, but emma doesn't support JUnit like ant does, though I wish it did.

        This is where I think you are both disconnecting.

        That said, I personally wish emma did support JUnit explicitly as well as the other current method in case JUnit is not being used.  (ie. I can specify a pattern just like ant's JUnit task, and if no tests are found, there is obviously 0% coverage, because no code is executed, but that would be posted)

        My reasoning is because developers currently have to do one of two things....
        1. write a JUnit test suite adding all tests which means we have to remember to add to this class for every test case we add OR
        2. write a reflective JUnit task that finds all JUnit tests according to a pattern.

        1. stinks because developers forget and coverage all the sudden doesn't change, and then on large projects I could easily see someone writing an already written test to cover something that we thought was missed, but wasn't.
        2. is ok, but if you ever change the JUnit patter, you most now go to two locations, ant build file and the JUnit bootstrap class for coverage.  Not to mention, that every project out there will need this same exact class causing much repeat in code itself.

        Now that said, I usually don't open my mount unless I am willing to do something about it, and I probably will once we have emma running on our project and start hearing these complaints.

        ps. Vlad, great explanation on code coverage in the first reply.  I think I am going to send that to my team.
        thanks,
        dean

         
    • Jeremy Whitlock

      Jeremy Whitlock - 2004-08-16

      Dean,
           Thanks for your input.  I'm not frustrated with Vlad or EMMA.  I love it!!!  I just see how other tools work and it's hard to see EMMA not do that.  To see any report, you have to run some instrumented code.  I see why that is done in EMMA by practice and I can even see why it might have been developed this way.  I just was asking if there was a way to create a report that would show you your code coverage for all instrumented classes even if you didn't run any of the instrumented code.  This might not be that important to Vlad or even many others out there but if it worked this way (Like Clover and JCoverage as examples) developers would be able to have a bird's eye view of how much of their code is covered by unit tests even if it's 0%.  Dean is right though...I think Vlad and I are comparing apples to oranges and that's why we are "butting heads" but I am not fighting with him or telling him what he should do or think.  Just trying to ask how something works in respect to other comparable tools and why they would act differently.  I understand Vlad 100% in all he's said BUT he still hasn't done much to answer any of my questions, just told me that my thinking is wrong or that I need to focus on other things.  That isn't what you are supposed to do when users ask questions about why a tool does what it does.

      Vlad, I hold nothing against you.  My last few posts have been more defending myself than anything.  You don't seem to grasp what I'm trying to convey and that is causing us not to see eye to eye.  I hope you understand.  I hope that we can get to the point where we are understanding eachother so we can get to the point to figure out a solution.  Laters, Jeremy

      P.S. - It could possibly be my thinking that is causing this.  From other code coverage tools I've used/tried, you get a report showing your coverage with or without running instrumented classes.  Code coverage needs that I must provide are being able to look at my application and see how much of it is covered by unit tests.

       
    • Vlad Roubtsov

      Vlad Roubtsov - 2004-08-16

      Jeremy, I do want to understand you but I can't seem to be able to get your exact difficulty.

      Here are a few more thoughts towards my point of view. I am trying different tacks here to see if something triggers better understanding between us:

      EMMA is in fact two tools in one:

      (1) "on-the-fly" convenience mode: emmarun will *run* your app (via its main() class) for you and generate the coverage reports, all in one go. emmarun does not care whether the app is JUnit or something else. THIS IS THE ONLY EMMA TOOL THAT REQUIRES YOU TO RUN YOUR APP VIA ITS main() CLASS. Note that by the nature of classloading involved, this only works for simple apps, so sometimes you will need to use the next option instead. I will describe only this other option next.

      (2) instr/report pair of tools support an entirely different approach I call "offline instrumentation". It is so universal that it works for just about anything.

      The instr tool can take *any set of classes and jars* and produce versions of those that will spit out runtime coverage data when they are executed. The tool does not care whether these classes are product code or testcases, whether they have main() or not, etc: all it matters is that they are normal Java classes.

      For example, if you take this class:

      public class SomeClass
      {
          public void foo ()
          {
           ... do something ...
           }

          public void bar ()
          {
           ... do something ...
           }
      }

      the instr tool will do a kind of a second compile on SomeClass.class and give you a modified version that does something like this:

      public class SomeClass
      {
          public void foo ()
          {
           ... do something ...
           mark SomeClass.foo() as covered; // NEW BYTECODE
         }

          public void bar ()
          {
           ... do something ...
           mark SomeClass.bar() as covered; // NEW BYTECODE

           }
      }

      (this is not what really happens but conceptually it is correct)

      To product a report, you need two types of data:

      (a) what I call "metadata", which is a memento of which classes were processed by instr and what methods etc they found inside your classes. It will be something like this:

      "class SomeClass has two methods that are executable: foo() and bar()"

      Metadata is usually dumped in those *.em files.

      Note that metadata alone is not yet sufficient for building a report: all you know is that you have a class and some methods in it. But you don't know which of the two methods (none, foo(), bar(), both foo() and bar()) have been used at runtime -- THAT'S BECAUSE YOU HAVEN'T RUN ANYTHING YET.

      (b) what I call "runtime coverage data" (usually dumped into *.ec files). Think of it as an imprint, a record of which classes and which methods were executed by your application. Who or what runs the applications? EMMA does not care. It could be that you ran some class with a main() from command line, or maybe you have a JUnit test suite. It could even be a human user running, say, an interactive Swing application. In this offline instrumentation mode EMMA does not require you to use any special app runner tool.

      The important thing is you have some code that is meant to run because, well, it needs to run to actually do what it is supposed to do, right? So, you must have had a way to run your application before you even knew about EMMA.

      Now, take that same setup and instead of the original .classes swap them for the ones produced by instr.

      Now, run the app *just like before*. The application should work just like before and when it's done it will spew some runtime coverage data in an *.ec file.

      For example (!), let's say you have a JUnit testcase for testing SomeClass:

      SomeClassTest extends TestCase
      {
          public void testFooAndBar ()
          {
              SomeClass sc = new SomeClass ();
              sc.foo ();
           }
      }

      Now, if you used instr on SomeClass before, you have its metadata somewhere in an *.em file. If you now *run* SomeClassTest via JUnit, it will execute and as part of that execution it will *run* SomeClass constructor and SomeClass.foo(). When JUnit exits, EMMA runtime will dump runtime coverage data in an *.ec file and that data will be something like

      "in class SomeClass these methods were hit:
      foo(): yes
      bar(): no"

      If some class is not even mentioned in the runtime coverage data, it is assumed to have not loaded and thus completely uncovered.

      Now, you have everything you need for a report:

      metadata + runtime coverage data => coverage report

      Think of the metadata as "denominator" for coverage and runtime coverage data as "numerator".

      You use EMMA's report tool and feed it your *.em and *.ec files and it will generate a report that for class SomeClass will say that foo() is Ok but bar() is red because it never *ran*. (Again, if the report generator finds a class X in the metadata that is not in the runtime coverage, it will still report on it, but there will be zeroes for X coverage contribution everywhere.)

      You go into your testcase and realize that, yes, you forgot to test bar() in the first place. So you beef up your testcase to do that.

      You see how *both* metadata and runtime coverage data are needed to produce a report? One tells the report tool what to report on and the other tells its what the coverage stats are. You get the first piece when you instrument with instr and you get the second piece when you run with JUnit or whatever.

      Note that it does not mean that if a class X is present in the metadata it must also be run so it is present in the coverage data as well.

      Next, to address a comment Jeremy made:
      <quote>
      Sure my coverage on many source locations would be 0% and I agree that is important to correct BUT EMMA only reports when you test. If I have 45 portions of my app and I test 43 all the time but neglected to test the other 2, I might have 100% code coverage for the 43 but what about the other 2? I'm left in the dark with EMMA where as with JCoverage, I know that there are untested portions of my code.
      </quote>

      I just don't understand why you, Jeremy, think this way. EMMA can be used to do exactly what you seem to be asking for:

      You can use instr to process all 45 portions of your app and then run *only* 43 of them. You might have 45 *.em files and only 43 *.ec files, for example.

      Then the 43 will be reported according to their true runtime coverage and the remaining 2 WILL SHOW UP AS COVERED 0%. Is this not what you want? If not, please explain, because I am at my wit's end.

      To rehash this idea again, if you have a large setup with sources in multiple locations and many sub-testsuites, you can use EMMA as follows:

      module #1: if (coverage_desired) instrument and save the *.em file

      module #2: if (coverage_desired) instrument and save the *.em file
      ...
      module #N: if (coverage_desired) instrument and save the *.em file

      Now, you have the metadata for all N modules (you can use different *.em files or keep accumulating everything in the same global file). This is enough to cause all of them to show up in the future report.

      But you don't have to run *all* N modules' testsuites each time. You just need to run *at least one*. For example, module #3 is your integration tests and you can just run them and nothing else. ALL EMMA ASKS FOR IS AT LEAST 1 *.ec file. Its report tool will happily take N *.em files and a single *.em file, for example.

      Does this work?

      This restriction of requiring at least some metadata is there by design, because not having any coverage data files (*.ec) is more often an indication of some mistake on the user's part than an indication of his desire to generate a coverage report of all zeroes *everywhere*. But if the latter is what is required, I can add a new property to allow this edge case, it will be easy enough to do. But I wanted to emphasize that that what this is, an edge case of no testcases having been executed. I think it happens rarely.

      If there is a good feature missing, Jeremy, perhaps we could go into a requirements gathering mode here for a bit. Can you describe a use case scenario that you think is not supported but you would like to have supported?

       
    • Jeremy Whitlock

      Jeremy Whitlock - 2004-08-16

      Vlad,
           WOOHOO!!!  We are seeing eye to eye.  I am happy.  I see what you are saying and it is now making sense.  I think that instead of going into the requirements gathering mode, I will follow your instructions verbatim and try to get as far as I can.  I will come back with my results and needs if there are any.  I now understand EMMA's framework and needs a little better and knowing that makes me feel that I can get it working unmodified to suite my needs.  Thanks for your patience.  I am not up to speed fully on code coverage tools other than how they work in a broad spectrum so I might not had fully understood enough to ask the proper questions or even understand your answers.  Take care, Jeremy

      P.S. - I see that all that is needed is the 1 *.ec file now and I can make that happen but I must admit that I was asking originally was if it were possible to create a report regardless of the presence of any *.ec files.  Although it's not much use to you, it would be useful to my peers.  I fully agree with your statements and it's a shame they don't have unit tests as it is right now.  :P

       
    • Vlad Roubtsov

      Vlad Roubtsov - 2004-08-16

      Amen!

      Just to add one little comment. I referred to multiple *.em and *.ec files previously just to keep things simple. EMMA does not care about file extensions or whether metadata and coverage data are kept in separate files. You should choose whichever bookkeeping mode is most convenient for you. For example, you can keep appending all data to the same file if you want (the default output file mode is "merge").

      Of course, if you always merge new data in, you should have a build target to remove all EMMA output files. This is so that you could reset the project to a clean state whenever desired.

      Regarding your P.S. comment, yes, relaxing the requirement of 1 *.ec file (it's actually less than that: the overall coverage data must not be empty which means at least 1 *class* must have been executed) would be a new feature that I've alluded to a couple of times before. Now that I have confirmed that I understood what you were after I have added it as an official RFE: http://sourceforge.net/tracker/index.php?func=detail&aid=1010410&group_id=108932&atid=651900

       
    • Jeremy Whitlock

      Jeremy Whitlock - 2004-08-17

      Vlad,
           Great ideas.  Just so you know that I'm not crazy though, the functionality that I mentioned throughout this post, to report with or without coverage data, isn't such a crazy idea.  Clover has the ability to turn on or off this feature as well.  I would love to join the development team and help you out with this and the project as a whole if you'd be willing to let me.  Just let me know what you think.  As for the RFE description, looks good to me.  Take care, Jeremy

       
    • Vlad Roubtsov

      Vlad Roubtsov - 2004-08-17

      Jeremy,

      I never thought the idea was crazy. So far I am getting new feature requests at a faster rate than I can implement them (and Dilum is busy with EMMA Maven plugin in his spare time, of which he, like myself, doesn't have much). So, I have to prioritize based on my own understanding of difficulty/usefulness of feature requests. This particular feature was only requested by you so far (so it may not be super-needed perhaps) but at the same time it's easy to implement (which is good).

      Thanks for the offer to join the team. I am currently thinking of the best ways to accept donations and any advice would be helpful. I've noticed you are a member of a few SF projects so maybe you can help me figure out a good process here.

      The things I am struggling with right now are:

      - not all EMMA users want to be EMMA developers. Some of them just want a simple bugfix or a small feature here and there. How to best accept their contributions without giving them full CVS access? Patches seems to be the standard technique here.

      - people who want to have longer and more serious involvement need full CVS access. What should be the process for promoting a person into such status?

      - I don't think patches work well for substantial changes to the codebase. I think it is necessary to create CVS feature branches and then do transactional merges into the current work CVS branch. I am not sure how other SF projects handle this.

      At the moment I am writing up the "dev" part of a new EMMA web site, so any data on what works and what doesn't would be great. SF is great, but I admit some of its facilities are not as good as commercial tools. For example, CVS has very "legacy" feel to it and SF Tracker is somewhat sub-par compared to what I am used to.

       
    • Jeremy Whitlock

      Jeremy Whitlock - 2004-08-18

      Vlad,
           I understand your skepticism of accepting just any user to be a developer.  I have a greater need in being a developer on your team as I think that I am going to implement EMMA in my company's continuous integration suite.  Here are my suggestions for your questions:

      Patches work great as long as they are well written.  Not much of a better suggestion.

      Trust comes to mind in CVS access and I know it's hard to trust those you don't know.  Try to see if they are already/still on existing projects and try to get them to tell you more about their abilities and why they'd want to join.  Remember though...CVS can be backed up and such so that no one person could screw it up.

      Branches are a big part of applications where multiple versions are ongoing at the same time.  Good idea.

      Shoot me an email if you have further questions or want me to elaborate.  Take care, Jeremy

       
    • Jeremy Whitlock

      Jeremy Whitlock - 2004-08-20

      Vlad,
           Just to give you an example of the feature I seek, here is a link:

      http://geronimo.apache.org/modules/axis/clover/

      Basically, you get the same report as if you had code coverage BUT obviously you have 0% covered and your drilled down code would still be highlighted red.  I have time to begin helping you with this as I have experience developing with ant and java so just give me a holler if you would like my help.  Also, this feature should be capable of being turned on and off.  Laters, Jeremy

      P.S. - I will post the same message in the RFE.

       
    • Nobody/Anonymous

      yes, I have been thinking about replacements for CVS for a long time.  I started a project which is becoming vapor ware(with my next child on the way).  I think the community needs an SCM badly that works well with open source.

      Ideally, everyone should be able to submit changes on a branch as you say, and the owner than can approve with or without editing the change and push it into the mainline easily.

      Also, the SCM should be more compatible with what gump is trying to do, except it doesn't force upgrade you to the latest and greatest, it only asks you if you are ready to upgrade to the latest and greatest.  long story short, is I feel your pain!!!!

      It is too bad I don't have more time.  I am starting to talk to the company I work for to get money to do the project(we will see how that goes). 
      dean

       

Log in to post a comment.