Menu

#31 Setting start and end dates / make "-d" logs work

v0.2-dev
closed-out-of-date
5
2006-12-05
2003-03-22
No

Judging from the trackers, many people want to generate
StatCvs statistics for a limited date range, for example
starting at a certain date. Lukasz has suggested to use
the "cvs log -d DATES" option to accomplish this. "-d"
logs will parse fine now, but won't give you the expected
results.

The reason is that StatCvs assumes that there is always
at least one revision for every file. With "-d" logs, that is
not necessarily true. Our whole processing model is based
on revisions, and files without revisions don't fit in there.
Right now, the parser simply discards them.

It is impossible (by design of CVS) to make it work for
date ranges with only an end date or both start and end
dates, because we can't calculate an accurate LOC for
them (same problem as with deleted files).

It *is* possible to make it work for date ranges with
*only a start date*.

This problem is noted as a limitation in the userguide.txt.
If it is fixed, this note must be removed.

Discussion

  • Chad Woolley

    Chad Woolley - 2003-12-08

    Logged In: YES
    user_id=447346

    Is there any idea how difficult this would be? I haven't looked
    at the StatCVS codebase, but I'd be willing to try to tackle
    this if it wasn't too difficult and I could find the time...

    Has anyone started working on it?

     
  • Richard Cyganiak

    Logged In: YES
    user_id=584620

    This should work now in the CVS version. Generate a log like this:

    cvs log -d "2002-01-01<2002-12-31"

    Both the start and end date are optional. Contrary to what I said
    above, this *does* work with an end date, but you have to check
    out the module with the end date:

    cvs -d ... checkout -D 2002-12-31

     
  • Richard Cyganiak

    • labels: --> Configuration options
    • milestone: --> v0.2-dev
    • assigned_to: nobody --> cyganiak
     
  • Chad Woolley

    Chad Woolley - 2003-12-12

    Logged In: YES
    user_id=447346

    Hi,

    I just tried this with the latest from the head. My log
    format was as follows:

    cvs log -N -d "2003-05-04<2010-12-31" > cvslog.log

    I got this exception:

    D:\apps\FoxServ\www\statcvs\stats>d:\j2sdk1.4\bin\java -jar
    ..\statcvs.jar cvslog.log .
    StatCvs - CVS statistics generation

    Exception in thread "main" java.lang.NullPointerException
    at
    net.sf.statcvs.input.CvsFileBlockParser.parseLocksAndAccessList(CvsFileBlockParser.java:130)
    at
    net.sf.statcvs.input.CvsFileBlockParser.parse(CvsFileBlockParser.java:68)
    at
    net.sf.statcvs.input.CvsLogfileParser.parse(CvsLogfileParser.java:75)
    at
    net.sf.statcvs.Main.generateDefaultHTMLSuite(Main.java:184)
    at net.sf.statcvs.Main.main(Main.java:76)

    I can provide more info if you want. Thank you for working
    on this.

    Thanks,
    Chad

     
  • Richard Cyganiak

    Logged In: YES
    user_id=584620

    Chad,

    > cvs log -N -d "2003-05-04<2010-12-31"

    please try again without the -N switch.

     
  • Chad Woolley

    Chad Woolley - 2003-12-12

    Logged In: YES
    user_id=447346

    OK, the -N was causing the exception. Now statcvs generated
    a report.

    Thanks!

     
  • Chad Woolley

    Chad Woolley - 2003-12-12

    Logged In: YES
    user_id=447346

    Hi,

    OK I got a report generated with a data-ranged log, but it
    still doesn't seem right.

    I originally checked in all the files (several hundred) to
    our repository on 2003-05-05. I want to exclude these
    checkins from the statistics.

    I made a log with this command:

    cvs log -d "2003-05-06<2010-12-31" > cvslog.log

    The header of the page now shows: "Summary Period:
    2003-05-06 to 2003-12-12". However, I know that all of my
    checkins on 2003-05-05 are still being included, by the
    lines of code and other reports.

    Is this working as designed? Am I able to exclude these
    initial checkins from the statistics?

     
  • Richard Cyganiak

    Logged In: YES
    user_id=584620

    Chad,

    the LOC chart on the report front page is supposed to look like
    before, only cut off at the start and end date. That is, in your
    case it would be almost the same, since you only cut off a
    single day.

    The reasoning behind this is that at the start of the report,
    there were, say, 1000 lines in the repository, so we let the
    chart start at 1000 lines. Anything else would be confusing
    IMHO. If the chart started at 0 lines instead of 1000, and the
    first thing you do is delte 100 lines, then the chart would go
    negative, and a negative LOC count doesn't make much sense.

    The LOC per author chart on the authors page, however,
    should look different. It represents *cumulative contributed
    lines of code* per author, a number that can only rise over
    time, and therefore it's ok to let it start at 0. In your case, the
    big jump on day 0 of the project should have disappeared from
    this chart.

    Does that make sense? Do you think that a different
    representation would be more sensible?

    Richard

     
  • Richard Cyganiak

    • summary: make "-d" logs work --> Setting start and end dates / make "-d" logs work
     
  • Chad Woolley

    Chad Woolley - 2003-12-12

    Logged In: YES
    user_id=447346

    I think some of the charts are showing up right. It's the
    "changes" and "lines of code" per author on the "Authors"
    page that don't seem right.

    I checked everything into the repository (hundreds of files)
    initially, but I don't want those commits for the checkins
    to show up in my "changes" and "lines of code" counts.
    However, they are still getting counted, because my stats
    are still the same as they were before (much higher than all
    the other developers).

    Is this the expected functionality?

     
  • Richard Cyganiak

    Logged In: YES
    user_id=584620

    Sounds like there is still some problem. I'm looking into it.

    Could you please try to regenerate with a start date a couple
    of weeks later? Like this:

    cvs log -d "2003-06-01<" > cvslog.log

    If the charts remain broken, could you please attach the
    zipped logfile, or mail it to me at richard@cyganiak.de ? Thanks
    a lot!

     
  • Chad Woolley

    Chad Woolley - 2003-12-12

    Logged In: YES
    user_id=447346

    I have a file (cvslog.zip) which contains from 2003-05-12
    and after. This is one week after the initial import
    (2003-05-05).

    I can't figure out how to attach a file to a bug, so I'll
    mail it to you.

     
  • Oren Gampel

    Oren Gampel - 2004-02-11

    Logged In: YES
    user_id=972372

    The summary states:
    It *is* possible to make it work for date ranges with
    *only a start date*.
    Can you please elaborate how?
    I still get:
    line 45: expected 'RCS file: ' but found '========
    ...
    When using a start date. (on version 0.2-dev)

     
  • Richard Cyganiak

    Logged In: YES
    user_id=584620

    v0.2 will support logfiles with start and end dates. Here is how
    it works (for me):

    http://statcvs.sourceforge.net/manual/#section_limiting-dates

    I'm not sure why it doesn't work for you. What CVS command
    did you use to create the log? Can you attach the first 100 or
    so lines of the log, or mail them to me at richard@cyganiak.de
    ? Thanks, Richard

     
  • Nobody/Anonymous

    Logged In: NO

    Command:
    cvs.exe log -d ">Feb 8" > \Projects\log\cvslog

    versions:
    [C:\Projects\log]java -jar statcvs.jar -version
    StatCvs - CVS statistics generation

    Version v0.2-dev

    [C:\Projects\log]"\Program Files\TortoiseCVS\cvs.exe" -version

    Concurrent Versions System (CVSNT) 2.0.4 (client/server)

    I've sent an email with the log attached.

     
  • Richard Cyganiak

    • status: open --> closed-out-of-date
     
  • Richard Cyganiak

    Logged In: YES
    user_id=584620
    Originator: YES

    No complaints for two years, seems to work for most people now.