I've written a short utility that will process the diff hunks to the ChangeLog file, searching backward for the Date/Name/Address that the commit should be attributed to. The build command is in the file "build". Once compiled, try something like
git log -p --unified=50 ChangeLog > ChangeLog.diff
git_changelog_author ChangeLog.diff > author.txt
The author.txt file will be a nice condensed list that someone can use to post process the git repository authorship in an automated way.
The logic can be deciphered from the code, but if you want explanation, just ask. I will emphasize again that this is only gleaning info for diff hunks (i.e., changes) in the ChangeLog file itself. Any commits, for example, that didn't have a change in the ChangeLog will not be detected, so those just fall back to the maintainer who made such a change in terms of authorship.
It's been requested that the gitID be modified from the SHA to the following format:
This can be done without a change in the code. Try the following sequence of commands instead:
This seems to work nicely, distinguishing the commiter from the author.
I'm attaching a second version of the utility, version 0.2 beta. This version allows a second input file with a git-log containing the list of files that have changed. If provided, the utility will discard any entries in which ChangeLog was the only file that changed in the changeset.
Example usage:
Another tiny modification, but I think it has no ramifications on the final list. Version 0.3 beta.
Here's a new Version 0.4 beta that does not stop at the first added line in the ChangeLog, but instead looks a little further for the first item in the addition block. An item has a star or colon in the line somewhere. The idea is to allow situations where the first addition might be a space or something similar while the author information is actually in the addition hunk a line or two later.
Here are four cases that are corrected by this new version:
I've added some supporting script files and C++ source files that will convert the gnuplot CVS repository using cvs2git, git, g++ and sed. I've placed all the files in their own git repository, but also the actual files are present, so just run
tar -xzf git_translation.tar.gz
and look through the README for instructions. The only thing done to enhance the cvs2git result is to use sed to put proper committer info in the cvs2git dump file and add author information to the cvs2git dump file. Otherwise, all tags in the new git repository remain as they are in the CVS repository. Tags can be altered after the creation of the repository if so desired. Also, one can create a branch at the base and add pre-CVS versions in tar archive files.