CVS to Git SF repository

Earnie Boyd

Converting from Sourceforge CVS to Sourceforge GIT code repositories

I thought I would document what I did to achieve the upgrade of the MinGW CVS repositories to Git repositories. I was using a Windows 7 laptop and had to setup the tools required. I used Cygwin mainly because I needed rsync and python. However there is no reason that you shouldn't be able to use a Linux client system to do the same thing.

To setup Cygwin, execute Cygwin's setup.exe and also choose the cvs, svn, git, python, ssh and rsync packages for installation. This gave me an environment in which to work. You may need to run a rebaseall using the Cygwin's ash shell, all other Cygwin processes must stopped.

Next start your Cygwin shell and create a directory to work in, I'll call it ~/cvs2git, and change directory to it. The rsync your projects CVS repository to a directory I'll call cvscopy.

mkdir cvscopy
rsync -av rsync://PROJECTNAME.cvs.sourceforge.net/cvsroot/PROJECTNAME/* cvscopy

I used an already written python script to do the actual conversion so you'll need to grab the following SVN repository. Yes, strange it is in SVN but the repository contains several varieties of conversions.

svn export --username=guest http://cvs2svn.tigris.org/svn/cvs2svn/trunk cvs2svn-trunk

If your CVS repository is like MinGW's was we have several repository directories we wanted to separate into different Git repositories. The conversion script requires each one to have an options file for the cvs2git script to operate on. So I created a script that modified a cvs2git.options.in file with sed and redirected the result to cvs2git.options. First git a copy of the sample cvs2git.options file.

cp cvs2svn-trunk/cvs2git.options.sample cvs2git.options.in

Next you need to edit cvs2git.options.in to create the template file for the script. Around line 373 you'll find a variable ctx.username which is set with a value of ctx2git. Change the value to @PROJECT@ like so.

ctx.username = '@PROJECT@'

The options also specify where the CVS directory is for the repository. So find the method run_options.set_project in the script around line 555. We'll substitute the test-data/main-cvsrepo string to @CVSDIR@ like so.

# Now set the project to be converted to git.  cvs2git only supports
# single-project conversions, so this method must only be called
# once:
run_options.set_project(
    # The filesystem path to the part of the CVS repository (*not* a
    # CVS working copy) that should be converted.  This may be a
    # subdirectory (i.e., a module) within a larger CVS repository.
    r'@CVSDIR@',
...

Now we need to create a map of users to transform the authors from CVS style to Git style. Around line 510 of the options file you'll find an array variable named author_transforms with examples set in it. There are two styles you can use for the value of the associative key which is the user name. You can use straight text or you can use a paired set. So assume we have a user named example we can do one of the following. I suggest you pick one style for all your entries. With the second style you can specify unicode strings by add a u prior to the name.

author_transforms = {
    'example' = 'Example Person <noreply@users.sourceforge.net>',
    'example' = ('Example Person', 'noreply@users.sourceforge.net'),
    'example' = (u'Example Person', 'noreply@users.sourceforge.net'),

    ...
}

There is also a 'cvs2cvs' entry in this array to map the ctx.username value properly. You should change it as well like follows.

author_transforms = {
    ...

    # This one will be used for commits for which CVS doesn't record
    # the original author, as explained above.
    '@PROJECT@' : '@PROJECT@ Team Maintenance <@USERNAME@@users.sourceforge.net>',
    }

Now you have your cvs2git.options.in template complete. We now need to create a repository description template named description.in. This needs to be a one line description and is used to describe the repository. I chose to simply state the repository name like so.

Repository: @GITREPO@

You could also set an external variable PROJDESC and create the description.in like so.

@PROJDESC@

Or maybe like so.

@GITREPO@ - @PROJDESC@

Now you have your description.in template complete. We now need to create an config.in template which configures our repository options. Make sure this file is in the correct format or you will not be able to use your repository. Here is what I used for my config.in file.

[core]
        repositoryformatversion = 0
        filemode = true
        bare = true
        sharedrepository = 2
[receive]
        denyNonFastforwards = true
[hooks]
        mailinglist = @MAILLIST@
        showrev = "t=%s; printf 'http://@PROJECT@.git.sourceforge.net/git/gitweb.cgi?p=@PROJECT@/@GITREPO@;a=commitdiff;h=%%s' ; echo;echo; git show -C ; echo"
        emailprefix = [git push @GITREPO@]
        emailmaxlines = 2000
        envelopesender = noreply@sourceforge.net
[gitweb]
        owner = Project: @PROJECT@

You can optionally add a pre-receive and post-receive file. If they exist in the working directory they will be copied to the hooks directory of the repository. Assuming we have three administrators with user names of example1, example2 and example3 I used the following pre-receive file.

1
2
3
4
5
6
7
8
#!/bin/sh
case $USER in
example1 | example2 | example3)
    exit 0 ;;
esac

echo You are not authorized to push to the repository.
exit 1

I used the following post-receive file which I copied from the default Sourceforge /home/scm_git/PROJECTNAME/PROJECTNAME/hooks/post-receive.sample file and removed the comment for the post-receive-mail script.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#!/bin/sh
#
# An example hook script for the "post-receive" event.
#
# The "post-receive" script is run after receive-pack has accepted a pack
# and the repository has been updated.  It is passed arguments in through
# stdin in the form
#  <oldrev> <newrev> <refname>
# For example:
#  aa453216d1b3e49e7f6f98441fa56946ddcd6a20 68f7abf4e6f922807889f52bc043ecd31b79f814 refs/heads/master
#
# see contrib/hooks/ for a sample, or uncomment the next line and
# rename the file to "post-receive".

. /usr/share/git-core/contrib/hooks/post-receive-email

This ends the preparation for using the script. The rest of this document describes the use of the script I created. I hope you find it useful.


The next step is to create a script to use for the conversion process. I used a name of cvs2git.sh and will add it as an attachment for you to download but will give it to you here in case you need to copy and paste it. For an example let's assume the Sourceforge project name is sfproj, the administrator of this project is sfprojadmin, the Git repository name with be sfprojgit and the CVS repository path from our rsync command is cvscopy/sfprojcvs. We don't want to have a list notification for this example. We simply do the following.

./cvs2git.sh sfproj sfprojadmin sfprojgit cvscopy/sfprojcvs

This will create the new git repository as sfprojgit configured for sfproj and then secure copy the result to /home/scm_git/sfproj/. The repository is build locally first then submitted. The sfprojadmin user will be asked by the shell.sourceforge.net server for they password or passphrase to create the shell instance and then asked again for the copy to the server. You can choose to ^C (CTRL-C) at the first password prompt instead to review the work. If there are errors it would have occurred during the execution cvs2svn-trunk/cvs2git script and displayed on the terminal output. I had one repository where the Attic directory contained files named the same as the active repository and the process aborted. I simply removed the files from the Attic directory. Once the file is copied to your project you can verify that it looks correct by browsing the web interface.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
#!/bin/sh
# This file is the work of Earnie Boyd <earnie~at~users.sourceforge.net>
# You are free to use this script for your own benefit including modifying it
# as you see fit.  However the following disclaimer applies to this work.
#
# This script is distributed in the hope that it will be useful but WITHOUT ANY
# WARRANTY.  ALL WARRANTIES, EXPRESSED OR IMPLIED ARE HEREBY DISCLAIMED.  This
# includes but is not limited to warranties of MERCHANTABILITY or FITNESS FOR A
# PARTICULAR PURPOSE.

# Use:
# ./cvs2git.sh SF-PROJECTNAME SF-USERNAME SF-GITREPO-NAME CVSDIR-PATH [SF-NOTIFICATION-MAILLIST]

PROJECT=$1
USER=$2
GITREPO=$3
CVSDIR=$4
if [ ! -z "$5" ]
then
  MAILLIST=$5
fi
TMPDIR=${GITREPO}-tmp

if [ ! -d $CVSDIR ]
then
  echo $CVSDIR is not a directory
  exit 1
fi

mkdir -p ${GITREPO}-work

sed -e "s#@CVSDIR@#${CVSDIR}#g" -e "s#@USERNAME@#${USER}#g" cvs2git.options.in > cvs2git.options

./cvs2svn-trunk/cvs2git --options=cvs2git.options --fallback-encoding utf-8
mv cvs2svn-tmp $TMPDIR

git --git-dir=$GITREPO init --shared=all --bare
cd $GITREPO
if [ -f ../config.in ]
then
  sed -e "s#@GITREPO@#${GITREPO}#g" -e "s#@MAILLIST@#${MAILLIST}#g" -e "s#@PROJECT@#${PROJECT}#g" ../config.in > config
else
  if [ ! -z "$MAILLIST" ]
  then
    git config hooks.mailinglist ${MAILLIST}
  else
    git config hooks.mailinglist ''
  fi
  git config hooks.showrev "t=%s; printf 'http://${PROJECT}.git.sourceforge.net/git/gitweb.cgi?p=${PROJECT}/${GITREPO};a=commitdiff;h=%%s' ; echo;echo; git show -C ; echo"
  git config hooks.emailprefix "[git push $GITREPO]"
  git config hooks.emailmaxlines 2000
  git config hooks.envelopesender noreply@sourceforge.net
fi
if [ -f ../pre-receive ]
then
  cp ../pre-receive hooks/
  chmod 755 hooks/pre-receive
fi
if [ -f ../post-receive ]
then
  cp ../post-receive hooks/
  chmod 755 hooks/post-receive
fi
if [ -f ../description.in ]
then
  sed -e "s#@PROJECT@#$PROJECT#g" -e "s#@PROJDESC@#$PROJDESC#g" -e "s#@GITREPO@#$GITREPO#g" ../description.in > description
fi

cd ../$GITREPO-work
git init
git remote add origin ../$GITREPO
git config branch.master.remote origin
git config branch.master.merge refs/heads/master
cat ../${TMPDIR}/git-{blob,dump}.dat | git fast-import
git reset --hard
git push origin master

git config user.name "${PROJECT} maintenance"
git config user.email "${USER}@users.sourceforge.net"
git tag -l | while read ver;
  do git checkout $ver;
  git tag -d $ver;
  GIT_COMMITTER_DATE="$(git show --format=%aD | head -1)" git tag -a $ver -m "prep for $ver release" ;
  done
git checkout master
git push --tags

cd ..
DESTDIR=/home/scm_git/$PROJECT

echo
echo You will need to enter your SourceForge password or passphrase a couple of times.
echo
ssh $USER,$PROJECT@shell.sourceforge.net create
scp -r $GITREPO $USER,$PROJECT@shell.sourceforge.net:$DESTDIR

Note: I've tried to add an attachment but it didn't not occur properly.


  • Earnie Boyd
    Earnie Boyd
    2012-07-29

    I've uploaded the script, the template files, the pre-receive hook and the post-receive hook to https://sourceforge.net/projects/earnie.u/files/cvs2git.sh/ for your convenience. Good luck converting your repositories.