Thread: [Jamwiki-devel] git migration
Brought to you by:
wrh2
From: Thomas K. <th...@ko...> - 2013-03-20 18:11:03
Attachments:
jamwiki_big_files
svn_authors
|
Hi, I ran an svn-git conversion yesterday and need your feedback for two decisions. First I've attached the authors file used to map svn ids to git names and emails. Please (Ryan) review this file and add or remove information as you like. Second: When trying to push the git repo to a server I killed the process somewhere at 30MB. That is too large. I've then filtered out the largest files in the history: git log --all --format=format:%T | xargs -n 1 -d "\\n" git ls-tree -r --long --full-name | cut -d " " -f4- | grep "^ *[[:digit:]]\{6,\}" | sort -u -n -r The .jar files alone make up more than 40MB. I propose to remove all .jar files from the history. I don't think that anybody plans to build the versions from the pre-maven era. It's nice and important to have the history of the java files from this time, but it's a burden to carry old .jar cruft around. Additionally I found more files that I propose to delete. They might not even be in the trunk branch: Size Path 6607408 jamwiki-crawler/tests/over5ktopics.txt 4805907 doc/categories.txt 2564118 jamwiki-crawler/tests/over10ktopics.txt 1219322 webtests/lib/webtest/resources/build/WebTestReport.reference.xml 656932 rainers/lib-src/acegi-security-1.0.3-src.zip 495194 doc/AllExtensions.pdf 403283 jamwiki-analyzer/test-content/Anarchism.orig.pdf 379304 jamwiki-mysql/logs/single-topic-parse.log 334704 webtests/results/002_SearchSomeJamwik/WebTestReport.xml 313772 jamwiki-analyzer/test-content/Anarchism.orig.html 309257 jamwiki-war/src/main/webapp/FCKeditor/_whatsnew_history.html 306051 webtests/results/005_SearchSomeJamwik/WebTestReport.xml 264148 jamwiki-war/src/main/webapp/FCKeditor/editor/js/fckeditorcode_ie.js 261229 jamwiki-war/src/main/webapp/FCKeditor/editor/js/fckeditorcode_gecko.js 161434 jamwiki-mysql/logs/single-topic-load.log 132893 doc/templates.txt 127503 work/org/apache/jsp/WEB_002dINF/jsp/recent_002dchanges_jsp.java 100665 ehcache/Ehcache - Web Caching_files/Main.js I've pushed a git clone with those files removed to: https://github.com/thkoch2001/jamwiki-wiki-conversion-wip The current tree has a size of 7.5 MB while the Git repo is 10 MB large. Having the whole history locally thus adds an overhead of 2.5 MB. I've lost the branches during the conversion. But I could do the conversion again with the branches, if you want. However the only not yet merged from the last year is "wrh2" with some cleanup commits. Is this branch still needed? Regards, Thomas Koch, http://www.koch.ro |
From: Peter P. <pit...@us...> - 2013-03-20 20:45:21
|
Hi Thomas, Am 20.03.2013 um 19:10 schrieb Thomas Koch <th...@ko...>: > I ran an svn-git conversion yesterday and need your feedback for two > decisions. [...] > > The .jar files alone make up more than 40MB. I propose to remove all .jar > files from the history. *hmmm* Is the SVN repository kept available online at SourceForge after migration has been finished? If yes, I wouldn't really mind, as history is kept intact. > I don't think that anybody plans to build the versions > from the pre-maven era. It's nice and important to have the history of the > java files from this time, but it's a burden to carry old .jar cruft around. True, but in the end a SCM is supposed to enable revision safe state tracking. We should, somehow, provide the full and uncut history; No matter if someone right now want's to use it for production relevant things ... But that's just my 2 cents ... > Additionally I found more files that I propose to delete. They might not even > be in the trunk branch: Maybe I'm pedantic, but I wouldn't do, without compensation. If we decide to "make a cut" and keep revisions "up to xxx" available for separate download, or checkout from still available SVN, I'm fine with this. But if you want to do a search "where does XYZ come from" you need a complete data basis. Unless I'm the only one, having this wish. In this case I'd not fight any deletion decision. > I've pushed a git clone with those files removed to: > https://github.com/thkoch2001/jamwiki-wiki-conversion-wip Thanks. > I've lost the branches during the conversion. But I could do the conversion > again with the branches, if you want. I'd prefer so, unless ... I know, I repeat myself ... we keep a separate history available (and therefore officially "(somehow) start from scratch"). There's http://danielpocock.com/migrating-a-sourceforge-project-from-subversion-to-git It additionally describes how to keep tags ... Which, IMHO, are even more important than branches (especially if they don't even differ significantly - in sense of unmerged code - from trunk). If development roadmap is "Next release will be 2.x" and "We make a kind of 'restart' including SCM house keeping", "For historical data see [SVN dump as compressed archive in SF files section or still read-only accessible SVN]" ... As from my perspective: let's go ahead! -- Best regards, Peter PS: Never mind the last commit to branch "pitpalme", if the decision is to cut history. I'll easily be able to re-apply it to a new repository! |
From: Ryan H. <rya...@gm...> - 2013-03-21 05:35:50
|
On 3/20/2013 1:45 PM, Peter Palmreuther wrote: > True, but in the end a SCM is supposed to enable revision safe state > tracking. We should, somehow, provide the full and uncut history; No > matter if someone right now want's to use it for production relevant > things ... But that's just my 2 cents ... I'd agree with Peter that ideally we would not lose history when changing source control systems, including the release branches. It's valuable to be able to trace the development of a file and review commit history to determine why things were done the way that they were done. I can delete most of the "personal" branches if that helps reduce the repository size, and I can provide file access if there is a fast way to download the entire SVN repository and convert it to Git. Alternately, if there are other options for simplifying this process let me know how I can help. In the end, if we have to lose some history it's not the worst thing in the world, but that's a trade-off that would be good to avoid if possible. Ryan |
From: Thomas K. <th...@ko...> - 2013-03-21 15:39:06
|
Hi Ryan, Peter, thank you for your feedback. I also want to keep all history of all important files. Since I work with Git I often look in the log to see the commit that introduced some line of code. See also this recent video on this topic: http://devblog.avdi.org/2012/06/22/use-revision-control-annotation-in-your- editor/ So I care a lot about the full history of all java and other code files. But I'd strongly propose to remove all .jar files from the history. They don't help to understand the code and would only be needed in the very unlikely case that somebody would want to build a jamwiki version from 2007. And even for this case we could keep the SVN and take the jars from there. Keeping the .jar files in the history would blow up the repository for everybody for the rest of the life of the jamwiki project without any benefit. Working with the repository only becomes slower. I've pushed a new version with all tags and branches to https://github.com/thkoch2001/jamwiki-wiki-conversion-wip2 I've only filtered the .jar files in this version. The history of the master branch is around 10MB. The additional history of the other branches is around 9MB. Most of the other files that I deleted in the last version are in the dfisla branch. I propose to review which branches we want to keep and to put the rest in some dark hidden corner of the internet. You can do selective clones to get only one branch with the --branch and -- single-branch options of git clone. Regards, Thomas Koch, http://www.koch.ro |
From: Ryan H. <rya...@gm...> - 2013-03-22 05:53:19
|
Hi Thomas, > Keeping the .jar files in the history would blow up the repository for > everybody for the rest of the life of the jamwiki project without any benefit. > Working with the repository only becomes slower. Losing the JAR files wouldn't be a huge problem, given the trade-off of repository size. Old JAMWiki WAR files are available for download, SVN will probably still be available, so losing JARs from old source code history seems like an acceptable trade-off. > I propose to review which branches we want to keep and to put the rest in some > dark hidden corner of the internet. With regards to branches, as long as the release branches are kept (branches/0.9.x, etc) and the release tags can be maintained then I don't think anything else is needed. Peter - if you feel differently please say so. Thanks for your work on this. Again, let me know if there is anything specific that I can do to help out. Ryan |
From: Peter P. <pit...@us...> - 2013-03-22 06:43:46
|
Hi Thomas, hi Ryan, Am 22.03.2013 um 06:52 schrieb Ryan Holliday <rya...@gm...>: > Hi Thomas, > >> Keeping the .jar files in the history would blow up the repository for >> everybody for the rest of the life of the jamwiki project without any benefit. >> Working with the repository only becomes slower. > > Losing the JAR files wouldn't be a huge problem, given the trade-off of > repository size. Old JAMWiki WAR files are available for download, SVN > will probably still be available, so losing JARs from old source code > history seems like an acceptable trade-off. I agree. If SVN is continued to be available there's no gain in transferring superfluous JAR files (that should never have made it to the SOURCE code management!) to git and only blow up it's size. I was only concerned about them getting lost completely and therefore establishing a situation that would make it impossible to re-build and older revision. That trade off is OK to me, as it's not the job of a new SCM to enable the most easiest way to re-build a historic version. >> I propose to review which branches we want to keep and to put the rest in some >> dark hidden corner of the internet. > > With regards to branches, as long as the release branches are kept > (branches/0.9.x, etc) and the release tags can be maintained then I > don't think anything else is needed. Peter - if you feel differently > please say so. I don't. SVN doesn't do branch tracking as Git, so there's no gain. All merges belong solely to destination trunk, so the commit log information where it originates from provides all information; Any further technical tracking of the original source is "manual work" anyway, so it can be done in SVN as well. I even don't mind about "my" personal branch; There's no problem in recreating it. All I see as "necessary" is - beneath "trunk" (or "master") - the release branches and tags. Everything else is an add-on :-) > Thanks for your work on this. Let me step into this chorus! Thanks for doing the job, I wouldn't have been able these days, due to workload. A little coding "by the way": yes; The focus requiring and time consuming SCM migration: sadly not right now :-) -- Best regards, Peter |
From: Thomas K. <th...@ko...> - 2013-03-22 06:35:42
|
Hi Ryan, the release branches and tags are available. I propose that you enable a Git repo on sourceforge and push the branches that you want to keep to that repo. I've not yet administrated a sourceforge project so I don't know how this works. But I've seen projects that had both an SVN and a Git repo. Regards, Thomas Koch, http://www.koch.ro |
From: Ryan H. <rya...@gm...> - 2013-03-24 17:00:55
|
Hi Thomas, Peter, I've enabled Git on Sourceforge and provided Git access to all JAMWiki developers, but haven't yet imported anything. Instructions for doing so can be found at https://sourceforge.net/apps/trac/sourceforge/wiki/Git Would either of you be willing to do the initial import? It should be extremely straightforward, but as a relative Git newbie I figured it would be best to leave it to the experts. Once that's done I'll look into changing SVN to be read-only to avoid any issues with code accidentally getting committed to the wrong place. Ryan On 3/21/2013 11:35 PM, Thomas Koch wrote: > Hi Ryan, > > the release branches and tags are available. I propose that you enable a Git > repo on sourceforge and push the branches that you want to keep to that repo. > > I've not yet administrated a sourceforge project so I don't know how this > works. But I've seen projects that had both an SVN and a Git repo. > > Regards, > > Thomas Koch, http://www.koch.ro > |
From: Thomas K. <th...@ko...> - 2013-03-24 17:31:59
|
Ryan Holliday: > Hi Thomas, Peter, > > I've enabled Git on Sourceforge and provided Git access to all JAMWiki > developers, but haven't yet imported anything. Hi Ryan, I don't have the right to push. It seems that you need to grant me developer status on the jamwiki sourceforge project. I'll only push the master branch and all tags to sourceforge for now. We can at any time also pull the other branches from github and push them to sourceforge if we want to keep them and enlarge the repository size. Regards, Thomas Koch, http://www.koch.ro |
From: Thomas K. <th...@ko...> - 2013-03-24 19:57:27
|
Hi Ryan, Peter, I have pushed master and all tags now. @Peter: Could you update the SCM data in the pom.xml? We can browse the wiki together and update the links there. @Ryan: Sourceforge asks projects to upgrade to their new plattform. No need to hurry, just want to know that you're aware of it. Regards, Thomas Koch, http://www.koch.ro |
From: Ryan H. <rya...@gm...> - 2013-04-02 02:46:23
|
I've pushed the release branches that were on Github to Sourceforge. Let me know if anything seems incorrect - I followed instructions from http://stackoverflow.com/questions/11266478/git-add-remote-branch I've had limited time available for JAMWiki development lately, although I'll be flying the next few days and tend to get things done when I'm stuck in airports. I've got a number of things I'd like to look into for 2.0, including simplifying upgrades and making it easier to use parts of the JAMWiki codebase (such as the parser) in other applications. Ryan On 3/24/2013 12:57 PM, Thomas Koch wrote: > Hi Ryan, Peter, > > I have pushed master and all tags now. |
From: Peter P. <pit...@us...> - 2013-03-25 07:44:44
|
Hey Thomas, Am 24.03.2013 um 20:57 schrieb Thomas Koch <th...@ko...>: > I have pushed master and all tags now. Thanks! > @Peter: Could you update the SCM data in the pom.xml? Did so, but I saw no commit message on jamwiki-commit. I've never managed a SF project, but I could imagine, there's a (post-)commit hook in old SVN repository, that need to be translated into a post-receive hook like https://github.com/git/git/blob/master/contrib/hooks/post-receive-email to send the mail to jamwiki-commit list. > We can browse the wiki together and update the links there. We'll do :-) -- Regards, Peter |
From: Ryan H. <rya...@gm...> - 2013-03-26 06:03:30
|
Hi, On 3/25/2013 12:44 AM, Peter Palmreuther wrote: > Did so, but I saw no commit message on jamwiki-commit. I spent some time tonight trying to setup the email hook as described at https://sourceforge.net/apps/trac/sourceforge/wiki/Git#Commitemailhooksetup and https://sourceforge.net/p/forge/documentation/Git/, but there isn't a /home/git directory on the shell. I'll need to revisit this at a later point, although if either of you have any suggestions I'd be grateful for the pointers. Ryan |
From: Peter P. <pit...@us...> - 2013-03-26 07:03:45
|
Hi Ryan, Am 26.03.2013 um 07:03 schrieb Ryan Holliday <rya...@gm...>: > Hi, > > On 3/25/2013 12:44 AM, Peter Palmreuther wrote: >> Did so, but I saw no commit message on jamwiki-commit. > > I spent some time tonight trying to setup the email hook as described at https://sourceforge.net/apps/trac/sourceforge/wiki/Git#Commitemailhooksetup and https://sourceforge.net/p/forge/documentation/Git/, but there isn't a /home/git directory on the shell. I'll need to revisit this at a later point, although if either of you have any suggestions I'd be grateful for the pointers. I don't have plain shell access on SF, but the first link you provide tells about "/home/scm_git" ... Maybe it's a typo in SF Git docs? If you can provide me temporary shell access to the project I can have a look for what's maybe "wrong" ... -- Regards, Peter |
From: Ryan H. <rya...@gm...> - 2013-03-26 15:09:50
|
Hi, The /home/git/p/... directory is present this morning, so perhaps it just takes some time after the project is updated for it to appear. I'll get the email hook set up later tonight. For future reference, if I'm reading the docs correctly then everyone with the developer role should also have shell access - see https://sourceforge.net/p/forge/documentation/SSH/. Ryan |
From: Ryan H. <rya...@gm...> - 2013-03-27 03:16:28
|
Emails should be sent to jamwiki-commit again. For reference, here are the steps taken to get this working: Copy https://sourceforge.net/apps/trac/sourceforge/wiki/Git%20hook%20script%20example to /home/git/p/jamwiki/code.git/hooks/post-receive-user chmod a+x post-receive-user git config --add hooks.mailinglist jam...@li... Make the content of the /home/git/p/jamwiki/code.git/description file "JAMWiki". |