This article is based off an old CurseForge KB article written by Ackis after I helped him through a conversion. I have since expanded the guide somewhat.
It assumes that you want to do a one-way conversion, i.e., after the conversion you do not want to use the SVN repository any more.
I have tried to expand on some sections, but the "know what you are doing" factor is still a bit high. Feel free to drop by in #git on irc.freenode.net or write an email.
You will need a fairly good understanding of Git (or a large amount of blind trust in my instructions). I am not going into the differences between SVN and Git here.
You need a working Git install with git-svn (hence also the SVN libraries).
Note: You may need to use git-svn on Linux as there are reports that it is broken on Windows. YMMV. You can also try both the mingw port and a cygwin install and see if at least one works.
The brief outline is:
Make an author map file.
git svn clone with this author map and the option
--no-metadata.
Fix up history as needed, in particular:
** Fake merges
** Remove empty commits
** Change SVN tagging commits to tags
The author map serves to map SVN usernames (which is the only identity information held in SVN) to realnames and email addresses for the git history.
It is a flat text file with lines of the form:
svnuser = R. E. Alname <real@email.example.com>
You can automate some of the task by running
svn log svn://svn.example.com/repository/path/ |
sed -ne 's/^r[^|]*| \([^ ]*\) |.*$/\1 = \1 <\1@dummy.example.com>/p' |
sort -u > author-map
which will scan the SVN history for all users, and make a dummy line for each. You can then look up their real names and emails as required.
The basic procedure is just
git svn clone -A author-map --no-metadata -s svn://svn.example.com/repository/path/ project
or if your project is branchless (consists only of a single line of history):
git svn clone -A author-map --no-metadata svn://svn.example.com/repository/path/ project
Note the difference in the -s flag; see below.
We use --no-metadata since this is a one-way conversion; omission of
this flag results in ugly git-svn-id lines.
If the -s layout (trunk/branches/tags) does not fit your repository,
note that git-svn can handle rather strange layouts in two ways:
If they can be shoehorned into a trunk/branches/tags style layout,
you can use the options -T/-b/-t (resp.) to set them.
Otherwise, you can do a two-step clone: first run git svn init
with the arguments that you would normally give to git svn
clone. Then edit the configuration file to point the
fetch = trunk:refs/remotes/svn/trunk branches = branches/:refs/remotes/svn/ tags = tags/:refs/remotes/tags/
lines to suit your needs. You may also use several lines of each type.
Then run git svn fetch.
The following are just the most common issues; there are many ways to improve history. Remember that you should not rewrite history after publishing it!
Modern SVN records "merges" (we call them cherry-picks), and modern git-svn can use this data to build git merges. However, old repositories require some manual intervention.
Warning: Do not push history that has grafts; chaos may ensue. Instead, filter-branch the history first.
Suppose you have determined that commit M_ is a merge commit, and that it has merged history up to commit _C. Then you can "fake" this merge with
echo $(git rev-parse M) $(echo git rev-parse M^) $(echo git rev-parse C) >> .git/info/grafts
The filter-branch invocations below will "set the graft in stone", but if you only want to do this step alone, you can run an otherwise-noop filter-branch with
git filter-branch --tag-name-filter cat -- --all
to achieve this.
git-svn frequently leaves commits that do not change anything, from SVN copy commands and such. You can delete them with
git filter-branch --prune-empty --tag-name-filter cat -- --all
On some SVNs I have worked with, it was common practice to prefix every commit message with the project name. You can extend the filter-branch invocation from the last subsection to also edit this out for prettier history:
git filter-branch --prune-empty --tag-name-filter cat --msg-filter 'perl -pe "s/^project:\s*//"' -- --all
git-svn currently leaves a branch tags/foo for every tag. Its tip
commit is usually the svn copy commit that created the tag, though
obviously this does not have to be the case.
The above filter-branch commands have already deleted this copy commit since it does no changes. It remains to turn the tagging commit into a proper tag. You can use the following chunk of shell code:
git for-each-ref --format="%(refname)" refs/remotes/tags/ |
while read tag; do
GIT_COMMITTER_DATE="$(git log -1 --pretty=format:"%ad" "$tag")" \
GIT_COMMITTER_EMAIL="$(git log -1 --pretty=format:"%ce" "$tag")" \
GIT_COMMITTER_NAME="$(git log -1 --pretty=format:"%cn" "$tag")" \
git tag -m "$(git for-each-ref --format="%(contents)" "$tag")" \
${tag#refs/remotes/tags/} "$tag"
done
The ugliness is there to exactly preserve the committer identity, timestamp and message.