Git With Subversion: The Trouble With Rebase

It wasn’t all that long ago that I was unfortunate enough to be working on projects whose idea of “version control” was merely to put all the source code in a shared network directory – thus failing to provide any notions of “version” or “control” whatsoever. Things are far more stable now; Subversion is widely used for projects that appreciate the idea of a centralized repository, while git appears to be the common choice for those looking for something distributed, with some remarkably powerful features. With the git svn bridge, developers can work locally (and even detached from the network) using git, and push their work to Subversion when needed. As a git convert, I would never return to straight SVN now; my first task when working on a Subversion project is to clone it with git. It appears to be a very common and successful workflow.

This isn’t going to be a “which is better” post; there are plenty of those around. Nor am I going to explain how to create a local git repository from a remote SVN: there are lots of places to find that too. What I am going to look at is the difficulties that happen when you wish to check your local work in git back into Subversion. You have to do some scrubbing, since the git svn tools allow git to understand SVN, but not the reverse. Anything pushed back to SVN cannot have any git merges; it has to be flattened into something linear, with all changes applied “after” the current SVN state.

Let’s set the scene. You have been working on a branch called “work” and made several commits; meanwhile, others have checked in their changes into SVN. You’ve reached a point where you’re ready to check in, so you first

git checkout master
git svn rebase

to get your local clone of the master branch up to date. It looks something like this:

rebase_start

You now have to get your work applied on top of the changes in master, so you can check it in, and you’ll probably want to flatten your work down into a single commit as well. There’s a quick way to do it, but it could leave you with conflicts to resolve, and you have to make sure no git merge features get into that Subversion master. You’d like to do this in your work branch, so you

git checkout work
git rebase master

and after dealing with any conflicts, the result is something like:

rebase_afterThis is the point where something just doesn’t feel right. Our work is indeed on top of master; but is it correct? In a way, it’s no longer our original work at all. The rebase process created a completely new set of commits, with different SHA identifiers – in fact, if it wasn’t for that “saved” tag, the original commits would be gone. (Like most git features, “gone” is a relative term – you can always go spelunking in the reflog). You’ve rewritten your entire development history on the work branch. How can you know it’s correct, and wouldn’t you be more comfortable with an easier way to get it back? What’s more, if you had anyone else collaborating with you by cloning the git repository, rewriting branch history has just destroyed the sync between the two of you.

A “safer” way to pull those changes in is the simple

git merge master

giving us:

rebase_mergewhich preserves our original work, unchanged, and any conflicts are resolved in the merge. If something went wrong

git reset HEAD^

will put it all back. We can verify our changes coexist successfully with those in master, and check in further modifications as needed – but they still can’t be checked in; they are still not on top of master. There’s no avoiding the need for a rebase. However, we are now in a safer situation; we can easily revert to our original state, and we can perform the rebase more safely too. To do so, we just need a “scratch area” to do it in; another branch.

git checkout -B _publish
git rebase master

gives:

rebase_publishThis might not appear to have gained anything, but it has. We can now verify if the rebase is successful with

git diff work

and, as long as the files all match, we know we have something we can safely check in. We can quickly finish up, group the commits, write a final commit message and push up to SVN with:

git checkout master
git merge --ff --squash _publish
git commit
git svn dcommit
git branch -D _publish

and we get to where we wanted, with safety checks every step of the way.

rebase_done

We can get rid of the work branch too, if we don’t need to keep it around for any further development:

rebase_final

 

It seems like a lot of extra effort to achieve something that can often be done in one or two commands, but the fast way, with no safety net, has caught me out in the past. A few choice git aliases reduce this to a simple workflow – sync, prepare, publish, push –  and I’m reassured at every point that my changes are checked in, correctly, without unexpected modifications from a bad merge. I spend less time looking over my shoulder, fetching master updates and breaking away from my development cycle to stay up to date, and far less time cleaning up the mess left by a bad master commit – which keeps me focused on the current work branch. Git is powerful, but requires discipline and organization from the user. A version control workflow is only truly successful if you spend as little time as possible making it do what you need.

 

Flattr this!

Leave a Reply

Your email address will not be published. Required fields are marked *