Much can be said about similarities between two popular, distributed version control systems: Git and Mercurial. For the most part, choosing between them can be a matter of taste. And because Git seems to have considerably more proponents in the Cool Kids camp, it doesn’t necessarily follow it’s a better option.
But I have found at least one specific and common scenario where Git clearly outshines Hg. Suppose you have coded yourself into a dead end: the feature you planned doesn’t pan out the way you wanted it; or you have some compatibility issues you cannot easily resolve; or you just need to escape the refactoring bathtub.
In any case, you just want to step back a few commits and pretend nothing happened, for now. The mishap might be useful later, though, so it’d be nice if we left it marked for the future.
In Git, this is easily done. You would start by creating a new branch that points to your dead end:
__my_feature__dead_end__ refer to the same, head commit. We would then move the former a little back, sticking it to one of the earlier hashes. Let’s find a suitable target:
If it looks right, we can reset the
my_feature branch so it points to this specific commit:
Our final situation would then looks like this:
which is exactly what we wanted. Note how any further commits starting from the referent of
my_feature would fork at that point, diverging from development line which has lead us into dead end.
Why the same thing is not so easily done in Mercurial?… In general, this is mostly because of its one fundamental design decision: every commit belongs to one branch, forever and for always. Branch designation is actually part of the changeset’s metadata, just like the commit message or diff. Moving things around – like we did above – is therefore equivalent to changing history and requires tools that are capable of doing so, such as
Great services like GitHub encourage to share projects and collaborate on them publicly. But not every piece of code feels like it deserves its own repository. Thus it’s quite reasonable to keep a “miscellaneous” repo which collects smaller, often unrelated hacks.
But how to set up such a repository and what structure should it have? Possible options include separate branches or separate folders within single branch. Personally, I prefer the former approach, as it keeps both the commit history and working directory cleaner. It also makes it rather trivial to promote a project into its own repo.
I speak from experience here, since I did exactly this with my repository of presentation slides. So far, it serves me well.
It’s not hard to arrange a new Git repository in such manner. The idea is to keep the master branch either completely empty, or only store common stuff there – such as a README file:
The actual content will be kept in separate branches, with no relation to each other and to the master one. Such entities are sometimes referred to as root branches. We create them as usual – for example via git checkout:
However, this is not nearly enough. We don’t want to base the new branch upon the content from master, but we still have it in the working directory. And even if we were to clean it up manually (using a spell such as ls | xargs rm -r to make sure the .git subdirectory is preserved), the removal would have to be registered as a commit in the new branch. Certainly, it would go against our goal to make it independent from master.
But the working copy is just one thing. In order to have truly independent, root branch we also need to disconnect its history from everything else in the repo. Otherwise, any changesets added before the branch was created would carry over and appear in its log.
Fortunately, making the history clear is very easy – although somewhat scary. We need to reach out to internal .git directory and remove the index file:
Don’t worry, this doesn’t touch any actual data, which is mostly inside .git/objects directory. What we removed is a “table of contents” for current branch, making it pristine clear – just like the master right after git init.
As a nice side effect, the whole content of working directory is now unknown to Git. Once we removed the index, every file and directory has became untracked. Now it’s possible to remove all of them in one go using git clean:
And that’s it. We now have a branch that has nothing in common with rest of the repository. If we need more, we can simply repeat those three steps, starting from a clean working copy (not necessarily from master branch).