Introducing git-test

March 19, 2015 Published by aesspotify

Git is an incredible tool, but it isn’t magical and it doesn’t think for you.

When actually doing distributed version control, collaborating with people working at the same time, it’s easy to get some wires crossed. Git-test helps you keep the commits on your branch working.

If you change all the foo:s into bar:s, and your team mate adds a foo somewhere else, merging will work just fine, but the change you were trying to make has suddenly been subverted. This is not a bug, it’s just the consequence of working at the wrong level of abstraction. Note that this isn’t even an evil merge, it’s just a consequence of working in parallel.

While we might dream of structural IDE-type features that let us work on the right level, a more practical and robust reaction is to simply detect the problem when it happens and try to adopt a work process that avoids it. One such strategy is to only allow fast-forward merges, or to rebase the same way, but make an explicit merge commit. This at least serializes the history so that it’s either your fault, for not also changing the newest foo, or your mates’ for writing foo instead of the new bar.

Why would that be an improvement? Because now it isn’t some random accident, it’s suddenly someones’ responsibility to ensure that it doesn’t happen. It’s good for code quality, but bad for team morale. You’re racing to merge first, because if you lose, you have to rebase and inspect your patches closely to make sure that they still work and still do what they’re supposed to before you go completely home-blind.

If you’re really unlucky, you’re plunged into “rebase-hell” and your patches don’t even apply cleanly. Hardened veterans will only grumble a little and reach for git rerere, so they at least don’t have to explain the same merges over and over. You might argue for merging more often, but that’s a different kind of decision. If you need to keep a feature separate in a feature branch, and you have a policy or preference for small, atomic commits, non-trivial features will consist of multiple commits.

OK, but what about merging often and using feature flags to hide unfinished features? Well, it has it’s advantages, all the code is right there, there are no secret, unseen gobs of code in other repositories and branches. The disadvantages are that you’ve moved dealing with the version control branching complexity into the run-time configuration. It’s a question of which kind of risk/pain you’re willing to accept, but using flags for completely new spike or “random hacking” features seems a bit extreme.

But, whatever, we’re tough and ulcer meds are covered. In order to keep history simple and free of kinks, you rebase. Tests pass and you can’t see anything wrong.

Does this mean that things are OK?

Well, that’s a question of what you had in mind. It doesn’t mean your patches are OK. If you’re using the empty merge commit style, it means that the merge commit will be fine. For some, that’s a reasonable compromise. If you have higher standards for branch commits, too, or if you merge fast-forward, well, then you might be building yourself a very troubled past.

In the fast-forward case, you can’t even tell which commits are good and which ones were just branch-work commit points. Imagine you want to roll back to an older version. How do you find one that even passes unit tests?

Enough nightmare fuel, the answer is of course to not suddenly wake up in that position in the first place. How can we avoid it? Well, we can run tests on every commit, even, especially, when rebasing. What’s the most convenient way of doing that?

There’s a feature of git rebase, where, in interactive mode, you can specify a command to be run as a rebase step. The manual for git rebase even suggests using it to run basic tests, like recompiling. (Recompiling? Really?) I have no reason to believe it wouldn’t work, but I’d like to rebase first, and test afterwards. Having rebased, I’d like to work in a tighter loop than having to muck about with rebase -i again.

Shell for-loop over git-rev-list? This definitely works, but it’s incredibly clunky and easily degenerates into the kind of copy-pasta that breeds bugs rather than flushes them out.

This situation is precisely the reason git-test exists.

At it’s core is the same shell for-loop over git rev-list, but it does have a few conveniences and touches to recommend it over the simple loop. It’s taken a long time to get it right, so I’d like to make an effort to try to pitch it properly.

It tries quite hard to come up with sensible list of commits to test. By default it tests everything in the current branch that is not in the upstream branch, (if there is one,) or the branch of the same name, or master, in any remote. That’s a bit verbose, but it seems to be the best match for the intuitive idea of “local commits”.

$ git test -v
aes/testing ^there/aes/testing will test 3 commits
iter commit tree result
0000 44cbf69 ccd8e5a ... pass
0001 481531c 636f674 ... pass
0002 ca287b5 8d6dbbb ... pass

It caches results, and caches on trees and tests instead of commits. This means that if you rebase to reorder commits, it will leverage the cleverness of git to skip commits whose contents have already been tested. It will also skip empty merge commits. Keying on the test means that it will never confuse cached unit tests and integration tests, or anything else.

$ git test -v
aes/testing ^there/aes/testing will test 3 commits
iter commit tree result
0000 44cbf69 ccd8e5a ... pass (cached)
0001 481531c 636f674 ... pass (cached)
0002 ca287b5 8d6dbbb ... pass (cached)

There are of course facilities to clear the cache, and to re-test, either everything or based the earlier result. If the cache has both pass and fail entries for a commit, that fact should always be pointed out. Having flappy tests is bad, there’s no need to make it worse by hiding it.

There is a facility for capturing the output of the test. The captures are saved in a symlink farm directory, so they can be found by timestamp and serial number, commit-ish or by tree-ish (and test hash). Whenever the cache is used, it tries to create the corresponding link anyway.

reports/
+- tree/
¦ +- <tree-ish>_<test>_pass
¦ \- <tree-ish>_<test>_pass
+- commit/
¦ +- <commit-ish>_pass --> ../tree/<tree-ish>_<test>_pass
¦ \- <commit-ish>_pass --> ../tree/<tree-ish>_<test>_pass
+- <time-stamp>/
¦ +- 0000_pass --> ../tree/<tree-ish>_<test>_pass
¦ \- 0001_pass --> ../tree/<tree-ish>_<test>_pass
`- latest -> <time-stamp>

What if the worst happens anyway? Well, you can search for the introduction of the bug with the awesome git bisect run to find a commit that introduces it. If it’s a commonly recurring issue, or you just want to be more thorough, you can run something like this:

git test -v --verify='check-for-thing.sh' \
         HEAD ^HEAD~20 path/to/troublefile

As a side-effect of how commits are found, git-test inherits the git feature of filtering for commits touching a particular file.

Wow. OK, I’ve changed my mind, git is magical.

I want to thank all my colleagues, but in particular, our PO, Rouzbeh Delavari for pointing out that the hash of the test command would be good to key on, and Erik Hansson for a useful discussion about the structure of the reports symlink farm.

Introducing git-test

Does this mean that things are OK?

This situation is precisely the reason git-test exists.

Sign up for engineering updates