People’s coding styles tend to evolve and change over time. One particular habit I seem to have picked up is to sprinkle the code liberally with numerous TODO
markers. I wish I could say it’s clear a sign of my ever-present dissatisfaction with imperfect solutions, but I suspect I simply adopted it while working at the current company :)
In any case, TODO
s (and FIXME
s &c.) are not actually something to scoff at, not too much at least. They are certainly better than the alternative, which is to commit shady code without explanation, rationale, or ideas for improvement. With TODO
s, you are at least making the technical debt apparent and explicit, thus increasing the likelihood you’ll eventually come around to pay it off.
When that glorious day comes, though, it would be nice to get a quick overview of the code’s shortcomings, so that you can decide what to work on first. Getting a list of TODO
s scattered over many files sounds like a great task for grep
, and a relatively simple one at that. But somehow, every time I wanted to do that I ended up spending some non-negligible time just working out the details of grep
‘s syntax and flags.
Thus, the logical course of action would be craft a simple script which would relieve me from doing that ever again. However, when I got around to writing it, I quickly realized the task it’s not actually that simple. In fact, it’s totally impossible to do it with just grep
, since it would require matching a regex against multiple subsequent lines of input . Standard GNU grep
doesn’t support that at all.
Well, at this point I should’ve probably taken the hint and realize it’s not exactly the best idea to use a shell script here. But hey, not everything has to be written in Python, right? :) So I rolled up my sleeves and after a fair amount of googling (and stack-overflowing), I unleashed a horror that I hereby present:
For best result, it is necessary to have pcregrep
installed, which is an extended version of grep
that supports the full spectrum of Perl-compatible regular expressions. On most popular Linux distros, pcregrep
is just one apt-get install
away.
When writing tests, ideally you should verify your code’s behavior not only in the usual, “happy” cases, but also in the erroneous ones. Although you may very well accept that a function blows when feed with incorrect data, it should blow up predictably and consistently. An error, exception or panic is still an output; and it should be possible to capture and examine it in tests.
The Python unittest
module has a couple of ways to deal with expected error. Probably the most useful among them is the TestCase.assertRaises
method. It does pretty much exactly what it names hints to: asserting that a piece of code raises a specific type of exception:
In Python 2.7 (or with the unittest2 shim library), it can be also used much more conveniently as a context manager:
To clarify, assertRaises
will execute given block of code (or a callable, like in the first example) and throw AssertionError
if an exception of given type was not raised by the code. By calling the tested function with incorrect data, we intend to provoke the exception, affirm the assertion, and ultimately have our test pass.
I mentioned, however, that it doesn’t just matter if your code blows up in response to invalid input or state, but also how it does so. Even though assertRaises
will verify that the exception is of correct type, it is often not nearly enough for a robust test. In Python, exception types tend to be awfully broad, insofar that a simple information about throwing TypeError
may tell you next to nothing about the error’s true nature.
Ironically, designers of the unittest
module seemed to be vaguely aware of the problem. One of their solutions, though, was to introduce a proverbial second problem, taking the form of assertRaisesRegexp
method. While it may kinda-sorta work in simple cases, I wouldn’t be very confident relying on regular expressions for anything more complex.
Especially when the other possible approach appears much more sound anyway. Using the feature of with
statement, we can capture the exception object and examine it ourselves, in a normal Python code. Not just the type or message (though these are typically the only reliable things), but also whatever other data it may carry:
Sometimes, those checks might grow quite sophisticated. For example, in Python 3 you have exception chaining; it allows you to look not only at the immediate exception object, but also its __cause__
, which is analogous to Throwable.getCause
in Java or Exception.InnerException
in C#. If you need to dig this deep, I’d suggest extracting a function with all that code – essentially a specialized version of assertRaises
, preferably with all the context manager goodness that would enable us to use it like the original.
As it turns out, this can be very simple.