When writing tests, ideally you should verify your code’s behavior not only in the usual, “happy” cases, but also in the erroneous ones. Although you may very well accept that a function blows when feed with incorrect data, it should blow up predictably and consistently. An error, exception or panic is still an output; and it should be possible to capture and examine it in tests.
The Python unittest
module has a couple of ways to deal with expected error. Probably the most useful among them is the TestCase.assertRaises
method. It does pretty much exactly what it names hints to: asserting that a piece of code raises a specific type of exception:
In Python 2.7 (or with the unittest2 shim library), it can be also used much more conveniently as a context manager:
To clarify, assertRaises
will execute given block of code (or a callable, like in the first example) and throw AssertionError
if an exception of given type was not raised by the code. By calling the tested function with incorrect data, we intend to provoke the exception, affirm the assertion, and ultimately have our test pass.
I mentioned, however, that it doesn’t just matter if your code blows up in response to invalid input or state, but also how it does so. Even though assertRaises
will verify that the exception is of correct type, it is often not nearly enough for a robust test. In Python, exception types tend to be awfully broad, insofar that a simple information about throwing TypeError
may tell you next to nothing about the error’s true nature.
Ironically, designers of the unittest
module seemed to be vaguely aware of the problem. One of their solutions, though, was to introduce a proverbial second problem, taking the form of assertRaisesRegexp
method. While it may kinda-sorta work in simple cases, I wouldn’t be very confident relying on regular expressions for anything more complex.
Especially when the other possible approach appears much more sound anyway. Using the feature of with
statement, we can capture the exception object and examine it ourselves, in a normal Python code. Not just the type or message (though these are typically the only reliable things), but also whatever other data it may carry:
Sometimes, those checks might grow quite sophisticated. For example, in Python 3 you have exception chaining; it allows you to look not only at the immediate exception object, but also its __cause__
, which is analogous to Throwable.getCause
in Java or Exception.InnerException
in C#. If you need to dig this deep, I’d suggest extracting a function with all that code – essentially a specialized version of assertRaises
, preferably with all the context manager goodness that would enable us to use it like the original.
As it turns out, this can be very simple.
There’s no better way to start a new year than a hearty, poignant rant. To set the bar up high right off the bat, I’m not gonna be picking on some usual, easy target like JavaScript or PHP. To the contrary, I will lash out on everyone-and-their-mother’s favorite language; the one sitting comfortably in the middle between established, mature, boring technologies of the enterprise; and the cutting edge, misbegotten phantasms of the GitHub generation.
That’s, of course, Python. A language so great that nearly all its flaws people talk about are shared evenly by others of its kin: the highly dynamic, interpreted languages. One would be hard-pressed to find anything that’s not simply a rehashed argument about maintainability, runtime safety or efficiency – concerns that apply equally well to Ruby, Perl and the like. What about anything specifically “pythonic”?…
Well, let’s talk about Python’s exceptions, shall we?
Every computer program expands until it can read e-mail – or so they say. But many applications need not to read, but to send e-mails; web apps or web services are probably the most prominent examples. If you happen to develop them, you may sometimes want a local, dummy SMTP server just for testing this functionality. It doesn’t even have to send anything (it must not, actually), but it should allow you to see what would be sent if the app worked in a production environment.
By far the easiest way to setup such a server involves, quite surprisingly, Python. There is a standard library module called smtpd
, which is built exactly for this purpose. Amusingly, you don’t even have to write any code that uses it; you can invoke it straight from the command line:
This will start a server that listens on port 8025 and dumps every message “sent” through it to the standard output. A custom port is chosen because on *nix systems, only the ports above 1024 are accessible to an ordinary user. For the standard SMTP port 25, you need to start the server as root:
While it’s more typing, it frees you from having to change the SMTP port number inside your application’s code.
If you plan to use smtpd
more extensively, though, you may want to look at the small runner script I’ve prepared. By default, it tries to listen on port 25, but you can supply a port number as its sole argument.
What is your first association evoked by a mention of the Java programming language? If you said “slow”, I hope all is well back there in the 90s! But if you said “verbose” instead, you would be very much in the right. Its wordiness is almost proverbial by now, with only some small improvements – like lambdas – emerging at the distant horizon.
Thankfully, more modern and higher level languages are much better in this regard. The likes of Python, Ruby or JavaScript, that is. They are more expressive and less bureaucratic, therefore requiring much less code to accomplish the same thing that takes pages upon pages in Java, C# or C++. The amount of “boilerplate” – tedious, repetitive code – is also said to be significantly lower, almost to the point of disappearing completely.
Sounds good? Sure it does. The only problem here is that most of those optimistic claims are patently false. Good real-world code written in high level, dynamic language does not necessarily end up being much shorter. Quite often, it is either on par or even longer than the equivalent source in Java et al.
It happens for several reasons, most of which are not immediately obvious looking only at a Hello World-like samples.
It is said Python is pretty much pseudocode that happens to be syntactically strict enough to parse and run. While usually meant as a compliment, this observation has an uglier counterbalance. Being a pseudocode, the raw Python source is woefully incomplete when it comes to providing necessary information for future maintainers. As a result, it gets out of hand rather quickly.
To rein it, we need to put that information back somehow. Here’s where the documentation and commenting part steps in. In case of Python, there is even a syntactical distinction on the language level between both. The former takes the form of docstrings attached to function and class definitions. They are meant to fill in the gaps those definitions always leave open, including such “trivialities” as what kind of arguments the function actually takes, what it returns, and what exceptions it may raise.
Even when you don’t write expansive prose with frequent usage examples and such, an adequate amount of docstrings will still take noticeable space. The reward you’ll get for your efforts won’t be impressive, too: you’ll just add the information you’d include anyway if you were writing the code in a static language.
The catch here is that you won’t even get the automatic assurance that you’ve added the information correctly… which brings us immediately to the next point.
There are two types of programmers: those who write tests, and those who will be writing tests. Undoubtedly, the best way to have someone join the first group is to make them write serious code in any interpreted, dynamic language.
In those languages, a comprehensive set of tests is probably the only automated and unambiguous way to ensure your code is satisfying even some basic invariants. Whole classes of errors that elsewhere would be eliminated by the compiler can slip completely undetected if the particular execution path is not exercised regularly. This includes trying to invoke a non-existent method; to pass an unexpected argument to a function; call a “function” which is not callable (or conversely, not call a function when it should have been); and many others. To my knowledge, none of these are normally detected by the various tools for static analysis (linting).
So, you write tests. You write so many tests, in fact, that they easily outmatch the very code they’re testing – if not necessarily in effort, then very likely in quantity. There’s no middle ground here: either you blanket your code with good tests coverage, or you can expect it to break in all kinds of silly ways.
Documentation, comments and tests take time and space. But at least the code in high level language can be short and sweet. Beautiful, elegant, crisp and concise, without any kind of unnecessary cruft!
Just why are you staring at me like that when I say I used a metaclass here?…
See, the problem with powerful and highly expressive languages is that they are dangerous. The old adage of power and responsibility does very much apply here. Without a proper harness, you will one day find out that:
Unless you can guarantee you’ll always have only real Perl/Python/Ruby rockstars on board, it’s necessary to tame that wild creativity at least a little bit. The inevitable side effect is that your code will sometimes have to be longer, at least compared to what that smart but mystifying technique would yield.
If you use a powerful HTML templating engine – like Jinja – inevitably you will notice a slow creep of more and more complicated logic entering your templates. Contrary to what many may tell you, it’s not inherently bad. Views can be complex, and keeping that complexity contained within templates is often better than letting it sip into controller code.
But logic, if not trivial, requires testing. Exempting it by saying “That’s just a template!” doesn’t really cut it. It’s pretty crappy excuse, at least in Flask/Jinja, where you can easily import your template macros into Python code:
When writing a fully featured test suite, though, you would probably want some more leverage Importing those macros by hand in every test can get stale rather quickly and leave behind a lot of boilerplate code.
Fortunately, this is Python. We have world class tools to combat repetition and verbosity, second only to Lisp macros. There is no reason we couldn’t write tests for our Jinja templates in clean and concise manner:
The JinjaTestCase
base, implemented in this gist, provides evidence that a little __metaclass__
can go a long way :)
Mocks are used to “fake” the behavior of objects to the extent required by tests. Most of the time, they can be pretty simple: when calling a method with given set of (constant) argument, then return some predefined value. Mocking frameworks actually strive to translate the previous sentence directly into code, which can be seen in Java’s Mockito library:
What is evident here is that this mock doesn’t maintain any state. As a result, all stubbed method are completely independent and referentially transparent: they always return the same value for the same set of arguments. In reality, that property is even stronger because they are simply constant.
Simple mocks like those work for a good number of cases. Some may even argue that it’s the only way to mock properly, because anything more complicated is a sign of code smell. We can counter that by saying that some complexity is often unavoidable, and it’s up to developers to decide whether they want to put it test or tested code.
And thus more complex stubbing may be needed if, for some case, you prefer the former approach. When that happens, it’s good to know how to fabricate a bit more sophisticated behavior. In Mockito, it’s very possible to achieve almost arbitrary complexity by capturing arguments of stubbed calls and routing them to callbacks. This is done with the help of ArgumentCaptor
class and the Answer
interface, respectively.
When combined, they can result in pretty clever mocks, as illustrated with this example of some imaginary API for SMTP (e-mail sending protocol):
Note that it doesn’t implement the real SMTP. That, presumably, would mimic the SmtpConnection
class exactly, which of course is not the point of mocking at all.
The point is to provide sufficiently complete behavior so that code under test can proceed, reaching all the execution paths we want to exercise. In this case, something can call getOutgoingClientData
and rely on certain string being present in the output.
Let’s just hope that this “something” actually has some good reasons to expect that…
Design patterns are often criticized, typically in the context of object-oriented programming. I buy into many such critiques, mostly because I value simplicity as one of the most important qualities of good code. Patterns – especially when overused – often stand in the way to achieve it,
Not all critique aimed towards design patterns is well founded and targeted, though. More specifically, the example I’ve seen brought up quite often is the Singleton pattern, and I don’t think it’s a good one in this context. Actually, for making a case for design patterns being (sometimes) harmful, the singleton is probably one of the worst picks.
Realizing this is important, because whatever point you’re trying to convey will be significantly watered down if you use an inadequate example. It’s just too easy to make up counterarguments or excuses, concentrating on specific flaws of your sloppy choice, rather than addressing more general issues you wanted to put some light on. A bad example can simply be a red herring, drawing attention from the topic you wanted it to stand for.
What’s so bad about singleton pattern, though?
Especially in their classic incarnation formulated in famous work of Gang of Four, design patterns are mostly about increasing robustness and flexibility of software design by introducing additional layers of indirection between existing concepts. For instance, you can consider the Factory pattern as proxy that separates the process of creating an object from specific type (class) of that object.
This goes along the same lines as separation between interface and implementation, a fundamental concept behind the whole object-oriented paradigm. The purpose is to decrease coupling, i.e. dependencies between different parts of the code, and it’s noble goal in its own regard.
Unfortunately, the Singleton pattern doesn’t really aid us in this pursuit. Quite the opposite: it talks about having at most one single instance of some class, which will easily make it a choke point for many otherwise independent parts of program logic. It happens especially often with top-level objects, representing whole subsystems; thanks to making them into singletons, they end up being used almost everywhere.
We also shouldn’t forget what singletons really are – that is, global variables. (You can have singletons with more limited scope, of course, but OO languages typically support them as language feature that doesn’t require dedicated design pattern). The pattern attempts to abstract them away but they tend to leak out rather eagerly, causing numerous problems.
Indeed, there are all sorts of nastiness related to global variables, with these two being – in my opinion – the most important ones:
It is worth noting that these problems are somewhat language-specific. In several programming languages, you can relatively easily create “global” variables which are only apparent; in reality, they proxy to thread-local and/or mockable objects, addressing both concerns outlined above.
However, in such languages the Singleton pattern is often obsolete as explicit technique, because they readily provide it as part of the language. For example, Python module objects are already singletons: their singularity is guaranteed by interpreter itself.
So, if you are to discuss the merits of software design patterns: pros and (specifically) cons, make sure you don’t base your whole argumentation on the example of Singleton. Accuracy, integrity and honesty would require choosing a target which is more representative and has no severe, unrelated issues.
Something like, say, Iterator. Or Factory. Or Composite.
Or pretty much anything else.