What is your first association evoked by a mention of the Java programming language? If you said “slow”, I hope all is well back there in the 90s! But if you said “verbose” instead, you would be very much in the right. Its wordiness is almost proverbial by now, with only some small improvements – like lambdas – emerging at the distant horizon.
Sounds good? Sure it does. The only problem here is that most of those optimistic claims are patently false. Good real-world code written in high level, dynamic language does not necessarily end up being much shorter. Quite often, it is either on par or even longer than the equivalent source in Java et al.
It happens for several reasons, most of which are not immediately obvious looking only at a Hello World-like samples.
It is said Python is pretty much pseudocode that happens to be syntactically strict enough to parse and run. While usually meant as a compliment, this observation has an uglier counterbalance. Being a pseudocode, the raw Python source is woefully incomplete when it comes to providing necessary information for future maintainers. As a result, it gets out of hand rather quickly.
To rein it, we need to put that information back somehow. Here’s where the documentation and commenting part steps in. In case of Python, there is even a syntactical distinction on the language level between both. The former takes the form of docstrings attached to function and class definitions. They are meant to fill in the gaps those definitions always leave open, including such “trivialities” as what kind of arguments the function actually takes, what it returns, and what exceptions it may raise.
Even when you don’t write expansive prose with frequent usage examples and such, an adequate amount of docstrings will still take noticeable space. The reward you’ll get for your efforts won’t be impressive, too: you’ll just add the information you’d include anyway if you were writing the code in a static language.
The catch here is that you won’t even get the automatic assurance that you’ve added the information correctly… which brings us immediately to the next point.
There are two types of programmers: those who write tests, and those who will be writing tests. Undoubtedly, the best way to have someone join the first group is to make them write serious code in any interpreted, dynamic language.
In those languages, a comprehensive set of tests is probably the only automated and unambiguous way to ensure your code is satisfying even some basic invariants. Whole classes of errors that elsewhere would be eliminated by the compiler can slip completely undetected if the particular execution path is not exercised regularly. This includes trying to invoke a non-existent method; to pass an unexpected argument to a function; call a “function” which is not callable (or conversely, not call a function when it should have been); and many others. To my knowledge, none of these are normally detected by the various tools for static analysis (linting).
So, you write tests. You write so many tests, in fact, that they easily outmatch the very code they’re testing – if not necessarily in effort, then very likely in quantity. There’s no middle ground here: either you blanket your code with good tests coverage, or you can expect it to break in all kinds of silly ways.
Documentation, comments and tests take time and space. But at least the code in high level language can be short and sweet. Beautiful, elegant, crisp and concise, without any kind of unnecessary cruft!
Just why are you staring at me like that when I say I used a metaclass here?…
See, the problem with powerful and highly expressive languages is that they are dangerous. The old adage of power and responsibility does very much apply here. Without a proper harness, you will one day find out that:
Unless you can guarantee you’ll always have only real Perl/Python/Ruby rockstars on board, it’s necessary to tame that wild creativity at least a little bit. The inevitable side effect is that your code will sometimes have to be longer, at least compared to what that smart but mystifying technique would yield.
Saying “API” nowadays is thought first and foremost to refer to a collection of HTTP request handlers that expose data from some Web service in a machine-readable format (usually JSON). This is not the meaning of API I have in mind right here, though. The classic one talks about any conglomerate of programming language constructs – functions, classes, packages – that are available for us to use.
You can interact with an API in numerous different ways, but one of them is somewhat less common. Occasionally, besides having functions to call and objects to create, you are also presented with base classes to inherit. This is symptomatic to more complex libraries, written mostly in statically typed languages. Among them, the one that takes the most advantage of this technique is probably Java.
Note that what I’m talking about here is substantially different from implementing an interface. That is a relatively common occurrence, required when working with listeners and callbacks:
as well as in many other situations and patterns. There is nothing terribly special about it, given that even non-object oriented languages have equivalent mechanisms of accomplishing the same objective: separating the “how” from “what” in code, possibly to exchange or expand the latter.
Inheriting from a base class is something else entirely. The often criticized, ideologically skewed interpretation of OOP would claim that inheritance is meant to introduce more specialized kinds of existing types of objects. In practice, that’s almost completely missing the real point.
What’s important in creating a derived class is that:
Combined, these two qualities allow to introduce much more sophisticated ways of communication between the API and its client code. It’s a powerful tool and should be used sparingly, but certain types of libraries and (especially) frameworks can benefit greatly from it.
As an example, look at the Guice framework for dependency injection. If you use it in your project, you need to configure it by specifying bindings: mapping between the interfaces used by your code and their implementations. This is necessary for the framework can wire them together, possibly in more then one way (different in production code and in test code, for example).
Bindings are quite sophisticated constructs. For non-trivial applications – and these are the ones you generally want to apply dependency injection to – they cannot really be pinned down to a single function call. Other, similar frameworks would therefore use completely external configuration files (usually in XML), which has a lot of downsides.
Guice, however, has a sufficiently smart API that allows to realize all configuration in actual, compilable code. To achieve this, it uses inheritance, exactly as described above. Here’s a short sample:
Overridden methods in the above class (well, one method) serve as “sections” in the “configuration file”. Inherited methods, on the other hand, provide an internal, limited namespace that can be used to compose configuration “entries”. Since everything is real code, we can have the compiler check everything for some basic sanity as well.
Look at the following piece of jQuery code:
Of the two patterns it demonstrates, one is almost decisively bad: you shouldn’t build up DOM nodes this way. To get more concise and maintainable code, it’s better to use one of the client-side templating engines.
The second pattern, however, is hugely interesting. Most often called method chaining, it also goes by a more glamorous name of fluent interface. As you can see by a careful look at the code sample above, the idea is pretty simple:
Whenever a method is mostly mutating object’s state, it should return the object itself.
Prime example of methods that do that are setters: simple function whose pretty much only purpose is to alter the value of some property stored as a field inside the object. When augmented with support for chaining, they start to work very pleasantly with few other common patterns, such as builders in Java.
Here’s, for example, a piece of code constructing a Protocol Buffer message that doesn’t use its
Builder‘s fluent interface:
And here’s the equivalent that takes advantage of method chaining:
It may not be shorter by pure line count, but it’s definitely easier on the eyes without all these repetitions of (completely unnecessary)
builder variable. We could even say that the whole Builder pattern is almost completely hidden thanks to method chaining. And undoubtedly, this a very good thing, as that pattern is just a compensation for the deficiencies of Java programming language.
By now you’ve probably figured out how to implement method chaining. In derivatives of C language, that amounts to having a
return this; statement at the end of method’s body:
and possibly changing the return type from
void to the class itself, a pointer to it, or reference:
It’s true that it may slightly obscure the implementation of fluent class for people unfamiliar with the pattern. But this cost comes with a great benefit of making the usage clearer – which is almost always much more important.
Plus, if you are lucky to program in Python instead, you may just roll out a decorator ;-)
As a fact of life, in bigger projects you often cannot just delete something – be it function, method, class or module. Replacing all its usages with whatever is the new recommendation – if any! – is typically outside of your influence, capabilities or priorities. By no means it should be treated as lost cause, though; any codebase would be quickly overwhelmed by kludges if there were no way to jettison them.
To reconcile those two opposing needs – compatibility and cleanliness – the typical approach involves a transition period. During that time, the particular piece of API shall be marked as deprecated, which is a slightly theatrical term for ‘obsolete’ and ‘not intended for new code’. How effective this is depends strongly on target audience – for publicly available APIs, someone will always wake up and start screaming when the transition period ends.
For in-project interfaces, however, the blow may be effectively cushioned by using certain features of the language, IDE, source control, continuous integration, and so on. As an example, Java has the
@Deprecated annotation that can be applied to functions or classes:
If the symbol is then used somewhere else, it produces a compiler warning (and visual cue in most IDEs). These can be suppressed, of course, but it’s something you need to do explicitly through a complementary language construct.
So I had this idea to try and add similar mechanism to Python. One part of it is already present in its standard library: we have the
warnings module and a built-in category of
DeprecationWarnings. These can be ignored, suppressed, caught or even made into errors.
They are also pretty powerful, as they allow to deprecate certain code paths and not just symbols, which can be useful when introducing new meanings for function parameters, among other things. At the same time, it means using them is irritatingly imperative and adds clutter:
And in this particular case, it also doesn’t work as intended, for reasons that will become apparent later on.
What we’d like instead is something similar to annotation approach that is available in Java:
Given that the
@-things in Python (decorators, that is) are significantly more powerful than the Java counterparts, it shouldn’t be a tough call to achieve this…
Surprisingly, though, it turns out to be very tricky and quite arcane. The problems lie mostly in the subtle issues of what exactly constitutes “usage” of a symbol in Python, and how to actually detect it. If you try to come up with a few solutions, you’ll soon realize how the one that may eventually require walking through the interpreter call stack turns out to be the least insane one.
But hey, we didn’t go to the Moon because it was easy, right? ;) So let’s see how at least we can get started.
Coding for Android typically means writing Java, for all the good and bad it entails. The language itself is known of its verbosity, but the Android itself does not really encourage conciseness either.
To illustrate, look at this trivial activity that simply displays its application name and version, complete with a button that allows to close the app:
Phew, that’s a lot of work! While the IDE will help you substantially in crafting this code, it won’t help that much when it comes to reading it. You can easily see how a lot of stuff here is repeated over and over, most notably fields holding
View objects. Should you need to change something about them or add a new one, you have to go through all these places.
Not to mention that it simply looks cluttered.
What to do about it, though?… As it turns out there is a way to structure your Android code in a more succinct and readable way. Like a few other approaches to modern Java, it employs a palette of annotations. There is namely a project called Android Annotations that offers few dozens of them and aims to speed up Android development, making the code easier and more maintainable.
And it’s pretty damn good at that, I must say. Rewriting the previous snippet to use those annotations results in a class which looks roughly like this:
Not only we have eliminated all the boilerplate, leaving only the actual logic, but also made the code more declarative and explicit. Some of the irrelevant entities has been completely removed, too, like the
btnExit field which was only used to bind an event listener. Overall, it’s much more elegant and understandable code.
How about a bigger example? There is one on the project’s official page which looks pretty impressive. I can also weigh in my own anecdotal evidence of going through the Android game I wrote long ago and molding its code (a few KLOC) to work with AA. The result has been rather impressive, partially thanks to very light dependency injection facilities that the project provides, allowing me to replace many occurrences of
Game.get().getGfx().draw(...); silliness with just
So in closing, I can recommend AA to all Android devs out there. It will certainly make your lives easier!
There is this quite well known book, titled Clean Code. It is a very insightful work which I wholeheartedly recommend reading for any serious (or semi-serious) programmer.
The premise revolves around the concept of “cleanness” of code, which the author defines in various ways. Mostly it boils down to high signal-to-noise ratio and clear structure, where everything is neatly subdivided into smaller parts.
The idea is very appealing. For some it may even sound like a grand revelation; I know it was almost like that for me. But there is a bigger kind of meta-lesson to be learned here: the one of scope where such great ideas apply – and where they don’t.
You see, the Clean Code ideal works very well for certain kind of languages. The original examples in the book are laid down in Java and this is no coincidence. Java – as well as C, Go and probably few others – is not a very expressive language: lots of detailed busy work is often needed, even for conceptually simple tasks. And if several such tasks are lined up one after another, the reader is likely to drown in small details instead of seeing the bigger picture.
Those details are therefore one of the prime reasons why you may call some code “unclean”. To tidy up, they need to be properly encapsulated, away from a higher level overview. Hence it’s pretty common in Java to see functions like this one:
and think of them as good code, even if such a function is only used once. The goodness comes from the fact that they are relieving their callers from irrelevant loop minutiae, so that it’s easier to see why we need
foos to be valid in the first place.
Yet, at the same time, if you saw a completely equivalent Python construct:
you would firstly curse at incompetence and lack of knowledge of whoever wrote it:
and then you would get rid of the function altogether:
Why? Because the language is expressive enough for implementation itself to be almost as readable as the function call. Sure, throw the result of
all(...) into a variable for even more self-documenting sweetness, but don’t put it in some far away place behind a standalone function. Such code may or may not be cleaner; it will definitely raise eyebrows, though.
And lack of astonishment is probably the most important metric.
Mocks are used to “fake” the behavior of objects to the extent required by tests. Most of the time, they can be pretty simple: when calling a method with given set of (constant) argument, then return some predefined value. Mocking frameworks actually strive to translate the previous sentence directly into code, which can be seen in Java’s Mockito library:
What is evident here is that this mock doesn’t maintain any state. As a result, all stubbed method are completely independent and referentially transparent: they always return the same value for the same set of arguments. In reality, that property is even stronger because they are simply constant.
Simple mocks like those work for a good number of cases. Some may even argue that it’s the only way to mock properly, because anything more complicated is a sign of code smell. We can counter that by saying that some complexity is often unavoidable, and it’s up to developers to decide whether they want to put it test or tested code.
And thus more complex stubbing may be needed if, for some case, you prefer the former approach. When that happens, it’s good to know how to fabricate a bit more sophisticated behavior. In Mockito, it’s very possible to achieve almost arbitrary complexity by capturing arguments of stubbed calls and routing them to callbacks. This is done with the help of
ArgumentCaptor class and the
Answer interface, respectively.
When combined, they can result in pretty clever mocks, as illustrated with this example of some imaginary API for SMTP (e-mail sending protocol):
Note that it doesn’t implement the real SMTP. That, presumably, would mimic the
SmtpConnection class exactly, which of course is not the point of mocking at all.
The point is to provide sufficiently complete behavior so that code under test can proceed, reaching all the execution paths we want to exercise. In this case, something can call
getOutgoingClientData and rely on certain string being present in the output.
Let’s just hope that this “something” actually has some good reasons to expect that…