When working with dictionaries in Python, or any equivalent data structure in some other language, it is quite important to remember the difference between a key which is not present and a key that maps to None
(null
) value. We often tend to blur the distinction by using dict.get
:
We do that because more often that not, None
and other falsy values (such as empty strings) are not interesting on their own, so we may as well lump them together with the “no value at all” case.
There are some situations, however, where these variants shall be treated separately. One of them is building a dictionary of keyword arguments that are subsequently ‘unpacked’ through the **kwargs
construct. Consider, for example, this code:
With a key mapping to None
, we’re calling the function with argument explicitly set to None
. Without the key present, we’re not passing the argument at all, allowing it to assume its default value.
But adding or not adding a key to dictionary is somewhat more cumbersome than mapping it to some value or None
. The latter can be done with conditional expression (x if cond else None
), together with many other keys and value at once. The former requires an if
statement, as shown above.
Would it be convenient if we had a special “missing
” value that could be used like None
, but caused the key to not be added to dictionary at all? If we had it, we could (for example) rewrite parts of the previous function that currently contain if
branches:
It shouldn’t be surprising that we could totally introduce such a value and extend dict
to support this functionality – after all, it’s Python we’re talking about :) Patching the dict
class itself is of course impossible, but we can inherit it and come up with something like the following piece:
The magical missing
object is only a marker here, used to filter out keys that we want to ignore.
With this class at hand, some dictionary manipulations become a bit shorter:
We could take this idea further and add support for missing
not only initialization, but also other dictionary operations – most notably the __setitem__
assignments. This gist shows how it could be done.
At work I’ve got a colleague who displays unusual aptitude in coming up with amusing terminology for everyday (coding) things. Among those, the Cake Pattern is always certain to provoke few laughs.
Recently, though, I heard him mention a great common sense rule for any development decision, from the grand architecture down to single line of code. It can be phrased like this:
One hack is fine. But if you need another one on top of the first, it’s probably high time to really consider what you are doing.
For good measure, I’ll call it a Principle of Two Hacks, and I’m pretty convinced that it’s a very beneficial rule to apply in programming – especially when creating any non-trivial, not throw-away programs.
At first it may sound rather vague, however. It’s concept of a “hack” is not easily explicable, or put into indisputable definitions. But that’s what makes it powerful: we don’t need to invoke elaborate (and often controversial) notions of design patterns or abstraction to be able to discuss them.
At best, the ideas of accidental complexity or technical debt might be somewhat close to what developers typically deem as a hack. In practice, this is mostly an opaque intuition that stems from experience or skill, and it’s usually very hard to express in words. Yet, it’s always apparent whenever we encounter it, even though the exact sensation may vary considerably: from a dim feeling that something is maybe a bit off, up to severe intellectual nausea caused by looking at really bad code.
I also like how extremely pragmatic this principle is. Quick-and-dirty fixes making it into production code are just a fact of life, and we are not prohibited from letting them slip. What we are strongly advised here is to maintain integrity of the software we’re writing, trying not to stack one hack upon another.
But even that is not absolute, nonnegotiable gospel; there might still be valid reasons to loosen up its structure. The important part, however, is to notice when we’re doing something fishy and consciously decide whether or not it is a good idea. It is much better than just plunging forward with total disregard of sanity of future maintainers.
Ultimately, this principle is just subtly telling us to think, and that is never a bad advice.
While at work a few days ago, I had an interesting albeit weird problem which started with the following cryptic error message:
ERROR 1005: can’t create table `qtn_formdisplay_product` (errno: 150)
It was produced by a local MySQL server running on my development machine when I tried to rebuild test database to accommodate for some model changes happening in the codebase. As you might have noticed, it’s not terribly informative, with the errno number as the only useful tidbit. A cursory glance at top search result for this message said that the most probable cause was a malformed FOREIGN KEY
constraint inside the offending CREATE TABLE
query.
Upon reading this, I blinked several times; something here was definitely off. The query wasn’t of course written by hand – if it was, we could at least consider an actual mistake to be a problem here. But no, it came from ORM – and not just an ORM, but the best ORM known to mankind. While obviously nothing is perfect, I would think it’s extremely unlikely that I found a serious bug in a widely used library just by doing something as innocent as creating a table with foreign key. I’m not that good, after all ;)
Well, except that it could totally be such a bug. The before mentioned search results also pointed to MySQL issue tracker where it was suggested that the error might happen after trying to create foreign key constraint with duplicate name. Supposedly, this could “corrupt” the parent table and no new FOREIGN KEY
s could reference it anymore, yielding the errno 150 if we attempted to create one. While it could not explain the behavior I observed (the parent table was freshly created), it raised some doubts whether MySQL itself may be to blame here.
These were exacerbated when one of my colleagues tried out the same procedure, and it worked for him just fine. He turned out to use newer version of MySQL, though: 5.5 versus 5.1. This appeared to support the hypothesis about a possible bug in MySQL but it didn’t seem to help one bit to get the thing running on the older version.
However, it was an important clue that something relevant changed in between, that had an influence on the whole issue. It was not really any particular bugfix or new feature: it was a change of defaults.
It would be quite far-fetched to call JavaScript a functional language, for it lacks many more sophisticated features from the FP paradigm – like tail recursion or automatic currying. This puts it on par with many similar languages which incorporate just enough of FP to make it useful but not as much as to blur their fundamental, imperative nature (and confuse programmers in the process). C++, Python or Ruby are a few examples, and on the surface JavaScript seems to place itself in the same region as well.
Except that it doesn’t. The numerous different purposes that JavaScript code uses functions makes it very distinct, even though the functions themselves are of very typical sort, found in almost all imperative languages. Learning to recognize those different roles and the real meaning of function
keyword is essential to becoming an effective JS coder.
So, let’s look into them one by one and see what the function
might really mean.
If you’ve seen few good JavaScript libraries, you have surely stumbled upon the following idiom:
Any and all code is enclosed within an anonymous function
. It’s not even stored in a var
iable; it’s just called immediately so its content is just executed, now.
This round-trip may easily be thought as if doing absolutely nothing but there is an important reason for keeping it that way. The point is that JavaScript has just one global object (window
in case of web browsers) which is a fragile namespace, easily polluted by defining things directly at the script level.
We can prevent that by using “bracketing” technique presented above, and putting everything inside this big, anonymous function. It works because JavaScript has function scope and it’s the only type of non-global scope available to the programmer.
So in the example above, the function
is used to confine script’s code and all the symbols it defines. But sometimes we obviously want to let some things through, while restricting access to some others – a concept known as encapsulation and exposing an interface.
Perhaps unsurprisingly, in JavaScript this is also done with the help of a function
:
What we get here is normal JS object but it should be thought of more like a module. It offers some public interface in the form of increment
and getValue
functions. But underneath, it also has some internal data stored within a closure: the value
variable. If you know few things about C or C++, you can easily see parallels with header files (.h, .hpp, …) which store declarations that are only implemented in the code files (.c, .cpp).
Or, alternatively, you may draw analogies to C# or Java with their public and private (or protected) members of a class. Incidentally, this leads us to another point…
Let’s assume that the counter
object from the example above is practical enough to be useful in more than one place (a tall order, I know). The DRY principle of course prohibits blatant duplication of code such as this, so we’d like to make the piece more reusable.
Here’s how we typically tackle this problem – still using only vanilla function
s:
Pretty straightforward, right? Instead of calling the function on a spot, we keep it around and use to create multiple objects. Hence the function becomes a constructor for them, while the whole mechanism is nothing else but a foundation for object-oriented programming.
We have now covered most (if not all) roles that functions play when it comes to structuring JavaScript code. What remains is to recognize how they interplay with each other to control the execution path of a program. Given the highly asynchronous nature of JavaScript (on both client and server side), it’s totally expected that we will see a lot of functions in any typical JS code.