As a fact of life, in bigger projects you often cannot just delete something – be it function, method, class or module. Replacing all its usages with whatever is the new recommendation – if any! – is typically outside of your influence, capabilities or priorities. By no means it should be treated as lost cause, though; any codebase would be quickly overwhelmed by kludges if there were no way to jettison them.
To reconcile those two opposing needs – compatibility and cleanliness – the typical approach involves a transition period. During that time, the particular piece of API shall be marked as deprecated, which is a slightly theatrical term for ‘obsolete’ and ‘not intended for new code’. How effective this is depends strongly on target audience – for publicly available APIs, someone will always wake up and start screaming when the transition period ends.
For in-project interfaces, however, the blow may be effectively cushioned by using certain features of the language, IDE, source control, continuous integration, and so on. As an example, Java has the @Deprecated
annotation that can be applied to functions or classes:
If the symbol is then used somewhere else, it produces a compiler warning (and visual cue in most IDEs). These can be suppressed, of course, but it’s something you need to do explicitly through a complementary language construct.
So I had this idea to try and add similar mechanism to Python. One part of it is already present in its standard library: we have the warnings
module and a built-in category of DeprecationWarning
s. These can be ignored, suppressed, caught or even made into errors.
They are also pretty powerful, as they allow to deprecate certain code paths and not just symbols, which can be useful when introducing new meanings for function parameters, among other things. At the same time, it means using them is irritatingly imperative and adds clutter:
And in this particular case, it also doesn’t work as intended, for reasons that will become apparent later on.
What we’d like instead is something similar to annotation approach that is available in Java:
Given that the @
-things in Python (decorators, that is) are significantly more powerful than the Java counterparts, it shouldn’t be a tough call to achieve this…
Surprisingly, though, it turns out to be very tricky and quite arcane. The problems lie mostly in the subtle issues of what exactly constitutes “usage” of a symbol in Python, and how to actually detect it. If you try to come up with a few solutions, you’ll soon realize how the one that may eventually require walking through the interpreter call stack turns out to be the least insane one.
But hey, we didn’t go to the Moon because it was easy, right? ;) So let’s see how at least we can get started.
Many are the quirks of shell scripting. Most are related to confusing syntax, but some come from certain surprising semantics of Bash as a language, as well as the way scripts are executed.
Consider, for example, that you’d like to list files that are within certain size range. This is something you cannot do with ls alone. And while there’s certainly some awk incantation that makes it trivial, let’s assume you’re a rare kind of scripter who actually likes their hacks readable:
So you use an explicit while
loop, obtain the file size using stat and compare it to given bounds using a straightforward if
statement. Pretty simple code that shouldn’t cause any troubles later on… right?
But as your needs grow, you find that you also want to count how many files fall within your range, and how many do not. Given that you have an explicit if
, it appears like a simple addition (in quite literal sense):
Why it doesn’t work, then? Because clearly this is not the output we’re looking for (ls_between is our script here):
It seems that neither matches nor misses are counted properly, even though it’s clear from the printed list that everything is fine with our if
statement and loop. Wherein lies the problem?
Last weekend I attended the PyGrunn conference, back in the good ol’ Netherlands. It was a very enjoyable and instructive event, featuring not only a few local speakers but also some prominent figures from the Python community – like Kenneth Reitz or Armin Ronacher. Overall, it was a weekend of some great pythonic fun.
…Except for a one small detail. As I’ve learned there, Python will apparently get to have enums in the 3.4 version of the language. To say that this was baffling to me would be a severe understatement. Not only I see very little need for such a feature, but I also have serious doubts whether it fits into the general spirit of Python language.
Why so? It’s mostly because of the main purpose of enumeration types, derived from their usage in many languages that already have them as a feature. That purpose is to turn arbitrary data – mostly integers and strings – into well-known, reliable entities that can be safely manipulated inside our programs. Enums act as border guardians, filtering out unexpected data and converting expected data into its safe representation.
What’s safety in this context? It’s type safety, of course. Thanks to enumeration types, we can be certain that a particular value belongs to specific and constrained set of elements. They clearly define all the variants that our code should handle, because everything else was already culled at the conversion stage.
Problem is, in Python there is nothing that would guarantee those safety promises are actually fulfilled. True, the basic property of enum types is retained: given an enum object, we know it must belong to a preordained set of entities. But there is nothing that ensures we are dealing with an enum object at all – short of us actually checking that ourselves:
How this is different from a straightforward membership check:
except for the latter looking cleaner and more explicit?…
This saying, I do not claim enums are completely out of place in Python. This is untrue, if simply because of the fact they are easily implemented through a rather simple metaclass. In fact, this is exactly how the proposed enums.Enum
base is supposed to work.
At the same time, it is also possible to provide some of the before-mentioned type safety. Just look into various libraries that enhance Python with support for contracts: a form of type safety which is even more powerful than what you’ll find in many statically typed languages. You are free to use them, and you should definitely do, if your project would benefit from such a functionality.
Incidentally, they fit right in with that new & upcoming enumeration types from Python 3.4. It remains to be seen what it means exactly for the overall direction of the language design. But with enums in place, the style of “checked” typing suddenly became a lot more pythonic than it was before.
And I can’t say I like it very much.
Much can be said about similarities between two popular, distributed version control systems: Git and Mercurial. For the most part, choosing between them can be a matter of taste. And because Git seems to have considerably more proponents in the Cool Kids camp, it doesn’t necessarily follow it’s a better option.
But I have found at least one specific and common scenario where Git clearly outshines Hg. Suppose you have coded yourself into a dead end: the feature you planned doesn’t pan out the way you wanted it; or you have some compatibility issues you cannot easily resolve; or you just need to escape the refactoring bathtub.
In any case, you just want to step back a few commits and pretend nothing happened, for now. The mishap might be useful later, though, so it’d be nice if we left it marked for the future.
In Git, this is easily done. You would start by creating a new branch that points to your dead end:
Afterwards, both my_feature
and __my_feature__dead_end__
refer to the same, head commit. We would then move the former a little back, sticking it to one of the earlier hashes. Let’s find a suitable target:
If it looks right, we can reset the my_feature
branch so it points to this specific commit:
Our final situation would then looks like this:
which is exactly what we wanted. Note how any further commits starting from the referent of my_feature
would fork at that point, diverging from development line which has lead us into dead end.
Why the same thing is not so easily done in Mercurial?… In general, this is mostly because of its one fundamental design decision: every commit belongs to one branch, forever and for always. Branch designation is actually part of the changeset’s metadata, just like the commit message or diff. Moving things around – like we did above – is therefore equivalent to changing history and requires tools that are capable of doing so, such as hg strip
.