The Year Gone By

2012-12-31 17:17

Times are tough. Everyone needs to spend most of their time scavenging food, warding off zombies and searching for survivors, so I’ll be brief with my obligatory, almost-traditional new year’s summary post.

The life

This will be in stark contrast with how freaking long this year has been for me, too. Long and very eventful, especially in the travel, relocation and job department. Suffice it to say, I switched countries more than once: from Poland to Netherlands, and then directly to Switzerland. Even with (supposedly) free movement of people in and around EU, this is still quite a cumbersome and time-consuming process.

When it comes to work, I discovered that I’m not actually very fond of squeezing UI and logic into tiny pocket devices. Distributed, server-side systems are turned out to be much more interesting and important to me, so I adjusted my job choices accordingly. And I like it a lot, especially since I started working in that hot new Internet startup ;)

The blog

Over the year, I’ve been trying to keep up with posting interesting stuff on regular basis. Although sometimes reality got in the way, I think I managed to publish at least a few nice pieces. I’m especially satisfied with:

End of 2012 also marks the period of (little over) a year when I’m writing this blog in English. As I’ve expected, making the language switch didn’t really have a negative impact on readership numbers.

However, I’m still getting occasional complaints about this very move, which is rather astonishing at this point. Nevertheless, I’ve spent some time thinking about this issue, the ways to reach out to non-international audience. Not promising anything at this point, but maybe in 2013 there will be something for you too :)

In meantime, enjoy the new year!

Tags: , ,
Author: Xion, posted under Computer Science & IT, Life » 1 comment

Increasing Utility of Utility Modules

2012-12-26 18:53

Admit it: all your projects have this one slightly odd part. It has many different names: “helper” classes, “common” functions, “auxiliary” code (if you’re that erudite), “shared” subroutines… Or simply a utility module, which is how I will call it from here.

This is the logic outside of application’s core. The non-essential building blocks which are nevertheless used throughout the whole project. Heap of scraps, squirreled to plug in the holes of your language, framework or libraries.

Those flea markets are notoriously difficult to keep in check. If left to their own devices, they gradually expand, swallowing more and more of incoming logic – code that would otherwise go into more specific, adequate places. This is where you can carefully observe the second law of programming dynamics: without directed influence, entropy can never decrease.

You cannot do away with utility modules, though. There will always be code that exhibits their two characteristic properties:

  • frequent usage throughout most of the project’s codebase
  • loose coupling with the rest of the project

But neither is binary: they are more like a continuous spectrum. And to make it even more difficult, they are also constantly in flux, effected by any change that happens in the code. To remain minimal (and thus manageable), utility modules require probably the most frequent, extensive and aggressive refactoring.

So how exactly do you deal with this necessary menace? I think it’s a matter of reacting quickly and decisively when one of these properties change. For the most part, the usage and (inter)dependencies of your utility package will dictate which one of these four steps you should take:

  • When certain entity (function, class) grows universal, up to a point when it’s used by more than one subpackage inside your app, it is usually time to put it into the utility module.
    Although somewhat obvious, this point is really important for one particular reason: preventing code duplication. Without a designated place for helper/common/etc. code, you will have different implementations of the very same things spread like a vermin across the whole codebase.
    And you really don’t want to have four distinct functions for turning a sequence [X, Y, Z] into a string "X, Y and Z", or something similar to that. This is way worse than even the biggest utility module you might end up dealing with.
  • Symmetrically, whenever an entity stops being used by more than one part of the program, you may roll it back into that sole part which still requires it.
    For most, this will be the trickiest part. I can tell you upfront that you shouldn’t actually do it every time. There are things basic enough that you don’t need at all to track usage of: the “join with ‘and'” example might be one such instance.
    But there are times when it’s more reasonable to put some utility stuff back into more dedicated package, because it simply belongs there. If you need to have that “and-join” localized, for example, you will likely find it more sensible to place its modified version into internationalization package. On the other hand, if you learn that it’s used only to format log messages which are never seen by the user, you can put it directly into logging module or just scrap altogether. (After all, programmers are not easily offended by poorly punctuated sentences).
  • When utility code evolves into subpackage on its own, extract it as such.
    When your internationalization needs require not only a localized “and-join” but also a way to match numeric values with noun plurals ("1 apple" vs "3 apples"; my native language is especially complex here), this can be the reason for creating a full-blown i18n package.
    It is quite natural process, for simple but useful snippets – ones that typically land in shared module – to grow more robust, functional and thus complex. At certain size threshold, it’s reasonable to promote them into next structural level: module or package.
  • Loosening up already loose ties may call for creating an external library.
    You probably don’t write code with the express intent of reusing it across different projects, regardless of whether you were convinced by Mythical Man Month that it costs three times as much to do so. But “accidents” happen, and you can sometimes find a non-trivial piece of utility code that appears to be floating, that doesn’t depend on anything else inside the parent project.
    Should you make such a discovery, by all means make it into a library. It doesn’t have to be real, external library – much less an open source one. Releasing a project into the wild is serious and time-consuming task, so I won’t ask you to do it every time (even though the world would benefit greatly if you did).
    Treat it as third-party code, though. Separate it into different root package, add a new build target for it, include it as a .jar or .egg rather than .java or .py files – and so on. This is a small investment that makes it much more likely for the code to increase its value threefold.
    Reducing the size of that pesky utility module will be just an added bonus.

So help you helper classes and utilize your utility modules for shared, common good :)

Tags: , ,
Author: Xion, posted under Computer Science & IT » Comments Off on Increasing Utility of Utility Modules

Alternative @property Syntax

2012-12-19 22:04

As you probably know very well, in Python you can add properties to your classes. They behave like instance fields syntactically, but under the hood they call accessor functions whenever you want to get or set the property value:

  1. import os
  3. class Directory(object):
  4.     """Simple class representing a directory in the file system."""
  5.     def __init__(self, path):
  6.         self.path = path
  8.     @property
  9.     def parent(self):
  10.         """Parent directory."""
  11.         return Directory(os.path.join(self.path, os.pardir))
  13. # usage: no () after .parent
  14. home_dir = Directory('/home/xion')
  15. root_dir = home_dir.parent.parent

Often – like in the example above – properties are read-only, providing only the getter method. It’s very easy to define them, too: just stick a @property decorator above method definition and you’re good to go.

iGet, iSet

Occasionally though, you will want to define a read-write property. (Or read-delete, but those are very rare). One function won’t cut it, since you need a setter in addition to getter. The canonical way Python docs recommend in such a case (at least since 2.6) is to use the @property.setter decorator:

  1. class TracedObject(object):
  2.     """Object that tracks changes to its properties."""
  3.     def __init__(self):
  4.         self.changed = False
  6.     @property
  7.     def x(self):
  8.         return self._x
  10.     @x.setter
  11.     def x(self, value):
  12.         self._x = value
  13.         self.changed = True

Besides that I find it ugly to split a single property between two methods, this approach will annoy many static code analyzers (including PEP8 checker) due to redefinition of x. Warnings like that are very useful in general, so we certainly don’t want to turn them off completely just to define a property or two.

So if our analyzer doesn’t support line-based warning suppression (like, again, pep8), we may want to look for a different solution.

Tags: , ,
Author: Xion, posted under Programming » 2 comments

Command Line Parsing in Python: Tips & Tricks

2012-12-13 21:49

Reading program’s command line and doing something with the arguments is the main purpose of most small (or bigger) utilities. Those are often written in Python – because of how easy and fast this is – so there should be a way to parse the command line in Python, too.
And in fact there are quite a few of them, all from the standard library. But the argparse module is most likely the best of them all, equally for its flexibility and power, as well as the sole fact of not being deprecated yet ;-)

For that matter, I have already used it several times, not only in Python. Today I want to present a summary of few useful techniques and solutions that I learned along the way, mostly by braving the not-so-friendly documentation of argparse. Given I’m not likely to do unusual stuff here, they should also address quite common, albeit less trivial use cases.

Boolean flags

Following the convention of every operating system imaginable, argparse has positional arguments and flags. Flags are denoted by one or two dashes preceding the name or its one-letter abbreviation:

  1. $ git commit -m "Fix stuff"
  2. $ hg bisect --bad 42
  3. $ ln -s ~/node_modules/foobar-0.0.1/bin/foobar ~/bin/foobar

Normally in argparse, flags take arguments that are later stored in the result object. This would be helpful for parsing something like the -m (message) flag in the git commit example above.
Not every flag needs to behave like that, though. In the last ln example, the -s does not take any arguments. Instead, it alters the program behavior by its mere presence: with it, ln creates a symbolic link instead of “hard” link. So in a sense, the flag is boolean. We would like to handle it as such.

In argparse, this is possible by setting the appropriate action= in the add_argument method:

  1. parser.add_argument("--symbolic", "-s", action='store_true', default=False)

Depending on what’s more logical for your program, you can reverse the logic to 'store_false' and default=True, of course.

Multiple positional arguments

If your program takes one entity as an argument and does something specific with it, users will often expect it to work with multiple entities too. You can observe it first hand with pip:

  1. $ pip install Flask
  2. $ pip install Flask WTForms SQLAlchemy celery pytz

or any version control application:

  1. $ git add README
  2. $ git add foo.h foo.c Makefile

There is no reason to ignore this expectation and it’s pretty easy to satisfy in argparse. Again, there is an action= for that:

  1. parser.add_argument("--foo", action='append')

and it’s sufficient for flags. Here the object returned by parse_args will get foo attribute with the list of arguments from all occurrences of --foo.

For positionals, it’s a little bit trickier because by default, they are meant to appear exactly once. This can be changed using nargs=:

  1. parser.add_argument("files", nargs='+')

The value of '+' is probably the most useful here, as it requires for the argument to be present at least once. Just like for flags, the result will be a list of all its occurrences, so you can iterate or map over it easily.

Optional positional arguments

Less typically, you may want to have a positional argument which can be supplied or not (an optional one). Although it is possible with the API outlined above, I wouldn’t recommend it: you will have to deal with unnecessary 0-or-1-element list and you won’t get proper error checking at the argparse level.

The correct solution involves nargs=, too, but with a dedicated '?' value:

  1. parser.add_argument("cache_dir", nargs='?', default='/tmp')

As you may guess, default= allows you to specify the value in parse_args result should the argument be omitted.


Once you set up your ArgumentParser, you will (hopefully) want to test it. Lucky for you, this can be done easily without every touching the actual command line. Simply pass your arguments (as a list) to parse_args and it will use it instead of sys.argv:

  1. >>> parser.parse_args(['-foo', 'bar'])
  2. Namespace(foo='bar')

With this you can easily write some nice unit tests for your parser – which you should do, obviously. What you should not do, however, is abusing this feature to call your program’s code from itself:

  1. def main(argv=sys.argv):
  2.     args = parse.parse_args(argv)
  3.     # ...
  5. # later, somewhere deep inside...
  6. main(['other_function', 'one_argument', '--key', 'value'])

Just don’t.

Read more

There are, of course, many other interesting features and applications of argparse that you will find useful. I can especially recommend that you get to know about:

  • subparsers, a way to divide your complex tool into several internal commands (like git or pip)
  • argument groups for organizing your arguments into functional groups for better --help output, or for mutual exclusion (e.g. --verbose and --quiet option)
  • help text formatting, handy for more elaborate descriptions that need their whitespace preserved

Equipped with this knowledge, you should be able to write beautiful and easy to use command line tools. Please do so :)

Tags: ,
Author: Xion, posted under Programming » Comments Off on Command Line Parsing in Python: Tips & Tricks

Ultimate Tea Solution

2012-12-05 22:15

Programmers are known for using various, ahem, cognitive enhancers (all legal, of course), with coffee as probably the most popular. Well, I’m an avid tea drinker instead, and I’m always on lookout for new flavors, brewing techniques and equipment.

Today I’d like to present a perfect example of from the last category. I’ve found it purely by accident while on one of the many trips to IKEA that I’ve undertaken in the last few days. It’s an ingenious teapot that makes it super easy to brew tea, pour it and – finally – get rid of used-up leaves.
In the past I used several different types of pots with built-in strainers, as well as standalone infusers, and it was always the cleanup part that turned out to be the most cumbersome. Soaked tea leaves don’t come off easily from infusers’ metallic lattice, requiring to flush the remnants out with direct water stream and risk clogging up the sink (eventually).

Overall, it’s just messy, not very clever and hardly user-friendly.

They say it helps with coffee too,
but I find this fact irrelevant.

Fortunately, the teapot I have found has solved it in a much smarter way. There is no separate insert where the leaves should go. Instead, you are supposed to put them directly inside the glass container and pour water straight into it.

This, obviously, seems like an extremely old-fashioned way of brewing tea, but it is also one of the best ones. Leaves are given plenty of space here to spread the flavor throughout the whole pot, rather than being crumpled and confined to the small volume of typical infusers. As a result you may often shorten the brewing time while still getting a richer taste in the end.

Problems arise when you’d like to pour some tea into your cup or glass and you don’t fancy getting some of those pesky leaves alongside with it. This is also where the teapot in question shows its ingenuity – or more precisely, it’s the cap of it that does.

Designers have equipped it with a piston made of fine-grained lattice that goes up and down the pot’s cylindrical body. The idea is just bizarrely simple: once your tea has extracted enough goodness from the leaves floating within, you can just press the piston all the way down. This collects all stray leaves and keeps them conveniently at the bottom of the pot, so that nothing gets through when you try to fill your cup.

Cleaning is also very easy: you simply run some tap water through the piston and into the glass, flushing the former while keeping all the leaves inside the pot. Afterwards, you just flush everything down the toilet and wash the teapot normally (e.g. in dishwasher). It’s effective, clean and simple.

And with a steady supply of tea, your code will likely be so too! :)

Tags: , ,
Author: Xion, posted under Life, Programming » 6 comments

Don’t Throw Old Books Away

2012-11-29 22:53

When moving in to a new flat just few days ago, I had to find a place for all the various IT books I’ve accumulated over the years. It’s not exactly a copious amount, but it was just enough to fill the shelves of quite big bookcase. Shuffling through them, I was surprised to find some really old ones, straight from the long forgotten era of Windows 9x.

Like these.

Many of the technologies they describe are long outdated (the different techniques for rendering 3D effects age particularly fast, for example). None of them seem to be completely phased out – you need to wait few decades for that, not just one – but I doubt they get used very much outside of maintenance of legacy systems. And definitely no one gets excited about, say, Delphi or Visual Basic now. That boat simply doesn’t float anymore, partially because its ocean – desktop platforms in general and Windows in particular – is slowly leaking away for quite some time now.

Does it mean all the books treating about those nigh-ancient subjects are little more than paper waste now? I wouldn’t be so sure. The whole purpose of IT books is far from being an up-to-date reference of anything. Bits over wire travel much faster than letters on paper, after all. It doesn’t happen very often that a developer needs to consult a book regularly, especially during a coding session. Although some timeless classics can be an exception, online documentation or sites like StackOverflow have replaced books for most intents and purposes.

Most – but not all. Some topics are better tackled practically if you first have a bit of theoretical foundation which you then iteratively refine after gaining more experience. You will finally put the book away, of course, but then you can still use it later to quickly revise your knowledge if need be. It doesn’t matter that it was standing dormant for many months or years, as long as it proves useful when there is urgent need to flip through its pages again.

What if you cannot even conceive how and when that prehistorical literature can be of any use whatsoever?… Well, I still wouldn’t be so quick with getting rid of it all. See, IT as a whole has this surprising tendency of going in circles, and periodically regurgitating old ideas into new forms and shapes. Hence the ability to generalize is important for long-term success in this field – at least the kind of success that’s more palpable than 15 minutes of Hacker News fame.

But to see patterns and connections between seemingly unrelated pieces of technology, you need have at least vague mental trace of all of them. And to verify you are not just seeing things, it’s often necessary to go back and read up again.

More often than not, that new thing will turn out to be just an old new thing.

Tags: ,
Author: Xion, posted under Computer Science & IT, Life, Thoughts » 1 comment

git outgoing

2012-11-20 12:05

Now that I don’t use Mercurial at work anymore, I’ve found that despite its shortcomings (hg status taking 10+ seconds?!) it has some few nice features. One of those is hg outgoing, which shows you which changesets you are going to send to remote repo in your next push. A quick glance at this list will typically ensure that everything is in order, or allow to amend some commits before making them public.

In Git you can do the similar by applying a filter to git log:

  1. $ git log origin/master..

But while origin is most often the remote you want to compare against, the master branch is typically not the one where most of development takes place. So if we want to create a git outgoing command, we would rather check what the current branch is and compare it with its remotely tracked equivalent:

  1. #!/bin/sh
  2. BRANCH=$(git name-rev HEAD 2>/dev/null | awk "{ print \$2 }")
  3. git log origin/$BRANCH..

Simply naming this script git-outgoing and making it executable somewhere within your $PATH (e.g. /usr/bin) will make the git outgoing command available:

  1. $ git outgoing
  2. commit 8c96c21c420dd10a34441cbd7d4c6904a6077716
  3. Author: Karol Kuczmarski <>
  4. Date:   Tue Nov 20 11:51:44 2012 +0100
  6.     Add .gitignore
  8. commit 8a51a4f39b383c9dff64532403ab3922bc2ae13c
  9. Author: Karol Kuczmarski <>
  10. Date:   Tue Nov 20 11:50:01 2012 +0100
  12.     Comments in install script

There are few untold assumptions here, like the fact that branch names must match on both local and remote repo. If you find yourself breaking those, then you’re probably better to just use git log directly.

Tags: , ,
Author: Xion, posted under Programming » 3 comments

© 2023 Karol Kuczmarski "Xion". Layout by Urszulka. Powered by WordPress with