Monthly archive for August, 2012

Don’t Write Classes

2012-08-29 11:17

On this year’s PyCon US, there was a talk with rather (thought-)provoking title Stop Writing Classes. The speaker might not be the most charismatic one you’ve listened to, but his point is important, even if very simple. Whenever you have class with a constructor and just one other method, you could probably do better by turning it into a single function instead.

Examples given in the presentation were in Python, of course, but the whole advice is pretty generic. It can be applied with equal success even to languages that are object-oriented to the extreme (like Java): just replace ‘function’ with ‘static method’. However, if we are talking about Python, there are many more situations where we can replace classes with functions. Often this will result in simpler code with less nesting levels.

Let’s see a few examples.

Inheriting for custom __init__

Sometimes we want to construct many similar objects that differ only slightly in a way their constructors are invoked. A rather simple example would be a urllib2.Request with some custom HTTP headers included:

  1. import urllib2
  2.  
  3. class CustomRequest(urllib2.Request):
  4.     def __init__(self, url, data=None, headers=None, *args, **kwargs):
  5.         headers = headers or {}
  6.         headers.setdefault('User-Agent', 'MyAwesomeApplication')
  7.         super(CustomRequest, self).__init__(url, data, headers, *args, **kwargs)
  8.  
  9. # usage
  10. request = urllib2.urlopen(CustomRequest("http://www.google.com")).read()

That works, but it’s unnecessarily complex without adding any notable benefits. It’s unlikely that we ever want to perform an isinstance check to distinguish between CustomRequest and the original Request, which is the main “perk” of using class-based approach.

Indeed, we could do just as well with a function:

  1. def CustomRequest(url, data=None, headers=None, *args, **kwargs):
  2.     headers = headers or {}
  3.     headers.setdefault('User-Agent', 'MyAwesomeApplication')
  4.     return urllib2.Request(url, data, headers, *args, **kwargs)

Note how usage doesn’t even change, thanks to Python handling classes like any other callables. Also, notice the reduced amount of underscores ;)

Patching-in methods

Even if the method we want to override is not __init__, it might still make sense to not do it through inheritance. Python allows to add or replace methods of specific objects simply by assigning them to some attribute. This is commonly referred to as monkey patching and it enables to more or less transparently change behavior of most objects once they have been created:

  1. import logging
  2. import functools
  3.  
  4. def log_method_calls(obj, method_name):
  5.     """Wrap the given method of ``obj`` object with log calls."""
  6.     method = getattr(obj, method_name)
  7.  
  8.     @functools.wraps(method)
  9.     def wrapped_method(*args, **kwargs):
  10.         logging.debug("Calling %r.%s with args=%s, kwargs=%s",
  11.                       obj, method_name, args, kwargs)
  12.         result = method(*args, **kwargs)
  13.         logging.debug("Call to %r.%s returned %r",
  14.                       obj, method_name, result)
  15.         return result
  16.  
  17.     setattr(obj, method_name, wrapped_method)
  18.     return obj

You will likely say that this look more hackish than using inheritance and/or decorators, and you’ll be correct. In some cases, though, this might be a right thing. If the solution for the moment is indeed a bit hacky, “disguising” it into seemingly more mature and idiomatic form is unwarranted pretension. Sometimes a hack is fine as long as you are honest about it.

Plain old data objects

Coming to Python from a more strict language, like C++ or Java, you may be tempted to construct types such as this:

  1. /*
  2.  * Represents a MIME content type, e.g. text/html or image/png.
  3.  */
  4. class ContentType {
  5. private:
  6.     std:string _major, _minor;
  7. public:
  8.     ContentType(const std::string& major, const std::string& minor)
  9.         : _major(major), _minor(minor) { }
  10.     ContentType(const std::string& spec) {
  11.         std::vector<string> parts;
  12.         boost::split(parts, spec, boost::is_any_of("/"));
  13.         _major = parts.at(0); _minor = parts.at(1);
  14.     }
  15.  
  16.     const std::string& Major() const { return _major; }
  17.     const std::string& Minor() const { return _minor; }
  18.  
  19.     operator std::string () const {
  20.         return Major() + "/" + Minor();
  21.     }
  22. };
  23.  
  24. // usage
  25. ContentType plainText("text/plain");
  26. sendMail("Hello", "Hello world!", plainText);

An idea is to encapsulate some common piece of data and pass it along in uniform way. In compiled, statically typed languages this is a good way to make the type checker work for us to eliminate certain kind of bugs and errors. If we declare a function to take ContentType, we can be sure we won’t get anything else. As a result, once we convert the initial string (like "application/json") into an object somewhere at the edge of the system, the rest of it can be simpler: it doesn’t have to bother with strings anymore.

But in dynamically typed, interpreted languages you can’t really extract such benefits because there is no compiler you can instruct to do your bookkeeping. Although you are perfectly allowed to write analogous classes:

  1. class ContentType(object):
  2.     """Represents a MIME content type, e.g. text/html or image/png."""
  3.     def __init__(self, major, minor=None):
  4.         if minor is None:
  5.             self._major, self._minor = major.split('/')
  6.         else:
  7.             self._major = major
  8.             self._minor = minor
  9.  
  10.     major = property(lambda self: self._major)
  11.     minor = property(lambda self: self._minor)
  12.  
  13.     def __str__(self):
  14.         return '%s/%s' % (self.major, self.minor)

there is no real benefit in doing so. Since you cannot be bulletproof-sure that a function will only receive objects of your type, a better solution (some would say “more pythonic”) is to keep the data in original form, or a simple form that is immediately usable. In this particular case a raw string will probably do best, although a tuple ("text", "html") – or better yet, namedtuple – may be more convenient in some applications.

So indeed…

…stop writing classes. Not literally all of them, of course, but always be on the lookout for alternatives. More often than not, they tend to make code (and life) simpler and easier.

Tags: , ,
Author: Xion, posted under Programming » 3 comments

Hello World Fallacy

2012-08-14 19:50

These days you cannot make more than few steps on the Web before tripping over yet another wonderful framework, technology, library, platform… or even language. More often that not they are promising heaven and stars: ease of use, flexibility, scalability, performance, and so on. Most importantly, they almost always emphasize how easy it is to get started and have working, tangible results – sometimes even whole apps – in very short time.

In many cases, they are absolutely right. With just the right tools, you can make some nice stuff pretty quickly. True, we’re still far from a scenario where you simply choose features you’d like to have, with them blending together automatically – even if some folks make serious leaps in that direction.
But if you think about it for a moment, it’s not something that we actually want, for reasons that are pretty obvious. The less effort is needed to create something, the less value it presents, all other things being equal. We definitely don’t expect to see software development reduced into rough equivalent of clicking through Windows wizards, because everything produced like that would be just hopelessly generic.

But think how easy it would be to get started with that

And thus we come to the titular issue which I took liberty in calling a “Hello World” Fallacy. It occurs when a well-meaning programmer tries out a new piece of technology and finds how easy it is to do simple stuff in it. Everything seems to fall into place: tutorials are clear, to the point and easy to follow; results appear quickly and are rather impressive; difficulties or setbacks are few and far between. Everything just goes extremely well.. What is the problem, then?

The problem lies in a sort of “halo effect” those early successes are likely to create. While surveying a new technology, it’s extremely tempting to look at the early victories as useful heuristic for evaluating the solution as a whole. We may think the way particular tech makes it easy to produce relatively simple apps is a good indicator of how it would work for bigger, more complicated projects. It’s about assuming a specific type of scalability: not necessarily tied to performance of handling heavy load of thousands of users, but to size and complexity of the system handling it.

Point is, your new technology may not really scale all that well. What makes it easy to pick up, among other things, is how good it fits to the simple use cases you will typically exercise when you are just starting out. But this early adequacy is not an evidence for ability to scale into bigger, more serious applications. If anything, it might constitute a feasible argument for the contrary. Newbie-friendliness often goes against long-term usability for more advanced users; compare, for example, the “intuitive” Ribbon UI introduced in relatively recent version Microsoft Office to its previous, much more powerful and convenient interface. While I don’t stipulate it’s a pure zero-sum game, I think catering to beginners and experts alike is surely more difficult than addressing the needs of only one target audience. The former is definitely a road less traveled.

When talking about software libraries or frameworks, the ‘expert’ would typically refer to developer using the tech for large and long-term project. They are likely to explore most of the crooks and crannies, often hitting brick walls that at first may even appear impassable. For them, the most important quality for a software library is its “workaroundability”: how well it performs at not getting in the way between programmer and job done, and how hackable it is – i.e. susceptible to stretching its limits beyond what authors originally intended.

This quality is hardly evident when you’ve only done few casual experiments with your shiny new package. General experience can help a great deal with arriving at unbiased conclusion, and so can the explicit knowledge about the whole issue. While it’s beyond my limited powers to help you significantly to the former, I can at least gently point to the latter.

Happy hacking!

 


© 2017 Karol Kuczmarski "Xion". Layout by Urszulka. Powered by WordPress with QuickLaTeX.com.