What is your first association evoked by a mention of the Java programming language? If you said “slow”, I hope all is well back there in the 90s! But if you said “verbose” instead, you would be very much in the right. Its wordiness is almost proverbial by now, with only some small improvements – like lambdas – emerging at the distant horizon.
Sounds good? Sure it does. The only problem here is that most of those optimistic claims are patently false. Good real-world code written in high level, dynamic language does not necessarily end up being much shorter. Quite often, it is either on par or even longer than the equivalent source in Java et al.
It happens for several reasons, most of which are not immediately obvious looking only at a Hello World-like samples.
It is said Python is pretty much pseudocode that happens to be syntactically strict enough to parse and run. While usually meant as a compliment, this observation has an uglier counterbalance. Being a pseudocode, the raw Python source is woefully incomplete when it comes to providing necessary information for future maintainers. As a result, it gets out of hand rather quickly.
To rein it, we need to put that information back somehow. Here’s where the documentation and commenting part steps in. In case of Python, there is even a syntactical distinction on the language level between both. The former takes the form of docstrings attached to function and class definitions. They are meant to fill in the gaps those definitions always leave open, including such “trivialities” as what kind of arguments the function actually takes, what it returns, and what exceptions it may raise.
Even when you don’t write expansive prose with frequent usage examples and such, an adequate amount of docstrings will still take noticeable space. The reward you’ll get for your efforts won’t be impressive, too: you’ll just add the information you’d include anyway if you were writing the code in a static language.
The catch here is that you won’t even get the automatic assurance that you’ve added the information correctly… which brings us immediately to the next point.
There are two types of programmers: those who write tests, and those who will be writing tests. Undoubtedly, the best way to have someone join the first group is to make them write serious code in any interpreted, dynamic language.
In those languages, a comprehensive set of tests is probably the only automated and unambiguous way to ensure your code is satisfying even some basic invariants. Whole classes of errors that elsewhere would be eliminated by the compiler can slip completely undetected if the particular execution path is not exercised regularly. This includes trying to invoke a non-existent method; to pass an unexpected argument to a function; call a “function” which is not callable (or conversely, not call a function when it should have been); and many others. To my knowledge, none of these are normally detected by the various tools for static analysis (linting).
So, you write tests. You write so many tests, in fact, that they easily outmatch the very code they’re testing – if not necessarily in effort, then very likely in quantity. There’s no middle ground here: either you blanket your code with good tests coverage, or you can expect it to break in all kinds of silly ways.
Documentation, comments and tests take time and space. But at least the code in high level language can be short and sweet. Beautiful, elegant, crisp and concise, without any kind of unnecessary cruft!
Just why are you staring at me like that when I say I used a metaclass here?…
See, the problem with powerful and highly expressive languages is that they are dangerous. The old adage of power and responsibility does very much apply here. Without a proper harness, you will one day find out that:
Unless you can guarantee you’ll always have only real Perl/Python/Ruby rockstars on board, it’s necessary to tame that wild creativity at least a little bit. The inevitable side effect is that your code will sometimes have to be longer, at least compared to what that smart but mystifying technique would yield.
Last week – while still on the other side of the pond – I attended a meet-up organized by the local Google Developers Group. The meeting included a presentation about Go, aimed mostly at newcomers, which covered the language from the ground up but at very fast pace. This spurred a lot of survey questions, as people evidently wanted to assess the language’s viability in general and fitness for particular domain of applications.
One of them was about web frameworks that are available to use in Go. Answer mentioned few simple, existing ones, but also how people coming from other languages are working to (re)build their favorite ones in Go. The point was, of course, that even though the language does not have its own Django or Rails just yet, it’s bound to happen quite soon.
And that’s when it dawned on me.
See, I wondered for a while now why people are eager to subject themselves to a huge productivity drop (among other hardships) when they switch from one technology, that they are proficient in, to a different but curiously similar one.
Mind you, I’m not talking about exploratory ventures intended to evaluate language X or framework Y by doing one or two non-trivial projects; heck, I do it very often (and you should too). No, I’m talking about all-out switching to a new shiny toy, especially when decided consciously and not through a gradual slanting, in a kind of “best tool for the job” fashion.
Whenever I looked for justification, usually I’d just find a straightforward litany of perks and benefits of
$targetTechnology, often having a not insignificant intersection with analogous list for the old one. Add the other, necessary part of risk-benefit calculation – drawbacks – and it just doesn’t balance out. Not by a long shot.
So, I notice that I am confused. There must some be other factor in play, but I couldn’t come up with any candidates – until that day.
As I speculate now, there is actually a big incentive to jump ship whenever a new one appear. And it seems to be one of dirty secrets of the hacker community, because it directly questions the esteemed notion of meritocracy that we are so eager to flaunt.
I often say I don’t believe programmers need to be great typists. No software project was ever late because its code couldn’t be typed fast enough. However, the fact that developer’s job consists mostly of thinking, intertwined with short outbursts of typing, means that it is beneficial to type fast, therefore getting back quickly to what’s really important.
Yet, typing code is significantly different game than writing prose in natural language (unless you are sprinkling your code with copious amount of comments and docstrings). I don’t suppose the skill of typing regular text fast (i.e. with all ten fingers) translates well into building screens of code listings. You need a different sort of exercise to be effective at that; usually, it just comes with a lot of coding practice.
But you may want to rush things a bit, and maybe have some fun in the process. I recently discovered a website called typing.io which aims to help you with improving your code-specific typing skills. When you sign up, you get presented with a choice of about dozen common languages and popular open source projects written in them. Your task is simple: you have to type their code in short, 15-line sprints, and your speed and accuracy will be measured and reported afterwards.
The choice of projects, and their fragments to type in, is generally pretty good. It definitely provides a very nice way to get the “feel” of any language you might want to learn in the future. You’ll get to see a lot of good, working, practical code written in it – not to mention you get to type it yourself :) Personally, I’ve found the C listings (of Redis data store) to be the most pleasant to both read and type, but it’s pretty likely you will have different preferences.
The application isn’t perfect, of course: it doesn’t really replicate the typical indentation dynamics of most code editors and IDEs. Instead, it opts for handling it implicitly, so the only whitespace you get to type is line and word break. You also don’t get to use your text navigation skills and clipboard-fu, which I’ve seen many coders leverage extensively when they are programming.
I think that’s fine, though, because the whole thing is specifically about typing. It’s great and pretty clear idea, and as such I strongly encourage you to try it out!
These days you cannot make more than few steps on the Web before tripping over yet another wonderful framework, technology, library, platform… or even language. More often that not they are promising heaven and stars: ease of use, flexibility, scalability, performance, and so on. Most importantly, they almost always emphasize how easy it is to get started and have working, tangible results – sometimes even whole apps – in very short time.
In many cases, they are absolutely right. With just the right tools, you can make some nice stuff pretty quickly. True, we’re still far from a scenario where you simply choose features you’d like to have, with them blending together automatically – even if some folks make serious leaps in that direction.
But if you think about it for a moment, it’s not something that we actually want, for reasons that are pretty obvious. The less effort is needed to create something, the less value it presents, all other things being equal. We definitely don’t expect to see software development reduced into rough equivalent of clicking through Windows wizards, because everything produced like that would be just hopelessly generic.
But think how easy it would be to get started with that…
And thus we come to the titular issue which I took liberty in calling a “Hello World” Fallacy. It occurs when a well-meaning programmer tries out a new piece of technology and finds how easy it is to do simple stuff in it. Everything seems to fall into place: tutorials are clear, to the point and easy to follow; results appear quickly and are rather impressive; difficulties or setbacks are few and far between. Everything just goes extremely well.. What is the problem, then?
The problem lies in a sort of “halo effect” those early successes are likely to create. While surveying a new technology, it’s extremely tempting to look at the early victories as useful heuristic for evaluating the solution as a whole. We may think the way particular tech makes it easy to produce relatively simple apps is a good indicator of how it would work for bigger, more complicated projects. It’s about assuming a specific type of scalability: not necessarily tied to performance of handling heavy load of thousands of users, but to size and complexity of the system handling it.
Point is, your new technology may not really scale all that well. What makes it easy to pick up, among other things, is how good it fits to the simple use cases you will typically exercise when you are just starting out. But this early adequacy is not an evidence for ability to scale into bigger, more serious applications. If anything, it might constitute a feasible argument for the contrary. Newbie-friendliness often goes against long-term usability for more advanced users; compare, for example, the “intuitive” Ribbon UI introduced in relatively recent version Microsoft Office to its previous, much more powerful and convenient interface. While I don’t stipulate it’s a pure zero-sum game, I think catering to beginners and experts alike is surely more difficult than addressing the needs of only one target audience. The former is definitely a road less traveled.
When talking about software libraries or frameworks, the ‘expert’ would typically refer to developer using the tech for large and long-term project. They are likely to explore most of the crooks and crannies, often hitting brick walls that at first may even appear impassable. For them, the most important quality for a software library is its “workaroundability”: how well it performs at not getting in the way between programmer and job done, and how hackable it is – i.e. susceptible to stretching its limits beyond what authors originally intended.
This quality is hardly evident when you’ve only done few casual experiments with your shiny new package. General experience can help a great deal with arriving at unbiased conclusion, and so can the explicit knowledge about the whole issue. While it’s beyond my limited powers to help you significantly to the former, I can at least gently point to the latter.
I recently had a discussion with a co-worker about feasibility of using anonymous functions in Python. We happen to overuse them quite a bit and this is not something I’m particularly fond of. For me lambdas in Python are looking pretty weird and thus I prefer to use them sparingly. I wasn’t entirely sure why is it so – given that I’m quite a fan of functional programming paradigm – until I noticed a seemingly insignificant fact.
lambda keyword is long. With six letters, it is among the longer keywords in Python 2.x, tied with
global, and beaten only by
finally. Quite likely, this is what causes lambdas in Python to stand out and require additional mental resources to process (assuming we’re comfortable enough with the very idea of anonymous functions). The long
lambda keyword seems slightly out of place because, in general, Python keywords are short.
Or are they?… Thinking about this, I’ve got an idea of comparing the average length of keywords from different programming languages. I didn’t really anticipate what kind of information would be exposed by applying such a metric but it seemed like a fun exercise. And it surely was; also, the results might provoke a thought or two.
Here they are:
|Language||Keyword||Total chars||Chars / keyword|
The newest incarnation of C++ seems to be losing badly in this competition, followed by C#. On the other side of the spectrum, Go and Python seem to be deliberately designed to avoid keyword bloat as much as possible. Java is somewhere in between when it comes to sheer numbers of keywords but their average length is definitely on the long side. This could very well be one of the reasons for the perceived verbosity of the language.
For those interested, the exact data and code I used to obtain these statistics are in this gist.
An extremely common programming exercise – popping up usually as an interview question – is to write a function that turns all characters in a string into uppercase. As you may know or suspect, such task is not really about the problem stated explicitly. Instead, it’s meant to probe if you can program at all, and whether you remember about handling special subsets of input data. That’s right: the actual problem is almost insignificant; it’s all about the necessary plumbing. Without a need for it, the task becomes awfully easy, especially in certain kind of languages:
This simplicity may be a cause of misconception that the whole problem of letter case is similarly trivial. Actually, I would not be surprised if the notion of having any sort of real ‘problem’ here is baffling to some. After all, every self-respecting language has those
toUpperCase functions built-in, right?…
Sure it has. But even assuming they work correctly, they are usually the only case-related transformations available out of the box. As it turns out, it’s hardly uncommon to need something way more sophisticated.
Also known as Obligatory New Year’s Post.
It was quite a year, this 2011. No single ground-breaking change, but a lot of somewhat significant events and small steps – mostly in the right direction. A short summary is of course in order, because taking time to stop and reflect is a good thing from time to time.
Technically, the biggest change would be the fact that I’m no longer a student. Attaining MSc. some time in the first quarter, I finished a five year-long period of computer science studies at Warsaw University of Technology. While there are mixed views on the importance of formal education, I consider this a major and important achievement – and a one with practical impact as well.
My master thesis was about implementing a reflection system for C++. Ironically, since then I haven’t really got to code anything in this language. That’s not actually something I’m at odds with. For me, sticking to just one language for extended period of time seems somewhat detrimental to development of one’s programming skills. On the other hand, there goes the saying that a language which doesn’t change your view on programming as a whole is not worth learning. As usual, it looks like a question of proper balance.
Finally, there was Python: for scripts, for cloud computing on Google App Engine, for general web programming, and for many everyday tasks and experiments. It seems to be my first choice language as of now – a one that I’m most productive in. Still, it probably has many tricks and crispy details waiting to be uncovered, which makes it likely to grab my attention for quite a bit longer.
Its status always has contenders, though. Clojure, Ruby and Haskell are among languages which I gave at least a brief glance in 2011. The last one is especially intriguing and may therefore be a subject of few posts later on.
2011 was also a busy year for me when it comes to attending various software-related events. Many of these were organized or influenced by local Google Technology User Group. Some of those I even got to speak at, lecturing on the Google App Engine platform or advanced topics in Android UI programming. In either case it was an exciting and refreshing experience.
There were also several other events and meet-ups I got to attend in the passing year. Some of them even required traveling abroad, some resulted in grabbing juicy awards (such as autographed books), while some were slightly less formal albeit still very interesting.
And kinda unexpected, too. I learned that there is bunch of thriving communities gathered around specific technologies, and they are all just around the corner – literally. Because contrary to the stereotype of lone hacker, their members are regularly meeting in real life. Wow! ;-)