Posts from 2013

Deprecate This

2013-05-26 14:02

As a fact of life, in bigger projects you often cannot just delete something – be it function, method, class or module. Replacing all its usages with whatever is the new recommendation – if any! – is typically outside of your influence, capabilities or priorities. By no means it should be treated as lost cause, though; any codebase would be quickly overwhelmed by kludges if there were no way to jettison them.

To reconcile those two opposing needs – compatibility and cleanliness – the typical approach involves a transition period. During that time, the particular piece of API shall be marked as deprecated, which is a slightly theatrical term for ‘obsolete’ and ‘not intended for new code’. How effective this is depends strongly on target audience – for publicly available APIs, someone will always wake up and start screaming when the transition period ends.

For in-project interfaces, however, the blow may be effectively cushioned by using certain features of the language, IDE, source control, continuous integration, and so on. As an example, Java has the @Deprecated annotation that can be applied to functions or classes:

  1. public class Foo {
  2.     /**
  3.      * @deprecated Use FooFactory instead
  4.      */
  5.     @Deprecated
  6.     public static Foo create() {
  7.         return new Foo();
  8.     }
  9. }

If the symbol is then used somewhere else, it produces a compiler warning (and visual cue in most IDEs). These can be suppressed, of course, but it’s something you need to do explicitly through a complementary language construct.

So I had this idea to try and add similar mechanism to Python. One part of it is already present in its standard library: we have the warnings module and a built-in category of DeprecationWarnings. These can be ignored, suppressed, caught or even made into errors.
They are also pretty powerful, as they allow to deprecate certain code paths and not just symbols, which can be useful when introducing new meanings for function parameters, among other things. At the same time, it means using them is irritatingly imperative and adds clutter:

  1. class Foo(object):
  2.     def __init__(self):
  3.         warnings.warn("Foo is deprecated", DeprecationWarning)
  4.         # ... rest of Foo constructor ...

And in this particular case, it also doesn’t work as intended, for reasons that will become apparent later on.
What we’d like instead is something similar to annotation approach that is available in Java:

  1. @deprecated
  2. class Foo(object):
  3.     # ...

Given that the @-things in Python (decorators, that is) are significantly more powerful than the Java counterparts, it shouldn’t be a tough call to achieve this…

Surprisingly, though, it turns out to be very tricky and quite arcane. The problems lie mostly in the subtle issues of what exactly constitutes “usage” of a symbol in Python, and how to actually detect it. If you try to come up with a few solutions, you’ll soon realize how the one that may eventually require walking through the interpreter call stack turns out to be the least insane one.

But hey, we didn’t go to the Moon because it was easy, right? ;) So let’s see how at least we can get started.

Tags: , , , , ,
Author: Xion, posted under Programming » Comments Off on Deprecate This

The Subshell Gotcha

2013-05-20 13:13

Many are the quirks of shell scripting. Most are related to confusing syntax, but some come from certain surprising semantics of Bash as a language, as well as the way scripts are executed.
Consider, for example, that you’d like to list files that are within certain size range. This is something you cannot do with ls alone. And while there’s certainly some awk incantation that makes it trivial, let’s assume you’re a rare kind of scripter who actually likes their hacks readable:

  1. #!/bin/sh
  2.  
  3. min=$1
  4. max=$2
  5.  
  6. ls | while read filename; do
  7.     size=$(stat -f %z $filename)
  8.     if [ $size -gt $min ] && [ $size -lt $max ]; then
  9.         echo $filename
  10.     fi
  11. done

So you use an explicit while loop, obtain the file size using stat and compare it to given bounds using a straightforward if statement. Pretty simple code that shouldn’t cause any troubles later on… right?

But as your needs grow, you find that you also want to count how many files fall within your range, and how many do not. Given that you have an explicit if, it appears like a simple addition (in quite literal sense):

  1. matches=0
  2. misses=0
  3. ls | while read filename; do
  4.     size=$(stat -f %z $filename)
  5.     if [ $size -gt $min ] && [ $size -lt $max ]; then
  6.         echo $filename
  7.         ((matches++))
  8.     else
  9.         ((misses++))
  10.     fi
  11. done
  12.  
  13. echo >&2 "$matches matches"
  14. echo >&2 "$misses misses"

Why it doesn’t work, then? Because clearly this is not the output we’re looking for (ls_between is our script here):

  1. $ ls -al
  2. total 25296
  3. drwxrwxr-x  19 xion  staff       646 15 Apr 18:44 .
  4. drwxrwxr-x  15 xion  staff       510 20 May 11:15 ..
  5. -rw-rw-r--   1 xion  staff        16 10 May  2012 hello.py
  6. -rw-rw-r--   1 xion  staff      4005 28 May  2012 keyword_stats.py
  7. -rw-rw-r--   1 xion  staff       218  5 Aug  2012 magical.py
  8. -rw-rw-r--   1 xion  staff     19901 11 May  2012 space_invaders.py
  9. $ ls_between 1024 10241024
  10. keyword_stats.py
  11. space_invaders.py
  12. 0 matches
  13. 0 misses

It seems that neither matches nor misses are counted properly, even though it’s clear from the printed list that everything is fine with our if statement and loop. Wherein lies the problem?

Tags: , , , ,
Author: Xion, posted under Applications, Programming » 2 comments

We Don’t Need No Enums (in Python)

2013-05-13 21:48

Last weekend I attended the PyGrunn conference, back in the good ol’ Netherlands. It was a very enjoyable and instructive event, featuring not only a few local speakers but also some prominent figures from the Python community – like Kenneth Reitz or Armin Ronacher. Overall, it was a weekend of some great pythonic fun.

…Except for a one small detail. As I’ve learned there, Python will apparently get to have enums in the 3.4 version of the language. To say that this was baffling to me would be a severe understatement. Not only I see very little need for such a feature, but I also have serious doubts whether it fits into the general spirit of Python language.

Why so? It’s mostly because of the main purpose of enumeration types, derived from their usage in many languages that already have them as a feature. That purpose is to turn arbitrary data – mostly integers and strings – into well-known, reliable entities that can be safely manipulated inside our programs. Enums act as border guardians, filtering out unexpected data and converting expected data into its safe representation.

What’s safety in this context? It’s type safety, of course. Thanks to enumeration types, we can be certain that a particular value belongs to specific and constrained set of elements. They clearly define all the variants that our code should handle, because everything else was already culled at the conversion stage.

Problem is, in Python there is nothing that would guarantee those safety promises are actually fulfilled. True, the basic property of enum types is retained: given an enum object, we know it must belong to a preordained set of entities. But there is nothing that ensures we are dealing with an enum object at all – short of us actually checking that ourselves:

  1. if isinstance(state, State):

How this is different from a straightforward membership check:

  1. if state in STATES:

except for the latter looking cleaner and more explicit?…

This saying, I do not claim enums are completely out of place in Python. This is untrue, if simply because of the fact they are easily implemented through a rather simple metaclass. In fact, this is exactly how the proposed enums.Enum base is supposed to work.
At the same time, it is also possible to provide some of the before-mentioned type safety. Just look into various libraries that enhance Python with support for contracts: a form of type safety which is even more powerful than what you’ll find in many statically typed languages. You are free to use them, and you should definitely do, if your project would benefit from such a functionality.

Incidentally, they fit right in with that new & upcoming enumeration types from Python 3.4. It remains to be seen what it means exactly for the overall direction of the language design. But with enums in place, the style of “checked” typing suddenly became a lot more pythonic than it was before.

And I can’t say I like it very much.

Tags: , , ,
Author: Xion, posted under Events, Programming » 3 comments

One Thing that Git Does Better: Rebranching

2013-05-04 0:06

Much can be said about similarities between two popular, distributed version control systems: Git and Mercurial. For the most part, choosing between them can be a matter of taste. And because Git seems to have considerably more proponents in the Cool Kids camp, it doesn’t necessarily follow it’s a better option.

But I have found at least one specific and common scenario where Git clearly outshines Hg. Suppose you have coded yourself into a dead end: the feature you planned doesn’t pan out the way you wanted it; or you have some compatibility issues you cannot easily resolve; or you just need to escape the refactoring bathtub.

Rebranching in Git (start)

In any case, you just want to step back a few commits and pretend nothing happened, for now. The mishap might be useful later, though, so it’d be nice if we left it marked for the future.

In Git, this is easily done. You would start by creating a new branch that points to your dead end:

  1. $ git checkout my_feature
  2. $ git branch __my_feature__dead_end__

Afterwards, both my_feature and __my_feature__dead_end__ refer to the same, head commit. We would then move the former a little back, sticking it to one of the earlier hashes. Let’s find a suitable target:

  1. $ git log
  2. commit 130fe41a3a0587a48c1ef8797030ae2e682c6fb4
  3. Author: John Doe <john.doe@example.com>
  4. Date:   Fri May 3 21:39:32 2013 +0200
  5.  
  6.     I don't like this adventure. Not one bit.
  7.  
  8. commit f834a10d199f5c23fa14ed0ebfcf89226d9c21a1
  9. Author: John Doe <john.doe@example.com>
  10. Date:   Fri May 3 21:25:46 2013 +0200
  11.  
  12.     Let's try this, maybe...
  13.  
  14. commit 9d877d32f3a08f416615f9a051ad3985e6e4d2ad
  15. Author: John Doe <john.doe@example.com>
  16. Date:   Fri May 3 21:18:57 2013 +0200
  17.  
  18.     Shiny!
  19. $ git checkout 9d877
  20. Note: checking out '9d877'.
  21. # trimmed for brevity

If it looks right, we can reset the my_feature branch so it points to this specific commit:

  1. $ git reset --hard 9d877

Our final situation would then looks like this:

Rebranching in Git (end)

which is exactly what we wanted. Note how any further commits starting from the referent of my_feature would fork at that point, diverging from development line which has lead us into dead end.

Why the same thing is not so easily done in Mercurial?… In general, this is mostly because of its one fundamental design decision: every commit belongs to one branch, forever and for always. Branch designation is actually part of the changeset’s metadata, just like the commit message or diff. Moving things around – like we did above – is therefore equivalent to changing history and requires tools that are capable of doing so, such as hg strip.

Tags: , , ,
Author: Xion, posted under Programming » 5 comments

Decorated Functions in C++… Almost

2013-04-28 21:56

Many languages now include the concept of annotations that can be applied to definitions of functions, classes, or even variables and fields. The term ‘annotation’ comes from Java, while other languages use different names: attributes (C#), decorators (Python), tags (Go), etc.
Besides naming disparities, those features also tend to differ in terms of offered power and flexibility. The Python ones, for example, allow for almost arbitrary transformations, while Go tags are constrained to struct fields and can only consist of text labels. The archetypal annotations from Java lie somewhere in between: they are mostly for storing metadata, but they can also be (pre)processed during compilation or runtime.

Now, what about C++? We know the language has a long history of lacking several critical features (*cough* delegates *cough*), but the recent advent of C++11 fixed quite a few of them all at once.
And at first sight, the lack of annotation support seems to be among them. New standard introduces something called attributes, which appears to fall in into the same conceptual bucket:

  1. [[dllexport]] void SomeFunction(int x);

That’s misleading, though. Attributes are nothing else than unified syntax for compiler extensions. Until C++11, the job of attributes were done by custom keywords, such as __attribute__ (GCC) or __declspec (Visual C++). Now they should be replaced by the new [[squareBracket]] syntax above.
So there is nothing really new about those attributes. At best, you could compare them to W3C deciding on common syntax for border-radius that forces both -webkit-border-radius and -moz-border-radius to adapt. Most importantly, there is no standard way to define your custom attributes and introspect them later.

That’s a shame. So I thought I could try to fix that, because the case didn’t look completely lost. In fact, there is a precedent for a mainstream language where some people implemented something-kinda-almost-like annotations. That language is JavaScript, where annotations can be realized as functions arranged into a pipeline. Here’s an example from Express framework:

  1. app.get('/home', [loginRequired, cached({minutes:30})], function(req, res){
  2.     res.render('home');
  3. });

Both loginRequired and cached(...) are functions, tied together by app.get with the actual request handler at the end. They “decorate” that handler, wrapping it inside a code which offers additional functionality.

But that’s JavaScript, with its dynamic typing and callback bonanza. Can we even try to translate the above code into C++?…
Well yes, we actually can! Two new features from C++11 allow us to attempt this: lambda functions and initializer lists. With those two – and a healthy dose of functional programming – we may achieve at least something comparable.

Here Be Dragons: Game from IGK Compo

2013-04-14 0:01

Last weekend I had the pleasure to attend the game development conference IGK in Siedlce, Poland. Although gamedev is not something I normally do, it was a fun experience.

One thing I was especially looking forward to was the one day-long game programming contest called Compo. Like last year and the one before, I formed a team with Adam “Reg” Sawicki and Krzysztof “Krzysiek K.” Kluczek, guest starring Kamil “Netrix” Szatkowski. That’s four coders in total, so we kinda had some problem with, say, more artistic part of the endeavor :)

Nevertheless, the result was pretty alright: we scored 4th place out of about dozen teams. The theme this time was an “artillery” game with few mandatory features: two types of energy-like resources (HP & MP), achievements and multiplayer gameplay.
We nailed the last one by implementing support for two mice at once. This allowed us to come up with an interesting idea inspired by the Atari game Rampart: two castles that expand and attack each other in a fast-paced duel full of frantic clicking. And because the attackers, or guardians of each castle are dragons, hence the game’s title: Here Be Dragons.

Here Be Dragons - screenshot

File: [2013-04-07] Here Be Dragons  [2013-04-07] Here Be Dragons (3.2 MiB, 848 downloads)

In retrospect, we should probably have narrowed the scope of our project more. Doing it 3D was ambitious in itself, but packing it with all the features we planned the night before turned out to be slightly too much :) In the end we didn’t even have time to code any sound effects or music!… That’s certainly a lesson to learn here.

But all in all, it was great fun. It certainly inspired me to maybe try and brush up my gamedev skills a bit more :)

Tags: , ,
Author: Xion, posted under Events, Programming » Comments Off on Here Be Dragons: Game from IGK Compo

The Great Plural Experiment

2013-03-31 22:24

You probably know very well that internationalization is hard. The mere act of translating the UI texts is actually one of the easiest parts, even though it’s not a pushover either. As one example: if your messages include quantities, you need to have some logic in place to choose different forms of nouns to go with your numbers. Fortunately, most frameworks already have that, as it’s a standard i18n feature.

Not every string message in your code is something to localize, of course. Log messages that are not visible to the user can be left alone in English – they should be, in fact. Coincidentally, though, those messages are also very likely to contain many numbers, often used as numerical quantities: things to do, things done, error count, and so on:

  1. INFO: database.py:442/init_database - 236 rows created

What if that number is 1?…

  1. INFO: files.py:132/process_files - 1 files processed

Oh well. That’s hardly the end of the world, isn’t it? Anyway, let’s just make the message slightly more universal:

  1. INFO: files.py:132/process_files - 1 file(s) processed

There, problem solved!

No worries, I haven’t gone insane. I know that no real-world software would put such a gold plating on something as irrelevant as grammar of its log messages. But it’s spring break, and we can be silly, so let’s have some fun with the idea.
Here I pose the question:

How hard would it be to construct a plural form of English noun from the singular one?

Consulting the largest repository of human knowledge (well, second largest) reveals that the rules of building English plurals are not exactly trivial – but not very complex either. There are exceptions to almost every rule, though, and a large body of exceptions in general. Still, you could expect to achieve at least some success by just disregarding them completely, and following the simple rules to the letter.

How high that success ratio would be, though?

Tags: , , , ,
Author: Xion, posted under Computer Science & IT » 4 comments
 


© 2017 Karol Kuczmarski "Xion". Layout by Urszulka. Powered by WordPress with QuickLaTeX.com.