Archive for Programming

The “Wat” of Python

2012-01-31 21:19

It is quite likely you are familiar with the Wat talk by Gary Bernhardt. It is sweeping through the Internet, giving some good laugh to pretty much anyone who watches it. Surely it did to me!
The speaker is making fun of Ruby and JavaScript languages (although mostly the latter, really), showing totally unexpected and baffling results of some seemingly trivial operations – like adding two arrays. It turns out that in JavaScript, the result is an empty string. (And the reasons for that provoke even bigger “wat”).

After watching the talk for about five times (it hardly gets old), I started to wonder whether it is only those two languages that exhibit similarly confusing behavior… The answer is of course “No”, and that should be glaringly obvious to anyone who knows at least a bit of C++ ;) But beating on that horse would be way too easy, so I’d rather try something more ambitious.
Hence I ventured forth to search for “wat” in Python 2.x. The journey wasn’t short enough to stop at mere addition operator but nevertheless – and despite me being nowhere near Python expert – I managed to find some candidates rather quickly.

I strove to keep with the original spirit of Gary’s talk, so I only included those quirks that can be easily shown in interactive interpreter. The final result consists of three of them, arranged in the order of increasing puzzlement. They are given without explanation or rationale, hopefully to encourage some thought beyond amusement :)

Behold, then, the Wat of Python!

Tags: , , ,
Author: Xion, posted under Internet, Programming » 10 comments

Using Executors in Java

2012-01-12 21:17

When thinking about concurrent programs, we are sometimes blinded by the notion of bare threads. We create them, start them, join them, and sometimes even interrupt them, all by operating directly on those tiny little abstractions over several paths of simultaneous execution. At the same time we might be extremely reluctant to directly use synchronization primitives (semaphores, mutexes, etc.), preferring more convenient and tailored solutions - such as thread-safe containers. And this is great, because synchronization is probably the most difficult aspect of concurrent programming. Any place where we can avoid it is therefore one less place to be infested by nasty bugs it could ensue.

So why we still cling to somewhat low-level Threads for actual execution, while having no problems with specialized solutions for concurrent data exchange and synchronization?... Well, we can be simply unaware that "getting code to execute in parallel" is also something that can benefit from safety and clarity of more targeted approach. In Java, one such approach is oh-so-object-orientedly called executors.

As we might expect, an Executor is something that executes, i.e. runs, code. Pieces of those code are given it in a form of Runnables, just like it would happen for regular Threads:

executor.execute(new Runnable() {
    @Override public void run() {
        calculatePiToDecimalPlaces(10000000);
    }
});

Executor itself is an abstract class, so it could be used without any knowledge about queuing policy, scheduling algorithms and any other details of the way it conducts execution of tasks. While this seems feasible in some real cases - such as servicing incoming network requests - executors are useful mainly because they are quite diverse in kind. Their complex and powerful variants are also relatively easy to use.

Let's play in pool

Simple functions for creating different types of executors are contained within the auxiliary Executors class. Behind the scenes, most of them have a thread pool which they pull threads from when they are needed to process tasks. This pool may be of fixed or variable size, and can reuse a thread for more than one task,

Depending on how much load we expect and how many threads can we afford to create, the choice is usually between newCachedThreadPool and newFixedThreadPool. There is also peculiar (but useful) newSingleThreadExecutor, as well as time-based newScheduledThreadPool and newSingleThreadScheduledExecutor, allowing to specify delay for our Runnables by passing them to schedule method instead of execute.

Swapping them

There is one case where the abstract nature of base Executor class comes handy: testing and performance tuning. A certain types of executors can serve as good approximation of some common concurrency scenarios.

Suppose that we are normally handling our tasks using a pool with fixed number of threads, but we are not sure whether it's actually the most optimal number. If our tasks appear to be mostly I/O-bound, it could be good idea to increase the thread count, seeing that threads waiting for I/O operations simply lay dormant for most of the time.
To see if our assumptions have grounds, and how big the increase can be, we can temporarily switch to cached thread pool. By experimenting with different levels of throughput and observing the average execution time along with numbers of threads used by application, we can get a sense of optimal number of threads for our fixed pool.
Similarly, we can adjust and possibly decrease this number for tasks that appear to be mostly CPU-bound.

Finally, it might be also sensible to use the single-threaded executor as a sort of "sanity check" for our complicated, parallel program. What we are checking this way is both correctness and performance, in rather simple and straightforward way.
For starters, our program should still compute correct results. Failing to do so serves as indication that seemingly correct behavior in multi-threaded setting may actually be an accidental side effect of unspotted hazards. In other words, threads might "align just right" if there is more than one running, and this would hide some insidious race conditions which we failed to account for.

As for performance, we should expect the single-thread code to run for longer time than its multi-thread variant. This is somewhat obvious observation that we might carelessly take for granted and thus never verify explicitly - and that's a mistake. Indeed, it's not unheard of to have parallelized algorithms which are actually slower than their serial counterparts. Throwing some threads is not a magic bullet, unfortunately: concurrency is still hard.

Tags: , , ,
Author: Xion, posted under Programming » 1 comment

Text Ellipsis with Gradient Fade in Pure CSS

2011-12-26 18:56

The other day I encountered a small but very interesting effect, visible in Bitbucket issues' table. Some of the cells were slightly too narrow for the text they contained, and it had to be ellipsized. Usually this is done by cropping some of the text's trailing chars and replacing them with dots - mostly because that's what the text-ellipsis style is doing. Here, however, I saw something much more original: the text was fading out in gradient-like style, going from full black to full transparent/white over a distance of about 30 pixels. It made quite of an eye-catching effect.

So, I decided to bring up Firebug and find out how this nifty trick actually works. Taught by past experiences, I expected a tightly coupled mis-mash of DOM and CSS hacks, with lots of moving parts that need to be carefully adjusted in face of any changes. Alas, I was wrong: it turned out to only use CSS, in succinct and elegant manner. After simple reverse-engineering, I uncovered a clever solution involving gradients, opacity and :before/:after pseudo-elements. It definitely deserves some press, so let's look into it.

Tags: , ,
Author: Xion, posted under Internet, Programming » Add comment

About Java references

2011-12-17 19:22

There is somewhat common misconception about garbage collecting, that it totally frees the programmer from memory-related concerns. Granted, it makes the task easier in great many cases, but it does so at the expense of significant loss of control over objects' lifetime. Normally, they are kept around for at least until they are not needed anymore - and usually that's fine for the typical definitions of "need" and "at least". Usually - but not always.

For those less typical use cases, garbage-collected environments provide mechanisms allowing to regain some of that lost control, to the extent necessary for particular task. Java, for example, offers a variety of different types of references, enabling to change the notion of what it means for an object to be eligible for garbage collecting. Choosing the right one for a problem at hand can be crucial, especially if we are concerned with the memory footprint of our application. Since - as the proverb goes - JVM expands to fill all available memory, it's good to know about techniques which help maintain our heap size in check.

The default is strong

So today, I will discuss the SoftReference and WeakReference classes, which can be both found in the java.lang.ref package. They provide the so-called soft and weak references, which are both considerably less powerful when it comes to prolonging the lifetime of an object.

Tags: , , ,
Author: Xion, posted under Programming » 1 comment

Decorators with Optional Arguments in Python

2011-12-13 18:34

It is common that features dubbed 'syntactic sugar' are often fostering novel approaches to programming problems. Python's decorators are no different here, and this was a topic I touched upon before. Today I'd like to discuss few quirks which are, unfortunately, adding to their complexity in a way that often doesn't feel necessary.

Let's start with something easy. Pretend that we have a simple decorator named @trace, which logs every call of the function it is applied to:

@trace
def some_function(*args):
    pass

An implementation of such decorator is relatively trivial, as it wraps the decorated function directly. One of the possible variants can be seen below:

def trace(func):
    def wrapped(*args, **kwargs):
        logging.debug("Calling %s with args=%s, kwargs=%s",
                      func.__name__, args, kwargs)
        return func(*args, **kwargs)
    return wrapped

That's pretty cool for starters, but let's say we want some calls to stand out in the logging output. Perhaps there are functions that we are more interested in than the rest. In other words, we'd like to adjust the priority of log messages that are generated by @trace:

@trace(level=logging.INFO)
def important_func():
    pass

This seemingly small change is actually mandating massive conceptual leap in what our decorator really does. It becomes apparent when we de-sugar the @decorator syntax and look at the plumbing underneath:

important_func = trace(level=logging.INFO)(important_func)

Introduction of parameters requires adding a new level of indirection, because it's the return value of trace(level=logging.INFO) that does the actual decorating (i.e. transforming given function into another). This might not be obvious at first glance and admittedly, a notion of function that returns a function which takes some other function in order to output a final function might be - ahem - slightly confusing ;-)

But wait! There is just one more thing... When we added the level argument, we not necessarily wanted to lose the ability to invoke @trace without it. Yes, it is still possible - but the syntax is rather awkward:

@trace()
def some_function(*args):
    pass

That's expected - trace only returns the actual decorator now - but at least slightly annoying. Can we get our pretty syntax back while maintaining the added flexibility of specifying custom arguments? Better yet: can we make @trace, @trace() and @trace(level) all work at the same time?...

Looks like tough call, but fortunately the answer is positive. Before we delve into details, though, let's step back and try to somewhat improve the way we are writing our decorators.

Tags: , , , ,
Author: Xion, posted under Programming » 1 comment

Synchronization Through Memcache

2011-12-10 18:11

In web and server-side development, a memcache is a remote service that keeps transient data in temporary, in-memory store and allows for fast access. It is usually based on keys which identify values being stored and retrieved. Due to storing technique - RAM rather than actual disks - it offers great speed (often less than few milliseconds per operation) while introducing a possibility for values to be evicted (deleted) if memcache runs out of space. Should that happen, oldest values are usually disposed of first.

This functionality makes memcache a good secondary storage that complements the usual persistent database, increasing the efficiency of the system as a whole. The usual scenario is to poll memcache first, and resort to querying the database only if the result cannot be found in cache. Once the query is made, its results are memcached for some reasonable amount time (usually called TTL: time to live), depending on allowed trade-offs between speed and being up-to-date with our results.

While making applications more responsive is the primary use case for memcache, it turns out that we can also utilize it for something completely different. That's because memcache implementations offer something more besides the obvious get/set commands. They also have operations which make it possible to use memcache as synchronization facility.

Odkrycie archeologiczne

2011-11-20 21:18

Sprzed prawie czterech lat pochodzi pewien niewielki (~2KLOC) projekt uczelniany, na który natknąłem się kilka dni temu w swoich przepastnych archiwach i postanowiłem upublicznić. Jest to implementacja prostego (acz zupełnie funkcjonalnego) serwera FTP, napisana w czystym C pod systemy POSIX-owe. Nie spodziewam się bynajmniej, aby mogła znaleźć rzeczywiste zastosowanie jako kawałek oprogramowania. Jest ona jednak całkiem interesująca jako kawałek kodu.

Wielu znany jest zapewne "syndrom następnego pół roku". Polega on na tym, że gdy po pół roku (plus/minus kilka miesięcy) spoglądamy na stworzony przez siebie kod, widzimy go tak, jakby napisał go ktoś zupełnie inny. Zazwyczaj wręcz trudno nam się w nim połapać i szybko dochodzimy do wniosku, że teraz napisalibyśmy go zdecydowanie lepiej. Nasz twór traktujemy więc jako bezwarto... ekhm... legacy code (;]), i uważamy to za naturalną kolej rzeczy.

Jednak moje niedawne znalezisku okazało się pod tym względem sporym zaskoczeniem. Nietypowe jest bowiem to, jak przetrwało ono próbę czasu. Na jego podstawie muszę dojść do lekko szokującego wniosku, iż Xion2007 potrafił - o zgrozo - pisać dobry kod. Robił to wprawdzie ostrożnie i raczej niepewnie (czego dowodem była przesadna ilość komentarzy), ale koniec końców udawało mu się to całkiem nieźle. Wysyłając mu wiadomość z przyszłości, mógłbym wprawdzie wspomnieć o zaletach podziału kodu na pliki krótsze niż 800-linijkowe, lecz poza tym do niewielu rzeczy mógłbym się przyczepić. To zupełnie akceptowalny, czytelny i przejrzysty kod w C...

Madness!

Tags: , ,
Author: Xion, posted under Programming, Studies » Add comment
 


© 2012 Karol Kuczmarski "Xion". Layout by Urszulka. Powered by WordPress with QuickLaTeX.com.