Design patterns are often criticized, typically in the context of object-oriented programming. I buy into many such critiques, mostly because I value simplicity as one of the most important qualities of good code. Patterns – especially when overused – often stand in the way to achieve it,
Not all critique aimed towards design patterns is well founded and targeted, though. More specifically, the example I’ve seen brought up quite often is the Singleton pattern, and I don’t think it’s a good one in this context. Actually, for making a case for design patterns being (sometimes) harmful, the singleton is probably one of the worst picks.
Realizing this is important, because whatever point you’re trying to convey will be significantly watered down if you use an inadequate example. It’s just too easy to make up counterarguments or excuses, concentrating on specific flaws of your sloppy choice, rather than addressing more general issues you wanted to put some light on. A bad example can simply be a red herring, drawing attention from the topic you wanted it to stand for.
What’s so bad about singleton pattern, though?
Especially in their classic incarnation formulated in famous work of Gang of Four, design patterns are mostly about increasing robustness and flexibility of software design by introducing additional layers of indirection between existing concepts. For instance, you can consider the Factory pattern as proxy that separates the process of creating an object from specific type (class) of that object.
This goes along the same lines as separation between interface and implementation, a fundamental concept behind the whole object-oriented paradigm. The purpose is to decrease coupling, i.e. dependencies between different parts of the code, and it’s noble goal in its own regard.
Unfortunately, the Singleton pattern doesn’t really aid us in this pursuit. Quite the opposite: it talks about having at most one single instance of some class, which will easily make it a choke point for many otherwise independent parts of program logic. It happens especially often with top-level objects, representing whole subsystems; thanks to making them into singletons, they end up being used almost everywhere.
We also shouldn’t forget what singletons really are – that is, global variables. (You can have singletons with more limited scope, of course, but OO languages typically support them as language feature that doesn’t require dedicated design pattern). The pattern attempts to abstract them away but they tend to leak out rather eagerly, causing numerous problems.
Indeed, there are all sorts of nastiness related to global variables, with these two being – in my opinion – the most important ones:
It is worth noting that these problems are somewhat language-specific. In several programming languages, you can relatively easily create “global” variables which are only apparent; in reality, they proxy to thread-local and/or mockable objects, addressing both concerns outlined above.
However, in such languages the Singleton pattern is often obsolete as explicit technique, because they readily provide it as part of the language. For example, Python module objects are already singletons: their singularity is guaranteed by interpreter itself.
So, if you are to discuss the merits of software design patterns: pros and (specifically) cons, make sure you don’t base your whole argumentation on the example of Singleton. Accuracy, integrity and honesty would require choosing a target which is more representative and has no severe, unrelated issues.
Something like, say, Iterator. Or Factory. Or Composite.
Or pretty much anything else.
Fairly recently, I started reading up on quantum mechanics (QM) to brush up my understanding of the topic and, quite surprisingly, I’ve found it ripe with analogies to my typical interests: software development. The one that stands out particularly well relates to the very basics of QM and the way they were widely misunderstood for many decades. What’s really amusing here is that while majority of physicists seem to have been easily fooled by how the world operates on quantum level, any contemporary half-decent software engineer, faced with problems of very similar nature, typically doesn’t exhibit folly of this magnitude.
We are not uncovering the Grand Scheme of Things every day, of course; what I’m saying is that we seem to be much less likely to come up with certain extremely bad answers to all the why? questions we encounter constantly in our work. Even the really hard ones (“Why-oh-why it doesn’t work?!”) are rarely different in this regard.
Thus I dare to say that we would not be so easily tricked by some “bizarre” phenomena that have fooled many of the early QM researchers. In fact, they turn out to be perfectly reasonable (and rather simple) if we look at them with programmer’s mindset. The hard part, of course, is to discover that such a perspective applies here, instead of quickly jumping to “intuitive” but wrong conclusions.
To see how tempting that jump can be, we should now look at one simple experiment with light and mirrors, and try to decipher its puzzling results.
The setup is not very complicated. We have one light source, two detectors and two pairs of mirrors. One pair consists of standard, fully reflective mirrors. Second pair has half-silvered ones; they reflect only half of the light, letting the other half through without changing its direction.
We arrange this equipment as shown in the following picture. Here, the yellow lines depict path the light is taking after being emitted from the source, somewhere beyond the left edge.
Source of this and subsequent images
But in this experiment, we are not letting out a continuous ray of light. Instead, we send out individual photons. We know (from some previous observations) that half-silvered mirrors are still behaving correctly in this scenario: they just reflect a photon about 50% of the time. Normal mirrors, obviously, are always reflecting all the photons.
Knowing this, we would expect both detectors to go off with roughly similar frequency. What we find out in practice is that only detector 2 is ever registering any photons, and no particle whatsoever reaches detector 1, at any time. (This is illustrated by a dashed line).
At this point we might want to perform a sanity check, to see whether we are really dealing with individual particles (rather than waves that can interfere and thus cancel themselves out). So, we block out one of the paths:
and now both detectors are going off, but not simultaneously. This indicates that our photons are indeed localized particles, as they appear to be only in one place at a time. Yet, for some weird inexplicable reason, they don’t show at detector 1 if we remove the barrier.
There are all sorts of peculiar conclusions we could come up with already, including the mere possibility of photon going both ways to have an effect on results we observe. Let’s try not to be crazy just yet, though. Surely we can establish which one of the two paths is actually being taken; it’s just a matter of putting an additional sensor:
So we do just that, and we turn on the machinery again. What we find out, however, is far from definite answer. Actually, it’s totally opposite: both detectors are going off now, just like in the previous setup – but we haven’t blocked anything this time! We just wanted to take a sneak peak and learn about the actual paths that our photons are taking.
But as it turns out, we are now preventing the phenomenon from occurring at all… What the hell?!
When thinking about concurrent programs, we are sometimes blinded by the notion of bare threads. We create them, start them, join them, and sometimes even interrupt
them, all by operating directly on those tiny little abstractions over several paths of simultaneous execution. At the same time we might be extremely reluctant to directly use synchronization primitives (semaphores, mutexes, etc.), preferring more convenient and tailored solutions – such as thread-safe containers. And this is great, because synchronization is probably the most difficult aspect of concurrent programming. Any place where we can avoid it is therefore one less place to be infested by nasty bugs it could ensue.
So why we still cling to somewhat low-level Thread
s for actual execution, while having no problems with specialized solutions for concurrent data exchange and synchronization?… Well, we can be simply unaware that “getting code to execute in parallel” is also something that can benefit from safety and clarity of more targeted approach. In Java, one such approach is oh-so-object-orientedly called executors.
As we might expect, an Executor
is something that executes, i.e. runs, code. Pieces of those code are given it in a form of Runnable
s, just like it would happen for regular Thread
s:
Executor
itself is an abstract class, so it could be used without any knowledge about queuing policy, scheduling algorithms and any other details of the way it conducts execution of tasks. While this seems feasible in some real cases – such as servicing incoming network requests – executors are useful mainly because they are quite diverse in kind. Their complex and powerful variants are also relatively easy to use.
Simple functions for creating different types of executors are contained within the auxiliary Executors
class. Behind the scenes, most of them have a thread pool which they pull threads from when they are needed to process tasks. This pool may be of fixed or variable size, and can reuse a thread for more than one task,
Depending on how much load we expect and how many threads can we afford to create, the choice is usually between newCachedThreadPool
and newFixedThreadPool
. There is also peculiar (but useful) newSingleThreadExecutor
, as well as time-based newScheduledThreadPool
and newSingleThreadScheduledExecutor
, allowing to specify delay for our Runnable
s by passing them to schedule
method instead of execute
.
There is one case where the abstract nature of base Executor
class comes handy: testing and performance tuning. A certain types of executors can serve as good approximation of some common concurrency scenarios.
Suppose that we are normally handling our tasks using a pool with fixed number of threads, but we are not sure whether it’s actually the most optimal number. If our tasks appear to be mostly I/O-bound, it could be good idea to increase the thread count, seeing that threads waiting for I/O operations simply lay dormant for most of the time.
To see if our assumptions have grounds, and how big the increase can be, we can temporarily switch to cached thread pool. By experimenting with different levels of throughput and observing the average execution time along with numbers of threads used by application, we can get a sense of optimal number of threads for our fixed pool.
Similarly, we can adjust and possibly decrease this number for tasks that appear to be mostly CPU-bound.
Finally, it might be also sensible to use the single-threaded executor as a sort of “sanity check” for our complicated, parallel program. What we are checking this way is both correctness and performance, in rather simple and straightforward way.
For starters, our program should still compute correct results. Failing to do so serves as indication that seemingly correct behavior in multi-threaded setting may actually be an accidental side effect of unspotted hazards. In other words, threads might “align just right” if there is more than one running, and this would hide some insidious race conditions which we failed to account for.
As for performance, we should expect the single-thread code to run for longer time than its multi-thread variant. This is somewhat obvious observation that we might carelessly take for granted and thus never verify explicitly – and that’s a mistake. Indeed, it’s not unheard of to have parallelized algorithms which are actually slower than their serial counterparts. Throwing some threads is not a magic bullet, unfortunately: concurrency is still hard.
O ile tylko ktoś nie spędził zeszłego tygodnia na Antarktydzie, w amazońskiej dżungli czy w innym podobnie odciętym od cywilizacji miejscu, z pewnością słyszał najważniejszą nowinę ostatnich lat. A już na pewno wspomnianego tygodnia – bo przecież jak tu ją nawet porównywać z takimi błahostkami jak choćby zakup Motoroli przez Google. Przecież mówimy tutaj o pomyślnym końcu procesu rozpoczętego w czasach, gdy Google nawet nie istniał! To musi robić wrażenie… I nawet jeśli owym wrażeniem jest głównie: “No wreszcie; co tak długo?!”, to przecież w niczym nie umniejsza to rangi wydarzenia.
Tak, mamy w końcu nowy standard C++! I to w sumie mogłoby wystarczyć za całą notkę, bo chyba wszystko, co można by powiedzieć na temat kolejnej wersji jednego z najważniejszych języków programowania, zostało już pewnie dawno powiedziane w dziesiątkach serwisów informacyjnych, tysiącach blogów i milionach tweetów. Znaczącą ich część zajmują omówienia nowych możliwości języka, dostępnych zresztą od jakiegoś czasu (acz w niepełnej formie) w kilku wiodących kompilatorach. Możliwości tych jest całkiem sporo i dlatego nie mam zamiaru nawet wyliczać ich wszystkich. Zdecydowałem, że w zamian przyjrzę się bliżej tylko trzem z nich – tym, które uważam za najbardziej znaczące i warte uwagi.
Nauczyłem się już lubić fakt, że w przypadku informatyki powiedzenie o “ciekawych czasach” jest truizmem, bo ciekawie jest po prostu zawsze – głównie ze względu na tempo zmian w wielu dziedzinach. Nawet w tych, wydawałoby się, zastygłych na lata. Niecałe trzy lata temu zżymałem się na przykład na zbytnią ufność w doskonałość obiektowych metod programowania. Dzisiaj zaś przychodzi mi robić coś zdecydowanie przeciwnego.
Programowanie obiektowe jest obecnie sztandarowym kozłem ofiarnym i chłopcem do bicia, otrzymującym ciosy z wielu stron. Już nie tylko programiści gier twierdzą, że nie mogą sobie na nie pozwolić ze względu na wydajność i zamiast niego forsują Data Oriented Design. Pokazywałem niedawno, że sprzeczność między tymi dwoma podejściami jest raczej pozorna niż rzeczywista. Teraz natknąłem się na interesującą opinię, która podważa sens OOP-u jako metodologii, wychodząc z nieco innego punktu widzenia niż wydajność dla celów grafiki real-time:
Object-oriented programming (…) is both anti-modular and anti-parallel by its very nature, and hence unsuitable for a modern CS curriculum. [pogrubienie moje]
Anty-modularne i anty-współbieżne? Oczywiście; da się napisać kod obiektowy, który te dwa warunki będzie spełniał doskonale. Ale to nie oznacza, że każdy kod obiektowy je spełnia, a to właśnie jest implikowane powyżej. Nie da się tego określić inaczej niż jako stereotyp – i to w modelowej wersji, czyli negatywnego uogólnienia z pojedynczych przypadków.
Jako antidotum na te rzekome bolączki OOP-u często wymieniane jest programowanie funkcyjne. Nie ujmując mu niczego ze swojej elegancji, nie mogę jednak nie zauważyć, że zamiata ono wiele problemów pod dywan. Określanie wykonania programu jako serii transformacji danych nie rozwiązuje jednak problemu: gdzie i jak te dane mają być zapisywane i chronione przed równoczesnym dostępem z wielu ścieżek wykonania. Sytuacje, w których programowanie funkcyjne lub quasi-funkcyjne sprawdza się dobrze to takie, gdzie problemy te dały się w miarę łatwo rozwiązać. Tak jest chociażby w przypadku vertex i pixel shaderów, gdzie podział danych wejściowych i wyjściowych na rozłączne bloki jest wręcz naturalny. Fakt ten nie jest jednak zasługą programowania funkcyjnego, tylko natury zagadnienia – w tym przypadku renderowania grafiki opartej na wielokątach.
I właśnie o tym powinniśmy pamiętać, gdy wyzłośliwiamy się nie tylko na OOP, ale dowolny inny paradygmat programowania. Otóż porzucenie go nie sprawi od razu, że magicznie zaczniemy pisać kod doskonale modularny. A już nie pewno nie spowoduje, że niezwykle trudne zagadnienia współbieżności staną się nagle banalnie proste. To niestety tak nie działa.
Nie znaczy to oczywiście, że nie powinniśmy poszukiwać nowych, lepszych metodologii do konkretnych zastosowań. Dlatego przecież wiele języków (np. C++, C#, Python) ewoluuje w kierunku wieloparadygmatowości, aby możliwe było dobranie właściwych narzędzi dla danej sytuacji. Nie wydaje mi się jednak, aby uleganie trendy nurtom krytykowania jakichkolwiek rozwiązań poprzez odwoływanie się do stereotypów i nieuzasadnionych wyobrażeń o nich było w tym procesie specjalnie produktywne. Zdaję sobie jednak sprawę, że “funkcje wirtualne to zuo!” brzmi lepiej niż “wywoływanie funkcji wirtualnych skutkuje narzutem wydajnościowym związanym z dodatkowym adresowaniem pamięci (które nie jest cache-friendly) i może powodować niepożądane skutki uboczne, jeśli ich wersje w klasach pochodnych nie są thread-safe“. Mam jednak nadzieję, iż nikt nie ma wątpliwości, które z tych dwóch stwierdzeń jest bardziej racjonalne.
Podziękowania dla Rega za podesłanie linków, które zainspirowały mnie do podjęcia tego tematu.