Posts tagged ‘functions’

Simulating nonlocal Keyword in Python 2.x

2013-08-04 15:30

The overall direction where Python 3 is going might be a bit worrying, but it’s undeniable that the 3.0 line has some really nice features and quality-of-life improvements. What’s not to love about Unicode string literals enabled by default? Or the print function? What about range, map and filter all being generators? Neato!

There are also few lesser known points. One is the new nonlocal keyword. It shares the syntax with the global keyword, which would make it instantaneously fishy just by this connotation. However, nonlocal looks genuinely useful: it allows to modify variables captured inside function’s closure:

  1. def count_parity(numbers):
  2.     even_count = odd_count = 0
  3.  
  4.     def examine(number):
  5.         if number % 2 == 0:
  6.             nonlocal even_count
  7.             even_count += 1
  8.         else:
  9.             nonlocal odd_count
  10.             odd_count += 1
  11.  
  12.     list(map(examine, numbers))
  13.     return even_count, odd_count

It’s something you would do quite a lot in certain other languages (*cough* JavaScript), so it’s good that Python has got around to support the notion as well. Syntax here is a bit clunky, true, but that’s simply a trade off, stemming directly from the lack of variable declarations in Python.

What about Python 2.x, though – are we out of luck? Well, not completely. There are a few ways to emulate nonlocal through other pythonic capabilities, sometimes even to better effect than the nonlocal keyword would yield.

Use generator instead

Speaking of yielding… As you have probably noticed right away, the example above is quite overblown and just plain silly. You don’t need to play functional just to count some values – you would use a loop instead:

  1. def count_parity(numbers):
  2.     even_count = odd_count = 0
  3.     for number in numbers:
  4.         if number % 2 == 0:
  5.             even_count += 1
  6.         else:
  7.             odd_count += 1
  8.      return even_count, odd_count

Really, the previous version is just a mindless application of the classic Visitor pattern, which is another reason why you shouldn’t do that: pattern overuse is bad. This saying, Visitor obviously has its place: it’s irreplaceable when traversing more complicated structures in more bureaucratic languages. A simple list of numbers in Python is the direct opposite of both of these characteristics.

Complex data structures exist in any language, however. How would we run some Python code for every node in a tree, or maybe graph? Unrolling the DFS or BFS or whatever traversal algorithm we use certainly doesn’t sound like an elegant and reusable approach.
But even then, there is still no need for functions and closures. We can easily get away with the simple for loop, if we just find a suitable iterable to loop over:

  1. def bst_count_parity(tree):
  2.     """Count the number of even and odd numbers in binary search tree."""
  3.     even_count = odd_count = 0
  4.     for node in bst_nodes(tree):
  5.         if node.value % 2 == 0:
  6.             even_count += 1
  7.         else:
  8.             odd_count += 1
  9.     return even_count, odd_count

The bst_nodes function above is not black magic by any stretch. It’s just a simple example of generator function, taking advantage of the powerful yield statement:

  1. def bst_nodes(tree):
  2.     """Yields nodes of binary tree in breadth-first order."""
  3.     queue = [tree]
  4.     while queue:
  5.         node = queue.popleft()
  6.         yield node
  7.         if node.left:
  8.             queue.append(node.left)
  9.         if node.right:
  10.             queue.append(node.right)

This works because both bst_count_parity and bst_nodes functions are executed “simultaneously”. That has the same practical effect as calling the visitor function to process a node, only the “function” is concealed as the body of for loop.

Language geeks (and Lisp fans) would say that we’ve exchanged a closure for continuation. There is probably a monad here somewhere, too.

Create reference where there is none

Generators can of course solve a lot of problems that we may want to address with nonlocal, but it’s true you cannot write them all off just by clever use of yield statement. For the those rare occasions – when you really, positively, truly need a mutable closure – there are still some options on the board.

The crucial observation is that while the closure in Python 2.x is indeed immutable – you cannot add new variables to it – the objects inside need not be. If you are normally able to change their state, you can do so through captured variables as well. After all, you are still just “reading” those variables; they do not change, even if the objects they point to do.

Hence the solution (or workaround, more accurately) is simple. You need to wrap your value inside a mutable object, and access it – both outside and inside the inner function – through that object only. There are few choices of suitable objects to use here, with lists and dictionaries being the simplest, built-in options:

  1. def incr(redis, key):
  2.     """Increments value of Redis key, as if Redis didn't have INCR command.
  3.    :return: New value for the key
  4.    """
  5.     res = []
  6.  
  7.     def txn(pipe):
  8.         res[0] = int(pipe.get(key)) + 1
  9.         pipe.multi()
  10.         pipe.set(key, res[0])
  11.  
  12.     redis.transaction(txn, key)
  13.     return res[0]

If you become fond of this technique, you may want to be more explicit and roll out your own wrapper. It might be something like a Var class with get and set methods, or just a value attribute.

Classy solution

Finally, there is a variant of the above approach that involves a class rather than function. It is strangely similar to “functor” objects from the old C++, back when it didn’t support proper lambdas and closures. Here it is:

  1. def incr(redis, key):
  2.     """Increments value of Redis key, as if Redis didn't have INCR command.
  3.    :return: New value for the key
  4.    """
  5.     class IncrTransaction(object):
  6.         def __call__(self, pipe):
  7.             self.result = int(pipe.get(key)) + 1
  8.             pipe.multi()
  9.             pipe.set(key, self.result)
  10.  
  11.     txn = IncrTransaction()
  12.     redis.transaction(txn, key)
  13.     return txn.result

Its main advantage (besides making it a bit clearer what’s going on) is the potential for extracting the class outside of the function – and thus reusing it. In the above example, you would just need to add the __init__(self, key) method to make the class independent from the enclosing function.

Ironically, though, that would also defeat the whole point: you don’t need a mutable closure if you don’t need a closure at all. Problem solved? ;-)

Tags: , , , ,
Author: Xion, posted under Programming » Comments Off on Simulating nonlocal Keyword in Python 2.x

Unpacking Ad-hoc Dictionaries

2013-01-27 8:04

Today I’d like to present something what I consider rather obvious. Normally I don’t do that, but I’ve had one aspiring Pythonist whom I helped with the trick below, and he marveled at the apparent cleverness of this solution. So I thought it might be useful for someone else, too.

Here’s the deal. In Python, functions can be invoked with keyword arguments, so that argument name appears in the function call. Many good APIs use that feature extensively; database libraries known as ORMs are one typical example:

  1. user = session.query(User).filter_by(email="joe@example.com").first()

In this call to filter_by() we pass the email argument as a keyword. Its value is then used to construct an SQL query that contains a filter on email column in the WHERE clause. By adding more arguments, we can introduce more filters, linked together with the AND operator.

Suppose, though, that we don’t know the column name beforehand. We just have it stored in some variable, maybe because the query is part of authentication procedure and we support different means for it: e-mail, Facebook user ID, Twitter handle, etc.
However, the keyword arguments in function call must always be written as literal Python identifiers. Which means that we would need to “eval” them somehow, i.e. compute dynamically.

How? Probably best is to construct an ad-hoc dictionary and unpack it with ** operator:

  1. def get_user(column_name, column_value):
  2.     return session.query(User).filter(**{column_name: column_value}).first()

That’s it. It may not be obvious at first, because normally we only unpack dictionaries that were carefully crafted as local variables, or received as kwargs parameters:

  1. def some_function(one_arg, **kwargs):
  2.     kwargs['foo'] = 'bar'
  3.     some_other_function(**kwargs)

But ** works on any dictionary. We are thus perfectly allowed to create one and then unpack it immediately. It doesn’t make much sense in most cases, but this is one of the two instances when it does.

The other situation arises when we know the argument name while writing code, but we cannot use it directly. Python reserves many short, common words with plethora of meanings (computer-scientific or otherwise), so this is not exactly a rare occurrence. You may encounter it when building URLs in Flask:

  1. login_url = url_for('login', **{'as': test_user_id})

or parsing HTML with BeautifulSoup:

  1. comment_spans = comments_table.find_all('span', **{'class': 'comment'})

Strictly speaking, this technique allows you to have completely arbitrary argument names which are not even words. Special handling would be required on both ends of function call, though.

Tags: , ,
Author: Xion, posted under Computer Science & IT » 2 comments

The Javascript Functions Which Are Not

2012-06-03 15:25

It would be quite far-fetched to call JavaScript a functional language, for it lacks many more sophisticated features from the FP paradigm – like tail recursion or automatic currying. This puts it on par with many similar languages which incorporate just enough of FP to make it useful but not as much as to blur their fundamental, imperative nature (and confuse programmers in the process). C++, Python or Ruby are a few examples, and on the surface JavaScript seems to place itself in the same region as well.

Except that it doesn’t. The numerous different purposes that JavaScript code uses functions makes it very distinct, even though the functions themselves are of very typical sort, found in almost all imperative languages. Learning to recognize those different roles and the real meaning of function keyword is essential to becoming an effective JS coder.

So, let’s look into them one by one and see what the function might really mean.

A scope

If you’ve seen few good JavaScript libraries, you have surely stumbled upon the following idiom:

  1. /* WonderfulThing.js
  2.  * A real-time, HTML5-enabled, MVC-compatible boilerplate
  3.  * for appifying robust prototypes... or something
  4.  */
  5.  
  6. (function() {
  7.     // actual code goes here
  8. })();

Any and all code is enclosed within an anonymous function. It’s not even stored in a variable; it’s just called immediately so its content is just executed, now.

This round-trip may easily be thought as if doing absolutely nothing but there is an important reason for keeping it that way. The point is that JavaScript has just one global object (window in case of web browsers) which is a fragile namespace, easily polluted by defining things directly at the script level.

We can prevent that by using “bracketing” technique presented above, and putting everything inside this big, anonymous function. It works because JavaScript has function scope and it’s the only type of non-global scope available to the programmer.

A module

So in the example above, the function is used to confine script’s code and all the symbols it defines. But sometimes we obviously want to let some things through, while restricting access to some others – a concept known as encapsulation and exposing an interface.

Perhaps unsurprisingly, in JavaScript this is also done with the help of a function:

  1. var counter = (function() {
  2.     var value = 0;
  3.     return {
  4.         increment: function(by) {
  5.             value += by || 1;
  6.         },
  7.         getValue: function() {
  8.             return value;
  9.         },
  10.     };
  11. })();

What we get here is normal JS object but it should be thought of more like a module. It offers some public interface in the form of increment and getValue functions. But underneath, it also has some internal data stored within a closure: the value variable. If you know few things about C or C++, you can easily see parallels with header files (.h, .hpp, …) which store declarations that are only implemented in the code files (.c, .cpp).

Or, alternatively, you may draw analogies to C# or Java with their public and private (or protected) members of a class. Incidentally, this leads us to another point…

Object factories (constructors)

Let’s assume that the counter object from the example above is practical enough to be useful in more than one place (a tall order, I know). The DRY principle of course prohibits blatant duplication of code such as this, so we’d like to make the piece more reusable.

Here’s how we typically tackle this problem – still using only vanilla functions:

  1. var createCounter = function(initial) {
  2.     var value = initial || 0;
  3.     return {
  4.         increment: function(by) {
  5.             value += by || 1;
  6.         },
  7.         getValue: function() {
  8.             return value;
  9.         },
  10.     };
  11. };
  12. var counter = createCounter();
  13. var counterFrom1000 = createCounter(1000);

Pretty straightforward, right? Instead of calling the function on a spot, we keep it around and use to create multiple objects. Hence the function becomes a constructor for them, while the whole mechanism is nothing else but a foundation for object-oriented programming.

\displaystyle functions + closures = OOP

We have now covered most (if not all) roles that functions play when it comes to structuring JavaScript code. What remains is to recognize how they interplay with each other to control the execution path of a program. Given the highly asynchronous nature of JavaScript (on both client and server side), it’s totally expected that we will see a lot of functions in any typical JS code.

Tags: , , , , ,
Author: Xion, posted under Programming » Comments Off on The Javascript Functions Which Are Not

How Does this Work (in JavaScript)

2012-03-18 21:19

Many caveats clutter the JavaScript language. Some of them are quite hilarious and relatively harmless, but few can get really nasty and lead to insidious bugs. Today, I’m gonna talk about something from the second group: the semantics of this keyword in JavaScript.

Why’s this?

It is worth noting why JS has the this keyword at all. Normally, we would expect it only in those languages which also have the corresponding class keyword. That’s what C++, Java and C# have taught us: that this represents the current object of a class when used inside one of its methods. It only makes sense, then, to use this keyword in a class scope, denoted by the class keyword – both of which JavaScript doesn’t seem to have. So, why’s this even there?

The most likely reason is that JavaScript actually has something that resembles traditional classes – but it does so very poorly. And like pretty much everything in JS, it is written as a function:

  1. function Greeting(text) {
  2.     this.text = text
  3. }
  4. Greeting.prototype.greet = function(who) {
  5.     alert("Hello, " who + "! " + this.text);
  6. }
  7.  
  8. var greeting = new Greeting("Nice to meet you!");
  9. greeting.greet("Alice");

Here, the Greeting is technically a function and is defined as one, but semantically it works more like constructor for the Greeting “class”. As for this keyword, it refers to the object being created by such a constructor when invoked by new statement – another familiar construct, by the way. Additionally, this also appears inside greet method and does its expected job, allowing access to the text member of an object that the method was called upon.

So it would seem that everything with this keyword is actually fine and rather unsurprising. Have we maybe overlooked something here, looking only at half of the picture?…

Well yes, very much so. And not even a half but more like a quarter, with the remaining three parts being significantly less pretty – to say it mildly.

Tags: , , , ,
Author: Xion, posted under Programming » 2 comments

Adding Recursive Depth to Our Functions

2012-03-05 22:46

I suppose it this not uncommon to encounter a general situation such as the following. Say you have some well-defined function that performs a transformation of one value into another. It’s not particularly important how lengthy or complicated this function is, only that it takes one parameter and outputs a result. Here’s a somewhat trivial but astonishingly useful example:

  1. def is_true(value):
  2.     ''' Checks whether given value can be interpreted as "true",
  3.    using various typical representations of truth. '''
  4.     s = str(value).lower()
  5.     can_be_true = s in ['1', 'y', 'yes', 'true']
  6.     can_be_false = s in ['0', 'n' 'no', 'false']
  7.     if can_be_true != (not can_be_false):
  8.         return bool(value) # fall back in case of inconsistency
  9.     return can_be_true

Depending on what happens in other parts of your program, you may find yourself applying such function to many different inputs. Then at some point, it is possible that you’ll need to handle lists of those inputs in addition to supporting single values. Query string of URLs, for example, often require such treatment because they may contain more than one value for given key, and web frameworks tend to collate those values into lists of strings.

In those situations, you will typically want to deal just with the list case. This leads to writing a conditional in either the caller code:

  1. if not isinstance(values, list):
  2.     values = [values]
  3. bools = map(is_true, values)

or directly inside a particular function. I’m not a big fan of similar solutions because everyone do them differently, and writing the same piece several times is increasingly prone to errors. Quite not incidentally, a mistake is present in the very example above – it shouldn’t be at all hard to spot it.

In any case, repeated application calls for extracting the pattern into something tangible and reusable. What I devised is therefore a general “recursivator”, whose simplified version is given below:

  1. def recursive(func):
  2.     ''' Creates a recursive function out of supplied one.
  3.    Resulting function recurses on lists, applying itself
  4.    to its elements. '''
  5.     def recursive_func(obj, *args, **kwargs):
  6.         if hasattr(obj, '__iter__'):
  7.             return [recursive_func(i, *args, **kwargs)
  8.                     for i in obj]
  9.         return obj
  10.     return recursive_func

As for usage, I think it’s equally feasible for both on-a-spot calls:

  1. bools = recursive(is_true)(values)

as well as decorating functions to make them recursive permanently. For this, though, it would be wise to turn it into class-based decorator, applying the technique I’ve described previously. This way we could easily extend the solution and tie it to our needs.

But what are the specific ways of doing so? I could think of some, like:

  • Recursing not only on lists, bit also on mappings (dictionaries) and applying the function to dictionary values. A common use case could be a kind of sanitization function for preparing values to be serialized, e.g. by turning datetimes into ISO-formatted strings.
  • Excluding some data types from recursion, preventing, say, sets from being turned into lists, as obviously sets are also iterable. In more general version, one could supply a predicate function for deciding whether to recurse or not.
  • Turning recursive into generator for more memory-efficient solution. If we’re lucky to program in Python 3.x, it would be a good excuse to employ the new yield from construct from 3.3.

One way or another, capturing a particular concept of computation into actual API such as recursive looks like a good way for making the code more descriptive and robust. Certainly it adheres to one of the statements from Zen: that explicit is better than implicit.

Tags: , , ,
Author: Xion, posted under Programming » Comments Off on Adding Recursive Depth to Our Functions

Decorators with Optional Arguments in Python

2011-12-13 18:34

It is common that features dubbed ‘syntactic sugar’ are often fostering novel approaches to programming problems. Python’s decorators are no different here, and this was a topic I touched upon before. Today I’d like to discuss few quirks which are, unfortunately, adding to their complexity in a way that often doesn’t feel necessary.

Let’s start with something easy. Pretend that we have a simple decorator named @trace, which logs every call of the function it is applied to:

  1. @trace
  2. def some_function(*args):
  3.     pass

An implementation of such decorator is relatively trivial, as it wraps the decorated function directly. One of the possible variants can be seen below:

  1. def trace(func):
  2.     def wrapped(*args, **kwargs):
  3.         logging.debug("Calling %s with args=%s, kwargs=%s",
  4.                       func.__name__, args, kwargs)
  5.         return func(*args, **kwargs)
  6.     return wrapped

That’s pretty cool for starters, but let’s say we want some calls to stand out in the logging output. Perhaps there are functions that we are more interested in than the rest. In other words, we’d like to adjust the priority of log messages that are generated by @trace:

  1. @trace(level=logging.INFO)
  2. def important_func():
  3.     pass

This seemingly small change is actually mandating massive conceptual leap in what our decorator really does. It becomes apparent when we de-sugar the @decorator syntax and look at the plumbing underneath:

  1. important_func = trace(level=logging.INFO)(important_func)

Introduction of parameters requires adding a new level of indirection, because it’s the return value of trace(level=logging.INFO) that does the actual decorating (i.e. transforming given function into another). This might not be obvious at first glance and admittedly, a notion of function that returns a function which takes some other function in order to output a final function might be – ahem – slightly confusing ;-)

But wait! There is just one more thing… When we added the level argument, we not necessarily wanted to lose the ability to invoke @trace without it. Yes, it is still possible – but the syntax is rather awkward:

  1. @trace()
  2. def some_function(*args):
  3.     pass

That’s expected – trace only returns the actual decorator now – but at least slightly annoying. Can we get our pretty syntax back while maintaining the added flexibility of specifying custom arguments? Better yet: can we make @trace, @trace() and @trace(level) all work at the same time?…

Looks like tough call, but fortunately the answer is positive. Before we delve into details, though, let’s step back and try to somewhat improve the way we are writing our decorators.

Tags: , , , ,
Author: Xion, posted under Programming » 2 comments

Trzy rodzaje metod w Pythonie

2011-10-03 23:05

Obiekty mają metody. Tak, w tym stwierdzeniu nie należy doszukiwać głębokiego sensu – jest ono po prostu prawdziwe :) Gdy mówimy o metodach obiektów czy też klas, zwykle mamy jednak na myśli tylko jeden ich rodzaj: metody instancyjne. W wielu językach programowania nie jest to aczkolwiek ich jedyny rodzaj – z takim przypadkiem mamy do czynienia chociażby w Pythonie.

Jak zawsze metody instancyjne są domyślnym typem, który tutaj jest dodatkowo zaznaczony obecnością specjalnego parametru – self – występującego zawsze jako pierwszy argument. To odpowiednik this z języków C++, C# czy Java i reprezentuje instancję obiektu, na rzecz której wywoływana jest metoda:

  1. class Counter(object):
  2.     def __init__(self, value = 0):
  3.         self._value = value
  4.     def increment(self, by = 1):
  5.         self._value += by

Fakt, że musi być on jawnie przekazany, wynika z zasad tworzenia zmiennych w Pythonie. Nie muszą być one jawnie deklarowane. Dlatego też odwołanie do pola obiektu jest zawsze kwalifikowane, gdyż przypisanie do _value zamiast self._value stworzyłoby po prostu zmienną lokalną.

Istnieją jednak takie metody, które nie operują na konkretnej instancji klasy. Typowo nazywa się je statycznymi. W Pythonie nie posiadają one parametru self, lecz są opatrzone dekoratorem @staticmethod:

  1. @staticmethod
  2. def format_string():
  3.     return "%d"

Statyczną metodę można wywołać zarówno przy pomocy nazwy klasy (Counter.format_string()), jak i jej obiektu (Counter().format_string()), ale w obu przypadkach rezultat będzie ten sam. Technicznie jest to bowiem zwyczajna funkcja umieszczona po prostu w zasięgu klasy zamiast zasięgu globalnego.

Mamy wreszcie trzeci typ, mieszczący się w pewnym sensie pomiędzy dwoma opisanymi powyżej. Nie wydaje mi się jednak, żeby występował on w żadnym innym, popularnym języku. Chodzi o metody klasowe (class methods). Nazywają się tak, bo są wywoływane na rzecz całej klasy (a nie jakiejś jej instancji) i przyjmują ową klasę jako swój pierwszy parametr. (Argument ten jest często nazywany cls, ale jest to o wiele słabsza konwencja niż ta dotycząca self).
W celu odróżnienia od innych rodzajów, metody klasowe oznaczone są dekoratorem @classmethod:

  1. @classmethod
  2. def from_other(cls, counter):
  3.     return cls(counter._value)

Podobnie jak metody statyczne, można je wywoływać na dwa sposoby – przy pomocy klasy lub obiektu – ale w obu przypadkach do cls trafi wyłącznie klasa. Tutaj akurat będzie to Counter, lecz w ogólności może to być także klasa pochodna:

  1. class BoundedCounter(Counter):
  2.     MAX = 100
  3.  
  4.     def __init__(self, value = 0):
  5.         if value > self.MAX:
  6.             raise ValueError, "Initial value cannot exceed maximum"
  7.         super(BoundedCounter, self).__init__(value)
  8.  
  9.     def increment(self, by = 1):
  10.         super(BoundedCounter, self).increment(by)
  11.         self._value = min(self._value, self.MAX)
  12.  
  13.     @classmethod
  14.     def from_other(cls, counter):
  15.         value = min(counter._value, cls.MAX)
  16.         return cls(value)

Powyższy kod – będący przykładem dodatkowego sposobu inicjalizacji obiektu – to dość typowy przypadek użycia metod klasowych. Korzystanie z nich wymaga aczkolwiek nieco wprawy w operowaniu pojęciami instancji klasy i samej klasy oraz ich poprawnego rozróżniania.
W gruncie rzeczy nie jest to jednak nic trudnego.

Tags: , , ,
Author: Xion, posted under Programming » 4 comments
 


© 2023 Karol Kuczmarski "Xion". Layout by Urszulka. Powered by WordPress with QuickLaTeX.com.