Deprecate This

2013-05-26 14:02

As a fact of life, in bigger projects you often cannot just delete something – be it function, method, class or module. Replacing all its usages with whatever is the new recommendation – if any! – is typically outside of your influence, capabilities or priorities. By no means it should be treated as lost cause, though; any codebase would be quickly overwhelmed by kludges if there were no way to jettison them.

To reconcile those two opposing needs – compatibility and cleanliness – the typical approach involves a transition period. During that time, the particular piece of API shall be marked as deprecated, which is a slightly theatrical term for ‘obsolete’ and ‘not intended for new code’. How effective this is depends strongly on target audience – for publicly available APIs, someone will always wake up and start screaming when the transition period ends.

For in-project interfaces, however, the blow may be effectively cushioned by using certain features of the language, IDE, source control, continuous integration, and so on. As an example, Java has the @Deprecated annotation that can be applied to functions or classes:

  1. public class Foo {
  2.     /**
  3.      * @deprecated Use FooFactory instead
  4.      */
  5.     @Deprecated
  6.     public static Foo create() {
  7.         return new Foo();
  8.     }
  9. }

If the symbol is then used somewhere else, it produces a compiler warning (and visual cue in most IDEs). These can be suppressed, of course, but it’s something you need to do explicitly through a complementary language construct.

So I had this idea to try and add similar mechanism to Python. One part of it is already present in its standard library: we have the warnings module and a built-in category of DeprecationWarnings. These can be ignored, suppressed, caught or even made into errors.
They are also pretty powerful, as they allow to deprecate certain code paths and not just symbols, which can be useful when introducing new meanings for function parameters, among other things. At the same time, it means using them is irritatingly imperative and adds clutter:

  1. class Foo(object):
  2.     def __init__(self):
  3.         warnings.warn("Foo is deprecated", DeprecationWarning)
  4.         # ... rest of Foo constructor ...

And in this particular case, it also doesn’t work as intended, for reasons that will become apparent later on.
What we’d like instead is something similar to annotation approach that is available in Java:

  1. @deprecated
  2. class Foo(object):
  3.     # ...

Given that the @-things in Python (decorators, that is) are significantly more powerful than the Java counterparts, it shouldn’t be a tough call to achieve this…

Surprisingly, though, it turns out to be very tricky and quite arcane. The problems lie mostly in the subtle issues of what exactly constitutes “usage” of a symbol in Python, and how to actually detect it. If you try to come up with a few solutions, you’ll soon realize how the one that may eventually require walking through the interpreter call stack turns out to be the least insane one.

But hey, we didn’t go to the Moon because it was easy, right? ;) So let’s see how at least we can get started.

We will program our @deprecated decorator as a class; this will eventually make it easier to divide the logic into smaller pieces since they can be put into methods. We also limit the scope of the task and will only decorate classes (Handling functions is even simpler and thus left as, ahem, exercise for the reader.)
Our idea is to replace the constructor (__init__) method for given class with almost identical function, except for one added statement that issues a DeprecationWarning. The “new” constructor will of course call the “old” one, so that nothing in the object’s behavior and state visibly changes.

Where are they?

Here’s the code for our first version:

  1. import inspect
  2. import functools
  3. import warnings
  5. class _DeprecatedDecorator(object):
  7.     def __call__(self, symbol):
  8.         if not inspect.isclass(symbol):
  9.             raise TypeError("only classes can be @deprecated")
  10.         return self._wrap_class(symbol)
  12.     def _wrap_class(self, cls):
  13.         previous_ctor = cls.__init__
  15.         @functools.wraps(previous_ctor)
  16.         def new_ctor(*args, **kwargs):
  17.             self._warn(cls.__name__)
  18.             return previous_ctor(*args, **kwargs)
  20.         cls.__init__ = new_ctor
  21.         return cls
  23.     def _warn(self, name):
  24.         warnings.warn("%s is deprecated" % name, DeprecationWarning)
  26. # actual decorator will be the (only) instance of above class
  27. deprecated = _DeprecatedDecorator()
  28. del _DeprecatedDecorator

But if we test it:

  1. @deprecated
  2. class Foo(object):
  3.     pass
  5. Foo()

we’ll most likely notice that no warnings are actually issued. Turns out that if we don’t specify otherwise, Python 2.7 will silently ignore all DeprecationWarnings. We need to adjust the settings ourselves, then, and the best way is to un-ignore only our own instances of these warnings (as opposed to all DeprecationWarnings, which may suddenly e.g. reveal hidden issues in libraries we depend on):

  1. class _DeprecatedDecorator(object):
  2.     MESSAGE = "%s is @deprecated"
  4.     def __call__(self, symbol):
  5.         if not inspect.isclass(symbol):
  6.             raise TypeError("only classes can be @deprecated")
  8.         warnings.filterwarnings('default',
  9.                                 message=self.MESSAGE % r'\w+',
  10.                                 category=DeprecationWarning)
  11.         return self._wrap_class(symbol)
  13.     # ...
  15.     def _warn(self, name):
  16.         warnings.warn(self.MESSAGE % name, DeprecationWarning)

With filterwarnings, we specify the format of messages in our DeprecationWarnings. This way, those warnings (and only those) will be filtered in the 'default' way which includes outputting them to stderr – exactly the thing that we want.

Not the frame you are looking for

Or almost the thing:

  1. /home/xion/experiments/deprecated/ DeprecationWarning: Foo is @deprecated

The text we get shows a place in code (module path and line number) where the warning was issued. But it’s clearly the wrong place: it’s inside the decorator’s code itself! It looks like what we are deprecating is really the call to warnings.warn, which sounds completely silly even if slightly meta.

The reason this happens is because we haven’t told the warnings module that the _warn method is just wrapper code. The appropriate place for warning’s origin is somewhere among the callers of that code. In more technical terms, the warning should be raised from earlier stack frame, and this requires using a special stacklevel parameter:

  1. self._warn(self, name):
  2.     warnings.warn(self.MESSAGE % name, DeprecationWarning,
  3.                   stacklevel=3)

If you look at how the warnings.warn method is called, you’ll notice that it’s under two levels of our decorator’s code: the new constructor (new_ctor) and the _warn method. This means that the actual usage of decorated symbol happens at the third frame from the top of the call stack. Hence stacklevel=3. (Yes, frames are counted from 1. No, I have no idea why).

Picking from the stack

So we have introduced some magic number (3) into our code. But this is not the worst thing, if we have correctly identified the real place of “usage” for the @deprecated symbol. It may very well not be the case, though: even after getting out of the decorator’s code, we may still be deep in the woods.

In many non-trivial applications, their own code is often called by outside logic and external libraries. Any kind of event handling – such as responding to user input in GUI programs – typically falls into that category. As a result, the call stack is may have our code interleaved with “foreign” code of the GUI framework or similar.

It can get especially convoluted in one of Python’s prime domains: web applications. They are typically built on top of WSGI-complaint server that calls into a our web framework of choice; this means the bottom levels of call stack are filled with foreign frames.
Then, the framework dispatches HTTP requests to our own handlers, which in turn may invoke a templating engine to render bits and pieces of HTML. And that could involve calling our code again, if simply for the __str__/__unicode__ routines of objects we’ve passed as template arguments, custom extensions and template filters notwithstanding.

Tell me, where in this spaghetti you can see an obvious place to say that this is where a particular symbol Foo is really used? Unless I’m missing something evident, an acceptable solution is therefore very likely to be at least somewhat project-specific.

But I can give you some assistance at least when it comes to walking and inspecting the call stack. In native, compiled languages this is very platform-dependent and generally quite scary prospect. Fortunately, Python provides a reasonable interface to its interpreter’s stack frames in the form of frame object and a few functions inside the inspect module.
One of them is inspect.currentframe which, as the name states, given you the current frame, i.e. the one of that function’s caller. From there, it is possible to use the frame.f_back pointers to walk down the stack and retrieve its frames:

  1. def _get_callstack(self):
  2.     frame = inspect.currentframe()
  3.     frame = frame.f_back  # omit this function's frame
  5.     stack = []
  6.     try:
  7.         while frame:
  8.             stack.append(frame)
  9.             frame = frame.f_back
  10.     finally:
  11.         del frame
  13.     return stack

Those frame contain, among other things, a dictionary of local variables and function parameters: frame.f_locals. As a peculiar side effect, it is very easy to create cyclic references by storing them in locals. To avoid that, we need to explicitly del them when no longer needed, preferably using a try-finally construct like the one above.
Any and all similarities to memory management techniques from C or C++ are very much not a coincidence :)

Finding the source

Now that we’ve got hold of the stackframes, it’s not very hard to tell where they are coming from. The most interesting data point is probably the name and path to original source file, which is accessible through a code object associated with the frame:

  1. filename = frame.f_code.co_filename

Armed with this information, we should be able to tell “our” frames from foreign ones and thus compute a reasonably good place of origin for our DeprecationWarnings:

  1. import os
  3. # ...
  5. def _warn(self, name):
  6.     warnings.warn(self.MESSAGE % name, DeprecationWarning,
  7.                   stacklevel=self._compute_stacklevel())
  9. def _compute_stacklevel(self):
  10.     this_file, _ = os.path.splitext(__file__)
  11.     app_code_dir = self._get_app_code_dir()
  13.     def is_relevant(filename):
  14.         return filename.startswith(app_code_dir) and not \
  15.             filename.startswith(this_file)
  17.     stack = self._get_callstack()
  18.     stack.pop(0)  # omit this function's frame
  20.     frame = None
  21.     try:
  22.         for i, frame in enumerate(stack, 1):
  23.             filename = frame.f_code.co_filename
  24.             if is_relevant(filename):
  25.                 return i
  26.     finally:
  27.         del frame
  28.         del stack
  30.     return 0
  32. def _get_app_code_dir(self):
  33.     import myapplication  # root package for the app
  34.     app_dir = os.path.dirname(myapplication.__file__)
  35.     return os.path.join(app_dir, '')  # ensure trailing slash

In practice, this might require further tweaking in order to eliminate all the “boilerplate” stackframes, especially if we hook into libraries and frameworks with our custom extensions and plugins. Like I’ve mentioned above, the exact solution is likely to require some project-specific fiddling.

The complete example of @deprecated decorator can be seen in this gist.

Tags: , , , , ,
Author: Xion, posted under Programming »

Adding comments is disabled.

Comments are disabled.

© 2017 Karol Kuczmarski "Xion". Layout by Urszulka. Powered by WordPress with