Over the course of several past months and years I was coding in Python, I’ve created quite a few Python packages: both open source and for private projects. Even though their most important part was always the code, there are numerous additional files that are necessary for the package to correctly serve its purpose. Rather than part of the Python language, they are more closely related to the Python platform.
But if you look for any definite, systemic info about them, you will at best find some scattered pieces of knowledge in various unrelated places. At worst, the only guidance would come in the form of a multitude of existing Python package sources, available on GitHub and similar sites. Parroting them is certainly an option, although I believe it’s much more advantageous to acquire firm understanding of how those different cogs fit together. Without it, following the modern Python’s best development practices – which are all hugely beneficial – is largely impossible.
So, I want to fill this void by outlining the structure of a Python package, as completely as possible. You can follow it as a step-by-step guide when creating your next project. Or just skim through it to see what you’re missing, and whether it’d be worthwhile to address such gaps. Any additional element or file will usually provide some tangible benefit, but of course not every project requires all bells and whistles.
Without further ado, let’s see what’s necessary for a complete Python software bundle.
So, I got myself a new laptop.
The main reason was that I wanted more powerful and – most importantly – slightly bigger portable computer. Up until now I used a cute 11.6″ machine that claimed to be a gaming laptop but worked pretty well as all-around development kit. The various trials and tribulations I had to overcome to make Ubuntu work reasonably well on this thing (it’s officially “not supported”) significantly increased my skills in tweaking Linux. And sometimes things worked so well that I actually managed to accomplish some work!
Nevertheless, the small size started to irk me quite a bit; the (small) additional mobility just wasn’t worth it. So this time I went for something just slightly bigger, but with a lot more pixels.
Alas, I got a 13″ Retina display MacBook Pro. Admittedly, I was a bit reluctant to be a semi-early adopter here, because the way increased resolution seems to work on these screens is a bit confusing. I mean, it’s apparently very natural to almost anyone: things look nicer, end of story. However, for someone who remembers how back in the days the difference between, say, 800×600 and 1024×768 made such a huge impact on UI scaling, the Retina’s quadrupling of pixel count may sounds pretty scary.
Just recall that the standard width for many website layouts is still around 960px, which translates to a little more than 1/3 (!) of Retina display’s width. Does it mean the web comes with big slabs of wasted whitespace and tiny column of content in between?…
Not really, as it turns out. By default, Retina cheats: the real (millimeter) size of UI elements is still roughly the same as on normal 13″ display running something around 1280×800. For typical GUI applications involving standard components and some text rendering, it’s indeed just making the interface sharper and more vivid. For pixel-perfect apps (such as games with set resolution), it seems the default solution is to stretch them proportionately; things might not look as nice then but they still work well.
Where the Retina display really shines is any serious text “processing”, be it reading websites, writing articles or – of course – programming. The additional level of detail might not be noticeable at first, however the difference becomes apparent when you look again on a screen with lower pixel density. There’s still some way to go in order to fluidly present even the smallest noticeable details to the sharpest of eyes, but it’s pretty short way.
I just shudder to think what resolution is needed to replicate the same sensation on 27″ or 30″ monitor :)
What about the operating system, though, the glorified OS X?
Besides handling that precious little screen very well – which cannot be said of some other systems – I don’t actually have much to say about it. With the rampant scavenging of UX concepts that goes back-and-forth between today’s platforms, the differences in look & feel of their graphical interfaces are mostly superficial. Whatever it is in the upper-right corner of your desktop – be it half-bitten apple, a rotated square or circle with dots – is unlikely to dictate the shape of your UI experience.
…Once you move the Dock to its proper position on the side, that is.
Under the hood, OS X is just a *nix, some say even more POSIX-y than what Linux currently is. This makes it a viable native choice for most developers, while the rest (i.e. those working with Microsoft products) can be accommodated via outstanding virtualization options. But all this goodness doesn’t come without few caveats.
Probably the biggest one is the horrendous functionality gap: lack of built-in package manager and installer. Life without apt-get
really sucks, and the bottom-up effort coalesced into Homebrew cannot really make up for it. I was especially appalled when I had to revert to the old google-download-unpack method for installing new programs. Amazingly, the Mac App Store is still mostly useless some two years after its inception.
Although I’m readily pointing out various quirks of the OS X platform here, I must say I’m not particularly concerned with them in the long term. I do not intended the Mac to become my primary system of choice, especially for development purposes. Its goal is to serve as handy portable computer, while simultaneously providing access to the third important platform to address any testing needs.
But that’s all aside of the most important perk: finally being able to visit those trendy coffee shops, of course! ;-)
There is a specific technology I wanted to play around with for some time now; it’s called node.js. It also happens that I think the best way to get to know new stuff is to create something small, but complete and functional. Note that by ‘functional’ I don’t really mean ‘practical’; that distinction is pretty important, given what I’m about to present here.
Basically, I wrote a package manager for jQuery. The idea was to have a straightforward way to install jQuery plugins – a way that somewhat mirrors the experience of dozens of other package managers, from pip
to cabal
. End result looks pretty decent, in my opinion:
The funny part? It doesn’t use any central, remote registry of plugins. What it does is searching GitHub and pulling code directly from there – provided it is able to find something relevant that looks like jQuery plugin. That seems to work well for quite a few popular ones, which is rather surprising given how silly and simplistic the underlying algorithm is. Certainly, there’s plenty of room for improvement, including support for jquery.json manifests – the future standard for the upcoming official plugin site.
As I said before, though, the main purpose of jqpm was educational one. After toying with underlying technologies for a couple of evenings, I definitely have better perspective to evaluate their usefulness. While the topic might warrant a follow-up posts in the future, I think I can briefly summarize my findings in few bullet points:
function
s and loops, and without denser syntactic sugar such as list comprehensions.require()
calls, it makes for an unusual system that resembles classic C/C++ #include
s – but improved, of course. What stands out the most is the lack of virtualenv/rvm-style utilities; instead, an equivalent approach of local node_modules
subfolders is used instead. (npm faq
and npm help folders
provide more elaborate explanation on how does it work exactly).The bottom line: node.js is definitely not a cancer and has many legitimate uses, mostly pertaining to rapid transfer of relatively small pieces of data over the Internet. API backends, single page web applications or certain game servers all fall easily into this category.
From developer’s point of view, it’s also quite fun platform to code in, despite the asynchronous PITA mentioned above (which is partially alleviated by libraries like async.js or frameworks providing futures/promises). On the overall abstraction ladder, I think it can be placed noticeably lower than Java and not very much higher than plain C. That place is an interesting one, and it’s also not densely populated by any similar technologies and languages (only Go and Objective-C come to mind). Occupying this mostly overlooked niche could very well be one of reasons for Node’s recent popularity.
…for fun and profit!
I’m still kind of amazed of how malleable the Python language is. It’s no small feat to allow for messing with classes before they are created but it turns out to be pretty commonplace now. My latest frontier of pythonic hackery is import hooks and today I’d like to write something about them. I believe this will come handy for at least a few pythonistas because the topic seems to be rather scarcely covered on the ‘net.
As you can easily deduce, the name ‘import hook’ indicates something related to Python’s mechanism of imports. More specifically, import hooks are about injecting our custom logic directly into Python’s importing routines. Before delving into details, though, let’s revise how the imports are being handled by default.
As far as we are concerned, the process seems to be pretty simple. When the Python interpreter encounters an import
statement, it looks up the list of directories stored inside sys.path
. This list is populated at startup and usually contains entries inserted by external libraries or the operating system, as well as some standard directories (e.g. dist-packages). These directories are searched in order and in greedy fashion: if one of them contains the desired package/module, it’s picked immediately and the whole process stops right there.
Should we run out of places to look, an ImportError
is raised. Because this is an exception we can catch, it’s possible to try multiple imports before giving up:
While this is extremely ugly boilerplate, it serves to greatly increase portability of our application or package. Fortunately, there is only handful of worthwhile libraries that we may need to handle this way; json
is the most prominent example.
__path__
What I presented above as Python’s import flow is sufficient as description for most purposes but far from being complete. It omits few crucial places where we can tweak things to our needs.
First is the __path__
attribute which can be defined in package’s __init__.py file. You can think of it as a local extension to sys.path
list that only works for submodules of this particular package. In other words, it contains directories that should be searched when a package’s submodule is being imported. By default it only has the __init__.py‘s directory but it can be extended to contain different paths as well.
A typical use case here is splitting single “logical” package between several “physical” packages, distributed separately – typically as different PyPI packets. For example, let’s say we have foo
package with foo.server
and foo.client
as subpackages. They are registered in PyPI as separate distributions (foo-server and foo-client, for instance) and user can have any or both of them installed at the same time. For this setup to work correctly, we need to modify foo.__path__
so that it may point to foo.server
‘s directory and foo.client
‘s directory, depending on whether they are present or not. While this task sounds exceedingly complex, it is actually very easy thanks to the standard pkgutil
module. All we need to do is to put the following two lines into foo/__init__.py file:
There is much more to __path__
manipulation than this simple trick, of course. If you are interested, I recommend reading an issue of Python Module of the Week devoted solely to pkgutil
.
sys.meta_path
and sys.path_hooks
Moving on, let’s focus on parts of import process that let you do the truly amazing things. Here I’m talking stuff like pulling modules directly from Zip files or remote repositories, or just creating them dynamically based on, say, WSDL description of Web services, symbols exported by DLLs, REST APIs, command line tools and their arguments… pretty much anything you can think of (and your imagination is likely better than mine). I’m also referring to “aggressive” interoperability between independent modules: when one package can adjust or expand its functionality when it detects that another one has been imported. Finally, I’m also talking about security-enhanced Python sandboxes that intercept import requests and can deny access to certain modules or alter their functionality on the fly.