jqpm: Package Manager for jQuery

2012-07-07 15:05

There is a specific technology I wanted to play around with for some time now; it’s called node.js. It also happens that I think the best way to get to know new stuff is to create something small, but complete and functional. Note that by ‘functional’ I don’t really mean ‘practical’; that distinction is pretty important, given what I’m about to present here.

Basically, I wrote a package manager for jQuery. The idea was to have a straightforward way to install jQuery plugins – a way that somewhat mirrors the experience of dozens of other package managers, from pip to cabal. End result looks pretty decent, in my opinion:

  1. $ jqpm install flot
  2. [jqpm] flot installed successfully
  3. $ ls *.js
  4. jquery.flot.js

The funny part? It doesn’t use any central, remote registry of plugins. What it does is searching GitHub and pulling code directly from there – provided it is able to find something relevant that looks like jQuery plugin. That seems to work well for quite a few popular ones, which is rather surprising given how silly and simplistic the underlying algorithm is. Certainly, there’s plenty of room for improvement, including support for jquery.json manifests – the future standard for the upcoming official plugin site.

As I said before, though, the main purpose of jqpm was educational one. After toying with underlying technologies for a couple of evenings, I definitely have better perspective to evaluate their usefulness. While the topic might warrant a follow-up posts in the future, I think I can briefly summarize my findings in few bullet points:

  • Node’s JavaScript is almost the same language you can find in your browser, with all of its wats, warts and shortcomings. That’s not a big problem if you already learned to deal with them, but I surely wouldn’t recommend it as starter language for novices. Additionally, it also turns out to be quite verbose language, with all the ubiquitous functions and loops, and without denser syntactic sugar such as list comprehensions.
  • By contrast, the standard library of Node is very nice mixture of usefulness and minimalism. It’s certainly not as rich as Python’s or Java’s, but it’s more than usable, despite sitting a bit on the low level side.
  • The canonical tool for managing dependencies, npm, is rather curious creature. Combined with the way Node resolves require() calls, it makes for an unusual system that resembles classic C/C++ #includes – but improved, of course. What stands out the most is the lack of virtualenv/rvm-style utilities; instead, an equivalent approach of local node_modules subfolders is used instead. (npm faq and npm help folders provide more elaborate explanation on how does it work exactly).
  • The callback-based, asynchronous computation is a big hindrance that doesn’t really seem worthwhile. Intriguingly, the hassles of async vs. sync feel strangely similar to issues with pure vs. impure code in functional languages such as Haskell; in both cases you need some serious refactoring of brainware to start coding effectively. In Haskell, however, you are gaining tremendous boons to correctness, modularization, parallelization and testability. In Node, it’s disputable whether you actually gain anything: the whole idea of I/O based on a single event loop sounds all too similar to what an operating system already does with threads sleeping on I/O calls and hardware interrupts that wake them. Granted, this incarnation of asynchronous I/O is much better than some older ones, but that’s mostly thanks to JavaScript being much better equipped to handle the callback bonanza than plain ol’ C.

The bottom line: node.js is definitely not a cancer and has many legitimate uses, mostly pertaining to rapid transfer of relatively small pieces of data over the Internet. API backends, single page web applications or certain game servers all fall easily into this category.

From developer’s point of view, it’s also quite fun platform to code in, despite the asynchronous PITA mentioned above (which is partially alleviated by libraries like async.js or frameworks providing futures/promises). On the overall abstraction ladder, I think it can be placed noticeably lower than Java and not very much higher than plain C. That place is an interesting one, and it’s also not densely populated by any similar technologies and languages (only Go and Objective-C come to mind). Occupying this mostly overlooked niche could very well be one of reasons for Node’s recent popularity.

Tags: , , , ,
Author: Xion, posted under Internet, Programming, Thoughts » 2 comments

Dictionary with Missing Values

2012-06-29 22:36

When working with dictionaries in Python, or any equivalent data structure in some other language, it is quite important to remember the difference between a key which is not present and a key that maps to None (null) value. We often tend to blur the distinction by using dict.get:

  1. thingie = some_dict.get('key')
  2. if thingie:
  3.     do_stuff_with(thingie)

We do that because more often that not, None and other falsy values (such as empty strings) are not interesting on their own, so we may as well lump them together with the “no value at all” case.

There are some situations, however, where these variants shall be treated separately. One of them is building a dictionary of keyword arguments that are subsequently ‘unpacked’ through the **kwargs construct. Consider, for example, this code:

  1. import re
  2. from bs4 import BeautifulSoup
  4. def find_images_by_text(html, text, in_alt=True, in_title=False):
  5.     """Search for <img> tags 'matching' given text in HTML document."""
  6.     regex = re.compile(".*%s.*" % re.escape(text), re.IGNORE_CASE)
  7.     attrs = {}
  8.     if in_alt:
  9.         attrs['alt'] = regex
  10.     if in_title:
  11.         attrs['title'] = regex
  12.     return BeautifulSoup(html).find_all('img', **attrs)

With a key mapping to None, we’re calling the function with argument explicitly set to None. Without the key present, we’re not passing the argument at all, allowing it to assume its default value.

But adding or not adding a key to dictionary is somewhat more cumbersome than mapping it to some value or None. The latter can be done with conditional expression (x if cond else None), together with many other keys and value at once. The former requires an if statement, as shown above.
Would it be convenient if we had a special “missing” value that could be used like None, but caused the key to not be added to dictionary at all? If we had it, we could (for example) rewrite parts of the previous function that currently contain if branches:

  1. attrs = {'alt': regex if in_alt else missing,
  2.          'title': regex if in_title else missing}

It shouldn’t be surprising that we could totally introduce such a value and extend dict to support this functionality – after all, it’s Python we’re talking about :) Patching the dict class itself is of course impossible, but we can inherit it and come up with something like the following piece:

  1. class MissingDict(dict):
  2.     def __init__(self, **kwargs):
  3.         kwargs = dict((k, v) for (k, v) in kwargs.iteritems()
  4.                       if v is not missing)
  5.         dict.__init__(self, **kwargs)
  7. missing = object()

The magical missing object is only a marker here, used to filter out keys that we want to ignore.

With this class at hand, some dictionary manipulations become a bit shorter:

  1. def find_images_by_text(html, text, in_alt=True, in_title=False):
  2.     regex = re.compile(".*%s.*" % re.escape(text), re.IGNORE_CASE)
  3.     return BeautifulSoup(html).find_all('img', **MissingDict(
  4.         alt=regex if in_alt else missing,
  5.         title=regex if in_title else missing))

We could take this idea further and add support for missing not only initialization, but also other dictionary operations – most notably the __setitem__ assignments. This gist shows how it could be done.

Tags: , ,
Author: Xion, posted under Computer Science & IT » Comments Off on Dictionary with Missing Values

The Principle of Two Hacks

2012-06-18 20:19

At work I’ve got a colleague who displays unusual aptitude in coming up with amusing terminology for everyday (coding) things. Among those, the Cake Pattern is always certain to provoke few laughs.

Recently, though, I heard him mention a great common sense rule for any development decision, from the grand architecture down to single line of code. It can be phrased like this:

One hack is fine. But if you need another one on top of the first, it’s probably high time to really consider what you are doing.

For good measure, I’ll call it a Principle of Two Hacks, and I’m pretty convinced that it’s a very beneficial rule to apply in programming – especially when creating any non-trivial, not throw-away programs.

At first it may sound rather vague, however. It’s concept of a “hack” is not easily explicable, or put into indisputable definitions. But that’s what makes it powerful: we don’t need to invoke elaborate (and often controversial) notions of design patterns or abstraction to be able to discuss them.

At best, the ideas of accidental complexity or technical debt might be somewhat close to what developers typically deem as a hack. In practice, this is mostly an opaque intuition that stems from experience or skill, and it’s usually very hard to express in words. Yet, it’s always apparent whenever we encounter it, even though the exact sensation may vary considerably: from a dim feeling that something is maybe a bit off, up to severe intellectual nausea caused by looking at really bad code.

I also like how extremely pragmatic this principle is. Quick-and-dirty fixes making it into production code are just a fact of life, and we are not prohibited from letting them slip. What we are strongly advised here is to maintain integrity of the software we’re writing, trying not to stack one hack upon another.

But even that is not absolute, nonnegotiable gospel; there might still be valid reasons to loosen up its structure. The important part, however, is to notice when we’re doing something fishy and consciously decide whether or not it is a good idea. It is much better than just plunging forward with total disregard of sanity of future maintainers.

Ultimately, this principle is just subtly telling us to think, and that is never a bad advice.

Tags: , ,
Author: Xion, posted under Programming » 1 comment

Interesting Problem with MySQL & SQLAlchemy

2012-06-12 14:31

While at work a few days ago, I had an interesting albeit weird problem which started with the following cryptic error message:

ERROR 1005: can’t create table `qtn_formdisplay_product` (errno: 150)

It was produced by a local MySQL server running on my development machine when I tried to rebuild test database to accommodate for some model changes happening in the codebase. As you might have noticed, it’s not terribly informative, with the errno number as the only useful tidbit. A cursory glance at top search result for this message said that the most probable cause was a malformed FOREIGN KEY constraint inside the offending CREATE TABLE query.

Upon reading this, I blinked several times; something here was definitely off. The query wasn’t of course written by hand – if it was, we could at least consider an actual mistake to be a problem here. But no, it came from ORM – and not just an ORM, but the best ORM known to mankind. While obviously nothing is perfect, I would think it’s extremely unlikely that I found a serious bug in a widely used library just by doing something as innocent as creating a table with foreign key. I’m not that good, after all ;)

Well, except that it could totally be such a bug. The before mentioned search results also pointed to MySQL issue tracker where it was suggested that the error might happen after trying to create foreign key constraint with duplicate name. Supposedly, this could “corrupt” the parent table and no new FOREIGN KEYs could reference it anymore, yielding the errno 150 if we attempted to create one. While it could not explain the behavior I observed (the parent table was freshly created), it raised some doubts whether MySQL itself may be to blame here.

These were exacerbated when one of my colleagues tried out the same procedure, and it worked for him just fine. He turned out to use newer version of MySQL, though: 5.5 versus 5.1. This appeared to support the hypothesis about a possible bug in MySQL but it didn’t seem to help one bit to get the thing running on the older version.

However, it was an important clue that something relevant changed in between, that had an influence on the whole issue. It was not really any particular bugfix or new feature: it was a change of defaults.

Tags: , , , ,
Author: Xion, posted under Computer Science & IT » 1 comment

The Javascript Functions Which Are Not

2012-06-03 15:25

It would be quite far-fetched to call JavaScript a functional language, for it lacks many more sophisticated features from the FP paradigm – like tail recursion or automatic currying. This puts it on par with many similar languages which incorporate just enough of FP to make it useful but not as much as to blur their fundamental, imperative nature (and confuse programmers in the process). C++, Python or Ruby are a few examples, and on the surface JavaScript seems to place itself in the same region as well.

Except that it doesn’t. The numerous different purposes that JavaScript code uses functions makes it very distinct, even though the functions themselves are of very typical sort, found in almost all imperative languages. Learning to recognize those different roles and the real meaning of function keyword is essential to becoming an effective JS coder.

So, let’s look into them one by one and see what the function might really mean.

A scope

If you’ve seen few good JavaScript libraries, you have surely stumbled upon the following idiom:

  1. /* WonderfulThing.js
  2.  * A real-time, HTML5-enabled, MVC-compatible boilerplate
  3.  * for appifying robust prototypes... or something
  4.  */
  6. (function() {
  7.     // actual code goes here
  8. })();

Any and all code is enclosed within an anonymous function. It’s not even stored in a variable; it’s just called immediately so its content is just executed, now.

This round-trip may easily be thought as if doing absolutely nothing but there is an important reason for keeping it that way. The point is that JavaScript has just one global object (window in case of web browsers) which is a fragile namespace, easily polluted by defining things directly at the script level.

We can prevent that by using “bracketing” technique presented above, and putting everything inside this big, anonymous function. It works because JavaScript has function scope and it’s the only type of non-global scope available to the programmer.

A module

So in the example above, the function is used to confine script’s code and all the symbols it defines. But sometimes we obviously want to let some things through, while restricting access to some others – a concept known as encapsulation and exposing an interface.

Perhaps unsurprisingly, in JavaScript this is also done with the help of a function:

  1. var counter = (function() {
  2.     var value = 0;
  3.     return {
  4.         increment: function(by) {
  5.             value += by || 1;
  6.         },
  7.         getValue: function() {
  8.             return value;
  9.         },
  10.     };
  11. })();

What we get here is normal JS object but it should be thought of more like a module. It offers some public interface in the form of increment and getValue functions. But underneath, it also has some internal data stored within a closure: the value variable. If you know few things about C or C++, you can easily see parallels with header files (.h, .hpp, …) which store declarations that are only implemented in the code files (.c, .cpp).

Or, alternatively, you may draw analogies to C# or Java with their public and private (or protected) members of a class. Incidentally, this leads us to another point…

Object factories (constructors)

Let’s assume that the counter object from the example above is practical enough to be useful in more than one place (a tall order, I know). The DRY principle of course prohibits blatant duplication of code such as this, so we’d like to make the piece more reusable.

Here’s how we typically tackle this problem – still using only vanilla functions:

  1. var createCounter = function(initial) {
  2.     var value = initial || 0;
  3.     return {
  4.         increment: function(by) {
  5.             value += by || 1;
  6.         },
  7.         getValue: function() {
  8.             return value;
  9.         },
  10.     };
  11. };
  12. var counter = createCounter();
  13. var counterFrom1000 = createCounter(1000);

Pretty straightforward, right? Instead of calling the function on a spot, we keep it around and use to create multiple objects. Hence the function becomes a constructor for them, while the whole mechanism is nothing else but a foundation for object-oriented programming.

\displaystyle functions + closures = OOP

We have now covered most (if not all) roles that functions play when it comes to structuring JavaScript code. What remains is to recognize how they interplay with each other to control the execution path of a program. Given the highly asynchronous nature of JavaScript (on both client and server side), it’s totally expected that we will see a lot of functions in any typical JS code.

Tags: , , , , ,
Author: Xion, posted under Programming » Comments Off on The Javascript Functions Which Are Not

Of Languages and Keywords

2012-05-28 18:40

I recently had a discussion with a co-worker about feasibility of using anonymous functions in Python. We happen to overuse them quite a bit and this is not something I’m particularly fond of. For me lambdas in Python are looking pretty weird and thus I prefer to use them sparingly. I wasn’t entirely sure why is it so – given that I’m quite a fan of functional programming paradigm – until I noticed a seemingly insignificant fact.

Namely: the lambda keyword is long. With six letters, it is among the longer keywords in Python 2.x, tied with return, import and global, and beaten only by continue and finally. Quite likely, this is what causes lambdas in Python to stand out and require additional mental resources to process (assuming we’re comfortable enough with the very idea of anonymous functions). The long lambda keyword seems slightly out of place because, in general, Python keywords are short.

Or are they?… Thinking about this, I’ve got an idea of comparing the average length of keywords from different programming languages. I didn’t really anticipate what kind of information would be exposed by applying such a metric but it seemed like a fun exercise. And it surely was; also, the results might provoke a thought or two.

Here they are:

Language Keyword Total chars Chars / keyword
Python 2.7 31 133 4.29
C++03 74 426 5.76
C++11 84 516 6.14
Java 1.7 50 289 5.78
C 32 166 5.19
C# 4.0 77 423 5.49
Go 25 129 5.16

The newest incarnation of C++ seems to be losing badly in this competition, followed by C#. On the other side of the spectrum, Go and Python seem to be deliberately designed to avoid keyword bloat as much as possible. Java is somewhere in between when it comes to sheer numbers of keywords but their average length is definitely on the long side. This could very well be one of the reasons for the perceived verbosity of the language.

For those interested, the exact data and code I used to obtain these statistics are in this gist.

Tags: , , , , ,
Author: Xion, posted under Computer Science & IT » 8 comments

DreamPie with virtualenv

2012-05-10 20:44

If you haven’t heard about it, DreamPie is an awesome GUI application layered on top of standard Python shell. I use it for elaborate prototyping where its multi-line input box is a significant advance over raw, terminal UX of IPython.

However, up until recently I didn’t know how to make DreamPie cooperate with virtualenv. Because it’s a GUI program, I scoured its menu and all the preference windows, searching for any trace of option that would allow me to set the Python executable. Having failed, I was convinced that authors didn’t think about including it – which was rather surprising, though.

But hey, DreamPie is open source! So I went to look around its code to see whether I can easily enhance it with an ability to specify Python binary. It wasn’t too long before I stumbled into this vital fragment:

  1. def main():
  2.     usage = "%prog [options] [python-executable]"
  3.     version = 'DreamPie %s' % __version__
  4.     parser = OptionParser(usage=usage, version=version)
  5.     # ...
  6.     opts, args = parser.parse_args()

The conclusions we could draw from this anecdote are thereby as follows:

  • It is indeed true that source code is often the best documentation…
  • …especially for open source programs where actual docs often suck.

With this newfound knowledge about dreampie arguments, it wasn’t very hard to make it use current virtualenv:

  1. $ dreampie $(which python)

And after doing some more research, I ended up adding the following line to my ~/.bash_aliases:

  1. alias dp='(dreampie $(which python) &>/dev/null &)'

Now I can simply type dp to get a DreamPie instance operating within current virtualenv but independent from terminal session. Very useful!

Tags: , , , ,
Author: Xion, posted under Applications, Programming » Comments Off on DreamPie with virtualenv

© 2017 Karol Kuczmarski "Xion". Layout by Urszulka. Powered by WordPress with QuickLaTeX.com.