Archive for Applications

Quick-and-dirty SMTP Server For Debugging

2013-12-27 11:51

Every computer program expands until it can read e-mail – or so they say. But many applications need not to read, but to send e-mails; web apps or web services are probably the most prominent examples. If you happen to develop them, you may sometimes want a local, dummy SMTP server just for testing this functionality. It doesn’t even have to send anything (it must not, actually), but it should allow you to see what would be sent if the app worked in a production environment.

By far the easiest way to setup such a server involves, quite surprisingly, Python. There is a standard library module called smtpd, which is built exactly for this purpose. Amusingly, you don’t even have to write any code that uses it; you can invoke it straight from the command line:

  1. $ python -m smtpd -n -c DebuggingServer

This will start a server that listens on port 8025 and dumps every message “sent” through it to the standard output. A custom port is chosen because on *nix systems, only the ports above 1024 are accessible to an ordinary user. For the standard SMTP port 25, you need to start the server as root:

  1. $ sudo python -m stmpd -c DebuggingServer localhost:25

While it’s more typing, it frees you from having to change the SMTP port number inside your application’s code.

If you plan to use smtpd more extensively, though, you may want to look at the small runner script I’ve prepared. By default, it tries to listen on port 25, but you can supply a port number as its sole argument.

Tags: , , ,
Author: Xion, posted under Applications, Internet, Programming » Comments Off on Quick-and-dirty SMTP Server For Debugging

The Subshell Gotcha

2013-05-20 13:13

Many are the quirks of shell scripting. Most are related to confusing syntax, but some come from certain surprising semantics of Bash as a language, as well as the way scripts are executed.
Consider, for example, that you’d like to list files that are within certain size range. This is something you cannot do with ls alone. And while there’s certainly some awk incantation that makes it trivial, let’s assume you’re a rare kind of scripter who actually likes their hacks readable:

  1. #!/bin/sh
  2.  
  3. min=$1
  4. max=$2
  5.  
  6. ls | while read filename; do
  7.     size=$(stat -f %z $filename)
  8.     if [ $size -gt $min ] && [ $size -lt $max ]; then
  9.         echo $filename
  10.     fi
  11. done

So you use an explicit while loop, obtain the file size using stat and compare it to given bounds using a straightforward if statement. Pretty simple code that shouldn’t cause any troubles later on… right?

But as your needs grow, you find that you also want to count how many files fall within your range, and how many do not. Given that you have an explicit if, it appears like a simple addition (in quite literal sense):

  1. matches=0
  2. misses=0
  3. ls | while read filename; do
  4.     size=$(stat -f %z $filename)
  5.     if [ $size -gt $min ] && [ $size -lt $max ]; then
  6.         echo $filename
  7.         ((matches++))
  8.     else
  9.         ((misses++))
  10.     fi
  11. done
  12.  
  13. echo >&2 "$matches matches"
  14. echo >&2 "$misses misses"

Why it doesn’t work, then? Because clearly this is not the output we’re looking for (ls_between is our script here):

  1. $ ls -al
  2. total 25296
  3. drwxrwxr-x  19 xion  staff       646 15 Apr 18:44 .
  4. drwxrwxr-x  15 xion  staff       510 20 May 11:15 ..
  5. -rw-rw-r--   1 xion  staff        16 10 May  2012 hello.py
  6. -rw-rw-r--   1 xion  staff      4005 28 May  2012 keyword_stats.py
  7. -rw-rw-r--   1 xion  staff       218  5 Aug  2012 magical.py
  8. -rw-rw-r--   1 xion  staff     19901 11 May  2012 space_invaders.py
  9. $ ls_between 1024 10241024
  10. keyword_stats.py
  11. space_invaders.py
  12. 0 matches
  13. 0 misses

It seems that neither matches nor misses are counted properly, even though it’s clear from the printed list that everything is fine with our if statement and loop. Wherein lies the problem?

Tags: , , , ,
Author: Xion, posted under Applications, Programming » 2 comments

When and What to Bash

2013-03-06 12:53

Often I advocate using Python for various automation tasks. It’s easy and powerful, especially when you consider how many great libraries – both standard and third party – are available at your fingertips. If asked, I could definitely share few anecdotes on how some .py script saved me a lot of hassle.

So I was a bit surprised to encounter a non-trivial problem where using Python seemed like an overkill. What I needed to do was to parse some text documents; extract specific bits of information from them; download several files through HTTP based on that; unzip them and place their content in designated directory.

Nothing too fancy. Rather simple stuff.

But then I realized that doing all this in Python would result in something like a screen and a half of terse code, full of tedious minutiae.
The parsing part alone would be a triply nested loop, with the first two layers taken by os.walk boilerplate. Next, there would be the joys of urllib2; heaven forbid it turns out I need some headers, cookies or authentication. Finally, I would have to wrap my head around the zipfile module. Oh cool, seems like some StringIO glue might be needed, too!

Granted, I would probably use glob2 for walking the file system, and definitely employ requests for HTTP work. And thus my little script would have external dependencies; isn’t that making it a full-blown program?…

Hey, I didn’t sign up for this! It was supposed to be simple. Why do I need to reimplement grep and curl, anyway? Can’t I just…

…oh wait.

Tags: , , ,
Author: Xion, posted under Applications, Programming » Comments Off on When and What to Bash

Infinitely Extensible Projects

2012-10-23 14:10

When diving into new language, or radically different framework, it may be a good idea to have a bigger project where you can apply your newfound skills. In my experience, this is typically better than having a lot of smaller ones, because it minimizes the hassle of project’s initial setup. Therefore, it encourages you to experiment more.

To reap the largest benefits of this approach, the project of choice should exhibit two important properties:

  • It must be easy to get started. Note that it doesn’t necessarily mean a complete programming newbie should be able to code the initial scaffolding in one session. But getting started must be easy for you specifically, so that you can dabble in interesting stuff almost right away.
    What it means exactly is dependent on your overall coding experience, and also on comparative difficulty of whatever you are trying to learn. For those taking their first steps in programming as a whole, extending their first Hello world program might be appropriate. However, if you are learning your fourth of fifth language you can aim for something a tad more ambitious.
  • It should offer practically infinite possibilities of extension. The idea is for the project to grow along with skills and knowledge you acquire, enabling you to try ever more things.
    In normal development, this typically results in feature creep and is best avoided. But experimental, exploratory, educational coding does not really have to be concerned with such notions. Of course, if you can pack your learning experiences into a usable program then double kudos to you.

What types of projects fit into this characteristic? I’d say quite a lot of them.

When I was honing my Python skills, I started programming an IRC bot so that I could cram a few ideas into it rather quickly. They were implemented mostly as commands that users could input and have the bot perform some actions, like searching Wikipedia for any given term.
A similar pattern (collection of mostly independent commands) can be realized in many different scenarios. Aspiring web programmers could come up with something like a YubNub clone (bonus points if it allows users to add their own commands). Complete coding novices would probably have to resort to simple, menu-based programs in terminal instead.

Another option is to attack a problem which is a very broad and/or vague. Text editors, for example, fuel countless discussions (even wars) over what functionality should they contain and how it should be accessible in the UI. Chances are slim that your take on the problem sprouts a new Emacs or Vim, but a home-brewed editor is easy enough to start and obviously extensible, almost without limits. Additionally, editors can fit into pretty much any environment, from terminals to desktop UIs or HTML5 applications.

Some endeavors are a bit more specific, though. In web development, a CMS or blogging engine became something of a timeless classic now. Everyone has written one at some point, and there’s a lot of additional (thought not always useful) functionality that can be added to it. Getting the basics right is also a challenge here, especially from security standpoint.

For mobile app creators, the infamous To-Do list app is an idea exercised ad nauseam. But it’s actually a good playground for toying with various device capabilities (e.g. location-based reminders) or web services (like Google or iCloud calendar).

More?…

I’m pretty sure I’m far from exhausting the list of possibilities here. I cannot really speak for domains I have little-to-no experience with, for example embedded or hardware-oriented programming with equipment such as Arduino.

It should be possible to come up with infinitely extensible projects for almost every environment and platform, though. After all, every program has always one more feature to add ;)

Tags: ,
Author: Xion, posted under Applications, Programming » Comments Off on Infinitely Extensible Projects

Yummy YAML

2012-10-15 19:48

It won’t be a stretch to bet that you’ve heard of XML. The infamous markup format was intended to be easily parseable by machines, in addition to being readable by humans. Needless to say, it failed to deliver on either of these promises. Markup elements tend to obscure the actual data, while parsing it – with all its namespaces, !DOCTYPEs and ![CDATA[s – is convoluted and not exactly efficient.

Other formats have thus risen to popularity, out of which JSON is probably widest known and used. It also has excellent support on many platforms.

For the purpose of transporting data between Internet endpoints, or for various APIs offered by websites and services, it works pretty great. There are other applications, though, where it might be the obvious first choice – but not necessarily the best one.

What JSON does well is ease to read and parse: the syntax can be outlined and defined in few paragraphs. However, at the same it’s not that convenient to write.

Unlike actual JavaScript, it needs quotes around keynames, even if they would pass as identifiers. (And by quotes I mean double-quotes, also where apostrophes would be better). Furthermore, it requires keeping track of separators between key-value pairs or array elements, not allowing to have a handy trailing comma at the end.

And lastly, there is no good support for longer texts due to lack of multi-line string literals.

Enter YAML

There is another, lesser known format which addresses these concerns very well. It’s called YAML, recursively from YAML Ain’t Markup Language. Here’s a short sample:

  1. name: John Smith
  2. birth date:
  3.   day: 10
  4.   month: 11
  5.   year: 1982
  6. phone:
  7. - type: mobile
  8.   number: 1123581321
  9. - type: work
  10.   number: 0918237465
  11. friends:
  12. - Robert Jones
  13. - Jane Taylor

At first sight it probably appears as pythonic (or haskellish) counterpart to the C-like JSON. Indentation does indeed matter, at least in the most popular and powerful variant of YAML syntax.

However, giving this bit of significance to whitespace allows to substantially reduce syntactic clutter. There are no curly or square brackets, and no commas. For the most part, there’s not much use for quotes either: strings are easily recognized as both keys and values, even if they contain spaces. Arrays are also supported in a straightforward way: by listing their elements with a leading dash (-), including cases where they are key-value objects themselves.

There’s even support for long strings that span multiple lines:

  1. almost_limerick: |  
  2.     There once was a man from the sticks
  3.     Who liked to compose limericks.
  4.         But he failed at the sport,
  5.         For he wrote 'em too short.

Here, pipe character (|) instructs parsers to preserve newlines and most of the whitespace, excluding the leading indent. Were it replaced with greater-than sign (>), an HTML-like fold would be performed, converting chains of whitespace into a single space.

Parsing the thing

Simple YAML documents represent a tree of keys and values which is much the same as the one produced from JSON files. For some programming languages, this similarity extends to the parsing libraries, as they offer more or less the same interface:

  1. def parse_file(path, parser):
  2.     with open(path, 'r') as f:
  3.         return parser.load(f)
  4.  
  5. import json, yaml
  6. print parse_file('data.json', parser=json)
  7. print parse_file('data.yaml', parser=yaml)

Even if it’s not the case for your language, it’s quite likely there is a YAML parser readily available.

More features

Funny thing about YAML is that its glaringly simple syntax hides very powerful and flexible format.

For example, the values are actually typed. They can be strings or nulls, but also integers, floats or booleans. Syntactic rules make a very good job here at automatically detecting the types; for instance, the word Yes will be recognized as boolean truth if standing on its own, but a text starting with it will be correctly recognized as string.

The other nifty feature is referencing. Parts of YAML document tree can be given labels, while other parts can later use those labels to point back to specific nodes. The whole structure can therefore morph into something more general than just a tree:

  1. triangle_graph: &first_vertex
  2. - label: First vertex
  3.   adjacent_to:
  4.   - label: Second vertex
  5.     adjacent_to:
  6.     - label: Third vertex
  7.       adjacent_to:
  8.      - *first_vertex

With types (incl. custom ones) and references, YAML can actually serve pretty well as serialization format for persisting objects.

Uses?…

But besides that, what YAML is actually good for?

I have hinted in the beginning that it seems like a decent choice for structured text data which is meant to be hand-edited. Various configuration files fall into this category, as well as some datasets around the size of contact list. I used YAML as config format for my IRC bot and it worked very well for this purpose. I’m also using it to store initialization data for database used by another side project of mine.

So, if you are not exercising YAML in any of your current endeavors, I’m encouraging you to give it a try. It might not be the best thing since sliced bread, but it’s very pleasant format to work with.

Tags: , ,
Author: Xion, posted under Applications, Programming » 2 comments

Thinning Your Fat Fingers

2012-09-07 10:32

I often say I don’t believe programmers need to be great typists. No software project was ever late because its code couldn’t be typed fast enough. However, the fact that developer’s job consists mostly of thinking, intertwined with short outbursts of typing, means that it is beneficial to type fast, therefore getting back quickly to what’s really important.
Yet, typing code is significantly different game than writing prose in natural language (unless you are sprinkling your code with copious amount of comments and docstrings). I don’t suppose the skill of typing regular text fast (i.e. with all ten fingers) translates well into building screens of code listings. You need a different sort of exercise to be effective at that; usually, it just comes with a lot of coding practice.

But you may want to rush things a bit, and maybe have some fun in the process. I recently discovered a website called typing.io which aims to help you with improving your code-specific typing skills. When you sign up, you get presented with a choice of about dozen common languages and popular open source projects written in them. Your task is simple: you have to type their code in short, 15-line sprints, and your speed and accuracy will be measured and reported afterwards.

The choice of projects, and their fragments to type in, is generally pretty good. It definitely provides a very nice way to get the “feel” of any language you might want to learn in the future. You’ll get to see a lot of good, working, practical code written in it – not to mention you get to type it yourself :) Personally, I’ve found the C listings (of Redis data store) to be the most pleasant to both read and type, but it’s pretty likely you will have different preferences.

The application isn’t perfect, of course: it doesn’t really replicate the typical indentation dynamics of most code editors and IDEs. Instead, it opts for handling it implicitly, so the only whitespace you get to type is line and word break. You also don’t get to use your text navigation skills and clipboard-fu, which I’ve seen many coders leverage extensively when they are programming.
I think that’s fine, though, because the whole thing is specifically about typing. It’s great and pretty clear idea, and as such I strongly encourage you to try it out!

Tags: , ,
Author: Xion, posted under Applications, Programming » 1 comment

DreamPie with virtualenv

2012-05-10 20:44

If you haven’t heard about it, DreamPie is an awesome GUI application layered on top of standard Python shell. I use it for elaborate prototyping where its multi-line input box is a significant advance over raw, terminal UX of IPython.

However, up until recently I didn’t know how to make DreamPie cooperate with virtualenv. Because it’s a GUI program, I scoured its menu and all the preference windows, searching for any trace of option that would allow me to set the Python executable. Having failed, I was convinced that authors didn’t think about including it – which was rather surprising, though.

But hey, DreamPie is open source! So I went to look around its code to see whether I can easily enhance it with an ability to specify Python binary. It wasn’t too long before I stumbled into this vital fragment:

  1. def main():
  2.     usage = "%prog [options] [python-executable]"
  3.     version = 'DreamPie %s' % __version__
  4.     parser = OptionParser(usage=usage, version=version)
  5.     # ...
  6.     opts, args = parser.parse_args()

The conclusions we could draw from this anecdote are thereby as follows:

  • It is indeed true that source code is often the best documentation…
  • …especially for open source programs where actual docs often suck.

With this newfound knowledge about dreampie arguments, it wasn’t very hard to make it use current virtualenv:

  1. $ dreampie $(which python)

And after doing some more research, I ended up adding the following line to my ~/.bash_aliases:

  1. alias dp='(dreampie $(which python) &>/dev/null &)'

Now I can simply type dp to get a DreamPie instance operating within current virtualenv but independent from terminal session. Very useful!

Tags: , , , ,
Author: Xion, posted under Applications, Programming » Comments Off on DreamPie with virtualenv
 


© 2023 Karol Kuczmarski "Xion". Layout by Urszulka. Powered by WordPress with QuickLaTeX.com.