When and What to Bash

2013-03-06 12:53

Often I advocate using Python for various automation tasks. It’s easy and powerful, especially when you consider how many great libraries – both standard and third party – are available at your fingertips. If asked, I could definitely share few anecdotes on how some .py script saved me a lot of hassle.

So I was a bit surprised to encounter a non-trivial problem where using Python seemed like an overkill. What I needed to do was to parse some text documents; extract specific bits of information from them; download several files through HTTP based on that; unzip them and place their content in designated directory.

Nothing too fancy. Rather simple stuff.

But then I realized that doing all this in Python would result in something like a screen and a half of terse code, full of tedious minutiae.
The parsing part alone would be a triply nested loop, with the first two layers taken by os.walk boilerplate. Next, there would be the joys of urllib2; heaven forbid it turns out I need some headers, cookies or authentication. Finally, I would have to wrap my head around the zipfile module. Oh cool, seems like some StringIO glue might be needed, too!

Granted, I would probably use glob2 for walking the file system, and definitely employ requests for HTTP work. And thus my little script would have external dependencies; isn’t that making it a full-blown program?…

Hey, I didn’t sign up for this! It was supposed to be simple. Why do I need to reimplement grep and curl, anyway? Can’t I just…

…oh wait.

I suspected for some time now that there must exist a certain niche. Between problems that require an actual programming language and those you can solve by typing a single terminal command, there is presumably something of a gray zone. Wasn’t sure how it looks like, though, until I’ve been given this obvious example on a silver platter.

If you haven’t guessed, I’m talking about Unix shell scripts here. While I did write some in the past, they were mostly just shortcurts for some long-winded commands (*cough* Git *cough*).
But conditions? Loops? Functions? No way! If I wanted to use a programming language, I would just use one. A proper one, not some syntactically challenged afterthought.

That sentiment is entirely justified, by the way. Viewed from the software engineering standpoint, Bash really is horrible. Both Javascript and PHP stand as paragons of stellar language design when compared to the hodgepodge of Bourne shell’s weird semantics. Let me just list a few bizarre examples:

  • Significant whitespace. No, not like in Python, where it’s only indentation that matters. Wrong space in shell script may, for example, turn assignment (foo=$(printf "Hello %s" "world")) into unintended command invocation (foo = ...).
    Sure, I’ve seen things worse than that, like Robot Framework’s syntax based on two-space delimiters. Still, Bash scores pretty damn high on the exasperation scale.
  • Really, really weak typing. You may laugh at how Javascript handles certain conversions, and rightfully so. So how about strings that, in certain contexts, behave like arrays?…
    1. $ ~ » foo=(ls -A -S) ; echo $foo
    2. ls -A -S
    3. $ ~ » for x in $foo; do printf "%s\n" $x; done
    4. ls
    5. -A
    6. -S

    Or maybe it was the other way around? Either way, that’s quite bizarre – to say the least.

  • eval by default. Every dynamic language has a way to execute code stored as a string; often it’s the eval function. And in every sensible language (as well as in Javascript), such practice is highly discouraged.
    What say you, then, about eval as the default action that doesn’t even require any special syntax?

    1. $ ~ » $foo
    2. .zsh_history
    3. .dreampie
    4. .Trash
    5. Documents
    6. ...

    Certainly, that looks good to me!

Alright, let’s step aside and ponder this for a while. How is it even possible that someone thought any of these “features” could ever be a good idea? What the hell were they thinking, back then at Bell Labs?…

The answer is pretty simple, I think. While every single feature may be ludicrous when considered in isolation, they could still consolidate into a very workable whole. And with Unix shell scripting, this seems to be precisely the case.

The whole point of writing a script is to tie several existing tools together in order to produce some greater, and useful, effect. Reaching out for external programs is the typical thing here, so no wonder it doesn’t involve any special tricks; it’s the default. As a consequence, anything else needs to make way for command invocation, including even some obvious and necessary elements, such as variables.

Those commands, by the way, work with plain text. One may argue whether this is a positive thing; there certainly are other options which appear at least equally viable.
But since everything is just text, it has to be handled in rather unorthodox and somewhat “lenient” way. Going over lines or words must be easy and concise. Leading and trailing whitespace shouldn’t get in the way. Nothing should be overly strict in what it does and does not accept.

All this, and more, makes getting the actual job done much easier. A job that is most often not very glamorous but almost always necessary.

Tags: , , ,
Author: Xion, posted under Applications, Programming »



Adding comments is disabled.

Comments are disabled.
 


© 2017 Karol Kuczmarski "Xion". Layout by Urszulka. Powered by WordPress with QuickLaTeX.com.