(Like this article? Read more Wednesday Wisdom! No time to read? No worries! This article will also become available as a podcast on Thursday)
It might be difficult to imagine for young people, but in the olden days it used to be quite difficult to write software. And I do not mean this in a “requirements are hard and so is code correctness” kind of way; I mean that if you got yourself a computer, you were just not able to open up an editor, write some code, and then compile and/or run it for lack of any software engineering related tools.
At that time, selling editors, compilers, and other programming tools was a business and computers did not come with any tools that would allow you to start writing serious code. The best you could hope for was a crappy BASIC interpreter that lacked any features for serious software development. If you wanted to compile your code, use a decent programming language, use a “database”, or develop user interfaces, you often had to shell out amounts of money that were beyond the means of most individuals. This fact gave rise to a very active software piracy movement and I partook liberally in that, copying software packages from friends and spending hours at the Xerox machine to copy manuals. The dearth of programming tools existed up and down the platform hierarchy, from measly personal computers to large mainframes, where the companies I worked for had to shell out thousands, if not tens of thousands, of guilders per month for the privilege of using IBM’s COBOL compiler. Open source did not really exist yet and to the extent that it did, the absence of the Internet made it very hard to distribute anything.
Unix was notably different. It came bundled with a C compiler and a suite of tools that enabled software development, like make, lex, and yacc. This was amazing and novel! An operating system that came with a compiler and other development tools? Unheard of! Unfortunately, Unix was not widely available because it required powerful hardware that was beyond the means of individuals and small companies. To top it off, commercial Unix implementations, like HP-UX and SunOS, often only contained a minimal K&R C compiler that was just enough to configure the kernel, while they charged an arm and a leg for their more powerful ANSI C compiler.
When gcc came around it contained an intricate bootstrap where it used the platform’s K&R C compiler to create a small ANSI C compiler that was then used to compile itself again and then the entire compiler collection.
Even when you did have access to a system with basic development tools, writing software was arduous because of the absence of libraries and components that solved higher level problems. If you wanted to parse an email address, use a hash map, or solve a linear programming problem, you were completely on your own. There was no GitHub, NPM, CPAN, or equivalent repository of code that you were free to download and use. Consequently, coding was a slog, and only the most committed individuals could get stuff done. Every piece of code you needed to write was a project and, because of the time needed, a very costly project at that.
This sad fact became double plus sad once you started considering the needs of system administrators. The process of running a complicated system requires issuing myriads of commands to deal with operational chores and to solve problems. Many procedures consist of issuing the same commands every time in more or less the same order or with only minimal variation. These system administration tasks lend themselves well to automation, but most systems did not come with enough programming tools to develop this automation.
Unix changed all that.
In one of their many strokes of brilliance, the original Unix designers created a system that was modular and with components that could easily be combined in elegant ways to create more powerful functionality. It is hard to see in hindsight how incredibly revolutionary and brilliant their design was. A proof of this is that it held up remarkably well over time, whereas other models for building operating systems either disappeared or had to embrace Unix’s features as best they could.
A consequence of Unix’s design was that the majority of operating system commands, even the ones that were provided by third parties, were composable. Whenever I taught Unix Fundamentals, people often asked me: “Does Unix have a command for <this or that>”. “No”, my answer would typically be, “but you can build it yourself by stringing together <this set of Unix commands>.” This way of working proved to be very future proof, so I still regularly type commands like docker container rm $(docker container ps -a | cut -f1 -d\ | grep -v CONTAINER)
in order to remove all dangling docker containers on my MacBook. Maybe a better command exists, maybe the Docker designers gave me a combination of commands and flags to do this with one command invocation, but courtesy of Unix, that is not how my mind works anymore. Pipes and command substitution means that I look at every Unix command as a simple filter to be used in a pipeline.
Another stroke of brilliance of the Unix designers was to blur the line between the command line interface and the interpreter of a simple programming language. The Unix shell contains statements like if, while, and case; it allows the definition of functions, understands the concept of variables, and can read commands from a file as well as from the terminal; a feature greatly helped by the fact that in Unix every stream of bytes is a file, whether you type them from the keyboard or whether they come from a disk.
This is yet another thing that might seem obvious now but which was revolutionary when it was first mentioned.
This design allows writing simple programs that are executed by the command line interpreter. Eat that COMMAND.COM! Apart from the aforementioned control statements, the original shell’s programming language did not contain many special built-in statements like PRINT or CLS; the shell was “just” a more powerful way to compose Unix’s basic commands into more powerful units of functionality. And so, the shell script was bourne; the ideal programming tool for automating system administration.
When I first came across shell scripts and understood how everything hung together, I had a veritable religious experience. “If there is a God”, I thought, “this is surely Her operating system design”. I was not the only one who thought that and shell script programming took off like wildfire as an unleashed generation of system administrators automated everything in sight with ever more intricate shell scripts. The shell itself morphed too: The hippies over at Berkeley invented the C Shell, AT&T retaliated with the Korn Shell, the bureaucrats over at Posix standardized the Posix shell, the open source world built and released the Bourne Again Shell, and for some reason that I cannot quite figure out, I seem to have standardized on the Z-shell.
Probably because it allows themes, which solidly brings the shell in the 21st century with its focus on mindless fluff and window dressing.
Since their inception in 1979, Shell scripts have held up remarkably well as a vehicle for automating system administration tasks. Pretty much every SRE and Linux software engineer I know writes or edits shell scripts on an almost daily basis in order to keep their systems going. Shell scripts configure the system, make backups, remove old log files, and run CI/CD pipelines. This fact is quite remarkable, given that the shell script language is actually an awful programming language. It has weird syntax, is untyped, cannot do math very well, does not deal kindly with strings containing spaces, and has terrible error recognition and recovery. Nobody in their right mind would invent a programming language like that today.
I am saying that, but Microsoft relatively recently gave us PowerShell which shows their unparalleled capability to take good ideas and implement them badly.
On top of that, shell scripts are not very portable. They are (by design) completely dependent on their environment and therefore you cannot even reliably run a shell script designed for one Linux distribution on another one, unless you take extreme precautions in the code. Because shell scripts string existing commands together, they are very sensitive to the tools installed. If your shell script depends on awk, you might be well advised to wonder if it is resistant against the subtle differences between Berkeley’s original awk, GNU awk, and SunOS’s “new” awk.
A lot of shell scripts parse the output of command invocations, which is great and powerful, until such time as the command outputs something unexpected because of an abnormal condition or a newer version of the command comes with changes to the output, like, I don’t know, maybe add an extra space or perhaps changes the output altogether.
I dare you to
export LANG=nl_NL
and see if your scripts still work 🙂.
Shell scripts are also extremely sensitive to characters that are special to the shell such as the the asteriks, white space, question marks, quotes, parenthesis, and backslashes. These characters have been breaking shell scripts since 1979, for instance when they appear in file names, variable values, or the output of commands that get substituted back into the command line.
In recent times we have gotten many new programming languages and many great libraries for all sorts of common problems. Some of these newer programming languages are terrible, most are decent and some are great. For most tasks, there is by really no good reason anymore to code shell scripts. If you know what you are doing, coding a Go program is about as fast; it will yield you a binary that might be bigger, but is also faster and, most importantly, way more robust.
During a recent project I ran a secure database in a container and I needed to take regular backups of that database that needed to be sent to an off-site cloud storage location. None of this is particularly problematic, so I started with a simple shell script. However, over time the requirements grew: I needed to integrate with a secure way to obtain database credentials, I wanted to emit metrics upon termination, and I had to interface with different cloud object stores. So the shell script grew and grew. The fact that I implemented all this in a shell script also added other complexities. For instance: To run my shell script on an hourly schedule, I needed to run cron in my container, which meant I had to run the container start command as root and then drop privileges later. This, as well as all other problems that were created by using a shell script, can of course be overcome with some additional shell script software engineering, but it did start to look like a wobbly pile of software. And as my friend Doug R. says: “You cannot make a wobbly pile of software more reliable by making it higher.”
So, one fine day I fired up the Go compiler and rewrote the shell script as a Go program with a built-in cron scheduler, with interfaces to different cloud providers, while emitting metrics to Prometheus. The resulting binary is faster and more robust. It is bigger, but it is better.
The defenders of shell scripts will say that it is possible to write robust cross-platform shell scripts and they are right, you can! But the fact that you can, doesn’t mean that you should. You can write a docker-like container system and a web server in shell script, but that doesn’t mean it is a good idea. With modern programming languages and libraries we can use a better language to create a better program in about the same amount of time.
So, it is time to say goodbye to shell scripts. They were amazing while we needed them, but we moved on and we got better things now.
100x everytime I write a shell script, I regret it later. But you'll never take `make` away from me (I know folks have tried, but I haven't found a similarly elegant system; not for lack of trying, lots of terrible renditions on the old tool).
yeah there is, small nit: `docker container ps -a | cut -f1 -d\ | grep -v CONTAINER` is the same as `docker ps -a -q`