The one-line shell setting that's caught the most bugs
I am trying a shorter, more technical Friday post alongside the longer essays earlier in the week. First up: the one line I put at the top of every shell script because silent failures are almost always worse than loud ones.
I'm trying a slightly different format today. The longer posts earlier in the week are where I want to work through the bigger arguments properly, with the context and caveats they need. Fridays are going to be shorter and more technical, mostly small things I do in my own work that are useful enough to be worth sharing with you, but not so large that they need a full essay.
This one is about the first line I put at the top of almost every shell script I write:
`bash
set -euo pipefail
`
It's not a clever trick, and it's definitely not a complete safety system, it's just a better default.
By default, a shell script is quite happy to fail halfway through, print an error, carry on doing the next thing, and then exit in a way that makes the outside world think everything was fine. That is the kind of failure I dislike most, because the thing did not work and the system lied about it. My longer article this week, Silent Failures, was all about that.
The usual example is a directory change that fails:
`bash
#!/usr/bin/env bash
cd /some/path/that/does/not/exist
rm -rf ./build
`
If the cd fails, the script can still run the next command from wherever it happened to start. The dangerous part is not only the delete, since you can construct safer or scarier versions of the same example depending on the script. The dangerous part is that the script has already lost the state it needed to be correct, and it may still look successful from the outside. That is the bit that turns a small error into a later mystery.
With the stricter setting at the top, the failed cd is the end of the script:
`bash
#!/usr/bin/env bash
set -euo pipefail
cd /some/path/that/does/not/exist
rm -rf ./build
`
That changes the default from "carry on and hope someone notices" to "stop at the point where the script is no longer safe to continue." It does three things. -e exits when a command fails. -u treats an unset variable as an error instead of quietly turning it into an empty string. pipefail makes a pipeline fail if any command in the pipeline fails, not only the last one.
The -u part is the one people tend to underestimate, since an unset variable often looks harmless right up to when it becomes part of a path, a filter, a service name, or a command argument. If a script needs $DIR and $DIR is missing, I want the script to stop immediately. I do not want it to continue with an empty value and leave me to discover later which command interpreted that empty value in the most exciting possible way.
pipefail catches another boring class of failure. Without it, this can report success:
`bash
thing_that_failed | tee output.log
`
The command I cared about failed, but tee did its job, so the pipeline can still look green. That is exactly the wrong trade. Logging a failed command is useful, but not if the act of logging it hides the fact that it failed.
There are edge cases, because this is coding. set -e has exceptions, especially around conditionals, && chains, and commands where a non-zero exit is a meaningful answer rather than a failure. That does not make the setting wrong. It just means expected failure should be written down explicitly, usually as an if block or a deliberately handled exception. What I want to avoid is accidental failure becoming normal control flow because nobody told the shell otherwise.
The reason this has become muscle memory for me is the same reason I keep coming back to fail-loud systems generally: if a workflow depends on me noticing that something went wrong, it is already broken and my attention is not a control. A script that prints the important error on line 43 of a noisy log and then exits cleanly has failed twice, once in the work it was meant to do, and once in the signal it was meant to send.
So this is the first Friday technical note for you coders: put set -euo pipefail near the top of the next shell script you write. It will not catch everything. It will sometimes force you to be more explicit than you expected. That is the point. I would rather have the script make me handle the awkward case on purpose than let it drift past a broken assumption and pretend nothing happened.