Once thing new shell programmers struggle with is error handling. The shell does not use exceptions like the programming languages most developers are familiar with; basically the shell mostly pre-dates the notion of exceptions in higher-level programming languages.
Developers often conceptualise shell commands to be something similar to procedures: You call it with some parameters, and it does something. This overlooks an important aspect: Commands have an exit status which indicate whether the command was successful or not.
By convention, an exit status of zero indicates success, and a non-zero exit status indicates a failure. Commands may use different non-zero values to indicate different sorts of failures (the documentation for the individual command will usually describe this).
Imagine this simple script:
1 2 3 4 5 6 7 8 |
|
(yes: This will direct the Star Ship Enterprise to reverse course at maximum speed. Something you'd want to do in an emergency - and it is slow to do manually. So obviously Gene Roddenberry should have scripted it.)
As it stands: if healthcheck
fails, the script will continue
anyway !
Most of the time, this is not really what we want.
There are several ways of detecting and dealing with a failing command in a script.
Stopping on Errors
95% of the time, it is perfectly acceptable for the script to merely
stop (with a non-zero exit status) if a command fails, and rely on the
failing command to explain (to stderr) what went wrong. This can be
done by modifying the shell behavior with set -e
:
1 2 3 4 5 6 7 8 9 |
|
Normally, the shell would simply execute commands in sequence. This
behavior is subtly changed by set -e
: It makes the shell exit
immediately if a command (or pipeline) returns a non-zero exit status.
But... If you have a command that is allowed to fail, this gives you the opposite problem! The script will exit with a non-zero exit status!
There's an easy way out of this by turning the command into a list
of commands - where the last part is guaranteed to succeed. This is
usually done by adding || true
to the command:
1 2 3 4 5 6 7 8 9 |
|
Notes:
-
The
||
operator basically means "only execute the right-hand side if the left-hand side fails". This is different from;
which just means "execute the left-hand side and then execute the right-hand side" (regardless of exit status). -
The built-in command
true
is a simple no-op which is guaranteed to give a exit status of zero (indicating success). -
The exit status of a pipeline is the exit status of the right-most command (of those which were actually executed).
Catching Errors
Sometimes merely using set -e
is insufficient: You may want to emit
a warning to the user before continuing:
1 2 3 4 5 6 7 8 9 10 11 |
|
A couple of points to note:
-
The
!
operator reverses the exit status of the command. It is a logical equivalent of a boolean "not" -
The warning is written to standard error courtesy of
1>&2
- novice users often forget that warnings and errors should go to standard error rather than standard output. -
The warning message identifies the command emitting the warning by including
$0
(the name of the script itself) in the message. This is useful for adding the right context to the warning and allows for easier debugging.
Error Cleanup
Sometimes it is necessary to do some cleanup in case of failures -
which means that simply using set -e
is insufficient.
Our example script will leave /tmp/dilithium-status
behind - as a
junk file in /tmp
. We can avoid that.
Or perhaps even do some cleanup before bailing out completely:
1 2 3 4 5 6 7 8 9 10 11 |
|
This makes use of the shell "trap" feature: It directs the shell to
execute commands when the script finishes - even if the script fails.
Conceptually, this is a simplistic equivalent to Python's try
... finally
construct (it does not support nesting).