Subshell conditional command exit status handling
Very recently, while deploying some changes via GitHub Actions, the pipeline was failing with this ever-cryptic error message:
Error: Process completed with exit code 1.
Nothing more! Not a single hint! Which essentially meant that I would need to get down the rabbit hole myself. And I did!
The error was coming from a pipeline job step that executes a command via ssh
. The command is quite long but for simplicity the very minimal-reproducible version would be (ignoring all ssh
client options as well):
ssh some-server 'cd some-dir/ && for dir in */; do ( cd "$dir" && [ -f some-file ] && echo "found" ); done'
Essentially, after establishing the ssh
connection to some-server
, we'd cd
into some-dir
. Then, for each subdirectory inside some-dir
, in a subshell, we'd cd
into the subdirectory, test
([
) for the existence of some-file
and if found, echo
the string found
.
We're running the whole command in the subshell in a short-circuit fashion with logical-and &&
i.e. if anything in the chain fails (exits with non-zero exit status), the next command is not executed. That's the source of our problem as well -- as the exit status of the whole subshell is that of the last executed command so for example, if no file named some-file
is present in subdirectory X
, the [ -f some-file]
test would be the last command executed (echo "found"
would not run as the test failed) in the subshell for subdirectory X
and as [-f some-file]
would have a exit status of 1, the whole subshell would have the exit status of 1 as well.
Another very important thing to consider is the exit status of the whole ssh
command. In the above case, the exit status of ssh
would be the exit status of the last subshell processed i.e. the last command run, which in turn depends on how the shell (bash
in this case) sorts subdirectory names from */
. As we're running a for
loop over the glob expansion of */
, the order shell gives us, we iterate over them in that exact order.
For example, let's take the following directory hierarchy:
$ tree -d some-dir some-dir ├── X ├── Y └── Z
the exit status of the whole ssh
command would be 0 (successful) if the directory Z
contains file named some-file
(Z
comes at the end in */
on some-dir
).
Let's add echo "$dir"
to get the sorting order in bash
for the glob token */
(ignoring the ssh
command here as the exit status of the following would be reflected in ssh
as-is):
$ cd some-dir/ && for dir in */; do ( cd "$dir" && echo "$dir" ); done X/ Y/ Z/
Now, if the directory Z
doesn't contain some-file
, we'd get a exit status of 1 (unsuccessful):
% tree some-dir some-dir ├── X │ └── some-file ├── Y │ └── some-file └── Z $ cd some-dir/ && for dir in */; do ( cd "$dir" && [ -f some-file ] && echo "found in ${dir}" ); done found in X/ found in Y/ $ echo $? 1
But if the file exists on Z
but not on X
or Y
, it would give us an exit status of 0 (successful):
% tree some-dir some-dir ├── X ├── Y │ └── some-file └── Z └── some-file $ cd some-dir/ && for dir in */; do ( cd "$dir" && [ -f some-file ] && echo "found in ${dir}" ); done found in Y/ found in Z/ $ echo $? 0
Going back to our pipeline ssh
command, if the last directory from */
expansion contains file named some-file
, we'd get an exit status of 0 (successful) for the whole ssh
command, 1 (unsuccessful) otherwise. The basic idea of pipeline jobs is that if some job step's command fails with non-zero exit status, nothing following that step would run and the whole job would be marked as failed.
The non-existent file some-file
is not an issue in this case to warrant the failure of the whole job, so the solution would be to do something to ignore the exit-status of [ -f some-file ]
, and if there are other commands that do the same need to be ignored as well.
As I didn't want to write a whole bunch of if-else
in this case and keep the short-circuit for readability, as a solution, I used a nested subshell approach -- I invoked a wrapper subshell that contains the above subshell and a || true
just in case any command in the first subshell returns non-zero exit status:
cd some-dir/ && for dir in */; do ( ( cd "$dir" && [ -f some-file ] && echo "found" ) || true ); done
Same thing through ssh
:
ssh some-server 'cd some-dir/ && for dir in */; do ( ( cd "$dir" && [ -f some-file ] && echo "found" ) || true ); done'
Notes:
-
The biggest caveat to do something like this is that it would hide any error (i.e. unsuccessful command with non-zero exit status) on the inner subshell. In my case, it doesn't matter as I was expecting some directories to not contain the
some-file
so the[ -f some-file ]
is not problematic. But this is something to keep in mind. -
The ordering in glob or pathname expansion pattern
*
depends on the collation setting in locale i.e.LC_COLLATE
.locale
command can be used to get the current value of all the locale-specific values includingLC_COLLATE
. My system hasen_US.UTF-8
as the value forLC_COLLATE
so the above sorting order is based on that locale; on a different locale, the sorting order could/would be different. -
There is also a
bash
keyword[[
which is a conditional construct and is a superset of[
/test
builtin.[[
would behave the same way as[
/test
in this case. -
The above assumes
bash
as the shell.
References:
- Github Actions
- ssh
- Subshell
- Exit status
bash
glob or pathname expansion ordering[
/test
builtin[[
keywordlocale
Comments
Comments powered by Disqus