Primer on Bash variables, logical builtins/commands, and operators

Monday 15 January 2018 01:20 AM (Dhaka)

bash is written in C, so one might expect some remnants of C, but the fact is bash is very different from it -- bash (or any other shell, for that matter) was never designed to be a full-fledged programming language. bash was initially written to execute various *nix standard commands directly, in essence, an interface to do fork()-exec() (and brothers) calls.

But, from time to time, users tend to (ab)use bash as a tool for doing stuffs that need a proper language with all the bells and whistles. We all tend to do some ambitious things in scripts, like introducing an opaque data structure, trying to reference pointers(!), and so on. So, bash developers have added various functionalities over the years, which leads to the current state of bash -- something between a typical shell and a programming language.

By the way, bash is the official shell of the GNU project, one of the reasons why it is so prevalent in all GNU/Linux distros (comes with the GNU coreutils package).

In this post, I will try to give some pointers on bash variables, and type related operations on them.

Notes:

I'm not counting indexed and associative arrays (introduced in bash-4) here.
I'm using bash version 4.3.48(1)

While working in bash, interactively or inside scripts, we need to deal with variables, and operators related to it. bash has some strict, and sort of unusual rules when dealing with variables. The trick to understanding these would be to never compare bash shell with a programming language.

Plain bash variables are untyped, they don't have any associated type to begin with. For understanding, you can think of them as all strings (well, not the lower level view, but think it like so for understanding), and depending on the operators they could be treated as integers. So, there is no type or alike you find in typical languages.

bash has a very restrictive variable declaration syntax. There must be no whitespace around = while declaring variables. Correct way:

foo=bar

While all of the following are incorrect:

foo = bar
foo =bar
foo= bar

Okay, now the following is not actually setting the variable value as an integer type, but depending on the operator it is used with it would behave as an integer (we'll come to this later):

foo=3

Some might thought that, declare -i is actually setting the variable as having integer value, see below:

$ declare -i foo=bar

$ echo "$foo"
0

As you can see, any value that does not have the correct base for integers are reset to 0, which is pretty misleading as one would be expecting an error/exception. This could introduce some hard to catch bugs in your code. For this, I would suggest you to avoid using declare -i unless you're absolutely sure about what you're doing.

So, how can we do arithmetic and string operations in bash. More importantly how bash would differentiate between these as the variables are untyped.

The answer is: By using different syntaxes and operators for them. And bash will set the appropriate operating environment when parsing the tokens.

Let's go through the common operators and syntaxes one by one.

`[` or `test`:

It is a bash builtin; defined by POSIX so portable
[ is analogous to test
There is also an external command [/test (comes with GNU coreutils, which behaves the similar way as the builtin)

For doing arithmetic operations, you can use operators like -eq (or =), -ne, -gt, -lt, -ge (greater or equal), -le (less or equal):

$ [ 6 -gt 5 ] && echo "True" || echo "False"
True

$ [ 6 -eq 5 ] && echo "True" || echo "False"
False

$ [ 6 = 5 ] && echo "True" || echo "False"
False

$ [ 5 -le 7 ] && echo "True" || echo "False"
True

So, bash checks the operators to [, if it is an arithmetic operator, it will expect integers as operands, otherwise will throw an error:

$ [ foo -eq 7 ]
bash: [: foo: integer expression expected

Just to note, in bash and any other shell, the POSIX defined way to compare for equality (for both integers and strings) is =, not ==.

Any other comparison operators are usable on only strings.

`[[`:

It is a bash keyword
Was introduced to add functionalities [ was missing; [[ can be thought of as a superset of [ as [[ supports everything [ supports, and adds many more

While you do not get much for arithmetic operations, but for string operations [[ is great as it supports shell Glob matching and Regex (Regular Expression) based matching out of the box:

$ bar="spamegg"

$ [ "$bar" = spam* ] && echo "Match" || echo "No match"
bash: [: too many arguments
No match

As you can see, we have set variable bar as having value spamegg. While doing shell Glob matching in [, with Glob token * (matches zero or more characters), [ fails giving the error about too many arguments received.

Note: The string No match still got printed as [ exits with status 1 (failed); this happened due to the way we've chained the short-circuit (logical) operators. Here, the echo "Match" would be executed if [ exits with a status 0 -- note the && (logical AND); and echo "No match" would be executed if any one of the [ or echo "Match" fails -- note the || (logical OR) operator.

Now, let's check if [[ can do Glob matching:

$ [[ "$bar" = spam* ]] && echo "Match" || echo "No match"
Match

Hmmm, it does. Note that, we're using same = operator here, for Glob matching too.

Note: = and == are replaceable in [[

Let's see Regex matching, the operator changes from = (or ==) to =~:

$ [[ "$bar" =~ ^spam ]] && echo "Match" || echo "No match"
Match

It does do Regex match.

As mentioned earlier, the arithmetic operators and operations are similar to that of plain [.

Note: You don't need to strictly quote the variables/parameters when used inside [[, unlike [:

$ spam='egg foo'

$ [ $spam = 'egg foo' ] && echo "True" || echo "False"
bash: [: too many arguments
False

$ [ "$spam" = 'egg foo' ] && echo "True" || echo "False"
True

$ [[ $spam = 'egg foo' ]] && echo "True" || echo "False"
True

`((`:

Used for arithmetic operations only, should not be used for string operations
Support operator syntaxes like > for -gt, >= for -ge, and alikes
The eqality operator is not =, rather you must use ==

Here is why you should not use (( on strings:

$ (( foo =  foo )) && echo "True" || echo "False"
False

$ (( foo == foo )) && echo "True" || echo "False"
True

Note just the addition of = inside (( changing the whole thing.

Now, arithmetic operations:

$ spam=5

$ (( spam >= 2 )) && echo "True" || echo "False"
True

$ (( spam == 2 )) && echo "True" || echo "False"
False

$ (( spam < 2 )) && echo "True" || echo "False"
False

Notice that, I have not used $ before variable name here, because you don't need that inside ((. So, (( spam >= 2 )) and (( $spam >= 2 )) both are syntactically correct and equivalent.

Note: You can't use the -gt, -eq and alikes inside ((:

$ (( 5 -eq 3 ))
bash: ((: 5 -eq 3 : syntax error in expression (error token is "3 ")

That's that!

One final note would be to use only [ when you want to make your script portable (e.g. would be run as /bin/sh and you don't know where sh is symlinked to) as it is defined by POSIX and should be available on all Bourne like shells. And if you're sure that you would only run the script with bash, you can (and you should) leverage the great features [[ and (( provide.

Thanks!

Readul Hasan Chayan [Heemayl]

[ or test:

[[:

((:

Comments

`[` or `test`:

`[[`:

`((`: