Shell scripting | Variables & Functions
Once we've learnt how to freely manipulate directories, files and data, it's time to give more power to the command line storing our values in variables and organizing our commands with functions.
In this guide we'll take a look at variables, arrays and functions inside shell scripting.
Some languages need to explicitly allocate and free memory for variables, structures and so on. Shell scripting handles this under the hood and in most cases we don't have to clean anything before leaving a function.
Exiting scripts as soon as possible or making return statements in functions can reduce system load and improve performance by avoiding unnecessary code execution.
Variables
In every programming language, variables store data and configuration options, and allow us to manage and control actions inside a script. Variables are quite easy to use but they are also quite easy to get ourselves into trouble with.
- There are three basic rules in naming variables:
- Variables have to be composed with alphanumeric characters and underscore characters.
- The first character of a variable cannot be a number.
- Spaces and punctuation symbols are not allowed.
message="Hi there"
balance=48
- If a variable is empty, or not filled after a failed assignment or user input, we can assign a default value using
:=
instead of=
${var:=default_value}
Two essential functionalities when working in a terminal-based environment are reading input from the user and printing out information into the screen.
- With the
read
command we can get keyboard input into the script:
read [options] [variable/s]
where options are the following:
-a
assigns the input to an array of index zero.-e
uses thereadline
to handle input behaving like the command line.-n
readsnum
characters rather than the entire line.-p
displays a prompt or message before the input field.-r
doesn't interpret backslash characters as escapes.-s
doesn't echo characters in the screen. Also called silent mode, it's useful when asking for passwords.-t
terminates input aftern
seconds and returns a non-zero exit status if timed out.
and variable/s define where to store the input data. We can set more than one variable in a read command:
read input_a input_b input_c input_d
printf "%s\n" "input_a = $input_a"
printf "%s\n" "input_b = $input_b"
printf "%s\n" "input_c = $input_c"
printf "%s\n" "input_d = $input_d"
read 1 2 3 4
input_a = 1
input_a = 2
input_a = 3
input_a = 4
If we don't explicitly mark how many variables do we want to store from input, the command will merge all in one, in a default shell variable named REPLY
:
read
printf "%s\n" "reply = $REPLY"
$ read 1 2 3 4
reply = '1 2 3 4'
- With
printf(1)
we can print out into the screen almost the same way that in C programming.
read -p "Enter your user name: " user_name
printf "%s\n" "Welcome aboard, ${user_name}"
Variables in Shell scripting have some peculiarities:
- Variables are case-sensitive.
- Variables don't need to be identified by type.
- There are no spaces between the
=
sign. The Shell is not going to understand the line as a variable assignment if we add spaces between. - Variables don't need to be declared as the shell doesn't care about it. When the Shell finds a variable, it automatically creates it.
- In order to use a previously declared variable, we need to add the
$
sign before the variable's name. - The Shell has some builtin internal variables. We can create or modify them too. Those variables are written in uppercase.
- Enclosing our variable between brackets avoids any type of ambiguity.
- Arithmetic calculation with integers is available through shell variables using the following format:
$ (( expression ))
Where expression can take the following operators:
- + / * % ++ -- **
- We've seen before that we don't need to declare variable types, however to work with integers we need to do so.
declare -i x=5
To work with float values we need to delegate our arithmetic operations to external tools like bc(1)
or expr(1)
.
- Variables can be marked as
readonly
using the following syntax:
readonly var_name=value
- Variables can be global and local. By default every variable is global, even outside the shell if they're declared inside a script.
To make a variable local to a function (independent from the global scope and only accessible by that function) we can label the variable.
prompt="welcome"
function foo () {
local prompt="well, hello there!"
}
echo $prompt
This example will output “welcome” since the variable inside the function, although it's named the same, it's declared as local
.
Arrays
Arrays are variables that have the ability to hold more than one value at a time. They have elements that behave like cells, and each of them stores data that can be accessed via an index.
In POSIX-compliant shells arrays are not available however, we can use delimited strings as a hack to simulate them. The delimiter to use is up to us as far as we keep track of it through the script.
In order to do so, we have a built-in special variable named IFS
(internal field separator). Its default value is usually set to whitespace characters like spaces, tabs, and new lines.
Since this variable is an environment variable, we need to store its current value in a variable before modifying it, and then restore it after the script is finished.
#!/bin/sh
# save current IFS
OLD_IFS=$IFS
# set new IFS
IFS=":"
# do something with the IFS
# restore IFS
IFS=$OLD_IFS
Let's say we have an array of items as follows: "cpu gpu ram"
. We can split the string into array items using the IFS
variable and a little help from the command read(1)
.
The read(1)
command will read the string into the array as if it were a file, using the -a
flag to read the string into an array, and the -r
flag to prevent backslash escapes from being processed as escape characters.
#!/bin/sh
str_array="cpu gpu ram"
IFS=" " read -ra items <<< "$str_array"
Now we can access the array items using the items
variable, with a for
loop for example:
for i in "${items[@]}"; do
printf "%s\n" "$i"
done
Functions
Writing down each action line on a script file is fine, but when we start collecting several lines of code, or we want to reuse some functionality with another process, we have to start thinking in a way to reuse code and make it readable and manageable. Functions allow us to organize logic into blocks that we can manage and reuse in a comfortable way.
A basic function structure looks like this:
ask_user() {
printf "%s" "greetings, please type your user name: "
read user
printf "%s\n" "$user is your current user name"
}
however if our function is small enough to be displayed in one line we can do so, remembering that in one-line functions commands need to ended with a semi-colon:
ask_user() {read -p "type your name: " user ; echo "Hi, $user";}
Now each time we want to execute the code inside a function we only need to call the function by its name, without any decorations:
ask_user
Functions need to be created before we can call them to be executed.
— We can pass arguments to functions inside shell scripting by adding them after calling the function.
To call arguments inside the function follow the scheme $1 $2 ... $n
.
Those arguments can be both fulfilled inside our code or as a user input arguments when running the script.
greet_user() {
printf "%s\n" "greetings, $1"
}
greet_user
In this first example we will need to type an argument after calling the script:
# output
$ ./greet_user.sh $USER
$ greetings, Mike
Now let's use a value inside the code to act as an argument for our function:
active_user=$USER
greet_user() {
printf "%s\n" "greetings, $1"
}
greet_user "$active_user"
This way we only need to run our script without typing any extra argument:
# output
$ ./greet_user.sh
$ greetings, Mike
— We can return values from a function too in a few ways.
- Change the state of a variable/s.
- Print output to stdout.
- Run
exit
command to end the script. - Run
return
command to end the function and optionally return a value.
Summing up
While Shell scripting has some limits compared with some other modern scripting languages, it's pretty easy to use it and it can cover almost all the needs to do system management, plus in special places like servers one can face a situation where the only available stuff to work with is a command-line text editor and a shell.