Shell scripting | Variables & Functions

Index

Variables
Arrays
Functions
Summing up

Once we've learnt how to freely manipulate directories, files and data, it's time to give more power to the command line storing our values in variables and organizing our commands with functions.

In this guide we'll take a look at variables, arrays and functions inside shell scripting.

Some languages need to explicitly allocate and free memory for variables, structures and so on. Shell scripting handles this under the hood and in most cases we don't have to clean anything before leaving a function.

Exiting scripts as soon as possible or making return statements in functions can reduce system load and improve performance by avoiding unnecessary code execution.

Variables

In every programming language, variables store data and configuration options, and allow us to manage and control actions inside a script. Variables are quite easy to use but they are also quite easy to get ourselves into trouble with.

There are three basic rules in naming variables:
- Variables have to be composed with alphanumeric characters and underscore characters.
- The first character of a variable cannot be a number.
- Spaces and punctuation symbols are not allowed.

message="Hi there"
balance=48

If a variable is empty, or not filled after a failed assignment or user input, we can assign a default value using := instead of =

${var:=default_value}

Two essential functionalities when working in a terminal-based environment are reading input from the user and printing out information into the screen.

With the read command we can get keyboard input into the script:

read [options] [variable/s]

where options are the following:

-a assigns the input to an array of index zero.
-e uses the readline to handle input behaving like the command line.
-n reads num characters rather than the entire line.
-p displays a prompt or message before the input field.
-r doesn't interpret backslash characters as escapes.
-s doesn't echo characters in the screen. Also called silent mode, it's useful when asking for passwords.
-t terminates input after n seconds and returns a non-zero exit status if timed out.

and variable/s define where to store the input data. We can set more than one variable in a read command:

read input_a input_b input_c input_d
printf "%s\n" "input_a = $input_a"
printf "%s\n" "input_b = $input_b"
printf "%s\n" "input_c = $input_c"
printf "%s\n" "input_d = $input_d"

read 1 2 3 4
input_a = 1
input_a = 2
input_a = 3
input_a = 4

If we don't explicitly mark how many variables do we want to store from input, the command will merge all in one, in a default shell variable named REPLY:

read
printf "%s\n" "reply = $REPLY"

$ read 1 2 3 4
reply = '1 2 3 4'

With printf(1) we can print out into the screen almost the same way that in C programming.

read -p "Enter your user name: " user_name
printf "%s\n" "Welcome aboard, ${user_name}"

Variables in Shell scripting have some peculiarities:

Variables are case-sensitive.
Variables don't need to be identified by type.
There are no spaces between the = sign. The Shell is not going to understand the line as a variable assignment if we add spaces between.
Variables don't need to be declared as the shell doesn't care about it. When the Shell finds a variable, it automatically creates it.
In order to use a previously declared variable, we need to add the $ sign before the variable's name.
The Shell has some builtin internal variables. We can create or modify them too. Those variables are written in uppercase.
Enclosing our variable between brackets avoids any type of ambiguity.
Arithmetic calculation with integers is available through shell variables using the following format:

$ (( expression ))

Where expression can take the following operators:

- + / * % ++ -- **

We've seen before that we don't need to declare variable types, however to work with integers we need to do so.

declare -i x=5

To work with float values we need to delegate our arithmetic operations to external tools like bc(1) or expr(1).

Variables can be marked as readonly using the following syntax:

readonly var_name=value

Variables can be global and local. By default every variable is global, even outside the shell if they're declared inside a script.

To make a variable local to a function (independent from the global scope and only accessible by that function) we can label the variable.

prompt="welcome"

function foo () {
  local prompt="well, hello there!"
}

echo $prompt

This example will output “welcome” since the variable inside the function, although it's named the same, it's declared as local.

Arrays

Arrays are variables that have the ability to hold more than one value at a time. They have elements that behave like cells, and each of them stores data that can be accessed via an index.

In POSIX-compliant shells arrays are not available however, we can use delimited strings as a hack to simulate them. The delimiter to use is up to us as far as we keep track of it through the script.

In order to do so, we have a built-in special variable named IFS (internal field separator). Its default value is usually set to whitespace characters like spaces, tabs, and new lines.

Since this variable is an environment variable, we need to store its current value in a variable before modifying it, and then restore it after the script is finished.

#!/bin/sh

# save current IFS
OLD_IFS=$IFS

# set new IFS
IFS=":"

# do something with the IFS

# restore IFS
IFS=$OLD_IFS

Let's say we have an array of items as follows: "cpu gpu ram". We can split the string into array items using the IFS variable and a little help from the command read(1).

The read(1) command will read the string into the array as if it were a file, using the -a flag to read the string into an array, and the -r flag to prevent backslash escapes from being processed as escape characters.

#!/bin/sh

str_array="cpu gpu ram"
IFS=" " read -ra items <<< "$str_array"

Now we can access the array items using the items variable, with a for loop for example:

for i in "${items[@]}"; do
  printf "%s\n" "$i"
done

Functions

Writing down each action line on a script file is fine, but when we start collecting several lines of code, or we want to reuse some functionality with another process, we have to start thinking in a way to reuse code and make it readable and manageable. Functions allow us to organize logic into blocks that we can manage and reuse in a comfortable way.

A basic function structure looks like this:

ask_user() {
    printf "%s" "greetings, please type your user name: "
    read user
    printf "%s\n" "$user is your current user name"
}

however if our function is small enough to be displayed in one line we can do so, remembering that in one-line functions commands need to ended with a semi-colon:

ask_user() {read -p "type your name: " user ; echo "Hi, $user";}

Now each time we want to execute the code inside a function we only need to call the function by its name, without any decorations:

ask_user

Functions need to be created before we can call them to be executed.

— We can pass arguments to functions inside shell scripting by adding them after calling the function.

To call arguments inside the function follow the scheme $1 $2 ... $n.

Those arguments can be both fulfilled inside our code or as a user input arguments when running the script.

greet_user() {
   printf "%s\n" "greetings, $1"
}

greet_user

In this first example we will need to type an argument after calling the script:

# output
$ ./greet_user.sh $USER
$ greetings, Mike

Now let's use a value inside the code to act as an argument for our function:

active_user=$USER

greet_user() {
   printf "%s\n" "greetings, $1"
}

greet_user "$active_user"

This way we only need to run our script without typing any extra argument:

# output
$ ./greet_user.sh
$ greetings, Mike

— We can return values from a function too in a few ways.

Change the state of a variable/s.
Print output to stdout.
Run exit command to end the script.
Run return command to end the function and optionally return a value.

Summing up

While Shell scripting has some limits compared with some other modern scripting languages, it's pretty easy to use it and it can cover almost all the needs to do system management, plus in special places like servers one can face a situation where the only available stuff to work with is a command-line text editor and a shell.

vertex farm