Shell scripting | Directories, files & data

Index

Create directories and files
Working with files and directories
Working with files' data
Working with commands' data

The command line has a good tool repository that allow us to perform any action in the computer without leaving the keyboard.

Although the best way to get knowledge in a subject is by doing things with it, before we start with full shell scripting we need to get in touch with the command line.

We are going to be using pure sh. It may happen your computer has another shell as default like bash. Please note that most of the scripts written in sh will work in bash, but not the other way around.

To know which available shells you have, type the following line on your terminal emulator:

$ cat /etc/shells

Create directories and files

We have to focus in working with text data since it's what the shell understands and because of the Unix philosophy everything is a file. Previously we saw with mkdir(1) we can create directories to store files, and with touch(1) we can create files to store data.

Files and directories starting with a dot will be hidden by default.

Create directories

$ mkdir .scripts

The example above will create a directory named .scripts in our current working directory. We can also target other directories typing the full path.

$ mkdir /other/full/path/my_new_dir

Create files

touch(1) works similar to mkdir(1); if no full path is typed then it creates the file in our current working directory

$ touch  my_script

You can add an extension .sh to your script but for shell scripts it isn't mandatory. The same way you can create any other kind of file. Just type the desired extension after the name.

You can also create a file by calling vim(1) to open a non-existing file, which will save the opened buffer once we type :w inside vim.

$ vim my_script

Working with files and directories

If we already have existing directories or files that we want to use, we also have the ability to move them freely around the system.

locate(1) finds a file if we know its name but we don't remember where it's placed.

$ locate test_file.txt

pwd(1) prints the current working directory.
ls(1) lists the files inside a directory. Other option is to use printf "%s\n" *. For scripting purposes is better not to use ls(1).

# both parameters should list the same
$ ls .scripts/
$ printf "%s\n" .scripts/*

The ls(1) tool can take several flags to display more or less information about the files in the selected directory. -lh for example will list details on the files in human readable format

mv(1) can move a file from a destination to another, and it can rename files too.

# move the file to another location
$ mv /original/path/file /destination/path/

# change the file's name
$ mv /original/path/old_file_name /original/path/new_file_name

cp(1) copies a file into another location. Using the -r flag it can copy directories too.

# copy a file into another location
$ cp /original/path/file /destination/path/

# copy a directory into another location
$ cp -r original/path/dir/ /destination/path/dir/

rm(1) removes a file. Using the -r flag it makes the deletion recursive which is useful to remove directories.

# this removes the file if it exists
$ rm .scritps/test.sh

# this removes the file, and ignores it if the marked file doesn't exist
$ rm -f .scripts/test.sh

# this recursively removes a directory, ignoring nonexistent files
$ rm -rf .scripts/

Working with files' data

Once we know how to create and manipulate directories and files it's time to work with data.

The easiest way to write data in a file is by actually typing it, or copy-pasting it via a text editor like vim(1) or ee(1). But there are situations in where we may need to write a log from an action that happens in the machine, or situations where we don't have the ability to open a text editor to manually type anything (plus it's supposed we're looking to automate tasks and remove some manual interaction).

This is where redirection and the following commands come into place.

less(1) prints a file content on the standard output letting the user to scroll trough it page by page.
cat(1) concatenates files and prints them on the standard output. As an easy example, try typing cat followed by a path to a text file you have inside your computer. You should see the text file's content printed in the terminal instance. It can also do more things (we'll get back to it in a few lines).
tee(1) redirects output to multiple files, copies standard input to standard output and also to any files given as arguments.
grep(1) searches input files for a given pattern and displays the relevant lines.

# find the word 'hello' inside the file demo.sh
$ grep hello  .scripts/demo.sh

Most programs have three common jobs: They get input, process that input with the given instructions and translate the processed data into an output result (that can happen to be an error too). Internally, the shell references input, output, and error to 0, 1, 2 respectively.

I/O redirection allow us to change where Input/Output comes from since in a normal case scenario, our input is the keyboard and our output is the screen.

Let's introduce append >> and truncate > as our main output redirection operators.

In output redirection, shell will take the standard output of the command and write it to a file instead of displaying it on a screen. It's the shell (sh, bash, etc) the one creating the file and not the redirected (cat(1), printf(1), ls(1), etc) command.

Append

Append to a file is done with the >> operator.

Create the specified file if that file doesn't exist.
Write at the end of the file.

$ printf "%s\n" "Appending the next info:" > my_file.txt
$ printf "%s\n" * >> my_file.txt
$
$ cat my_file.txt
  Appending the next info:
  Documents/
  Downloads/
  my_file.txt

Note that your list may differ when using the ls(1) command since it'll list your current directories and files.

Truncate

Truncate a file is done with the > operator.

Create the specified file if that file doesn't exist.
Remove the file's content.
Write content to the file.

$ printf "%s\n" "Writing a line" > my_file.txt
$ printf "%s\n" "Truncating a second line" > my_file.txt
$
$ cat my_file.txt
  Truncating a second line

This way we can decide whether we store all the information or only the latest one and make it available for other functions or programs.

If we want to log the output error of a program we need to explicitly specify the redirection operand to do so by adding the internal descriptor, which in the case of stderr is 2.

# this will produce an error since cat cannot display directories
$ cat .scripts/ 2> cat_error.txt

It may happen that we need both stdout and stderr redirected to one file. The way we can tell Shell to do that is by redirecting our stderr to stdout the same time we're redirecting our stdout from the program.

# this way we redirect stderr to stdout and print it in the terminal
$ cat .scripts/ 2>&1 > cat-out.txt 

# this way we redirect stderr to stdout and write it into the file
$ cat .scripts/ > cat-out.txt 2>&1

Using the cat(1) command with redirection can copy files into another files, or append a file's content into another:

$ cat my_file > log.txt

will copy my_file's content inside the log.txt file, erasing everything inside log.txt first, while

$ cat my_file >> log.txt

will add my_file's content at the end of the log.txt file.

The cat() command can also allow us to write everything we type into the terminal to a file using >> append or > truncate.

$ cat >my_file.txt

will let you write the text on terminal which will be saved in a file named file.

$ cat >>my_file.txt

will do the same, except it will append the text to the end of the file.

Note that to end writing into the file we need to press CTRL+D which sends an “end-of-file” character. In order to automate things, we can add an escape string.

# standard EOF workflow:
$ cat > my_file.txt << EOF
everything typed here will be written.
EOF

# custom escape string:
$ cat <<"END" >my_file.txt
  all things here will be written.
  END

Here we can see input redirection being managed by << , indicating the shell to take the text after << as the end of the input.

When writing Shell scripts we may face a situation where we need indentation while executing an action before EOF. In this case we need to change << to <<- and use tabs (not spaces) to indent.

# indenting while using command line tools
$ cat > my_file.ext <<- EOF
    this text can be indented.
EOF

# indenting inside a shell script
if [ cond ]; then
    cat > my_file.txt <<- EOF
        we're
        indenting.
        EOF
fi

Instead of truncating, using < will redirect data into a command as its input.

# store content of a directory into a file.
printf "%s\n" .scripts/* > stored_list.txt

# pass the created file as grep's input
grep ".sh" < stored_list.txt

Working with commands' data

We've been looking at ways to get our output written into a file and our input redirected to take the content from a file. What about letting a command take the output of another command as its input?

Pipes [represented with a vertical bar | ] are the answer to that question.

As an example running top we can display all the ongoing processes and hardware demand of our computer, but the list is huge and maybe we're only interested in the Cpu(s) usage. We know that grep() can take an input pattern and print relevant lines from a file, so using a pipe will allow us to combine both commands and get what we're looking for.

$ top | grep Cpu(s)

We can use more than one pipe in a command instruction.

Maybe we want to log all computer's usage data while only printing in the screen the Cpu(s) usage. We can use tee(1) to pass top(1) into a logfile and to grep(1) too.

$ top | tee top_log.txt | grep Cpu(s)

Similar to file redirection, we can pipe stderr instead of stdout. To do so we have to add an & symbol after the pipe's vertical bar |&.

Commands can be queued and executed only if some previous ones have met some special condition. Let's take a look at how: imagine two commands, X and Y.

# & Runs X and then runs Y in an asynchronous way.
$ X & Y

# ; Runs X and afterwards runs Y in a synchronous way.
$ X ; Y

# && Runs Y only if X is successful in synchronous and.
$ X && Y

# || Runs Y only if X is not successful.
$ X || Y

A cool hack if we want to pass the last argument of a command to another command is using $_ as the last argument.

$ touch my_file.txt && echo "hello world" > $_

Almost every commands make use of standard input / output and pass the errors through standard error. Redirection is really handy when solving problems that require to filter a lot of data in order to get the byte we're looking for.

vertex farm

n0mad coder's blog