Week 09 — Expansions, conditionals

This week we will study while loops and if statements, several ways to test variable values and file properties, and some useful ways to manipulate the values stored in variables.

Evaluation

Up to 10 points can be gained towards your final score by completing the in-class assignment on Friday.

Preparation

1. Complete the self-preparation assignment at home before next class

This week's self-preparation assignment is mostly practice with some familiar shell commands and some new ones. The new commands are explained in the Notes section below, which also contains the practice exercises. (If you are already familiar with the command line, please at least skim the notes to make sure there are no small details that are new to you.) These commands and concepts will be further explored in the in-class assignment on Friday.

2. Check your understanding of the concepts using the self-assessment questionnaire
  1. Answer each question in the self-assessment questionnaire as honestly as you can.
  2. Revise the topics having the lowest scores, update your scores.
  3. Repeat the previous step until you feel comfortable with most (or all) of the topics.

On Thursday evening, press 'submit'. In class on Friday we will check which topics were difficult for everyone.

To succeed at the in-class assignment for this class you should understand the topics outlined in the “Notes” section.

What you will learn from this class

Notes

The notes below include several exercises with answers that introduce new concepts. Many of these concepts will be used in this week's in-class assignment.

Read the notes and try to complete each of the exercises without looking at the sample answer. If you cannot complete an exercise using a few short commands then read the sample answer, practice it, and make sure you understand it before continuing.

Review

Make sure you understand the topics from last week. Click on the link below to expand a brief review.

Review

Exercises

As well as the indicated exercises, try typing in all the examples for yourself. If you can think of ways to modify the example to change the behaviour, try them. Exploration is the best way to learn.

Variables

Variables are used to store data. Variable names must begin with a letter which can be followed by any number of letters or digits. (The underscore “_” is treated as a letter.) Names that conform to these rules are legal (allowed by the rules); names that break these rules are illegal (not allowed by the rules).

Some examples of legal variable names:

Name Why it is legal
a starts with a letter
abcdef letter followed by any number of letters
a1b2c3 letter followed by any number of letters or digits
FooBar999Baz letter followed by any number of letters or digits
_ underscore _ is a letter too
_1234_ letter followed by digits and a letter
LONG_VARIABLE_NAME_NUMBER_1 letter followed by lots of letters and a final digit

Some examples of illegal variable names:

Name Why it is not legal
0 does not start with a letter
2things does not start with a letter
x@y @ is neither a letter nor a digit
final value space is neither a letter nor a digit

You create or set a variable using the = assignment operator. The syntax (general form) of assignment is:

variableName=value

where variableName follows the rules explained above and value is a single word (such as a filename), number, etc., with no spaces. There must not be any space either side of the = symbol.

You get the value of a variable by writing a $ before the variable's name. For example:

$ metars=/tmp/metars-2019 $ echo $metars /tmp/metars-2019

Again, there must be no space between the $ and the variable name.

Quoting

What if you want to put a space inside a value stored in a variable? You can protect spaces using quotation marks.

Single quotes around a value like this 'value' will protect everything inside the value. Wildcards (*, ?, etc.), dollar signs ($), and other special characters will be completely ignored. Spaces inside the value will be considered part of the value.

Double quotes around a value like this "value" will protect everything inside the value except for expansions (see below) introduced by the $ character. One such expansion is getting the value of a variable using $name.

$ foo='$woohoo $$$ * .* how about this?' single quotes stop *, ?, and $ from being treated specially $ echo '$foo' single quotes stop $ from being treated specially $foo $ echo " * $foo ? " double quotes allow $ to get the value of foo * $woohoo $$$ * .* how about this? ? but * and ? wildcards are still ignored

If you want a value with spaces inside, use '…'. If you want a value with spaces inside and variables to be expanded, use "…".

Expansions

The $ character is used to transform variables and other values in the command line by a process called expansion. There are several kinds of expansion:

Variable expansion

A $ followed by a variable name expands to the value stored in the variable. (If the variable is not set to any value then the result is blank.) Braces { and } around the variable name are optional but are necessary when a variable expansion is followed immediately by a letter or digit that is not part of the variable name, as in the last example below.

$ metars=/tmp/metars-2019 $ echo $metars /tmp/metars-2019 $ me=myself $ echo $me myself $ echo ${metars} /tmp/metars-2019 $ echo ${me}tars myselftars

The brace syntax {variable} also provides several mechanisms that modify the value retrieved from variable. Within a variable expansion with braces, a suffix (such as a file extension) can be removed by following the variable name with %suffix.

$ filename=2019-01-01T00:53:57-japan.txt $ echo ${filename} 2019-01-01T00:53:57-japan.txt $ echo ${filename%.txt} 2019-01-01T00:53:57-japan $ echo ${filename%-japan.txt} 2019-01-01T00:53:57

A prefix can be removed by following the variable name with #prefix.

$ echo ${filename} 2019-01-01T00:53:57-japan.txt $ echo ${filename#2019} -01-01T00:53:57-japan.txt $ echo ${filename#2019-??} -01T00:53:57-japan.txt $ echo ${filename#2019-??-??} T00:53:57-japan.txt

In both cases (${name%pattern} and ${name#pattern}) you can use wildcards such as ? in the prefix or suffix.

$ echo ${filename} 2019-01-01T00:53:57-japan.txt $ echo ${filename#2019-??} -01T00:53:57-japan.txt $ echo ${filename#2019-??-??} T00:53:57-japan.txt $ echo ${filename%:??-*} 2019-01-01T00:53

You can also replace a pattern anywhere in a value with some other text using /pattern/replacement after the variable name:

$ echo ${filename} 2019-01-01T00:53:57-japan.txt $ echo ${filename/T/ at time } 2019-01-01 at time 00:53:57-japan.txt

The variable expansions above should be all you need in most cases, but there are several more that you might need to use occasionally. If you are interested, here is a table showing most of them. (Click on the 'link' to toggle the table.)

String operators available during ''${...}'' expansion

Imagine that you are running out of disk space on your computer. You have a lot of 'lossless' music stored in .wav (Microsoft 'wave') files. You could halve the amount of space they use by converting them to .flac (free lossless audio codec) files. The program ffmpeg can do this for you. The syntax is:

ffmpeg -i input-filename.wav output-filename.flac

First, make some 'fake' .wav files like this:

for i in {1..9}; do echo $i > track-$i.wav; done

1. Write a “for wav in …” loop that echos the names of all the *.wav files in the current directory, one at a time.

track-1.wav
track-2.wav
  ...
track-9.wav

2. Change the echo command so that for every file it prints two things: the original name ($wav) as well as the name with the original .wav suffix removed.

track-1.wav track-1
track-2.wav track-2
  ...
track-9.wav track-9

3. Change the echo command so that for every file it prints two things: the original name ($wav) as well as the name with the original .wav suffix removed and a new .flac suffix added.

track-1.wav track-1.flac
track-2.wav track-2.flac
  ...
track-9.wav track-9.flac

4. Change the echo command so that for every .wav file in the current directory your loop prints: ffmpeg -i filename.wav filename.flac

The output of your loop should look like this:

ffmpeg -i track-1.wav track-1.flac
ffmpeg -i track-2.wav track-2.flac
  ...
ffmpeg -i track-8.wav track-8.flac
ffmpeg -i track-9.wav track-9.flac

(If you had some genuine .wav files, and a copy of the ffmpeg program, you could remove the echo from your loop body and it really would convert them all the .wav files to .flac for you.)

Answer

Parameter expansion

Parameters are the values passed to a shell script on the command line. Whereas variables are named, parameters are numbered starting at 1. (If you ever happen to need it, $0 is the name of the shell script exactly as it appeared on the command line.)

There are three other special variables that are useful inside shell scripts. $# expands to the number of command-line arguments, and both $@ and $* expand to a sequence containing all of the command-line arguments separated by spaces.

Parameter Meaning
$1 The first command-line argument
$2 The second command-line argument
(and so on…)
$# The number of command line arguments
$@ All of the command line arguments
$* All of the command line arguments

Write a shell script that prints a single number showing how many command-line arguments it is run with. (Don't forget you have to make it executable using chmod +x filename before you can run it.)

Answer

The variables $@ and $* behave differently when quoted. To illustrate the difference, consider the following script:

#!/bin/sh

echo 'using "$@":'
for argument in "$@"; do
  echo "$argument"
done

echo 'using "$*":'
for argument in "$*"; do
  echo "$argument"
done

Running this script with three command-line arguments one, "two too", and three produces this result:

$ ./script one “two too” three using "$@": one two too three using "$*": one two too three

Create the script shown above. Run it with arguments one "two too" three. Run it with other arguments, including no arguments.

You can see that "$*" expands to a list of command-line arguments all inside one pair of double quotes ("). In other words, "$*" is one single value containing all of the command line arguments.

On the other hand, "$@" expands to a list of command-line arguments where each separate argument is inside a pair of double quotes ("). In other words, "$@" is one value per argument, each value containing a quoted version of the corresponding argument.

Expansion Equivalent
"$*" Single value containing all arguments: "$1 $2 $3 $4 …"
"$@" Multiple values, one per argument: "$1" "$2" "$3 "$4" …

In a for loop you should almost always use "$@" (to repeat the loop for each argument).

for argument in "$@"; do some_operation_on "$argument"; done

When assigning to a variable you should probably always use "$*", however most shells are clever enough to let you use either.

all_arguments="$*"
all_arguments="$@"
Arithmetic substitution

You can evaluate arithmetic expressions by enclosing them in double parentheses preceded by a $ character: $((expression)) Within the expression you can use the normal arithmetic operators and the names of variables (without a $ in front of them).

$ echo $((2+4*10)) 42 $ two=2 $ ten=10 $ echo $((two+4*ten)) 42 $ total=0 $ for n in {1..10}; do total=$((total+n)); done $ echo $total 55 $ n=1 $ for word in one two three four; do echo $n $word; n=$((n+1)); done 1 one 2 two 3 three 4 four

Modify your shell script from the previous exercise so that it prints each command-line argument preceded with its number, starting at 1. For example:

$ ./script one “two too” three 1 one 2 two too 3 three

Answer

Write a shell script called factorial that calculates the factorial of its command line argument. Recall that factorial(n) = n * (n-1) * (n-2) * … * 1.

$ ./factorial 5 120

Answer

Command substitution

Sometimes you will need to store the output of a command in a variable, or use the output of one command as an argument to another command. Command substitution provides a way to do this.

The pattern $(command) is replaced with the output from running command. Note that command can include command-line options and arguments, and can even be a pipeline made from several commands. The result can be used to set the value of a variable. In the following examples, note the use of double quotation marks around the command substitutions to protect any spaces in the output from the commands.

$ ls | wc -l 8752 $ pwd /Users/piumarta/metar-2019 $ numFiles="$(ls | wc -l)" $ dirName="$(pwd)" $ echo there are $numFiles files in the directory $dirName there are 8752 files in the directory /Users/piumarta/metar-2019

Another way of doing the same thing, without variables, is to use the command substitutions directly where their output is needed:

$ ls | wc -l 8752 $ pwd /Users/piumarta/metar-2019 $ echo there are $(ls | wc -l) files in the directory $(pwd) there are 8752 files in the directory /Users/piumarta/metar-2019

Write a shell script called nfiles.sh that prints the number of files in each of the directories written on the command line followed by the name of the directory.

$ ./nfiles.sh . /bin /usr/bin 42 . 124 /bin 1486 /usr/bin

(Of course, your results will differ.)

Answer

Control structures

A for loop is executed once for each member of a list of items. Other control structures include the while loop that executes until a condition becomes false, the until loop that executes until a condition becomes true, and the if statement that conditionally executes (or not) a sequence of commands.

While loop

The syntax (general form) of a while loop is

while TEST do   COMMANDS done

or on a single line like this:

while TEST ; do COMMANDS ; done

The COMMANDS part works exactly like it does in a for loop. The TEST part should be a command that can either succeed or fail. The while loop will continue to run its TEST and the COMMANDS until the TEST fails.

The ''test'' command

A useful command to use for the TEST part of a while loop is test, which can do many things. One thing test can do is compare two numbers.

Command Succeeds if… Example
test LHS -lt RHS LHS <   RHS test $num -lt $limit $num is less than $limit
test LHS -le RHS LHS <=  RHS test $num -le 0 $num is negative
test LHS -eq RHS LHS =   RHS test $num -eq 0 $num is zero
test LHS -ne RHS LHS =/= RHS test $num -ne -1 $num is not -1
test LHS -ge RHS LHS >=  RHS test $num -ge 0 $num is non-negative
test LHS -gt RHS LHS >   RHS test $num -gt 0 $num is positive

Combining a while loop with test and arithmetic expansion to update a counter:

$ counter=0 $ while test $counter -lt 5; do > echo $counter > counter=$((counter+1)) > done 0 1 2 3 4

If statement

The if statement conditionally executes a sequence of commands. The syntax of if statements is:

if TEST then   COMMANDS fi

or on a single line like this:

if TEST ; then COMMANDS ; fi

The COMMANDS will be run only if the TEST succeeds. Using the test command again for the TEST:

$ n=3 $ if test $n -lt 5; then > echo $n is less than 5 > fi 3 is less than 5

Another form of the if statement provides a second set of commands to be run if the TEST fails.

if TEST then   COMMANDS1 else   COMMANDS2 fi

or on a single line like this:

if TEST ; then COMMANDS1 ; else COMMANDS2; fi

First the TEST command is run. If TEST succeeds then COMMANDS1 are run. If TEST fails then COMMANDS2 are run.

$ n=7 $ if test $n -lt 5; then > echo $n is less than 5 > else > echo $n is not less than 5 > fi 7 is not less than 5

The test command can also check the properties of a file or directory, the size of a string, or the relationship between two strings.

Command Succeeds if…
test -d FILE FILE exists and is a directory
test -e FILE FILE exists
test -f FILE FILE exists and is a regular file
test -r FILE FILE is readable
test -s FILE FILE exists and is non-empty
test -w FILE FILE is writable
test -x FILE FILE is executable
test FILE1 -nt FILE2 FILE1 is newer than FILE2
test FILE1 -ot FILE2 FILE1 is older than FILE2
test -z STRING STRING is empty
test -n STRING STRING is not empty
test STRING1 = STRING2 the strings are equal
test STRING1 != STRING2 the strings are not equal
test STRING1 < STRING2 STRING1 comes before STRING2 in dictionary order
test STRING1 > STRING2 STRING1 comes after STRING2 in dictionary order
test -v VAR the shell variable named VAR is set

Combining the test for a directory with the if statement:

$ if test -d subdir; then > echo subdir already exists > else > echo creating subdir > mkdir subdir > fi echo creating subdir $ if test -d subdir; then > echo subdir already exists > else > echo creating subdir > mkdir subdir > fi subdir already exists

Modify your nfiles.sh script so that it checks each command-line argument. If the argument is a directory, the script prints the number of files in the directory followed by the argument (as before). If the argument is not a directory, the script prints '?' and then the argument.

$ ./nfiles.sh . /bin /usrbin /bin/ls 43 . 124 /bin ? /usrbin ? /bin/ls

Hint: instead of using two echo commands, set a variable (e.g., n) to either the number of files in the directory or the value '?'. At the end of your loop use a single echo command to print n and then the argument.

Answer

Modify your nfiles.sh script so that it checks each command-line argument. If the argument is a directory, the script prints the number of files in the directory followed by the argument (as before). If the argument is a regular file, the script prints 'F' and then the argument. If the argument is neither a directory nor a file (e.g., it does not exist) then the script prints '?' followed by the argument.

$ ./nfiles.sh . /bin /usrbin /bin/ls 43 . 124 /bin ? /usrbin F /bin/ls

Hint: the commands in the else part of your if statement should include another if statement that tests whether the non-directory argument is a regular file (test -f). This second if selects between 'F' for a file or '?' for everything else.

Answer

The meanings of the above test forms can be inverted by placing a ! (“not”) in front of them.

Command Succeeds if…
test ! EXPR EXPR fails (is false)

Combining if with the test for a directory (-d) and inverting it (!) to mean “the directory does not exist”:

if test ! -d subdir; then # subdir does not exist, so...   mkdir subdir # make it fi

You can combine two or more test forms with logical “and” or logical “or”:

Command Succeeds if…
test EXPR1 -a EXPR2 both EXPR1 and EXPR2 succeed (are true)
test EXPR1 -o EXPR2 either EXPR1 or EXPR2 succeeds (is true)

To check if your log file exists as a regular file (-f) and (-a) is writable (-w):

if test -f logfile -a -w logfile; then   echo logfile is a regular file and is writable fi

Shorthand for ''test''

Many shells have an alternative version of test called [ (open square bracket). Instead of test expression you can write [ expression ] which looks quite a lot nicer. Note that you must put spaces on both sides of the opening “[” and another before the final “]”.

$ numFiles=$(ls | wc -l) $ echo $numFiles 43 $ while [ ${#numFiles} -lt 5 ]; do # make numFiles be five characters wide, padded with '0's on the left > numFiles=“0$numFiles” # add a '0' to the left of numFiles > done $ echo $numFiles 00043

Modify your nfiles.sh script so that it prints the fist item on each line (the number of files, or an 'F' or a '?') right-justified in a field 5 characters wide. Use spaces to pad the number (or 'F' or '?') on the left to the required width.

$ ./nfiles.sh . /bin /usrbin /bin/ls   43 .   124 /bin   ? /usrbin   F /bin/ls

Answer

Other commands as loop/conditional tests

Many commands can be used as the test or condition in a loop or if statement. For example, grep succeeds if it finds a match and fails if it cannot find a match.

if grep -q -s pattern files... ; then   echo I found the pattern in the files. else   echo The pattern does not occur in the files. fi

(-q tells grep not to print any output, and -s tells grep not to complain about missing files.)

See Finding information about commands and programs below for different ways to look for information about success/failure of commands and their other options that help when using them as tests in loops and if statements.

Infinite loops

Two built-in commands help with infinite loops.

Command Succeeds
true always
false never

The following while loop will never stop. (If you try it then to make it stop type Control+C.)

while true; do   echo are you bored yet?   sleep 1 done

The following while loop will stop immediately and never execute the echo command.

while false; do   echo this cannot happen done

One use of true and false is to set a flag in a shell script to affect an if statement later on.

USE_LOGFILE=true # true ⇒ use log file; false ⇒ don't   if $USE_LOGFILE; then   echo “Running analysis at $(date)” >> logfile.txt fi

Stopping or restarting loops: ''break'', and ''continue''

You can break out of a while or for loop using the break command. You can jump back to the test at the start of a while loop using the continue command. Inside a for loop, the continue command restarts the loop body with the loop variable set to the next item in the list of items.

$ for i in {1..10}; do > if [ $i -eq 5 ]; then break; fi # break out of the loop if i = 5 > if [ $i -eq 3 ]; then continue; fi # restart the loop if i = 3 > echo $i > done 1 2 4

Modify your nfiles.sh script so that it uses a flag to remember whether any arguments were non-directories. If at least one argument was not a directory (it was a regular file, or did not exist) then print a message at the end of the script saying: Warning: non-directories were encountered.

$ ./nfiles.sh . /bin   43 .   124 /bin $ ./nfiles.sh . /bin /usrbin /bin/ls   43 .   124 /bin   ? /usrbin   F /bin/ls Warning: non-directories were encountered

Answer

Stopping a script or shell: ''exit''

You can terminate a shell script (or your interactive shell session) using exit.

if test ! -d data; then
  echo "data directory does not exist: giving up"
  exit 1
fi

The argument to exit is optional and should be a number. 0 is success and non-zero is failure. This allows scripts to control loops and conditionals, as part of their TEST, by returning success or failure from the entire script.

Write a short script called exit0.sh that immediately uses exit 0 to terminate its own execution.

Answer

Write another short script called exit1.sh that immediately uses exit 1 to terminate its own execution.

Answer

Use an if statement to verify which script 'succeeds' and which script 'fails'.

$ if ./exit0.sh; then echo succeeded; else echo failed; fi $ if ./exit1.sh; then echo succeeded; else echo failed; fi

Which exit value represents 'success'?

Which exit value represents 'failure'?

Answer

Modify nfiles.sh so that it succeeds if all arguments were directories and fails if any arguments were non-directories. Test whether it works using an if statement on the command line.

$ if ./nfiles.sh . /bin; then echo OK; else echo KO; fi   43 .   124 /bin OK $ if ./nfiles.sh . /bin /usrbin /bin/ls; then echo OK; else echo KO; fi   43 .   124 /bin   ? /usrbin   F /bin/ls Warning: non-directories were encountered KO

Answer

Command and filename completion

You can save a lot of time by typing the first few characters of a filename and then pressing the Tab key. The shell will try to find a file matching what you typed, and then 'complete' the part of the filename that you did not type. If there is more than one matching file, the shell will complete up to the point where the file names diverge. If there is only one matching file, the shell will complete the entire filename and than add a space at the end.

$ touch a-file-with-a-very-long-name $ ls a- # press the Tab key $ ls a-file-with-a-very-long-name # the shell completes the name a-file-with-a-very-long-name $ touch a-file-with-an-equally-long-name $ ls a- # press the Tab key to complete the name $ ls a-file-with-a # press Tab again to list the matching files a-file-with-an-equally-long-name a-file-with-a-very-long-name $ ls a-file-with-a # the command line remains in the same state

Finding information about commands and programs

Programs such as test (and many others) have a large number of command line options. Don't bother trying to memorise more than two or three of the most useful options. Instead, know where to look up information when you need it. There are several ways to find information about a command, depending on the kind of command it is.

Use ''help'' to learn about built-in commands

(Note: MobaXterm has its own non-standard help command that does not work as shown below.)

$ help true true: true   Return a successful result.     Exit Status:   Always succeeds.   $ help help help: help [-dms] [pattern …]   Display information about builtin commands.     Displays brief summaries of builtin commands. If PATTERN is   specified, gives detailed help on all commands matching PATTERN,   otherwise the list of help topics is printed.     Options:   -d output short description for each topic   -m display usage in pseudo-manpage format   -s output only a short usage synopsis for each topic matching   PATTERN     Arguments:   PATTERN Pattern specifiying a help topic     Exit Status:   Returns success unless PATTERN is not found or an invalid option is given.

Using help you can find information about the syntax of loops and conditionals, the options understood by echo and other commands, and even obtain a list of all the builtin commands by typing help with no arguments.

Notice the last section, “Exit Status”. This tells you when the command will 'succeed' and when it will 'fail'. You can use the command as a TEST in a loop or if statement to check its “exit status” and therefore to test for whatever situation affects that status, according to the description of the command.

Use ''man'' to read the manual page for most programs

Commands that are not builtin to the shell usually have a manual page. Use man command to read the manual page describing command. Use man -k keyword to see a list of manual pages related to the given keyword. (Note that the version of man used by MobaXterm does not provide the -k keyword option.)

$ man ls LS(1) User Commands LS(1)   NAME   ls - list directory contents   SYNOPSIS   ls [OPTION]… [FILE]…   DESCRIPTION   List information about the FILEs (the current directory by default).   Sort entries alphabetically if none of -cftuvSUX nor --sort is speci-   fied.     -a, --all   do not ignore entries starting with . ...etc...

Note that the manual page for a command that can 'succeed' or 'fail' (and which is therefore useful in loop and if statement tests) will almost always include an “Exit Status” section describing what situations you can test for using the command.

Asking programs for help

Many programs respond to the option -h or -help or --help by printing brief instructions about how to use that program.

$ cat --help Usage: /bin/cat [OPTION]… [FILE]… Concatenate FILE(s) to standard output.   With no FILE, or when FILE is -, read standard input.     -A, --show-all equivalent to -vET   -b, --number-nonblank number nonempty output lines, overrides -n   -e equivalent to -vE   -E, --show-ends display $ at end of each line   -n, --number number all output lines   -s, --squeeze-blank suppress repeated empty output lines   -t equivalent to -vT   -T, --show-tabs display TAB characters as ^I   -u (ignored)   -v, --show-nonprinting use ^ and M- notation, except for LFD and TAB   --help display this help and exit   --version output version information and exit   Examples:   /bin/cat f - g Output f's contents, then standard input, then g's contents.   /bin/cat Copy standard input to standard output.

Commands that are useful as TESTs will generally tell you about their “exit status” too. For example, on my computer, the output from grep --help includes the following two lines:

Exit status is 0 if any line is selected, 1 otherwise;
if any error occurs and -q is not given, the exit status is 2.

Summary