KUAS Engineering

Week 06 — Command line

You do not rise to the level of your goals, you fall to the level of your systems. James Clear

Evaluation

Up to 10 points can be gained towards your final score by completing the in-class assignment on Friday.

Preparation

1. Complete the self-preparation assignment at home before next class

This week's self-preparation assignment is part practical preparation and part study.

First, read the following sections from the Command line interface guide:

all sections from the beginning up to, and including, Counting characters, words, and lines
Redirection
Pipes
Editing text files with nano

Second, complete the practical exercise in the Notes section below. (If you are already familiar with the command line, please at least skim the notes to make sure there are no small details that are new to you.)

2. Check your understanding of file command line concepts using the self-assessment questionnaire

Answer each question in the self-assessment questionnaire as honestly as you can.
Revise the topics having the lowest scores, update your scores.
Repeat the previous step until you feel comfortable with most (or all) of the topics.

On Thursday evening, press 'submit'. In class on Friday we will check which topics were difficult for everyone.

To succeed at the in-class assignment for this class you should understand the topics outlined in the “Notes” section.

What you will learn from this class

You will learn what the command line (shell) is, why you should know how to use it, and understand how to

enter commands (and understand how the shell interprets what you entered),
use absolute and relative paths,
navigate around the filesystem of your computer,
see what files and directories are present and how to display their contents,
specify the location files and directories,
create, copy, and delete files and directories,
find files and directories by type, name, pattern, or content,
search files for specific content, and
combine existing commands to do new, powerful things.

Notes

Why learn to use command line?

Using the command line puts you in control at the level of the operating system and other fundamental processes that make it work. Many operations and options that are not accessible using a graphical interface (Windows Explorer, Mac Finder, etc.) become accessible to you on the command line.

Developers, engineers, scientists, and researchers all use the command line to make themselves faster and more effective (and happier) than would be using only graphical interfaces.

What is the command line anyway?

The OS provides an abstraction over the computer and its resources. You have one (probably) CPU, some memory, some disk, and a finite amout of time to perform computations. The OS shares these resources fairly between all the programs competing for their use.

To do this it gives each program the illusion that it controls the entire machine. In reality each process gets a share of the disk, a share of the memory, a share of the CPU, and a share of the time. The OS is therefore only responsible for:

managing the disk (filesystem)
managing the memory (memory management and virtual memory systems)
managing the use of the CPU (dividing time into little slices and spreading them around the running programs)
managing communication and synchronisation between programs (network, etc.)

To do this the OS has to do one more thing:

control the execution of programs: starting them, managing their resources, and terminating them.

Notice that none of the above includes “interact with the user”. The operating system could not care less about end users.

On the other hand, we have the illusion of working with the entire computer. We install applications, run programs, interact with them, terminate them. We create and delete files and directories, and move them around in a vain attempt to keep them oganised. None of this is the responsibility of the OS.

Instead there is a program (just like any other program) that sits between the user and the OS. This can be a graphical program (like Windows Explorer or the Mac Finder) or it can be an interactive command-line program (a shell). This program is reponsible for letting the user control the machine: creating and deleting files and directories, managing physical devices, controling program execution, etc.

Mac and Linux 'terminals' are examples of interactive command-line shells. On Windows there is MobaXterm, Cygwin, mingw-w64, Windows Subsystem for Linux (WSL), or even running a guest OS (e.g., Linux, or BSD) under a virtual machine such as VirtualBox or VMWare.

The rest of this document leads you through a practical exploration of basic command line features. Things you should type are shown in a grey box like this. Keys you should type include Enter or Return (don't type the word, just press the key), and Control-C which means “hold down the Control key while typing the character C.

Exploring the command line

To follow all the steps you will need a text editor. If you do not already have an editor then one possibility is nano. It is installed by default on Mac and many Linux distributions. In MobaXterm you can install it by typing apt-get install nano on the command line (while connected to the Internet, since it has to be downloaded before it is installed).

Prompt, commands, pressing ''Enter'', terminating programs

Start the shell, wait for it to print the prompt, then type ls and wait.
When you get bored, press Enter (or Return if you have it).

Remark: The shell will wait (literally) forever for you to press Enter. If the computer is not responding, did you simply forget to press Enter?

From now on I will assume you press Enter after every command you type.

Type cat (followed by Enter).
When you get bored, press Control-C.

Remark: If you give no arguments to some programs then they use your keyboard for input. If the computer is not responding, did you forget to tell a program which file to read from?

How the shell parses what you type

Type echo.
Type echo hello world.
Type echo hello world .
Type echo hello world and press the cursor-left key until the cursor is in the middle of the line before pressing Enter.

Remark: White space is used to break the line into a command part followed by zero or more argument parts. Once the line is broken into parts the white space is discarded. It does not matter how much white space you use, or even where the cursor is positioned in the line when you press Enter.

Directories

Type echo ~ (this is the path name of your home directory)
Type cd (this changes your current directory to your home directory)
Type pwd (this shows you where you are; check you actually are in your home directory)
Type ls (this will show you the details of the files and directories in your home directory)
Type ls /home/<your-username> or ls /Users/<your-username> (e.g., ls /home/piumarta – this also shows you your home directory)
Type ls ~ (of course, this is another way to list your home directory)
Type ls . (a single dot is another name for “this directory”, which is either your home or the last directory you changed to using cd)
Type ls /home (or maybe ls /Users – this shows you the directory where all accounts are stored, the parent of your home directory)
Type ls .. (this also shows you the parent of your home directory, because your current directory is your home and '..' is the name of the parent directory of the current directory)

If there are more names in the /home (or /Users) directory, pick one of them. Let's call that name <name>. (If there are none, just your your name again.)

Type ls ~<name> (this is another name for the home directory of the user called <name>)
Type cd .. (this changes your current directory to /home, where all the home directories are stored)
Type pwd (this prints the working directory, proving you are now “in” the directory where home directories are stored)
Type ls (you will see your account name listed, and the names of any other accounts on the computer)
Type ls <your-username> (this is a relative path, which begins in the working directory instead of the root directory)
Type cd - (this should print nothing, but… where are you now?)
Type pwd (cd understands the special argument '-' which means “the directory I was in before this one”)

Remark: There are several ways to specify locations in the computer, and one of them is implicit (the current working directory) and often used as a default when you do not specify any other directory.

Where are commands implemented?

Type type echo (this shows you that echo is a built-in command, implemented in the shell itself; when you echo things, the shell performs the “echo”ing for you directly)
Type type ls (this shows you that ls is a program that is stored on the disk; the shell runs the ls program for you whenever you type its name)

Remark: Commands are either built-in to the shell, or they are programs stored in the filesystem just like any other file. Having a user program manage the running of other user programs in this way was one of the reasons why shells were invented.

Remark: There is nothing special about commands, and you can add lots (and lots) of new commands by installing programs on your computer in places such as /usr/bin or /usr/local/bin.

Hidden files, and visual clues about file type

Type ls . (this shows you all the files in the current directory, not the directory itself)
Type ls -d . (this shows you the details of the directory '.' itself, not the files that it contains; -d means “list directories as themselves”)
Type ls -a (now you can see the hidden directory entries, which start with '.', including '.' itself and its ancestor '..')
Type ls -aF (this will put a '/' after directory names, and a '*' after executable files)
Type ls -F /usr/bin (there is a large collection of executable files in /usr/bin)

Creating directories, copying files

Type cd /tmp (this puts you in a directory meant for temporary files)
Type pwd(make sure the cd command really worked and this prints /tmp)
Type mkdir mydir (MaKes a DIRectory called mydir)
Type ls -ld mydir (check that you are the owner of the directory: -l = long format, -d = show information about directories themselves, not about the files they contain)
Type cp /etc/passwd mydir (this copies the file /etc/passwd into your mydir directory. We can do better, though. Try the following instead…)
Type cp -vip /etc/passwd mydir/ (this version employs several “safety features” that command line pros use often
- the option -v means “verbose”: it prints each file as it is copied
- the option -i means “interactive”: it asks you whether you want to overwrite any files that already exist
- the option -p means “preserve permissions”: in particular, the copy will have the same timestamp as the original
- following the destination directory's name with a '/' ensures the destination really is an existing directory: if for some reason the directory does not exist, cp will print an error message. Without the / at the end, if the directory did not exist then the file would be copied to a regular file called “mydir” which is definitely not what we want, so including the / ensures our intentions are enforced)

Remark: use the options that programs like ls and cp provide so that they give you the information and protection from mistakes that you want, and make use of the (very) few “safety” features (such as trailing / on directory names) that are available in the command line.

Copies of files vs. multiple links to inodes

Type nano data.txt
Type the first ten counting numbers, as words, one per line:
- one
- two
- three
- four
- five
- six
- seven
- eight
- nine
- ten
Type Control-O and Enter (to write Out the file)
Type Control-X (to eXit nano)
Type ls -il (you can see your file, its owner, how long is it, and the first — probably huge — number is the disk address of the inode describing the file's contents)
Type cp data.txt copy1.txt
Type ls -il(you can see copy1.txt is the same size but has a different inode — the contents were copied)
Type nano copy1.txt and then add this is copy1 at the start of the file
Type Control-O Enter Control-X (write out the file and exit)
Type ls -il (you can see copy1.txt is now larger than data.txt, but its inode has not changed)
Type cat data.txt (this concatenates the files named in the command arguments and prints them on the terminal; you can see that the original file is unchanged)
Type cat copy1.txt (you can see that the copy has been changed)
Type ls -il (you can see that the inode of copy1.txt has not changed, but the contents of the storage blocks of the file were changed)

Remark: cp makes a brand new directory entry and a brand new inode and then copies the contents of the original file into brand new storage.

Remark: When you edit a file with nano the inode does not change, only the contents of the file change. Continue with this section to see why this is significant.

Type ln data.txt copy2.txt (this creates another link to data.txt's inode called copy2.txt)
Type ls -li (you can see that the inode numbers of data.txt and copy2.txt are the same. The ln program made a new directory entry but did not copy the inode. You can also see that the link count of data.txt and copy2.txt is 2, whereas the link count of copy1.txt remains 1, because there are now two directory entries pointing at the one inode shared by data.txt and copy2.txt)
Type nano copy2.txt and add "this is copy2" at the start of the file; then press Control-O, Enter, and Control-X to write out the file and exit)
Type ls -il (you can see copy2.txt and data.txt are both now larger)
Type cat copy2.txt (you can see that the copy2.txt has been changed)
Type cat data.txt (because copy2.txt and data.txt share the same inode, they both changed when you edit either one of them; they are the same file, but with multiple directory entries pointing to it with different names)

Remark: ln makes a new link to an existing inode and file contents. If you modify any one of the files sharing the same inode, they will all change in exactly the same way.

Remark: The link count of a file (or directory) tells you how many directory entries “point to” (share) the same inode.

Finding files by their type

Type find . -type d (this will print all the directories in or under the current directory; it will probably only print '.' unless you created more directories)
Type find .. -type d (this will print all the directories in or under the parent directory; it should probably find several more names, including 'mydir')
Type find . -type f (this will print all the files in or under the parent directory; it should print at least data.txt, copy1.txt, and copy2.txt)
Type find . -name *.txt (this will print an error message… why?!? Let's find out…)

Remark: You can search for files based on their type: file (-type f), directory (-type d), etc.

Finding files by name

Type echo find . -name *.txt (this will print the command that the shell just ran; it says ”find . -name copy1.txt copy2.txt data.txt“ which is not what you wanted – the shell expanded *.txt in to the names of all the .txt files)
Type find . -name '*.txt' (this will print all the files in, or under, '.' whose names end with '.txt')
Type find . -name 'c*' (this will print copy1.txt and copy2.txt; all the files whose names start with 'c')

Remark: You can use echo to see exactly what the shell is doing with complex arguments.

Remark: You can use quote characters '…' to stop the shell messing with your arguments; often you want *.txt to mean “all the text files in this directory”, but in this case you did not want that at all.

Finding data in files, finding files by their content

Type grep e data.txt (this will search the content of data.txt for all lines that have the letter 'e' in them; you should see hello, one, three, five, etc., but not two, four, or six)
Type grep two * (this will search the content of all files in the current directory for lines that have the word 'two'. You should see all three lines from all three files. Because there was more than one file argument on the command line, grep also prints the name of the file(s) where the target string 'two' was found)

Remark: You can search for content in one or more files.

Remark: You can search for files based on whether they contain particular content.

Redirecting output to a file

Type ls -l /usr/bin (there is a lot of output)
Type ls -l /usr/bin > /tmp/files.txt (there is no output; what happened? All the output from ls was redirected (written) to /tmp/files.txt instead of to the screen)
Type cat /tmp/files.txt (there's the output that would have gone to the screen)

Remark: Command output can be saved in a file using the redirection operator > file.

Remark: There is too much output in files.txt to see it all at once.

Type less /tmp/files.txt (this will show you the output one page at a time. Press space to move forward a page; press up and down arrows to move forward or backward a line; press G to go to the end and g to go to the start of the tile; press q to quit.)

Remark: To view a large amount of data one page at a time, use the program 'less'.

Type grep ed /tmp/files.txt(this finds all programs in /usr/bin that have 'ed' in their name; not especially useful, but it illustrates an important point…)

Remark: The output of a command can be redirected to (saved in) a file and analysed using other programs.

Redirecting input from a file

Type grep ed and then enter:
- hello
- are
- we
- bored (grep will echo back only the line containing 'ed')
- yet?
type Control-C (to terminate the program)

Remark: Many commands can read from the keyboard as well as reading from files.

Type grep ed < /tmp/files.txt (this will act as if you have typed the input, but the input is taken from the file /tmp/files.txt)

Remark: Just as output can be redirected to a file using >, input can be redirected from a file using <.

Pipelines and filters

What if you wanted to avoid creating a temporary file in between ls and grep?

Type ls -l /usr/bin | grep ed (this prints all files in /usr/bin whose name includes ed. The output of ls was redirected to the input of grep, without using an explicit temporary file in the middle. [There actually is a temporary file in the middle, but it is invisible and exists only in the computer's memory.])

Remark: The output of one program can be sent to the input of another program.

Type wc and then enter these two lines:
- Why was the computer late for the meeting?
- Because it had a hard drive.
Type Control-D (wc will print the number of characters, words, and lines that you typed)

Remark: wc can analyse text files by counting characters, words, and lines.

Remark: When a program is reading from the keyboard, Control-D is a way to make the program believe that it reached the end of the input file. Try it with cat: run cat, type a few lines, then press Control-D.

Type ls -l /usr/bin | grep e | wc -l (this prints the number of programs in /usr/bin that have an 'e' in their name)

Remark: Programs can be chained together into long pipelines by joining inputs to outputs together.

Remark: In this example, grep is acting as a filter. It reads input, filters it in some way, and then writes the result to its output.

Remark: Many command line utilities are built this way, so that they can be composed to perform useful functions. Individually they are all quite small and simple, but together their behaviour can be very complex. The flexibility to compose them in many ways is one reason that the command line is so powerful for managing and analysing data.

Remark: The '|' character is called “pipe”; it is used a lot by command line “pros”.