Lesspress.net: Bash

General documentation / cheat sheets for various languages and services

Initialization

~/.bash_profile
- According to bash(1):
  
  The personal initialization file, executed for login shells.
- It’s run once when you login, but not for each additional (sub)shell you open.
- Evidently, macOS (Terminal) runs a full login shell for each new session you start^ref, which is why my ~/.bash_profile simply reads:
```
[ -f ~/.bashrc ] && source ~/.bashrc
```
~/.bashrc
- According to bash(1):
  
  The individual per-interactive-shell startup file.
- It is not automatically loaded when you login, but only when you open an interactive shell.
~/.profile
- Read by both sh and bash, but bash will ignore it if ~/.bash_profile exists.

Positional parameters

Var	Description
`$0`	The path of the executed shell script (not the path to the bash interpreter).
`$1`	The first command line argument. (`$2` is the second, etc.)
`$#`	The number of positional arguments (`$1`, `$2`, …) available, not including the shell script itself.
`"$*"`	All positional arguments, as one token. This should always be quoted.
`"$@"`	All positional arguments, as multiple individually quoted tokens. It should also always be quoted, and it’s generally preferable to `"$*"`.
`$$`	PID of the current process.
`$!`	PID of the most recently executed background process.
`$_`	Final argument of most recently executed command
`$?`	Exit status of most recently executed command

Operators

	Description
`=~`	tests for a match anywhere. For example, `[[ "$@" =~ '--help' ]]` checks whether any of the command line arguments are `--help`.
`-z ""` or `-z`	Evaluates to true
`-z "something"`	Evaluates to false
`${var+whatever}`	If `$var` is set to anything, evaluates to null (i.e., the empty string); otherwise evaluates to the string “whatever”
`-z ${var+whatever}`	So, this test evaluates to true when `var` is unset, and “whatever” can be any non-empty string.
`-z "$var"`	This evaluates to true if `var` is unset or set to the empty string. (But throws if `var` is unset and `set -u` is active.)
`-n ""`	Evaluates to false (`-n` is nearly the opposite of `-z`, but requires its argument to be quoted.)
`-n "$var"`	Evaluates to true if `var` is set to anything at all.

Internal variables

$PROMPT_COMMAND

Execute the contents of this variable as a command before showing the prompt.
$PIPESTATUS

An array of exit statuses from the different processes in the last sequence of piped commands. $PIPESTATUS[0] holds the exit status of the first (leftmost) process, $PIPESTATUS[1] holds the exit status of the second process, etc.

String manipulation

The Advanced Bash-Scripting Guide is misleading. The string in these patterns is a variable name, while substring is the actual string value of the trigger pattern.

Syntax	Explanation
Deletion
`${string#substring}`	Delete shortest match of `substring` from front (implicit `^`) of the `$string` variable.
`${string##substring}`	Delete longest match of `substring` from front (implicit `^`) of the `$string` variable.
`${string%substring}`	Delete shortest match of `substring` from end (implicit `$`) of the `$string` variable.
`${string%%substring}`	Delete longest match of `substring` from end (implicit `$`) of the `$string` variable.
Replacement
`${string/substring/replacement}`	Replace first match (in `$string` variable) of `substring` with `replacement`.
`${string//substring/replacement}`	Replace all matches (in `$string` variable) of `substring` with `replacement`.
`${string/#substring/replacement}`	Replace `substring` with `replacement` iff the `$string` variable starts with `substring`.
`${string/%substring/replacement}`	Replace `substring` with `replacement` iff the `$string` variable ends with `substring`.

For example, to replace all newlines in the variable IDS with commas: ${IDS//$'\n'/,}

Table of behaviors when $param is in different states, drawn from https://stackoverflow.com/a/16753536:

	`param` set and not null	`param` set but null	`param` unset
`${param:-word}`	substitute `$param`	substitute `word`	substitute `word`
`${param-word}`	substitute `$param`	substitute null	substitute `word`
`${param:=word}`	substitute `$param`	assign `word`	assign `word`
`${param=word}`	substitute `$param`	substitute null	assign `word`
`${param:?word}`	substitute `$param`	error, exit	error, exit
`${param?word}`	substitute `$param`	substitute null	error, exit
`${param:+word}`	substitute `word`	substitute null	substitute null
`${param+word}`	substitute `word`	substitute `word`	substitute null

Array variables

All bash variables are arrays; it’s just that most of them have only one value, and the default variable-dereferencing syntax retrieves the first value. I.e., $var is equivalent to the more explicit ${var[0]} for any variable var.

The special array variables $@, $*, and $#, as well as $1, $2, …, are well known since they refer to the command’s / function’s arguments, but other arrays require a little extra syntax:

`"${FILES[@]}"`	Reference all elements of the `FILES` variable.
`${#FILES[@]}`	Return the number of elements in the `FILES` variable.
`${!FILES[@]}`	Return the indices (keys for associative arrays) in the `FILES` variable.
`"${FILES[0]}"`	Retrieve the first entry of the `FILES` variable.
`"${FILES[1]}"`	Retrieve the second entry of the `FILES` variable.

Array literals use round parentheses; (a b), ( a b ), and ('a' 'b') are all equivalent; (a b), ('a b'), and (a, b) are all different.

INPUT_FILES=()

This initializes a local variable to have the value of an empty array. This is different from INPUT_FILES=, which sets the value (of the first element, implicitly) of the variable INPUT_FILES to the empty string.
INPUT_FILES+=(/tmp/z)

This appends the string /tmp/z to the INPUT_FILES variable. The RHS of the += operator is simply an array literal of any length, which gets concatenated to the end of the current value of the INPUT_FILES (array) variable.
INPUT_FILES+='/tmp/a'

This appends the string /tmp/a to the first element in INPUT_FILES; probably not what you intended!

Bash does not support multidimensional arrays; e.g., GRID=((1 0) (0 1)) is a syntax error.

Arrays can be sliced much as in Python: ${FILES[@]:1} returns all but the first element; ${FILES[@]:0:2} returns just the first two elements.

And at least in Bash 4, negative indices work as counting from the end of the array: ${FILES[-1]} returns the last element in FILES.

Earlier versions of bash (and some other shells) have a special syntax to get the last command line argument: ${@: -1} (n.b.: the space before -1 is required; otherwise it’s parsed as parameter substitution’s :- default-when-null operator.)

References:

Examples

Suppose ARR=(a b c) and argc() { printf "%d\n" $#; }:

Executed command	Result
`argc ${ARR[@]}`	`3`
`argc ${ARR[*]}`	`3`
`argc "${ARR[*]}"`	`1`
`argc "${ARR[@]}"`	`3`

IO redirection

: can be used like Python pass, as a placeholder for a no-op statement where, otherwise, there would be a syntax error. For example, echo hello | | cat won’t run, but echo hello | : | cat will (though it won’t have any input, because : never produces output).

Syntax	Effect
`1>N` or `>N`	Write `STDOUT` to N
`1>>N` or `>>N`	Append `STDOUT` to N
`2>N`	Write `STDERR` to N
`2>&1`	Redirects `STDERR` to `STDOUT`
`>>file 2>&1`	Appends both `STDERR` and `STDOUT` to `file`
`2>&1 >>file`	Redirects `STDERR` to `STDOUT` but writes what would have otherwise gone to `STDOUT` to `file` instead
`2>>N`	Append `STDERR` to N
`&>`	Write both `STDOUT` and `STDERR` to file (e.g., `&>/dev/null` to discard all output streams) (Bash 4 feature)
`N>&-`	Close output file descriptor `N` (which defaults to 1, if missing)
`N<&-`	Close input file descriptor `N` (which defaults to 0, if missing)
`N<>filename`	Open file or device `filename` for reading and writing under the alias `&N`. `N` should be a number greater than 2.

These generally only affect the behavior of the line they appear on; to apply them to the current shell session, use exec. For example, call exec 2>>/var/log/somescript.log at the beginning of your shell script to redirect all STDERR to that file instead of the appearing in the user/caller’s TTY (and thus preempts the user’s control over STDERR).

exec {NEW_FD}<>filename

Open file or device filename as the smallest (beginning with 10) available file descriptor, and save it to the environmental variable: NEW_FD. So after this command runs, echo $NEW_FD will return something like 10. Use the {NEW_FD} syntax whenever creating or closing the named file descriptor, but use &$NEW_FD whenever using it as a redirection target.
exec 2> >(logger) or exec 2>> >(logger) or exec 2>>>(logger)

Send all STDERR over to syslog.

logger is a BSD (and thus, macOS) command that opens up a process that can be written to, which pipes input into syslog. E.g., echo "testing logger from shell" | logger will show up in Console.app with a timestamp and the “tag” chbrown, since that’s my username. logger -t shell-log applies the tag shell-log to everything it receives on STDIN. Each input line translate to a single syslog entry.

The three commands are effectively equivalent, since there’s no difference between overwriting and appending to the file descriptor that logger opens. The latter two commands are syntactically equivalent, but a space is required between the N> and >(...) syntax in the first command. Due to bash’s parser, exec 2>>(logger) is lexed as exec 2>> (logger), which is a syntax error.
exec >/tmp/script.log

For the remainder of the current script / scope, this will write STDOUT to the file /tmp/script.log (overwriting it if it exists).
exec 3<>/tmp/other.log; date >&3; exec 3>&-

Open the file /tmp/other.log as file descriptor 3, write the current date to it, and then close the file descriptor.

(exec 3<> /tmp/other.log syntax is equivalent to the first command.)

File descriptors 3 through 9 are all initially closed, but each can be arbitrarily opened and closed with this exec N<> syntax. Trying to read from or write to a closed file descriptor will raise the error message “N: Bad file descriptor”, where N is the closed file descriptor you tried to use.

The first step creates the file if needed, and sets the pointer to the beginning of it. Writing to it will simply overwrite whatever’s currently there, byte for byte; i.e., it doesn’t truncate the file before writing to it. Each write advances the cursor, so multiple calls to date >&3 will write each timestamp to its own line.

To append to a file descriptor opened this way, you can read through it with, e.g., cat; upon completion cat <&3 >/dev/null, all subsequent writes to file descriptor 3 will be appended to the file, because the cursor has been advanced to the end of the file.

Exit codes

In a bash session, $? refers to the exit / status code of the last-run command. Inside a bash script, $? starts out 0, and the exit code of the overall script (insofar as the caller is concerned), when it naturally finishes/exits, is set to the value of $?. You can’t manually set the exit code, e.g., ~~?=101~~, though.

Table of “Exit Codes With Special Meanings” from Appendix E of the “Advanced Bash-Scripting Guide” (with clarifications where applicable):

Value	Description
0	Success!
1	Any sort of error; might indicate that the error doesn’t fall nicely into one of the subsequent categories, or the developer was too lazy (or too skeptical about standards) to pick a more descriptive error code
2	Misuse of shell builtins (missing keyword or command); permission problem; `diff` return code on a failed binary file comparison.
126	Command invoked cannot execute (permission problem or command is not an executable)
127	Command not found
128	Invalid argument to exit (caused by `exit "tau"` or `exit 6.28`, etc.)
128+n	Fatal error signal `n`; `kill -9 $PID` (`SIGKILL`) causes the script running as `$PID` to have exit code `137`; terminating a script with `Ctrl-c` sets the exit code to `130` (`Ctrl-c` is fatal error signal `2` (`SIGINT`), and `128` + `2` = `130`)
255	Exit status out of range (caused by `exit -1` or lower; calling `exit 1000` is like calling `exit 232`, since `1000` % `256` = `232`)

Exit code constants from /usr/include/sysexits.h:

Constant	Value	Description
`EX_OK`	`0`	successful termination
`EX_USAGE`	`64`	command line usage error
`EX_DATAERR`	`65`	data format error
`EX_NOINPUT`	`66`	cannot open input
`EX_NOUSER`	`67`	addressee unknown
`EX_NOHOST`	`68`	host name unknown
`EX_UNAVAILABLE`	`69`	service unavailable
`EX_SOFTWARE`	`70`	internal software error
`EX_OSERR`	`71`	system error (e.g., can’t fork)
`EX_OSFILE`	`72`	critical OS file missing
`EX_CANTCREAT`	`73`	can’t create (user) output file
`EX_IOERR`	`74`	input/output error
`EX_TEMPFAIL`	`75`	temp failure; user is invited to retry
`EX_PROTOCOL`	`76`	remote error in protocol
`EX_NOPERM`	`77`	permission denied
`EX_CONFIG`	`78`	configuration error

“Errno” codes (from errno --list):

Constant	Value	Description
`EPERM`	`1`	Operation not permitted
`ENOENT`	`2`	No such file or directory
`ESRCH`	`3`	No such process
`EINTR`	`4`	Interrupted system call
`EIO`	`5`	Input/output error
`ENXIO`	`6`	Device not configured
`E2BIG`	`7`	Argument list too long
`ENOEXEC`	`8`	Exec format error
`EBADF`	`9`	Bad file descriptor
`ECHILD`	`10`	No child processes
`EDEADLK`	`11`	Resource deadlock avoided
`ENOMEM`	`12`	Cannot allocate memory
`EACCES`	`13`	Permission denied
`EFAULT`	`14`	Bad address
`ENOTBLK`	`15`	Block device required
`EBUSY`	`16`	Resource busy
`EEXIST`	`17`	File exists
`EXDEV`	`18`	Cross-device link
`ENODEV`	`19`	Operation not supported by device
`ENOTDIR`	`20`	Not a directory
`EISDIR`	`21`	Is a directory
`EINVAL`	`22`	Invalid argument
`ENFILE`	`23`	Too many open files in system
`EMFILE`	`24`	Too many open files
`ENOTTY`	`25`	Inappropriate ioctl for device
`ETXTBSY`	`26`	Text file busy
`EFBIG`	`27`	File too large
`ENOSPC`	`28`	No space left on device
`ESPIPE`	`29`	Illegal seek
`EROFS`	`30`	Read-only file system
`EMLINK`	`31`	Too many links
`EPIPE`	`32`	Broken pipe
`EDOM`	`33`	Numerical argument out of domain
`ERANGE`	`34`	Result too large
`EAGAIN`	`35`	Resource temporarily unavailable
`EWOULDBLOCK`	`35`	Resource temporarily unavailable
`EINPROGRESS`	`36`	Operation now in progress
`EALREADY`	`37`	Operation already in progress
`ENOTSOCK`	`38`	Socket operation on non-socket
`EDESTADDRREQ`	`39`	Destination address required
`EMSGSIZE`	`40`	Message too long
`EPROTOTYPE`	`41`	Protocol wrong type for socket
`ENOPROTOOPT`	`42`	Protocol not available
`EPROTONOSUPPORT`	`43`	Protocol not supported
`ESOCKTNOSUPPORT`	`44`	Socket type not supported
`ENOTSUP`	`45`	Operation not supported
`EPFNOSUPPORT`	`46`	Protocol family not supported
`EAFNOSUPPORT`	`47`	Address family not supported by protocol family
`EADDRINUSE`	`48`	Address already in use
`EADDRNOTAVAIL`	`49`	Can’t assign requested address
`ENETDOWN`	`50`	Network is down
`ENETUNREACH`	`51`	Network is unreachable
`ENETRESET`	`52`	Network dropped connection on reset
`ECONNABORTED`	`53`	Software caused connection abort
`ECONNRESET`	`54`	Connection reset by peer
`ENOBUFS`	`55`	No buffer space available
`EISCONN`	`56`	Socket is already connected
`ENOTCONN`	`57`	Socket is not connected
`ESHUTDOWN`	`58`	Can’t send after socket shutdown
`ETOOMANYREFS`	`59`	Too many references: can’t splice
`ETIMEDOUT`	`60`	Operation timed out
`ECONNREFUSED`	`61`	Connection refused
`ELOOP`	`62`	Too many levels of symbolic links
`ENAMETOOLONG`	`63`	File name too long
`EHOSTDOWN`	`64`	Host is down
`EHOSTUNREACH`	`65`	No route to host
`ENOTEMPTY`	`66`	Directory not empty
`EPROCLIM`	`67`	Too many processes
`EUSERS`	`68`	Too many users
`EDQUOT`	`69`	Disc quota exceeded
`ESTALE`	`70`	Stale NFS file handle
`EREMOTE`	`71`	Too many levels of remote in path
`EBADRPC`	`72`	RPC struct is bad
`ERPCMISMATCH`	`73`	RPC version wrong
`EPROGUNAVAIL`	`74`	RPC prog. not avail
`EPROGMISMATCH`	`75`	Program version wrong
`EPROCUNAVAIL`	`76`	Bad procedure for program
`ENOLCK`	`77`	No locks available
`ENOSYS`	`78`	Function not implemented
`EFTYPE`	`79`	Inappropriate file type or format
`EAUTH`	`80`	Authentication error
`ENEEDAUTH`	`81`	Need authenticator
`EPWROFF`	`82`	Device power is off
`EDEVERR`	`83`	Device error
`EOVERFLOW`	`84`	Value too large to be stored in data type
`EBADEXEC`	`85`	Bad executable (or shared library)
`EBADARCH`	`86`	Bad CPU type in executable
`ESHLIBVERS`	`87`	Shared library version mismatch
`EBADMACHO`	`88`	Malformed Mach-o file
`ECANCELED`	`89`	Operation canceled
`EIDRM`	`90`	Identifier removed
`ENOMSG`	`91`	No message of desired type
`EILSEQ`	`92`	Illegal byte sequence
`ENOATTR`	`93`	Attribute not found
`EBADMSG`	`94`	Bad message
`EMULTIHOP`	`95`	EMULTIHOP (Reserved)
`ENODATA`	`96`	No message available on STREAM
`ENOLINK`	`97`	ENOLINK (Reserved)
`ENOSR`	`98`	No STREAM resources
`ENOSTR`	`99`	Not a STREAM
`EPROTO`	`100`	Protocol error
`ETIME`	`101`	STREAM ioctl timeout
`EOPNOTSUPP`	`102`	Operation not supported on socket
`ENOPOLICY`	`103`	Policy not found
`ENOTRECOVERABLE`	`104`	State not recoverable
`EOWNERDEAD`	`105`	Previous owner died
`EQFULL`	`106`	Interface output queue is full
`ELAST`	`106`	Interface output queue is full

Idioms, snippets, and examples

${1-default.txt}

Evaluates to default.txt if $1 is undefined, otherwise uses whatever the value of $1 is (even if it’s the empty string).
mkdir -p /tmp/test && cd $_

Make a directory (if needed) and navigate into it in the same breath.
bc <<< "2 + 2"

Send a simple string on STDIN. Equivalent to echo "2 + 2" | bc.
lsof -p $$ or ls -l /dev/fd

List all file descriptors open in current shell.
some_cmd 2>&1 >/dev/null

Discard STDOUT, Redirect STDERR to STDOUT. Order matters: >/dev/null 2>&1 does not work. Depending on the behavior of some_cmd, you can replace >/dev/null with >&-, but if some_cmd writes to STDOUT without checking if it’s open, or does not catch write errors on that front, /dev/null is required.
declare -p PATH (equivalent to typeset -p):

Writes a bit of shell script to STDOUT, that, when source‘d, would replicate the current value of the PATH environment variable.
set -x

Use this to debug; it prints all bash calls to /dev/stderr

Turn it off with set +x
set -e

Exit script immediately if anything throws / exits with an error
set -u

Throw an error if we use an unset variable.
set -o pipefail

Set the overall result of a series of piped commands to the exit code of the last command to exit with a non-zero status.
set -euo pipefail

Combination of the three preceding options: exit on error, exit on unset variables, and accumulate errors in pipeline.

⭐ best practice!
mktemp -d

This will safely create a new directory in $TMPDIR, and write the full path to /dev/stdout. This is the only argument structure that works equivalently on both Linux and macOS (at least, macOS 10.11). It will not take responsibility for deleting the file at any point.
TMPDIR=${TMPDIR-/tmp/}

Ensure that the $TMPDIR variable exists, using a default of /tmp/ if $TMPDIR has no existing value (or is empty). (On macOS, TMPDIR is set to something like /var/folders/m8/ga778wd7rv9g0p_3cqhpti400qdfa7/T/, so I’ve applied a trailing slash to the default, for consistency.)
DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

This sets DIR to the absolute pathname of the directory containing the script. You should call this early, before the script has had a chance to cd to any other working directory.

When the script is executed directly because it’s on the PATH, BASH_SOURCE[0] is the full path to the script. But if it’s executed as an argument to bash, BASH_SOURCE[0] will be set to exact value of that argument. The cd does the real work of resolving and simplifying any relative or symbolic path components, but by running within a subshell, it does not affect the root-level current working directory.

In both cases, the initial working directory is the directory the script was called from. In the latter case, when BASH_SOURCE[0] is a relative directory to begin with and so the working directory is relevant. Any intermediate cd calls will probably render BASH_SOURCE[0] an invalid path, which is why it’s important to call this snippet before anything else might alter the working directory.

This method will not resolve symlinks in the path to their linked source names.

Source: Getting the source directory of a Bash script from within
command -v some_cmd (POSIX compliant)

This has exit code 0 if some_cmd is something you can call, e.g., a binary on PATH, an alias, or a shell function. It also prints a description of some_cmd to /dev/stdout when it succeeds, so you’ll probably want to 2>/dev/null.

type some_cmd and hash some_cmd are more flexible/fast/beneficial alternatives, but are only reliable in bash. hash some_cmd is successful if some_cmd is a command on the PATH or a function (but fails for aliases). type some_cmd covers all the bases, like command -v some_cmd, but has a type -P some_cmd variant that limits success to when some_cmd is on the PATH. (type also prints out the source code when given a function, which can be handy.)

Source: Check if a program exists from a Bash script
function doit { ...; } vs. function doit () { ...; } vs. doit() { ...; }

These are practically equivalent, but the version without the function keyword (in which case the parentheses are required) is preferred, due to increased compatibility / portability with the POSIX spec.

ctrl+z        # suspend (to resume in the same session, call: bg)
disown %1     # disown
tmux          # launch another shell in screen / tmux / etc.
pgrep sommat-in-my-process   # find PID
reptyr 99999  # reparent process (where 99999 is the PID from above)

This will transfer the current shell’s running process to another parent.^ref

Depends on reptyr.

Templates / boilerplate

Simple bash script preamble:

#!/usr/bin/env bash
set -e # exit immediately on error

usage() {
  >&2 cat <<HELP
Usage: $(basename "$0") [-h|--help]

<What the script does>
HELP
}

while [[ $# -gt 0 ]]; do
  case $1 in
    -h|--help)
      usage
      exit 0
      ;;
    -v|--verbose)
      >&2 printf 'Entering debug (verbose) mode.\n'
      set -x
      ;;
    *)
      usage
      >&2 printf '\nUnrecognized option: %s\n' "$1"
      exit 1
      ;;
  esac
  shift
done

<do something>