{{Header}} {{title|title= /bin/bash - Proper Whitespace Handling - Whitespace Safety - End-of-Options Parameter Security }} {{#seo: |description=Supporting multiple command line parameters with spaces in wrapper scripts, safe output handling, and use of the end-of-options parameter (--). }} {{coding_style_mininav}} {{intro| Supporting multiple command line parameters with spaces in wrapper scripts, safe output handling, and use of the end-of-options parameter (--) for better safety. }} = Summary = * Shell options: Use set -o errexit, set -o nounset, set -o errtrace, and set -o pipefail. * Quoted expansions: Quote variable expansions and prefer ${variable} style. * Array-based command building: Build command lines with arrays. On POSIX sh, use set --. * End-of-options marker: Use the end-of-options parameter -- where supported, after options and before positional parameters. * Long options: Prefer long option names over short flags when sensible. * Safe output: Do not use echo. Use printf with an explicit format string, or stecho for terminal output. * Safe line input: Use IFS= read -r for line-oriented input. * Checking unset variables: Under nounset, check explicitly whether variables exist. * Local variable declaration: Declare local variables first, then assign to them on a separate line. * Dynamic scoping awareness: Remember that Bash local uses dynamic scoping, and do not localize BASH_REMATCH. * Loop subshell avoidance: Avoid piping into while read loops, because that can create a subshell. * Stdin protection: Prevent stdin stealing inside loops by using a separate file descriptor when needed. * NUL-delimited input: Use NUL-delimited input where required, for example with find -files0-from. * Pipefail caution: Be careful with pipefail when piping into early-exiting consumers such as grep --quiet. * Readable project style: Prefer explicit, readable, whitespace-safe code and project helper functions where appropriate. = Safe ways to print = For this style guide, do not use echo. Use printf with an explicit format string instead. * {{VideoLink |videoid=lq98MM2ogBk |text=bash's echo command is broken }} * {{VideoLink |videoid=ft0_cw54qak |text=echo is broken: a follow-up video }} shellcheck bug reports: * [https://github.com/koalaman/shellcheck/issues/2674 Warn on echo "$var" when $var might be -e #2674] * [https://github.com/koalaman/shellcheck/issues?q=is%3Aissue+is%3Aopen+echo+in%3Atitle Open shellcheck issues related to echo] Please note that printf does not have a default format specifier. The first positional parameter is always treated as the format string. When the format is omitted, untrusted data can be interpreted as format directives or backslash escapes. It is always recommended to be explicit about the format being used. Normally, there is no need to interpret escape sequences from a variable. Therefore, use the printf format specifier %s when the data is not printed to a terminal: {{CodeSelect|code= var="$(printf '%s' "${untrusted_text}")" }} printf '%s\n' "message here" is the usual replacement for echo "message here". If you require escapes to be interpreted, interpret them on a per-need basis: {{CodeSelect|code= red="$(printf '%b' "\e[31m")" # red=$'\e[31m' # printf -v red '%b' "\e[31m" nocolor="$(printf '%b' "\e[m")" # nocolor=$'\e[m' # printf -v nocolor '%b' "\e[m" }} Escapes that are already interpreted can then be printed with %s: {{CodeSelect|code= var="$(printf '%s' "${red} ${untrusted_text} ${nocolor}")" }} And this is why you should use stecho when printing to the terminal, because it sanitizes unsafe characters ([[unicode]]). Simply using printf '%s' is not sufficient when escapes are already interpreted: {{CodeSelect|code= stecho "${red} ${untrusted_text} ${nocolor}" printf '%s' "${red} ${untrusted_text} ${nocolor}" {{!}} stecho printf '%s' "${red} ${untrusted_text} ${nocolor}" {{!}} stecho {{!}} less -R }} '''Rule of thumb''': * echo: Never. * printf: Whenever the printed data is not used by a terminal. ** Format %b: Only for trusted data or fixed literals. ** Format %s: With any data. * stecho: Whenever the printed data is used by a terminal. ** When not using stecho: When stecho cannot reasonably be considered available, such as during early build steps when building Kicksecure from source code using derivative-maker. Resources: * [https://github.com/anordal/shellharden/blob/master/how_to_do_things_safely_in_bash.md#echo--printf shellharden: echo / printf] * [https://unix.stackexchange.com/questions/65803/why-is-printf-better-than-echo Unix & Linux Stack Exchange: Why is printf better than echo?] * [https://pubs.opengroup.org/onlinepubs/9799919799/utilities/echo.html POSIX echo specification] = Bash Proper Whitespace Handling = * Quote variables. * Build parameters using arrays. * Enforce nounset. * Use end-of-options. * Style: use long option names.
#!/bin/bash

## https://yakking.branchable.com/posts/whitespace-safety/

#set -x
set -o errexit
set -o nounset
set -o errtrace
set -o pipefail

lib_dir="/tmp/test/lib/program with space/something spacy"
main_app_dir="/tmp/test/home/user/folder with space/abc"

mkdir --parents -- "${lib_dir}"
mkdir --parents -- "${main_app_dir}"

declare -a cmd_list

cmd_list+=("cp")
cmd_list+=("--recursive")
cmd_list+=("--")
cmd_list+=("${lib_dir}")
cmd_list+=("${main_app_dir}/")

printf '%s\n' "cmd_list has ${#cmd_list[@]} items"

## Execution example.
"${cmd_list[@]}"

## 'for' loop example.
for cmd_item in "${cmd_list[@]}"; do
    printf '%s\n' "cmd_item: '$cmd_item'"
done

## Alternative.
cmd_alt_list=(
    cp               ## program
    --recursive      ## recursive
    --               ## stop option parsing (protects against paths that begin with '-')
    "$lib_dir"       ## source directory
    "$main_app_dir/" ## destination
)

## 'for' loop example.
for cmd_alt_item in "${cmd_alt_list[@]}"; do
    printf '%s\n' "cmd_alt_item: '$cmd_alt_item'"
done
= Why nounset = Without nounset, an unset variable silently expands to an empty string. That can turn a dangerous path into something unintended.
rm -- "/$UNSET_VAR"
If UNSET_VAR is unset and nounset is disabled, this becomes:
rm -- "/"
On many systems that will fail with an error such as:
rm: cannot remove '/': Is a directory
That specific command happens to fail here, but the pattern is still unsafe. With set -o nounset, the shell aborts earlier before running rm. Setting UNSET_VAR="" would not solve the general problem either. Variables that may intentionally be empty should be handled explicitly. = local = == Error swallowing == Note: {{CodeSelect|code= local testvar=$(false) }} Expected: error Actual: no error When declaration and assignment are combined on the same line, local itself returns success and masks the failing command substitution. Better: {{CodeSelect|code= local testvar testvar=$(false) }} == Dynamic scoping == local variables in Bash use dynamic scoping. That means nested function calls can still read and modify them unless they declare their own local variable. Example: {{CodeSelect|code= fn_01 () { local myvar myvar='supposedly local' printf '%s\n' "in fn_01, myvar is $myvar" fn_02 printf '%s\n' "in fn_01, myvar is now $myvar" } fn_02 () { printf '%s\n' "in fn_02, myvar is $myvar" myvar='not so local after all' printf '%s\n' "in fn_02, myvar is now $myvar" } fn_01 }} Output:
in fn_01, myvar is supposedly local
in fn_02, myvar is supposedly local
in fn_02, myvar is now not so local after all
in fn_01, myvar is now not so local after all
To avoid problems from this, declare all function-local variables as local at the head of a function. For example: {{CodeSelect|code= fn_01 () { local myvar myvar='local to fn_01' printf '%s\n' "in fn_01, myvar is $myvar" fn_02 printf '%s\n' "in fn_01, myvar is now $myvar" } fn_02 () { local myvar myvar='local to fn_02' printf '%s\n' "in fn_02, myvar is $myvar" } fn_01 }} Output:
in fn_01, myvar is local to fn_01
in fn_02, myvar is local to fn_02
in fn_01, myvar is now local to fn_01
== BASH_REMATCH == Do not local -a BASH_REMATCH! {{quotation |quote=Note specifically: Bash sets BASH_REMATCH in the global scope; declaring it as a local variable will lead to unexpected results. |context=[https://www.gnu.org/software/bash/manual/bash.html GNU Bash manual] }} = POSIX array = On a POSIX shell, positional parameters provide the portable array-like container. $@ has different scope per function or main script. You can build it with set --: Add items to the array:
set -- a b c
Add items to the beginning or end of the array:
set -- b
set -- a "$@" c
= Use of End-of-Options Parameter (--) = The end-of-options parameter "--" is important because otherwise inputs might be mistaken for command options. This can even become a security issue. Here are examples using the sponge command: {{CodeSelect|code= sponge -a testfilename testfilename" does not look like an option. {{CodeSelect|code= sponge -a --testfilename --testfilename" as options:
sponge: invalid option -- '-'
sponge: invalid option -- 't'
sponge: invalid option -- 'e'
...
{{CodeSelect|code= sponge -a -- --testfilename -- signals that "--testfilename" is a filename, not an option. Conclusion: * The -- parameter marks the end of command options. * Place -- after all command options and before filenames or other positional parameters, where the command supports it. * This technique is applicable to many Unix/Linux commands, not just sponge. * It is especially useful when input may begin with -. = nounset - Check if Variable Exists =
#!/bin/bash

set -o errexit
set -o nounset
set -o errtrace
set -o pipefail

## Enable for testing.
#unset HOME

if [ -z "${HOME+x}" ]; then
    printf '%s\n' "Error: HOME is not set." >&2
    exit 1
fi

printf '%s\n' "$HOME"
= Safely Using Find with NUL-Delimited Input = Example: Note: The variable could be different. It could, for example, be --/usr. {{CodeSelect|code= folder_name="/usr" }} {{CodeSelect|code= printf '%s\0' "${folder_name}" {{!}} find -files0-from - -perm /u=s,g=s -print0 }} Do not use stecho or stprint here, because find -files0-from requires NUL-delimited input. NUL ("\0") is required because: {{quotation |quote=The starting points in file have to be separated by ASCII NUL characters. Two consecutive NUL characters, i.e., a starting point with a Zero-length file name is not allowed and will lead to an error diagnostic followed by a non-Zero exit code later. |context=[https://manpages.debian.org/unstable/findutils/find.1.en.html Debian find man page] }} A single trailing NUL is normal. Two consecutive NUL bytes would mean an empty file name entry, which is invalid. = loops = == subshells created by pipelines == Avoid piping data into a loop. This spawns a subshell even without using $() syntax. Bad code example:
str="abc
def
ghi"
line_count=0

printf '%s\n' "${str}" | while IFS= read -r line; do
  ((line_count += 1))
done

printf '%s\n' "${line_count}"

## Expected result: 3
## Actual result: 0
Instead, redirect command output into the loop. Good code example:
str="abc
def
ghi"
line_count=0

while IFS= read -r line; do
  ((line_count += 1))
done < <(printf '%s\n' "${str}")

printf '%s\n' "${line_count}"

## Result: 3
== stdin stealing == Commands that read from stdin can swallow data that was supposed to be processed by the read component of a while read loop. qrexec-client-vm is one example, and vim is another. Bad code example:
str="abc
def
ghi"

while IFS= read -r line; do
  vim -- "$line"
done < <(printf '%s\n' "${str}")

## Output:
##
## Vim: Warning: Input is not from a terminal
## Vim: Error reading input, exiting...
## Vim: preserving files...
## Vim: Finished.
Work around this by using alternative file descriptors and redirection. Good code example:
str="abc
def
ghi"

while IFS= read -r line 0<&3; do
  vim -- "$line"
done 3< <(printf '%s\n' "${str}")

## Result: Opens "abc", then "def", then "ghi" in Vim.
= misc =
base_name="${file_name##*/}"
file_extension="${base_name##*.}"
= coding style = * no workarounds for older Bash versions. Assume the Bash version of Debian {{Stable project version based on Debian codename}}. * prefer explicit, readable, whitespace-safe code over compact shell tricks * use long options rather than short options when sensible, for example use cp --recursive instead of cp -r * no trailing whitespaces allowed in source code files * all source code files must have a newline at the end * no git style symlinks ([[Git#git_symlinks|git symlinks]]) (text file without newline at the end) because of past [https://security.snyk.io/vuln/SNYK-UNMANAGED-GITGIT-2372015 git symlink CVE] * avoid [[unicode]] whenever possible. See also [[unicode-show]]. * use: ** shellcheck ** avoid rm when safe-rm is appropriate * https://github.com/MrMEEE/bumblebee-Old-and-abbandoned/issues/123 * https://github.com/valvesoftware/steam-for-linux/issues/3671 ** avoid wget and curl in project code, prefer scurl ([[Secure Downloads]]) ** avoid grep for simple string matching in project code, use str_match ** str_replace ** append-once ** overwrite * use ${variable} style * use shell options
set -o errexit
set -o nounset
set -o errtrace
set -o pipefail
* do not use: ** which, use command -v instead. This is because which is an external binary, whereas command is a shell built-in. * file name extensions: ** POSIX sh libraries: .sh ** Bash libraries: .bsh ** Executables: no file name extension ** (executables = scripts that can be run but cannot be sourced, libraries = scripts that can be sourced but may optionally be run as well) = pipefail and early-exiting consumers = This combination can be an issue because the consumer may exit early and the producer may then receive SIGPIPE (broken pipe).
#!/bin/bash

set -o errexit
set -o nounset
set -o errtrace
set -o pipefail

for i in {1..10000}; do
  printf '%s\n' "0"
done | grep --quiet -- "0"
This can fail even though grep --quiet finds a match. grep --quiet exits as soon as it has enough input, while the producer may still be writing. With pipefail enabled, the producer's non-zero exit status can then make the whole pipeline fail. Guideline: * Avoid producer {{!}} grep --quiet -- pattern when pipefail is enabled. * Prefer matching directly against a variable or file when possible. * In project code, prefer helper functions such as str_match where they fit the use case. * If an early-exit consumer is intentional, handle exit statuses explicitly instead of assuming the pipeline is harmless. = Improved Error Handler = Inspired by [https://github.com/pottmi/stringent.sh stringent.sh] {{CodeSelect|code= if (( "$BASH_SUBSHELL" >= 1 )); then kill "$$" fi }} Usually not needed. When a subshell detects an error due to errexit and errtrace, it returns a non-zero exit status and the parent shell also sees the failure. Preventing the error handler from running twice is only useful in rare cases. = Resources = * [https://github.com/anordal/shellharden/blob/master/how_to_do_things_safely_in_bash.md shellharden: How to do things safely in bash] * [https://dwheeler.com/essays/fixing-unix-linux-filenames.html David A. Wheeler: Fixing Unix/Linux/POSIX filenames] * use with care: {{VideoLink |videoid=DvDu8_A2uhs |text=Seat Belts and Airbags for bash }} ** use with care: [https://github.com/pottmi/stringent.sh stringent.sh] = See Also = * [[Dev/coding style]] = Footnotes = {{Footer}} [[Category:Design]]