bash – Ryan Schulze

Passing arrays to bash functions

June 2, 2024June 2, 2024Ryanarrays, bash, functions

Short version: yes, you can pass an array to a Bash function. You can also manipulate the array contents within the function to pass information back. It is easy and has been supported since Bash version 4.3, I believe.

#!/usr/bin/env bash

do_stuff() {
  local -n internal_array=${1}
  # reads and writes to $internal_array[] are applied to the array passed to the function
}

declare -A main_array

# important, the name of the array is passed as an argument (i.e. without the $), not the array itself.
do_stuff main_array

#!/usr/bin/env bash

do_stuff() {

local -n internal_array=${1}

# reads and writes to $internal_array[] are applied to the array passed to the function

}

declare -A main_array

# important, the name of the array is passed as an argument (i.e. without the $), not the array itself.

do_stuff main_array

Long version: I recently watched a talk by some very knowledgeable people about Bash, where they delved deep into its internals and quirks. At one point, they discussed passing information to and from functions without creating subshells. The solution became quite convoluted, and I was surprised because the whole time I was thinking, “just use nameref“.

Out of curiosity, I searched online, but unfortunately, the internet is full of responses like “doesn’t work,” “Bash can’t do that” and many variations of “just pass all the values of the array to the function as arguments and piece them together again inside the function” (which is a terrible solution since you lose the keys). There are a few posts here and there suggesting local -n as a solution, but they are rare, and especially on sites like Stack Overflow, they are not the top answers.

In a nutshell what we are going to do is pass a reference to an array to a function (think “pointers” or “symlinks” if that helps).

Relevant parts of the bash man page for declare and local:

declare -n
Give each name the nameref attribute, making it a name reference to another variable. That other variable is defined by the value of name. All references, assignments, and attribute modifications to name, except for those using or changing the -n attribute itself, are performed on the variable referenced by name’s value. The nameref attribute cannot be applied to array variables.
local
For each argument, a local variable named name is created, and assigned value. The option can be any of the options accepted by declare. local can only be used within a function; it makes the variable name have a visible scope restricted to that function and its children.

So, in summary, we can use local to define variables with a scope limited to the function they are defined in, and local accepts all the options that declare supports.

It seems the “The nameref attribute cannot be applied to array variables.” part of the declare definition causes a lot of confusion or deters people from trying to use it for referencing arrays.
What it means is that you can’t do a local -n my_array=() (i.e. applying the nameref attribute to an array), but local -n my_array is fine (where my_array is a variable with the nameref attribute which can also point to an array).

Enough theory, let’s get down to practical examples.

We will create a function called do_stuff that:

takes the name of an array as argument $1
reads the length key from the array
add a random key to the array with a random number the length of the length key previously read

Then we will create an array outside of the function with some keys/values, pass it to the do_stuff function, and then output the contents

#!/usr/bin/env bash

do_stuff() {
  local -n internal_array=${1}
  local length

  # read values from the array passed to the function
  length=${internal_array["length"]}

  # add an entry to the array
  internal_array["random"]="${RANDOM:0:$length}"
}
declare -A main_array
  
main_array[name]="one"
main_array[length]="5"

# important, the name of the array is passed as an argument (i.e. without the $), not the array itself.
do_stuff main_array

# easier to use declare to print the key/values of the associative array then iterating over all keys and printing the respective values
declare -p main_array

#!/usr/bin/env bash

do_stuff() {

local -n internal_array=${1}

local length

# read values from the array passed to the function

length=${internal_array["length"]}

# add an entry to the array

internal_array["random"]="${RANDOM:0:$length}"

}

declare -A main_array

main_array[name]="one"

main_array[length]="5"

# important, the name of the array is passed as an argument (i.e. without the $), not the array itself.

do_stuff main_array

# easier to use declare to print the key/values of the associative array then iterating over all keys and printing the respective values

declare -p main_array

declare -A main_array=([length]="5" [random]="25778" [name]="one" )

1	declare -A main_array=([length]="5" [random]="25778" [name]="one" )

So this shows us that the do_stuff function could read from the array (the length value), and write to the array (add the random number key/value), and the changes were applied to the array outside of the function. (where we did the declare -p). Bonus points for not needing a subshell.

Using this “trick” allows us to pass more complex information to a function, and especially receive more complex information from a function.

There is one caveat: you can’t use the same name for the array inside the function as well as outside. I wouldn’t advise doing this anyway for readability reasons, as the variable’s scope can become confusing. If you try it you get the following output:

local: warning: array: circular name reference

1	local: warning: array: circular name reference

Oddly I often noticed the following statement on Stack Overflow about nameref:

“This only works if the array is defined as a global”
Nope, works just as fine passing an array locally scoped to a function to another function this way.

#!/usr/bin/env bash

do_stuff() {
  local -n internal_array=${1}
  local length

  # read values from the array passed to the function
  length=${internal_array["length"]}

  # add an entry to the array
  internal_array["random"]="${RANDOM:0:$length}"
}

main() {
  local -A main_array
  
  main_array[name]="one"
  main_array[length]="5"

  # important, the name of the array is passed as an argument (i.e. without the $), not the array itself.
  do_stuff main_array

  declare -p main_array
}

main

#!/usr/bin/env bash

do_stuff() {

local -n internal_array=${1}

local length

# read values from the array passed to the function

length=${internal_array["length"]}

# add an entry to the array

internal_array["random"]="${RANDOM:0:$length}"

}

main() {

local -A main_array

main_array[name]="one"

main_array[length]="5"

# important, the name of the array is passed as an argument (i.e. without the $), not the array itself.

do_stuff main_array

declare -p main_array

}

main

declare -A main_array=([length]="5" [random]="28903" [name]="one" )

1	declare -A main_array=([length]="5" [random]="28903" [name]="one" )

(I prefer using a main() function like in this example to avoid global variables unless explicitly defined)

So, there you have it: an easy way to pass an array to a function in bash, no weird looping over values. And a better way to receive information from an array than the byte return value and parsing the output of the function.

How to colorize manpages

December 15, 2017April 1, 2022Ryanbash

I’m surprised I’ve never posted this here before. Turning manpages from monochrome to color is super easy.

There are a few LESS_TERMCAP_* environment variables you can adjust. Here is a list of useful ones to change

ks      make the keypad send commands
ke      make the keypad send digits
vb      emit visual bell
mb      start blink
md      start bold
me      turn off bold, blink and underline
so      start standout (reverse video)
se      stop standout
us      start underline
ue      stop underline

ks make the keypad send commands

ke make the keypad send digits

vb emit visual bell

mb start blink

md start bold

me turn off bold, blink and underline

so start standout (reverse video)

se stop standout

us start underline

ue stop underline

I prefer to only set them for man, so I put this little function in my ~/.bashrc

man () {
  LESS_TERMCAP_mb=$(tput setaf 4)\
  LESS_TERMCAP_md=$(tput setaf 4;tput bold) \
  LESS_TERMCAP_so=$(tput setaf 7;tput setab 4;tput bold) \
  LESS_TERMCAP_us=$(tput setaf 6) \
  LESS_TERMCAP_me=$(tput sgr0) \
  LESS_TERMCAP_se=$(tput sgr0) \
  LESS_TERMCAP_ue=$(tput sgr0) \
  command man "$@"
}

man () {

LESS_TERMCAP_mb=$(tput setaf 4)\

LESS_TERMCAP_md=$(tput setaf 4;tput bold) \

LESS_TERMCAP_so=$(tput setaf 7;tput setab 4;tput bold) \

LESS_TERMCAP_us=$(tput setaf 6) \

LESS_TERMCAP_me=$(tput sgr0) \

LESS_TERMCAP_se=$(tput sgr0) \

LESS_TERMCAP_ue=$(tput sgr0) \

command man "$@"

}

Bash function for easily watching logs and colorizing the output

October 10, 2017April 1, 2022Ryanbash

Another useful bash function I have on my servers. It’s a wrapper around tail -F and ccze . It will look for a log file (prepends /var/log/ to the patch if it can’t find it), and pipes it into ccze for colorizing the output. Handy if you find yourself watching logs. I mostly use it for dhcp/tftp/mail where I don’t have a huge amount of traffic (i.e. can watch it in real time) and am expecting an event/log entry.

logwatch() {
  local logfile=
  for path in '' /var/log/; do
    if [[ -r "${path}${1}" ]]; then
      tail -n50 -F "${path}${1}" | ccze ${2:+--plugin} ${2} --mode ansi --convert-date
      return
    fi
  done
}

logwatch() {

local logfile=

for path in '' /var/log/; do

if [[ -r "${path}${1}" ]]; then

tail -n50 -F "${path}${1}" | ccze ${2:+--plugin} ${2} --mode ansi --convert-date

return

done

}

Usage:

1 2	logwatch kern.log logwatch /var/log/apache2/access.log

Using regex comparision in bash and BASH_REMATCH

September 8, 2017September 8, 2017Ryanbash

Bash supports regular expressions in comparisons via the =~ operator. But what is rarely used or documented is that you can use the ${BASH_REMATCH[n]} array to access successful matches (back-references to capture groups). So if you use parentheses for grouping () in your regex, you can access the content of that group.

Here is an example where I am parsing date placeholders in a text with an optional offset (e.g. |YYYY.MM.DD|+2 ). Storing the format and offset in separate groups:

while read -r line; do
	while [[ ${line} =~ \|([YMD\\/\ .-]+)\|(\+*[0-9]*) ]]; do
		dateformat=${BASH_REMATCH[1]}
		dateformat=${dateformat/YYYY/%Y}
		dateformat=${dateformat/MMMM/%B}
		dateformat=${dateformat/MM/%m}
		dateformat=${dateformat/DD/%d}
		offset='now'
		[[ ! -z ${BASH_REMATCH[2]} ]] && offset="${BASH_REMATCH[2]} days"
		line=${line/|${BASH_REMATCH[1]}|${BASH_REMATCH[2]}/$(date "+${dateformat}" --date="${offset}")}
	done
	echo "${line}"
done < input

while read -r line; do

while [[ ${line} =~ \|([YMD\\/\ .-]+)\|(\+*[0-9]*) ]]; do

dateformat=${BASH_REMATCH[1]}

dateformat=${dateformat/YYYY/%Y}

dateformat=${dateformat/MMMM/%B}

dateformat=${dateformat/MM/%m}

dateformat=${dateformat/DD/%d}

offset='now'

[[ ! -z ${BASH_REMATCH[2]} ]] && offset="${BASH_REMATCH[2]} days"

line=${line/|${BASH_REMATCH[1]}|${BASH_REMATCH[2]}/$(date "+${dateformat}" --date="${offset}")}

done

echo "${line}"

done < input

|YYYY.MM.DD|

|YYYY.MM.DD|+7

|YYYY-MM-DD|

|YYYY-MM-DD|+14

|MMMM YYYY|

|YYYY/MM|

|MM/YYYY|

This is a sentence containing a timestamp (|YYYY.MM.DD|+7) with an offset.

This is another sentence containing multiple timstamps between |YYYY.MM.DD| and |YYYY.MM.DD|+7.

2017.09.08

2017.09.15

2017-09-08

2017-09-22

September 2017

2017/09

09/2017

This is a sentence containing a timestamp (2017.09.15) with an offset.

This is another sentence containing multiple timstamps between 2017.09.08 and 2017.09.15.

Multiply floats by 10,100, … in bash

August 22, 2017September 8, 2017Ryanbash

A short one today. Bash can only handle integer numbers and not floats, so when someone searches the internet on how to use math on floats in bash the solution they find is usually “use bc” and looks something like this:

$ f=12.3456
$ bc -l <<< "${f} * 10"
123.4560

$ f=12.3456

$ bc -l <<< "${f} * 10"

123.4560

Or if they want the result to be an integer:

$ f=12.3456
$ bc -l <<< "scale=0; ${f} * 10 /1"
123

$ f=12.3456

$ bc -l <<< "scale=0; ${f} * 10 /1"

123

It’s a fine solution, and readable (which can mean a lot for people maintaining scripts). But if all you want to do is multiply by 10,100,1000, … you can achieve this faster with a bit of string manipulation:

$ f=12.3456
$ _sub="${f#*.}"
$ echo "${f%.*}${_sub:0:1}.${_sub:1}"

$ f=12.3456

$ _sub="${f#*.}"

$ echo "${f%.*}${_sub:0:1}.${_sub:1}"

It just splits the number into two strings, and assembles it again with the decimal shifted. Have a look at substring_removal and substring_expansion for more examples on how to modify strings in bash. I’d highly suggest either sticking this in a separate function, or commenting the code since it isn’t necessarily obvious what is going on

Since it is all pure bash and doesn’t need to spawn external commands, it quicker (not that bc is slow, but if you are doing a lot of calculations, it can add up). I know what you are thinking “if your goal is speed, you shouldn’t be using bash”, that doesn’t mean we can’t write efficient code.