[Linuxcommand-announce] [LinuxCommand.org: Tips, News And Rants] Script: average
Brought to you by:
bshotts
|
From: William S. <bs...@pa...> - 2010-04-01 21:22:34
|
A few weeks ago, I was cruising the Ubuntu forums and came across a
question from a poster who wanted to find the average of a series of
floating-point numbers. The numbers were extracted from some other
command and were output in a column. He wanted a command line
incantation that would take the column of numbers and return the
average. Several people answered this query with clever one-line
solutions, however I thought that this problem would be a good task for
a script to perform. Using a script, one could have a solution that was
a little more robust and general purpose. I wrote the following script,
presented here with line numbers:
1 #!/bin/bash
2
3 # average - calculate the average of a series of numbers
4
5 # handle cmd line option
6 if [[ $1 ]]; then
7 case $1 in
8 -s|--scale) scale=$2 ;;
9 *) echo "usage: average [-s scale]" >&2
10 exit 1 ;;
11 esac
12 fi
13
14 # construct instruction stream for bc
15 c=0
16 { echo "t = 0; scale = 2"
17 [[ $scale ]] && echo "scale = $scale"
18 while read value; do
19
20 # only process valid numbers
21 if [[ $value =~ ^[-+]?[0-9]*\.?[0-9]+$ ]]; then
22 echo "t += $value"
23 ((++c))
24 fi
25 done
26
27 # make sure we don't divide by zero
28 ((c)) && echo "t / $c"
29 } | bc
This script takes a series of numbers from standard input and prints
the result. It is invoked as follows:
average -s scale < file_of_numbers
where scale is an integer containing the desired number of decimal
places in the result and file_of_numbers is a file containing the
series of number we desire to average. If scale is not specified, then
the default value of 2 is used.
To demonstrate the script, we will calculate the average size of the
programs in the /usr/bin directory:
me@linuxbox:~$ stat --format "%s" /usr/bin/* | average
81766.66
The basic idea behind this script is that it uses the bc arbitrary
precision calculator program to figure out the average. We need to use
something like bc, because arithmetic expansion in the shell can only
handle integer math.
To perform our calculation, we need to construct a series of
instructions and pipe them into bc. This task comprises the bulk of our
script. In order to do something that complicated, we employ a shell
feature known as a group command. Starting with line 16 and ending with
line 29 we capture all of the standard output and consolidate it into a
single stream. That is, all of the standard output produced by the
commands on lines 16-29 is treated as though it is a single command and
piped into bc on line 29.
We'll look at our group command piece by piece. As you know, an average
is calculated by adding up a series of numbers and dividing the sum by
the number of entries. In our case, the number of entries is stored in
the variable c and the sum is stored (within bc) in the variable t. We
start our group command (line 16) by passing some initial values to bc.
We set the initial value of the bc variable t to zero and the value of
scale to our default value of two (the default scale of bc is zero).
On line 17, we evaluate the scale variable to see if the command line
option was used and if so, pass that new value to bc.
Next, we start a while loop that reads entries from our standard input.
Each iteration of the loop causes the next entry in the series to be
assigned to the variable value.
Lines 20-24 are interesting. Here we test to see if the string
contained in value is actually a valid floating point number. To do
this, we employ a regular expression that will only match if the number
is properly formatted. The regular expression says, to match, value may
start with a plus or minus sign, followed by zero or more numerals,
followed by an optional decimal point, and ending with one or more
numerals.. If value passes this test, an instruction is inserted into
the stream telling bc to add value to t (line 22) and we increment c
(line 23), otherwise value is ignored.
After all of the numbers have been read from standard input, it's time
to perform the calculation, First, we test to see that we actually
processed some numbers. If we did not, then c would equal zero and the
resulting calculation would cause a "division by zero" error, so we
test the value of c and only if it is not equal to zero we insert the
final instruction for bc.
This script would make a good starting point for a series of
statistical programs. The most significant design weakness of the
script as written is that it fails to check that the value supplied to
the scale option is really an integer. That's an improvement I will
leave to my faithful readers...
Further Reading
The following man pages
- :bc
- bash (the "Compound Commands" section, covers group commands and the
[[]] and (()) compound commands)The Linux Command Line
- Chapter 20 (regular expressions)
- Chapter 28 (if command, [[]] and (()) compound commands and && and ||
control operators)
- Chapter 29 (the read command)
- Chapter 30 (while loops)
- Chapter 35 (arithmetic expressions and expansion, bc program)
- Chapter 33 (positional parameters)
- Chapter 37 (group commands)
--
Posted By William Shotts to LinuxCommand.org: Tips, News And Rants at
4/01/2010 05:22:00 PM |