A simple task: take a CSV file and compute the average of the values in the fourth column.
In python, a general purpose language, the code is reasonably straightforward. There are a couple of lines of noise (the two imports at the top). And to highlight two idiosyncrasies: (1) one has to know that "sys.argv" stands for the list of arguments passed to python, and (2) that list indexing starts at zero, so that the fourth column is number 3. There's also a bit of low-level work: opening the file, iterating its lines, splitting each line by commas and casting the read strings into numbers (a floating point number in this case).
# Run like: python parse.py data.csv from __future__ import with_statement import sys with open(sys.argv[1]) as f: # Prepare variables total = 0 n_lines = 0 # Iterate over each line for line in f: # Execute for each line fields = line.split(',') total += float(fields[3]) n_lines += 1 # Conclude print "Average:", total / n_lines
But there are languages far more suited for such a task, for example AWK, which has the built-in ability to process line-by-line the content of a file given as argument or from piped into it.
AWK has a number of built-in variables such as FS (the column separator character), and NR (the number of lines so far). The middle part--executed for every line--consists of one or more statements (in this case just one) that execute when matching a specified condition (in this case there's no condition and therefore all lines are considered). Implicitly casts strings to numbers.
# Run like: cat data.csv | awk -f parse.awk # Prepare variables BEGIN { FS = ","; total = 0; } # Execute for each line { total += $4; } # Conclude END { print "Average:", total / NR; }
The code above is expanded and commented. But it's really a one-liner:
$ awk 'BEGIN{FS=""; total=0} {total+=$4} END{print "Average:", total/NR}' data.csv