Menu

Tree [r6] /
 History

HTTPS access


File Date Author Commit
 bin 2011-05-05 kypote [r2] added source to svn
 src 2011-05-05 kypote [r2] added source to svn
 INSTALL 2011-05-05 kypote [r4] README was really INSTALL (so I moved it). README
 README 2011-05-05 kypote [r6] ...

Read Me

ocdf is a small (UNIX/Linux) command line 
tool to bin, merge or generate a cumulative 
running total of a data set supplied.

For example, if the following data set were supplied:
>$ cat test
1 1
1 2
2 1
3 4


a call to ocdf produces:
>$ ocdf -i test
1 3
2 4
3 8


and if we were to want our data binned (into 2 bins):
>$ ocdf -i test -b 2
2 4
3 4

ocdf reads from stdin by default.  See 'ocdf -h' for
other commands.

ocdf uses arbitrary precision (it requires 
GMP) and stores the data set in an efficient 
data structure (an AVL tree) so it can be 
used for massive data sets (millions or 
billions of data points).

The idea is that if you have a massive data 
set (say stock prices) you can run it through 
ocdf, without worry about loosing precision, 
bogging down in run time and can simply 
generate a binned plot that you can use 
something like gnuplot to analyze.  Another
common use (I use it for) is when I have
a mass of un-ordered data: just pump it
through ocdf to sort and merge duplicates
(ocdf -r).