Examples

Tomasz

Locsmoc Usage Examples

A sample input file is provided in the Files folder. (The sample file contains a header row, which must be declared with the -h option)


Piecewise linear smoothing

A minimal command has the form

java -jar locsmoc.jar -f in.txt.gz -f out.txt.gz

The program will read a file in.txt.gz and store the result in file out.txt.gz. By default, smoothing will occur using a poisson error model with dispersion parameter p=1 and produce a piecewise-linear model. A more complete, but equivalent command is

java -jar locsmoc.jar --method poisson -p 1 -o 1
    -f in.txt.gz -f out.txt.gz

The default settings are suitable for quick results. However, a better fitting model is usually obtained by enabling knot moving, for example

java -jar locsmoc.jar --method poisson -p 2 -o 1 --move 10
    -f in.txt.gz -f out.txt.gz

In the above, the parameter p is changed to 2 for more aggressive smoothing.


Piecewise constant segmentation

Locsmoc can be used to produce piecewise constant segmentations. This requires over-riding some default behavior. An example is

java -jar locsmoc.jar --method poisson -p 1 -o 0 -c
    -f in.txt.gz -f out.txt.gz

Here the option -o 0 reduces the order of the output polynomial, and the option -c allows joining piecewise constant segments together. Note: it is necessary to specify the -c option for piecewise constant models.

As before, better models are usually obtained by enabling knot moving, for example

java -jar locsmoc.jar --method ttest -p 0.01 -o 0 -c --move 5
    -f in.txt.gz -f out.txt.gz

In the above example, the method and parameter p are also changed so that an empirical t-test is used to determine placement of the breakpoints.


Further options

The tool accepts a number of other parameters, which make processing different types of files and/or multiple files possible.

File options

Multiple input/output files can be specified in a single command,

java -jar locsmoc.jar -f in1.txt.gz -f out1.txt.gz -f in2.txt.gz -f out2.txt.gz

The inputs will be processed sequentially.

All files in a directory can be processed with a single command,

java -jar locsmoc.jar -f pathin -f pathout

Files with a header row are acceptable, but the header has to be declared,

java -jar locsmoc.jar -h -f in.txt.gz -f out.txt.gz

Files can be gziped (extension gz), plain text (any other extension), or a combination of the two

java -jar locsmoc.jar -f in.txt -f out.tsv
java -jar locsmoc.jar -f in.tsv -f out.txt.gz
java -jar locsmoc.jar -f in.txt.gz -f out.txt

Smoothing criteria

The criteria used for joining/moving segments can be one of poisson, ttest, or percent. Examples of the first two are shown in the previous sections. The percent method is invoked using

java -jar locsmoc.jar --method percent -p 0.02 -f in.txt.gz -f out.txt.gz

The model output by this command will deviate from the original signal by no more that 2% per segment.

Output piecewise polynomial models can be of order 0 (piecewise-constant), 1 (piecewise-linear), or 2 (piecewise-quadratic):

java -jar locsmoc.jar -o 0 -c -f in.txt.gz -f output-order0.txt.gz
java -jar locsmoc.jar -o 1 -f in.txt.gz -f output-order1.txt.gz
java -jar locsmoc.jar -o 2 -f in.txt.gz -f output-order2.txt.gz

For the linear and quadratic cases, note that not all segments will be reported with the maximal set order. In particular, if the input signal cannot be smoothed at all with the specified criteria, the output will be identical to the input (i.e. both will be run-length encoded piecewise-constant functions) regardless of the order specified using -o.