Download Latest Version Somald.jar (19.3 kB)
Email in envelope

Get an email when there's a new version of String overlap matcher for large data

Home
Name Modified Size InfoDownloads / Week
readme.txt 2011-04-14 1.6 kB
test1.txt 2011-04-14 1.1 kB
Somald.jar 2011-04-14 19.3 kB
Totals: 3 Items   22.0 kB 0
Readme
-------------------------------------------------------------------

This example project solves the following problem:
Reading part:
Read lines of genetic code represented as string from a file.
Each lines in the file has two tabs separated string.
Computation part: 
Find the overlap position on the first string (from the tail end ) by the second string( from the head end).
The overlap must be minimum of 5 letters.
Return the non overlapped head of the first string if such an overlap exist, otherwise return the first string. 
eg1: input  abc12345 12345678 output:abc 
eg2:input  abcdefghij 1234567 output: abcdefghij
eg3:input abc11111111111 11111111111 output:abc
Writing Part:
collect the result and write it out.
--------------------------------------------------------------------

Running the program

1.Download the jar file "Somald.jar" or compile from the source.
2.Download input file "test1.txt" or creat your own.
3.Run the program by typing- 
	java -jar Somald.jar test1.txt
where test1.txt is the input file.

The result is printed on the screen. Use standard redirection to capture output.

Bug
The built in output file writer has an unknown bug (some kind of thread thrashing ) which kicks in after processing about 90% of the input. 	
If you still want to use it, simple edit the Main execution entry point line number 140 in Main.java class
from 
	instance.readFileAndWork(new File(args[0]),null);
to 
	instance.readFileAndWork(new File(args[0]),new file("yourOutputFileName.txt"));
	
 
Source: readme.txt, updated 2011-04-14