Download Latest Version MiningABs_v1.0.0.zip (1.9 MB)
Email in envelope

Get an email when there's a new version of MiningABs

Home
Name Modified Size InfoDownloads / Week
README.txt 2013-06-18 4.2 kB
MiningABs_v1.0.0.zip 2013-06-18 1.9 MB
Totals: 2 Items   2.0 MB 0
Three main executable programs in this achieve are MiningABs, IdentifyCommonGenes and ConvertSeqsetToDistanceMatrix. We strongly recommend to run all of the programs on the 64-bit operation systems and make sure that your computer has at least 4GB free memory.

Here we guide you how to use these programs with an example. The example related files can be accessed in the directory “Example”.
Step1: To make sure that you have successfully installed the Java SE Development Kit. If not, the software is publicly available at “http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html”. So far, the latest version is 7u21. According to your operation system, please download a corresponding file and then install it.

Step2: How to prepare your input data and run the programs. The Example directory includes 7 text files and 2 directories as follows:
1. ListDSs.txt (text): Each line represents a microarray dataset name. In this example, we use 3 datasets: DS1, DS2 and DS3.
2. Datasets (directory): this directory contains 3 text files with dataset information. The files should be named as the same content in “ListDSs.txt”, like DS1.txt, DS2.txt and DS3.txt in this example. Each file contains a microarray dataset-involved probe IDs, their corresponding reading values and disease target classes (1 = tumor; 0 = normal). Each element is separated by a tab.
3. PlatformInformation (directory): this directory contains 3 text files with microarray platform information. The files should be named containing the same content of “ListDSs.txt”, like PI_DS1.txt, PI_DS2.txt and PI_DS3.txt in this example. Each file contains the probe IDs, gene IDs and gene symbols. Each element is separated by a tab. 
4. InfoProbes.txt (text): this file contains all datasets involved probe sequences information. An identifier of a sequence in each row is composed of a tab separated dataset name, probe ID and gene ID.
5. CG.txt (text): this file contains the common genes which are contained by each microarray platform. You can simply run a program “Run_IdentifyCommonGenes.bat” to output this file. This program contains a java commend line “java -cp MiningABs.jar IdentifyCommonGenes 3 Example/InfoProbes.txt Example/CG.txt”, in which “3” represents the number of input datasets, “Example/InfoProbes.txt” is a path of your probe sequences information, and “Example/CG.txt” is a path of your output file.
6. MapGeneIDGeneSym.txt (text): this file contains a map of gene IDs and gene symbols over all datasets.
7. DM.txt (text): this file contains a distance matrix among any paired probe sequences. You can simply run a program “Run_ConvertSeqsetToDistanceMatrix.bat” to output this file. This program contains a java commend line “java -cp ./ ConvertSeqsetToDistanceMatrix Example/InfoProbes.txt Example/DM.txt”, in which “Example/InfoProbes.txt” is a path of your probe sequences information, and “Example/DM.txt” is a path of your output file.
8. Config.txt (text): this file contains all parameter settings for the MiningABs. Each line consists of a parameter name and its corresponding value separated by a tab. These parameters’ descriptions are shown as follows:
POPSIZE: the size of populations
MAXGENS: the max number of generations
INI_NVARS: the number of associated biomarkers (ABs)
PXOVER: the threshold for crossover and mutation rates
FITNESS_THRESHOLD: the threshold of an improved c-LM
LMs: how many improved c-LMs will be output in a run?

9. LMs.txt (text): Once the above 8 documents are ready, you can directly run a program “Run_MiningABs.bat” to output improved c-LMs to a file. If you receive an error “Java heap space out of memory”, please allocate more memory to this program. For example, change “java -cp MiningABs.jar MiningABs ……arguments……” in the file “Run_MiningABs.bat” to “java -Xmx70000m -cp MiningABs.jar MiningABs ……arguments……” where -Xmx70000m represents allocating at most 70GB memory for running this program.


If you have any questions about the above steps, please do not hesitate to contact us.
June 18, 2013
Source: README.txt, updated 2013-06-18