Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
example_data.csv | 2018-05-07 | 12.7 MB | |
Totals: 1 Item | 12.7 MB | 0 |
Requirements ------------ - R (https://www.r-project.org/) version 3.4.0 or higher - Additional R packages: parallel, infotheo Data file format ---------------- SynergisticCore.R accepts data in the CSV format, where columns contain expression profiles of individual cells and rows consists of expressions of individual genes in cells. The first column contains gene/transcription factors names and the first row contains the header with columns, i.e. cells, identifiers. A cell identifier should be of the format subpopulation_identifier.cell_number For example: Neuron.1 (cell one in the Neuron subpopulation), Neuron.2 (cell two in the Neuron subpopulation), NSC.1 (cell one in the NSC subpopulation), NSC.2 (cell two in the NSC subpopulation), Astro.1 (cell one in the astrocytes subpopulation). The subpopulation identifiers in column identifiers are used to determine the target subpopulation and optionally subpopulations excluded from background subpopulations (see Usage section below). Usage: ------ > Rscript SynergisticCore.R <datafile> <Expression threshold> <target subpopulation id> [-E <number of excluded subpopulations> <excluded subpopulation id 1> <excluded subpopulation id 2> ... ] [-known <number of known identity TFs> <known identity TF 1> <known identity TF 2> ... ] where - datafile: a file in the format specified above containing single-cell RNA-seq data for a number of subpopulations, - expression threshold: a number specifying the threshold value below which a gene is considered not to be expressed, - target subpopulation id: identifier of the subpopulation for which the synergistic identity core is searched. Optionally, certain subpopulations can be excluded from the set of background subpopulations. For that the '-E' option can be used followed by the number of subpopulations and the identifiers of the subpopulations to be excluded. Moreover, the -known option allows the user to provide information on the known identity transcription factors. The '-known' option is followed by the number and the names of the individual known identity transcription factors. If present, additional information on the number of known transcription factors in the core and p-value is provided in the output. Usage examples: --------------- > Rscript SynergisticCore.R "Data/example_data.csv" 10 Blood_progenitors -E 1 Unspecified > Rscript SynergisticCore.R "Data/example_data.csv" 10 Blood_progenitors -E 1 Unspecified -known 6 GATA2 RUNX1 TAL1 HHEX NFE2L2 SOX7