Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
classify_resources.pl | 2014-01-15 | 4.5 kB | |
Totals: 1 Item | 4.5 kB | 0 |
Overview
These files are provided for download as a result of our bioinformatics resource networks analysis, as linked to from here. They provide an example of the application of our bioNerDS software, but are not required in order to use bioNerDS directly.
That said, you might find the files contained within the scripts/variant_aggregation
subdirectory useful for processing and grouping the raw output of bioNerDS before further analysis - in particular, the .pm
Perl module file. Note that future versions of bioNerDS will include this step automatically (no separate script will be required).
Some of these scripts make use of a MySQL database connection (the aggregated bioNerDS data is stored as such) - look for instances of "FILL_ME"
within the scripts and replace with appropriate database connection information (table name, database name, user name, password). In most cases, the table name and database name are taken as command line parameters with the -t and -d flags respectively. You may also need to change the host and port variables if you're not using localhost and/or port 3306.
Network Generation Workflow
Note: You can skip steps 1 to 5 if you are using the already filtered SQL dump tables, rather than generating your own from the raw bioNerDS data. I.e., using: data/results_tables.sql.gz
.
Note 2: Alternatively, download the appropriate Cytoscape (networks/*.cys
) files instead.
- First you'll need to process the raw bioNerDS output data, which will aggreagte name variants, using the following script:
variant_aggregation/populateMentionsDB.pl
. - Next, if you want to use resource classification (databases verses software), you'll need to run
resource_classifier/classify_resource.pl
, creating a file which enables resource to class associations. - Download and import
data/mesh_bioinf_journals.sql.gz
into a local database (this enables you to filter the main data by journal mesh term). - Download and import
data/pmc_articles_sections.sql.gz
into a local database (this enables you to filter the main data by article section). - Using the
scripts/filter_mentions.sql
MySQL script, create the required tables through the various filtering steps employed. You will need to import the classification file you previously generated into the database. The final table you create here is the one used in the"FILL_ME"
blanks within the remaining scripts. - Generate the resource co-occurrence pairs, in one of two ways, depending on if you require temporal separation or not (see below). In each case, also include the
-c <classification>
(eithersw
ordb
) to theget_ordered_cooccur.pl
script, if desired. - Import resulting
.csv
files into Cytoscape or equivalent for analysis.
Example Pair-Extraction Script Execution
For general networks:
#!/usr/bin/bash
./get_ordered_cooccur.pl -d "FILL_ME" -t "FILL_ME" -s 2 |
calculate_probabilities_and_filter_pairs.pl > results.csv
For temporal networks:
#!/usr/bin/bash
./get_ordered_cooccur.pl -d "FILL_ME" -t "FILL_ME" -s 2 -y
./cooccur_temporal_shim.pl
for f in cooccur_pairs_20*.tsv; do
cat $f |
./calculate_probabilities_and_filter_pairs.pl > $f.csv
done