Download Latest Version bioNerDS.jar (71.6 MB)
Email in envelope

Get an email when there's a new version of bioNerDS

Home / method_networks / scripts / resource_classifier
Name Modified Size InfoDownloads / Week
Parent folder
classify_resources.pl 2014-01-15 4.5 kB
Totals: 1 Item   4.5 kB 0

Overview

These files are provided for download as a result of our bioinformatics resource networks analysis, as linked to from here. They provide an example of the application of our bioNerDS software, but are not required in order to use bioNerDS directly.

That said, you might find the files contained within the scripts/variant_aggregation subdirectory useful for processing and grouping the raw output of bioNerDS before further analysis - in particular, the .pm Perl module file. Note that future versions of bioNerDS will include this step automatically (no separate script will be required).

Some of these scripts make use of a MySQL database connection (the aggregated bioNerDS data is stored as such) - look for instances of "FILL_ME" within the scripts and replace with appropriate database connection information (table name, database name, user name, password). In most cases, the table name and database name are taken as command line parameters with the -t and -d flags respectively. You may also need to change the host and port variables if you're not using localhost and/or port 3306.

Network Generation Workflow

Note: You can skip steps 1 to 5 if you are using the already filtered SQL dump tables, rather than generating your own from the raw bioNerDS data. I.e., using: data/results_tables.sql.gz.

Note 2: Alternatively, download the appropriate Cytoscape (networks/*.cys) files instead.

  1. First you'll need to process the raw bioNerDS output data, which will aggreagte name variants, using the following script: variant_aggregation/populateMentionsDB.pl.
  2. Next, if you want to use resource classification (databases verses software), you'll need to run resource_classifier/classify_resource.pl, creating a file which enables resource to class associations.
  3. Download and import data/mesh_bioinf_journals.sql.gz into a local database (this enables you to filter the main data by journal mesh term).
  4. Download and import data/pmc_articles_sections.sql.gz into a local database (this enables you to filter the main data by article section).
  5. Using the scripts/filter_mentions.sql MySQL script, create the required tables through the various filtering steps employed. You will need to import the classification file you previously generated into the database. The final table you create here is the one used in the "FILL_ME" blanks within the remaining scripts.
  6. Generate the resource co-occurrence pairs, in one of two ways, depending on if you require temporal separation or not (see below). In each case, also include the -c <classification> (either sw or db) to the get_ordered_cooccur.pl script, if desired.
  7. Import resulting .csv files into Cytoscape or equivalent for analysis.

Example Pair-Extraction Script Execution

For general networks:

#!/usr/bin/bash
./get_ordered_cooccur.pl -d "FILL_ME" -t "FILL_ME" -s 2 | 
calculate_probabilities_and_filter_pairs.pl > results.csv

For temporal networks:

#!/usr/bin/bash
./get_ordered_cooccur.pl -d "FILL_ME" -t "FILL_ME" -s 2 -y 
./cooccur_temporal_shim.pl
for f in cooccur_pairs_20*.tsv; do 
    cat $f |
    ./calculate_probabilities_and_filter_pairs.pl > $f.csv
done
Source: README.md, updated 2014-07-30