Note! This is an unmaintained archive site.
Wiki has been moved to Github and current documentation is available at https://github.com/chipster/chipster/wiki
This tutorial will show how to modify or add new analysis tool to a Chipster server. Integrating analysis tools is a straightforward process, allowing you to use Chipster as a medium for serving and supporting large number of users not experienced in data analysis and programming. Here we focus on tools written using the R programming language, but the process is pretty much the same also with other flavours of analysis tools.
To be able to work with tool scripts you need to be able to log into the Chipster server. This is an important restriction to guarantee security of the server: analysis tools are essentially unrestricted pieces of code that will be run on the server, meaning that the ability to add or modify tools is equal to the ability to log into the system.
In practice, developing R scripts for Chipster means editing files on the comp/modules subdirectory of the Chipster installation inside the server. This can be done either by logging into the Linux command line and running the file editor there, or connecting the server filesystem and editing the files using the file editor you normally use. Let's first have a look at the command line option, as you always need the command line for some server administration tasks anyway.
Once you have started a fresh Chipster server using the virtual machine images that are provided, you need to login there for the first time. To do this, you typically use the virtual console of your virtualisation software (e.g. VirtualBox or VMware). This virtual console corresponds to a physical console (display and keyboard) of a real server machine. It opens up when you start the virtual machine.
Login using the username "chipster". If you haven't already changed the password, then it is "chipster" by default.
Now you are at the console command line. Virtual consoles are not very user friendly, so it is recommended to use SSH to connect to the server. When you logged in, you were provided with a "message of the day" that contains some useful bits of information, including the IP address of the virtual server. Record the IP address and log out.
Next open up your favourite SSH client and point it to the IP address of the server. Login using the "chipster" account again. Switch into directory that contains analysis tool modules and list them.
cd /opt/chipster/comp/modules/ ls
There you can see the analysis modules, typically common, microarray and ngs. Module common contains utility functions that can be used by the actual tool scripts in the other modules.
Each module has a configuration file that describes the tools inside the module.
cat microarray/microarray-module.xml
Typically you would edit the file to add or remove tools and tool categories.
Next we switch to a directory that contains the actual analysis scripts and have a look at there.
cd microarray/R-2.12 ls
You can see a large number of scripts, each of them corresponding to a single tool you have on the client GUI. To make things as simple as possible, Chipster has a one-to-one mapping between tools and files. To create a tool, you need to create only a single file. And a single file is related to a single tool. If you need to share functionality between tools, the common module is available for that.
To modify a tool, you can just edit it with a text editor:
nano na-omitted.R
The tool should look like the following:
# TOOL na-omitted.R: "Remove missing values" (Removal of missing values. All observations, i.e., genes that have at least one missing value are excluded from the data set.) # INPUT normalized.tsv: normalized.tsv TYPE GENE_EXPRS # OUTPUT na-omitted.tsv: na-omitted.tsv # Removal of missing values # JTT 22.6.2006 # Loads the file file
As you can see, there are two parts in the script. First there are 3 lines of header, which describes the tool for Chipster. After the header you have the script itself, just a regular script of R commands. The header is commented so that the description snippet can be conveniently part of the executable R script file. It is written using simple and compact description language called SADL. The most practical way to get started with SADL descriptions is to have a look at similar tools and their headers. For complete reference, look at:
Next we work with the script a bit. Changes to tool scripts are visible immediately. You can try this out by adding this to the end of the script (and saving it):
# I was here!
When you open the Chipster client and click Show sourcecode for Remove missing values tool, you should see your additional line there. If you remove the line, save and hit Show sourcecode again, you will see that the line has disappeared. It is important to remember that if you change the header part of the script, then you need to restart your client so that it can recreate its internal data structures and GUI components to match the updated tool.
To add or remove a tool, edit the corresponding module configuration file, e.g.:
cd /opt/chipster/comp/modules/microarray/ nano microarray-module.xml
Tool can be removed by simply removing or commenting out the XML entry from the file. To add a tool, create a new entry. As an example, consider this snippet from microarray-module.xml:
<tool runtime="R-2.12"><resource>filter-expression.R</resource></tool> <tool runtime="R-2.12"><resource>filter-flags.R</resource></tool> <tool runtime="R-2.12"><resource>filter-sd.R</resource></tool>
To remove flag filtering and add p-value filtering, you would change the snippet to following:
<tool runtime="R-2.12"><resource>filter-expression.R</resource></tool> <tool runtime="R-2.12"><resource>filter-sd.R</resource></tool> <tool runtime="R-2.12"><resource>filter-pvalue.R</resource></tool>
The runtime attribute defines the runtime environment to use when running the tool. They are specified in comp/config/runtimes.xml. For R scripts, you need to pick the correct version of the R environment for your script (in these examples, it is R 2.12). As we have seen, scripts are stored in runtime specific subdirectories of the module. It means that tools can have different versions for different runtimes, allowing to cater for e.g. differences between R syntax. Mostly this functionality is used when a set of scripts is gradually updated to a later R version.
The client has a More help button associated with each tool. It takes user to a tool specific manual page, assuming the page exists. Manual is hosted in the server, and more particularly, on the webstart component that also serves the startup page. In the server, switch to manual directory and check it:
cd /opt/chipster/webstart/web-root/manual ls
You can see a long list of HTML pages. Their names match tool names, but manual pages are not organised into module and runtime hierarchies.
To create a manual page for a tool saved in foobar.R, create a file called foobar.html and fill it in by e.g. using this template: