Menu

TechnicalManual22

Aleksi Kallio

Technical manual for Chipster 2.2

Note! This is an unmaintained archive site.
Wiki has been moved to Github and current documentation is available at https://github.com/chipster/chipster/wiki

The manual covers Chipster platform version 2.2. It instructs in setting up your own Chipster server, adding your own tools into Chipster, and more. For the user manual, please see http://chipster.csc.fi/manual/.

Introduction

In the basic setup, Chipster is a client-server system. Chipster server can be run on a single server computer or even a laptop. The Chipster server itself actually contains multiple independent services, so it can be scaled across a cluster of servers to distribute computational and data transfer load.

The system consists of compute, authentication, management and logging services, and message and file brokers, which act as communication channels between the components.

System installation

Virtual machine installation

The recommended way to get Chipster server running is virtual machine installation. Chipster is packaged as complete virtual machine images that can be deployed to a variety of virtualisation platforms. The images are based on Ubuntu Linux 11.10 (Oneiric).

To use the Chipster virtual machine, you need to:

  1. Install virtualisation software such as VirtualBox or VMware Player
  2. Download Chipster virtual machine
  3. Start the Chipster virtual machine
  4. Start Chipster client

Installing virtualisation software

To run a virtual machine, you need a virtualisation software installed on the computer, which is going to run the virtual machine. VirtualBox and VMware Player are two common virtualisation software products, which work with Chipster.

  • VirtualBox for Linux, Mac and Windows, free
  • VMware Player for Linux and Windows, free for personal non-commercial use

Instructions for VirtualBox

Download Chipster virtual machine

Download all the files from VirtualBox directory under the desired version from:

Note that the total size is around 80 GB.

Add Chipster virtual machine to VirtualBox

  • Open VirtualBox
  • Select "Machine"->"Add"
  • Go to the folder where you downloaded Chipster virtual machine files and select chipster.vbox and "Open"

Configure Chipster virtual machine

  • Select "Settings" and "Network" and change the Attached to: Bridged Adapter (it's NAT by default, VirtualBox NAT does not yet work with Chipster))

Start Chipster virtual machine

  • Select "Start"
    • Note: In Mac environment, the presence of VMware tools in the image sometimes causes troubles during the boot (freezes with kernel panic). It is solved by a reboot. In future versions of the virtual machine VMware tools will disabled by default.

Instructions for VMware Player

Download Chipster virtual machine

Download all the files from VMware directory under the desired version from:

Note that the total size is around 80 GB.

Add Chipster virtual machine to VMware Player

  • Run VMware Player
  • Select "Open a virtual machine"
  • Select chipster.vmx and 'Open'

Start Chipster virtual machine

  • Click "Play virtual machine"
  • If you get a notification about missing VMware tools, just ignore it for now.

Start Chipster client

Chipster server are configured to start when the virtual machine is started. After you have the Chipster virtual machine running, start the Chipster client by pointing your web browser to

http://<hostname or ip address of the virtual machine>:8081

and clicking on the Launch Chipster link. Login with chipster/chipster. To get started, you can open the example session (link in Datasets panel).

If you don't know the hostname or ip address of the virtual machine you have started, see instructions in the next section.

Configuring Chipster

  • Login to vm using username: chipster, password: chipster
  • Check the ip address of the vm
    • Ip address is printed in the "message of the day" when you login
    • Or you can use:
      hostname -I
      

or

        ifconfig
  • For convenience, it is recommended to set the keyboard layout and time zone
    • Instructions are printed to "message of the day" when you login
  • Configure Chipster to use the given ip address:

    cd /opt/chipster2;./configure.sh
    
  • You can also try running

    cd /opt/chipster2;./configure.sh auto
    

which auto detects the ip address and uses default values for other settings
* Restart Chipster:

    service chipster2 restart
  • Using a web browser go to the Chipster start page:

    http://<vm ip address>:8081
    
  • For administering the OS installation of the virtual machine, chipster account has sudo rights

    • su or sudo rights are not required for running Chipster
    • You can create a separate administration account and remove sudo rights from chipster

System installation in Linux

These are instructions for installation using the automatic tools provided in the installation package.

0) Requirements

Following software needs to be installed:

  • Java 1.6 or later

Majority of the tools also require correct version of R statistical environment (see list of corresponding versions).

The following tcp ports need to be open in the firewall:

  • 61616 for message broker service
  • 8080 for file broker service
  • 8081 for webstart service (optional)

1) Downloading and extracting

Installation packages can be obtained from http://chipster.sourceforge.net/downloads.shtml.

After downloading extract the tar archive. It contains directory "chipster", where all components are in their own subdirectories. It can be placed anywhere, but usually /opt/chipster is used.

Downloading and extraction can be done easily on command line (adjust version number 2.2.0 as needed):

cd /opt
wget http://www.nic.funet.fi/pub/sci/molbio/chipster/dist/versions/2.2.0/chipster-2.2.0.tar.gz
tar -xzf chipster-2.2.0.tar.gz

2) Installing external tools

No external tools are needed to start the server environment, but for the microarray analysis tools to work, R and a collection of libraries are needed. You can skip this step if you just want to get the system running first.

If you have installed R to default location /opt/chipster/tools/R-2.9.0, you can install the R libraries needed by Chipster with the setup tool directly. Otherwise you have to update comp/conf/environment.xml first with the correct location of the R binary. Next run (as root if needed):

./setup.sh

For more information on setup tool see Setup tool section.

The setup tool will print out instruction for carrying out the remaining installation steps for additional tools and databases.

3) Configuring Chipster services

To configure the Chipster services, run the following two scripts. Both scripts will ask for confirmation before writing changes to files. Defaults should be fine for a local installation.

./configure.sh
./genpasswd.sh

* configure.sh* configures all the components, and genpasswd.sh generates secure passwords that server components use to authenticate each other.

4) Starting and stopping services

To start all the Chipster services, run:

./chipster start

In addition to start, you can also use stop, restart, and status.

5) Testing installation

To start the client using Java Web Start, go to the Web Start address specified when running the configure.sh. Default address is:

http://hostname-of-this-machine:8081

Note! Java Web Start server (Jetty) is not bundled to backported versions 1.1.x. You have to set up your own web server for serving Web Start files.

To start the client locally (on the same machine as the services), run:

./client/bin/chipster-client

The default username/password is chipster/chipster. Users can be added by editing the userlist at auth/security/users. Chipster also supports several more advanced authentication providers.

6) Starting services at boot time

The steps needed for making services start at boot time are somewhat system dependent. In most Linux systems two steps are needed:

  • Make link from /etc/init.d/ to the executable of the service, for example /etc/init.d/chipster-auth -> /opt/chipster/auth/bin/chipster-auth.
  • Make links from /etc/rcX.d to the link at /etc/init.d to define the runlevels at which the service is started (typically 3).

You can also control Chipster as a single service

  • Make link from /etc/init.d/ to the Chipster service script chipster/chipster

In Red Hat Linux chkconfig can take care of creating the runlevel links, and you can use service <service_name> start | stop | status | console to control services.

Please note that brokers must be started before other components can be started. This is taken care for you if you use the single service option.

Tool installation in Linux

One of the key ideas behind Chipster is to take all the high quality tools in the relevant field of data analysis and integrate them together. For the end user, this is great. Unfortunately for the person installing the system the situation is not that optimal. We really wish that substantial amount of quality data analysis algorithms were available in some clean, platform independent format and we could simply distribute them just like we distribute Chipster itself. But it is not reality, at least yet. So we have to face the facts and install different analysis applications the way the original author had in mind.

For these reasons, we strongly recommend you to choose the virtual machine based distribution and that way avoid installing external tools yourself.

Manual tool installation

It is also possible to install external applications and datasets manually by yourself. By external applications we mean the computational environment needed to run Chipster compute service. Chipster itself is plain Java and does not have any dependencies to external application other than Java Runtime Environment. We do package Chipster with Tanuki Software's free Java Service Wrapper for convenience, but using the wrapper is not required. So, without the external applications in place your compute service will boot up, but will not be able to run successfully any analysis jobs. If external applications are partially available, then you can use some of the tools, etc.

External dependencies can be divided to 3 layers.

  1. OS level packages
  2. external applications and databases (R and others)
  3. R packages

Level 1 contains a collection of operating system packages that are required for applications at levels 2 and 3 to work. Naturally level 1 is OS specific and so the packages are installed into OS specific locations using OS specific tools (typically apt-get or yum). Levels 2 and 3 are contained in the Chipster tools directory. The most important application at level 2 is R, as it hosts most of the analysis functionality and is also the basis for layer 3. There are also some simple databases, i.e. plain files, that reside on layer 2. The R specific layer 3 consist mostly of CRAN and Bioconductor packages, with some additional third party packages. They are installed using the standard R installation methods and will be located in chipster/tools/R-<version>/library. There is a setup tools for installing layer 3 automatically.

Chipster tool directory or tool home is the place to store all external dependencies (except for OS packages). By default it is /opt/chipster/tools. Analysis scripts have access to tool directory path via a variable so that they can access external applications and databases. You need to configure tool home to chipster/comp/conf/runtimes.xml if you change it.

Up-to-date steps for installing all external applications and datasets can be found from the VM distribution installation script:

<http://code.google.com/p/chipster/source/browse/src/main/admin/vm/install-chipster.sh?name=default>

Follow steps onwards from Install external applications and datasets.

Client installation in Linux

Client installs automatically with Java Web Start.

Installation in Mac OS X

Chipster client is fully Mac OS X compatible and supported on Mac platforms. It installs automatically with Java Web Start.

Chipster server supports Mac OS X. The installation is identical to Linux installation, so please refer there for instructions.

Installation in Windows

Chipster client is fully Windows compatible and supported on Windows platforms. It installs automatically with Java Web Start.

Chipster server has experimental support for Windows. As the bioinformatics tool environment is Unix oriented, doing a complete installation in Windows will require significant efforts.

System administration

Chipster architecture

The shortest description for Chipster architecture would be that it is very flexible. The Chipster environment is based on message oriented architecture (called also message passing architecture or message oriented middleware architecture). Components are connected using message broker (ActiveMQ). This results in a loosely coupled distributed system. Chipster is designed to be based on the idea of broadcast, allowing components to be unaware of each other. Also the system does not depend on the protocol used for communication.

The Chipster environment consists of the following components:

  • message broker (1 to many)
  • file broker (1 to many)
  • authenticator (1)
  • compute server (1 to many)
  • client (many)

All components can be added and removed on fly. In case there are multiple instances of a same component running there's no need for extra configuration, because, for example, multiple analysers can function without being aware of each other. This allows system administrator to add analyser components on fly if there is need for extra processing power, for example during large courses. Currently there can be only one and authenticator.

One of the key ideas in designing Chipster architecture was to carefully consider where each bit of the system's state is managed. Chipster client follows fat client paradigm where client is functionally rich. This decision was made to keep server environment simple and lightweight, to reduce number of messages, to distribute processing load (especially data visualisation) to clients and to allow improved user experience as client application is mostly independent of server components. As most of the relevant state has the same lifecycle as one client session, managing state at the client side is also logically a good solution.

Server components explained

Message Broker (ActiveMQ) acts as a central point of the system, passing messages in-between components. ActiveMQ supports broker redundancy for improving scalability and reliability, so multiple brokers can be used simultaneously.

File broker distributes files to other components, acting as a supplement to message broker. File distribution is based on pull mechanism, where components needing data go and retrieve needed files from the file broker. This way compute servers and clients can be behind firewalls. Using separate file broker also allows compute servers to use minimal disk space as files are cached at file server.

Authenticator processes requests from clients. Each request is examined, and if valid session exists for that client it is allowed to continue. Otherwise a request is made for user to authenticate and after a successfull authentication a new session is created. Authenticator supports many types of authentication sources (Unix passwd, JAAS, LDAP...), and can use them simultanously. Server components authenticate to broker using server specific keys, and are allowed to communicate directly without going through the authenticator. Authenticator is a separate component so that it can be deployed inside intranet, as it might need access to sensitive information such as user databases.

Compute service listens for computation requests. When client initiates a new task, all compute services with free resources reply and client decides which service gets to process the task. This way there is no single point of failure in distribution of tasks to server environment, and compute services can be modified easily on fly.

Simple server installation

The simple way to install Chipster environment is to deploy all components to a single server and distribute clients by using Java Web Start.

All server components run inside their own directories, so having them on a single server does not require any special arrangements. Message broker and file broker are running in their respective ports, and other components connect to them using local network loopback.

Advanced server installation

A good guideline for setting up advanced installation is to dedicate an untrusted server for message broker and file broker components, as they are the only components that have open server ports. That server should not be inside organisations firewall, i.e., be in DMZ network. To secure user credentials, authenticator should be installed separately on a strongly protected machine.

It is possible to deploy multiple compute servers. All of them should have same tools descriptions, but it is possible to select enabled tools per server. It is also possible to configure maximum job counts. If you have many nodes available but they have also other use besides Chipster it is recommended to deploy compute servers on as many nodes as possible but limit the per server job count to keep Chipster from hogging all the resources. If there are memory intensive tools, it might be a good idea to deploy dedicated node for them with large memory and low maximum job count. Independent compute services can also be deployed to batch processing system (LSF etc.), following a worker paradigm.

Chipster and firewalls

One of the design guidelines in Chipster was to make it easily adaptable to various firewall configurations. Even though there are many server components, only message and file brokers are listening to open ports. In other words, they act as a hub to which other components connect to. Both of the components are designed so that they can be installed on a "untrusted" machine located in the DMZ. Compute and authentication services often have to be located inside intranet, which is not a problem as they do not act as servers from a networking point of view.

Client uses TCP or SSL to connect to message and file brokers. This communication can be configured to ports 80 and 443 to bypass strict firewalls. In some high security environments practically all network access is disabled, except for HTTP using local proxy. Currently Chipster does not use HTTP, so in this extreme case deployment is not possible without changes to firewall configuration. However routing messages through HTTP is supported by ActiveMQ message broker, so in future these scenarios might also be supported directly.

Upgrading server installation

Upgrading VM bundled installation

Chipster VM bundle comes with automatic update tool that allows you to update the installation without downloading everything again. Updates do not happen automatically, but must be initiated manually. Before the update, you should stop Chipster server.

./chipster stop
./update.sh
./chipster start

The system works so that the update.sh script is just a bootstrap script that downloads the actual update script and executes it. This way the update system itself also gets updated when needed.

The actual update script is called update-exec.sh and is located at <http://www.nic.funet.fi/pub/sci/molbio/chipster/dist/virtual_machines/updates/>. When run, update-exec.sh downloads files, unpacks them, moves things around when needed and does other required setup steps.

Chipster update system only concerns the Chipster installation and tool dependencies. You should also take care of keeping the operating system of the VM installation up to date, using normal Debian tools, such as aptitude.

sudo aptitude upgrade

Operating system packages get updated and a reboot might be necessary.

Upgrading other installations

If you installed Chipster yourself, then the automatic update mechanism is not available. The recommended approach is to make a fresh install of Chipster and move relevant functionality over from the previous installation. You should check at least these locations for things to move over:

  • chipster/*/conf/chipster-config.xml - custom configuration
  • chipster/comp/conf/runtimes.xml - custom analysis tool runtimes
  • chipster/comp/modules - custom tool scripts
  • chipster/webstart/web-root/manual - custom manual pages

When Chipster is upgraded, also tool dependencies need updating. For exact details on changes between versions, look at the update-exec.sh script at <http://www.nic.funet.fi/pub/sci/molbio/chipster/dist/virtual_machines/updates/>.

Directory layout

Chipster directory layout is different on client and server sides. On client side the goal has been to make placement of files and directories to be compatitible with operating system specific conventions. On server side the goal has been to make the layout as coherent as possible (especially integrate well into Java Service Wrapper that wraps all the server components).

Client

Application data (logs, SSL keys, user preferences ) is stored in a one place and user data (sessions, workflows) in another.

  • Windows
    • Application data stored in Local Settings\Application Data\Chipster inside user's home directory
    • User data stored in My Documents inside user's home directory
  • Mac OS X
    • Application data stored in Library/Application Support/Chipster inside user's home directory
    • User data stored in My Documents inside user's home directory
  • Linux/Unix
    • Application data stored in .chipster inside user's home directory
    • User data stored in home directory, or Document or My Documents inside the home directory if they exist

If operating system is not recognised, we fall back to Linux/Unix. This is because most often esoteric OS's are Unix variants.

Server on Linux

Typically Chipster is installed to /opt/chipster. Inside the installation directory there is a shared directory and several independent component directories (that depend on the shared directory). The contents of the shared directory are given below.

* chipster/shared
  * bin - generic executable files
  * lib - Java JAR and platform specific libraries
  * lib-src - source codes for libraries that require source code to be distributed together (LGPL)

All of the component directories follow the same basic layout. The contents of the components directories are given below. "Wrapper" means here Java Service Wrapper that is bundled with Chipster server installation.

* chipster/&lt;component name&gt;
  * bin - executable files and utility scripts
    * chipster-&lt;component name&gt; - main executable script (use this)
    * linux-x86-&lt;32 | 64&gt; - platform specific executables
      * chipster-&lt;component name&gt; - platform specific executable script
      * wrapper - wrapper binary
  * logs - log files for wrapper (console output) and Chipster itself
    * wrapper.log
    * chipster.log
    * messages.log
    * jobs.log
    * security.log
    * status.log
  * security - files related to encryption (and authentication on authentication service)
    * keystore.ks - automatically generated dummy key for SSL
    * users - flat file user database
  * conf - component's configuration
    * chipster-config.xml - main Chipster configuration
    * wrapper.conf - wrapper configuration
    * jaas.config - JAAS authenticator configuration
    * runtimes.xml - compute service runtime environments' configuration (compute service)
    * environment.xml - description of tool runtime environment (compute service)
  * file-root - www-root of file cache (file broker)
  * web-root - www-root of Web Start files (webstart service)
  * jobs-data - working directory for jobs (compute service)
  * modules - directory containing analysis tools (compute service)
    * microarray - microarray tools, in tool type specific subdirectories
       * R-&lt;version&gt;
       * bsh
       * java
       * microarray-module.xml - tool configuration for this module
    * ngs - NGS tools, in tool type specific subdirectories
       * R-&lt;version&gt;
       * java
       * ngs-module.xml - tool configuration for this module
    * sequence - sequence analysis tools, in tool type specific subdirectories
       * shell
       * sequence-module.xml - tool configuration for this module
    * &lt;third party modules&gt;
  * database - monitoring database (manager)
  * database-backups - backups for monitoring database (manager)

ActiveMQ uses it's own directory layout. See ActiveMQ documentation for more information.

Configuration system

Configuring Chipster

If you just want to get your Chipster up and running, execute configure.sh script and your done! If you want to know more about Chipster configuration system, then read on.

Chipster stores application configuration to a file called chipster-config.xml. It is located either in a conf subdirectory (see directory layout) or loaded dynamically via URL. The former approach is meant for server components and the latter for clients starting over Java Web Start. The configuration file is not created automatically any more, but it must always exists (locally or behind an URL).

The configuration is loaded in two steps. First an internal default configuration is loaded (chipster-config-specification.xml, located inside the Chipster JAR) and then the normal configuration file chipster-config.xml. The latter contains only information that needs to be set per instance basis, so it is quite minimalistic. However it is possible to overwrite configuration entries of the internal default configuration using the normal configuration file. Just include the entry in the file and it will replace the default one.

The recommended way to configure a new Chipster instance is to use the configure.sh script located at the installation root directory. It will configure all the components and the Web Start client descriptor. You can also modify the configuration files manually. For information on meaning of the different configuration entries, please refer to http://code.google.com/p/chipster/sou.../chipster-config-specification.xml in the code repository.

Loading configuration over URL

Each Chipster component (client, analysis server, file broker etc.) has its own configuration file. If configuration file is not explicitly specified, chipster-config.xml is used. Configuration can be loaded over URL by passing an argument -config <url> at component startup. You can also specify a local file (e.g. -config file:/path/to/config.xml). For Web Start clients configuration file can be set in the chipster.jnlp descriptor file. Using this mechanism allows to manage the configuration (such as the address of the broker server) centrally.

The configuration file

The configuration file chipster-config.xml contains all the configuration that different components require. See below for an example configuration file of a file broker component.

&lt;configuration content-version="3"&gt;

    &lt;configuration-module moduleId="messaging"&gt;

        &lt;entry entryKey="broker-host"&gt;
            &lt;value&gt;&lt;/value&gt;
        &lt;/entry&gt;

        &lt;entry entryKey="broker-protocol"&gt;
            &lt;value&gt;&lt;/value&gt;
        &lt;/entry&gt;

        &lt;entry entryKey="broker-port"&gt;
            &lt;value&gt;&lt;/value&gt;
        &lt;/entry&gt;

    &lt;/configuration-module&gt;

    &lt;configuration-module moduleId="security"&gt;

        &lt;entry entryKey="username"&gt;
            &lt;value&gt;filebroker&lt;/value&gt;
        &lt;/entry&gt;

        &lt;entry entryKey="password"&gt;
            &lt;value&gt;filebroker&lt;/value&gt;
        &lt;/entry&gt;

    &lt;/configuration-module&gt;

    &lt;configuration-module moduleId="filebroker"&gt;

        &lt;entry entryKey="url"&gt;
            &lt;value&gt;http://chipster.example.com:8080&lt;/value&gt;
        &lt;/entry&gt;

        &lt;entry entryKey="port"&gt;
            &lt;value&gt;8080&lt;/value&gt;
              &lt;/entry&gt;

    &lt;/configuration-module&gt;

&lt;/configuration&gt;

The file contains several modules (XML element configuration-module), and the selection of modules varies between different components. Modules security and messaging are related to how Chipster node connects to messaging fabric and are always required. Additionally, there are node specific modules, such as filebroker in the example.

Inside the module, there are configuration entries (XML element entry). Every entry has a key (XML attribute entryKey) and it contains one or more values (XML element value).

Programming API

Configuration can be accessed programmatically as shown below.

DirectoryLayout.initialiseServerLayout(Arrays.asList(new String[] {}));
Configuration configuration = DirectoryLayout.getInstance().getConfiguration();

First directory layout must be initialised. Here we initialised server layout and do not specify any node specific configuration modules that need to exist. Next we fetch a fi.csc.microarray.config.Configuration object that can be used to read configuration modules and entries.

Secure communications

Setting up SSL

By default Chipster server installation uses plain TCP for communication. Setting up SSL is not trivial when using Java's default implementation, so it is not done by default. However here you'll find instructions on how to do it.

Step 1. Locate keystore

You can either use the keystore that is bundled with Chipster clients and generate your own (see [#Generating_SSL_keys]). Save it to file keystore.ks.

Step 2. Configure message broker

You need to:

  • copy keystore.ks to chipster/activemq/conf
  • open chipster/activemq/bin/<platform>/wrapper.conf, uncomment and edit the following settings
    • javax.net.ssl.keystorePassword=microarray (or whatever you have used)
    • javax.net.ssl.keystore=%ACTIVEMQ_BASE/conf/keystore.ks
  • open chipster/activemq/conf/activemq.xml and change protocol to "ssl" (you can change port also)

Step 3. Configure Chipster components

For each of the server components, you need to:

  • copy keystore.ks to chipster/<component>/security
  • open chipster/<component>/conf/chipster-config.xml and in module "messaging" change protocol to "ssl" (you can change port also)

That's it. You also need to change setting in the module "security" if you have used other than default values; see [#Generating_SSL_keys] for more details.

Generating SSL keys

Chipster comes with dummy keystore that gets you going with SSL. If you want to use SSL not only for encrypting communication but also establishing trust between server components and clients, you have to replace these publicly available keys with your own ones. Chipster uses Java's normal SSL implementation. Keystore can be manipulated as explained in Security documentation, so you can also use your existing keys.

Here we describe how you can generate your own SSL keys. Please note that these keys will not be approved by any Certificate Authority, and cause warnings if used outside of Chipster environment.

Step 1. Generate a new keystore

Keys can be generated using Java's keytool-application.

Generate key using keytool:

keytool -genkey -alias your_key_alias -dname "cn=Your name or organisation, ou=Your name or organisation, o=Your name or organisation, c=your_country_code" -validity 1800 -keyalg RSA -keystore keystore.ks

keytool will ask your keystore password (twice). You can choose any name (alias) for the key and you can use any password you want. The dummy keystore uses "client" as key alias and "microarray" as keystore password.

Next we need to set up trust for the newly generated key. It is done by exporting and importing the certificate.

keytool -exportcert -alias your_key_alias -file cert -keystore keystore.ks
keytool -importcert -alias your_trusted_key_alias -file cert -keystore keystore.ks

You can choose any name (alias) for the trusted key. The dummy keystore uses "microarray" and that is also the default in Chipster SSL configuration.

Now we have set up another dummy keystore. To actually set up trust between communication endpoints, read the next step.

Step 2. Distribute keystore

Chipster components have subdirectory "security" where keystore is stored in file keystore.ks, and message broker stores keystore in "conf" subdirectory. You can replace it with your newly generated keystore. If you wish to establish trust between different Chipster components, you should generate at least two dedicated keys: one for clients and one for server components. You might also generate a dedicated key for each server component.

Step 3. Update configuration

After deploying new keystore you have to configure modules to understand them. If you used default trusted key alias or keystore password, no changes are required. Keystore related settings are placed to configuration module "security", in configuration files chipster-config.xml.

&lt;configuration-module moduleId="security" description="encryption and authentication"&gt;
  &lt;entry entryKey="keystore" type="string" description="keystore file for SSL"&gt;
    &lt;value&gt;${chipster_security_dir}/keystore.ks&lt;/value&gt;
  &lt;/entry&gt;

  &lt;entry entryKey="keypass" type="string" description="keystore password for SSL"&gt;
    &lt;value&gt;microarray&lt;/value&gt;
  &lt;/entry&gt;

  &lt;entry entryKey="keyalias" type="string" description="alias of key to be used for SSL"&gt;
    &lt;value&gt;microarray&lt;/value&gt;
  &lt;/entry&gt;

        ...

Default configuration does not have SSL specific settings, so you need to add those entries. You should update values for "keypass" and "keyalias" to reflect appropriate settings for each component. The key alias refers to the trusted key, not the private key. The alias of the private key needs not to be configured, but the key needs to be in the keystore anyway. You can also change keystore path if you don't wish to store the keystore inside the "security" directory.

Authentication

Users file

The simplest supported authentication mechanism is the users file in auth/security/users. The format is:

&lt;username&gt;:&lt;password&gt;:&lt;exp. date as YYYY-MM-DD&gt;:comment

Only username and password are required. Blank lines and comment lines starting with # are allowed.

LDAP

See Authentication via LDAP.

Tool development

Writing Chipster tools

Basically, you have to do three things:

  • provide the tool itself (command line executable, R script, Java class etc.)
  • write a tool description in [SADLFormat], so that the script can be run and shown in the client application
  • make compute service aware of the tool

You should also follow conventions for Chipster analysis tools.

Adding and modifying tools

Chipster tools are divided into modules. Modules are a high level packages that cover some area of data analysis, such as next generation sequencing. At compute server, modules are stored in chipster/comp/modules directory. Each module has its own subdirectory, where the tools are located in tool type specific subdirectories. Tools can be R scripts, BeanShell scripts, or header stubs that define how command line tools are invoked etc. Besides the tools themselves, each module has a configuration file <module name>-module.xml that lists all the tools, maps them to runtimes (configured at compute service level) and gives tool specific parameters, if needed.

To get started, go and have a look at the modules directory. Changes to tool files are detected dynamically, so you can make a change and see what happens when you run the tool through the client. Changes to tool code does not require any restarting, allowing you to write and test tools simultaneously. However, please note that changes to tool headers and module configuration files require client and compute service restart.

Writing SADL header

SADL (Simple Analysis Description Language) is a simple notation for describing analysis tools so that they can be used in the Chipster 2 framework. SADL describes what input files the tool takes, what output files it produces, and what parameters are needed for running it. For the syntax of SADL please see [SADLFormat]

Making R scripts Chipster compatible

Chipster uses regular R scripts. The only thing to remember is that interactive functions can not be used.

Before running the script, the system runs the following initialisation snippet:

setwd(".")

The script should output results in table format to a file specified in description header. So, for example like this:

write.table(mytable, file="results.txt", quote=FALSE, col.names=FALSE, row.names=FALSE)

Tool conventions

The goal in Chipster is always to produce a coherent user experience. Here are some conventions that can be useful when integrating tools into Chipster and should be followed when writing tools that are to be integrated into Chipster main repository.

NGS analysis module

  • Tools should accept and produce read data in FASTQ and BAM format when possible

Microarray analysis module

  • The default data format is TSV (tab separated values), with one row for each gene or probeset
  • The first column should be unnamed or "identifier" and contain the gene/probeset name
  • Tool should not remove any existing columns unless the row structure is changed. In other words, inputs can have annotation etc. data that just passes through analysis steps
  • See AnalysisToolInputsAndOutputs for more information

Sequence analysis module (Embster)

  • Follow EMBOSS conventions

FAQ

Q: Chipster seems to ignore Java proxy settings and our firewall allows connections only through proxy.
A: By default Chipster ignores proxy settings and always uses direct connection. It is possible the disable the override and make Chipster to use Java proxy settings. In chipster-config.xml, add the following under the module messaging:

&lt;entry entryKey="disable-proxy" type="boolean" description="should we ignore Java proxy settings and connect directly"&gt;
&lt;value&gt;false&lt;/value&gt;
&lt;/entry&gt;

The change needs to made to chipster-config.xml of clients. In normal setups it is served by webstart server and will be in effect when clients are restarted.

Q: Client application fails to start with UnknownHostException.
A: You are running a Linux workstation (say "foobar") and startup fails with "fi.csc.microarray.MicroarrayException: could not connect to message broker at ssl://chipster.csc.fi:61617 (Could not connect to broker URL: ssl://chipster.csc.fi:61617. Reason: java.net.UnknownHostException: foobar: foobar)". The problem is that your hostname cannot be resolved for your workstation (Java SSL requires that hostnames can be resolved for both endpoints). Try "host foobar" on shell. If it says "host not found" your network is a bit problematic. You can add "foobar" to your /etc/hosts after localhost, like "127.0.0.1 localhost foobar", and it should work. You can also contact system administrator to find out why your hostname cannot be resolved.

Q: Starting Chipster server environment results in: "Could not detect hardware architecture, please set platform manually."
A: If hardware architecture is not detected automatically, it can be set manually by editing all instances of chipster-generic.sh. Architecture is configured by changing the PLATFORM line to match your hardware architecture (see comment above the line for options).

Q: I get "RSA premaster secret error" when trying to run Chipster server.
A: Some JRE's are not bundled with complete security files (needed by Chipster for SSL). Installing "Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files" should fix it. They can be installed using your system's package manager (if available there) or from http://java.sun.com/javase/downloads/index_jdk5.jsp.

Q: Attempts to start client always end with: "fi.csc.microarray.MicroarrayException: could not connect to message broker at ssl://chipster.csc.fi:61617 (Could not connect to broker URL: ssl://chipster.csc.fi:61617. Reason: java.net.ConnectException: Connection timed out: connect)".
A: If broker is running properly, the reason is a firewall blocking communication between servers and client. To configure firewall, the default configuration of Chipster needs port 61616 (TCP) or 61617 (SSL) for messaging and port 8080 (HTTP) for file transfers. Also make sure that Java is not configured to use a non-compliant proxy server for HTTP.


Related

Wiki: ChipsterToolConventions
Wiki: ChipsterVsRVersions
Wiki: DirectoryLayout
Wiki: LDAP
Wiki: Main_Page
Wiki: SADLFormat