Vidjil – Server Manual – Installation and Maintenance

Table of Contents

This is the preliminary help of the Vidjil server. This help is intended for server administrators. Users should consult the web application manual.

1 Roadmap: plain installation / Docker containers

There are two ways to install and run a Vidjil server:

  • The plain installation of the server should run on any Linux/Unix server with Nginx (recommanded) or Apache. We provide below detailed instructions for Ubuntu 14.04 LTS. This is now the recommended installation. We use this installation on the public server (https://app.vidjil.org) since October 2014.
  • We are developping Debian packages as well as Docker containers to ease the installation and the maintenance. The Docker containers are currently tested in some partner hospitals, and we intend to release ready-to-use Docker containers in Q3 2017. This will then be the recommanded way to install and use Vidjil.

We recommand to people interested in installing a Vidjil server to wait until Q3 2017 and to use meanwhile the public test server.

2 Requirements

2.1 CPU, RAM

2.1.1 Minimal

The Vidjil algorithm typically uses approx. 1.2GB of RAM to run on a 1GB .fastq and will take approx. 5+ minutes. Therefore in order to process requests from a single user with a few samples, any standard multi-core processor with 2GB RAM will be enough.

2.1.2 Recommended

When choosing hardware for your server it is important to know the scale of usage you require. If you have many users that use the app on a daily basis, you will need to have multiple cores to ensure the worker queues don't build up. One worker will occupy one core completely when running the Vidjil algorithm (which is currently single-threaded).

For reference, here is the current Vidjil setup we used on our public testing server https://app.vidjil.org during two years (40+ users, including 15 regular users):

  • Processor: Quad core Intel 2.4MHz
  • RAM: 16GB

Given that the CPU is quad-core, we have 3 workers for executing Vidjil, keeping always one CPU core dedicated to the web server, even when the workers run at full capacity.

As of the end of 2016, we use for the public server a virtual machine with similar capabilities.

Running other RepSeq programs through the Vidjil server may require additional CPU and RAM.

2.2 Storage

As for many high-throughput sequencing pipeline, disk storage to store input data (.fastq, .fasta, .fastq.gz or .fasta.gz) is now the main constraint in our environment.

Depending on the sequencer, files can weigh several GB. Depending of the number of users, a full installation's total storage should thus be serveral hundred GB, or even several TB. We recommend a RAID setup of at least 2x2TB to allow for user files and at least one backup.

User files (results, annotations) as well as the metadata database are quite smaller (as of the end of 2016, on the public server, 3 GB for all user files of 40+ users). Note that even when the input sequences are deleted, the server is still able to display the results of previous analyses. Moreover, a future release at some point of 2017 will allow to access .fastq files on a mounted filesystem.

2.3 Authentication

The accounts are now local to the Vidjil server. We intend to implement some LDAP access at some point of 2017.

2.4 Network

Once installed, the server can run on a private network. However, the following network access are recommended:

  • outbound access
    • for users: several features using external platforms (IgBlast, IMGT/V-QUEST…)
    • for server mainteners: upgrades and reports to a monitor server
  • inbound access
    • The team in Lille may help local server mainteners in some monitoring, maintenance and upgrade tasks, provided a SSH access can be arranged, possibly over VPN.

3 Installing and running the Vidjil server

These installation instruction are for Ubuntu server 14.04 These instructions are preliminary, other documentation can also be found in dev.org.

3.1 Requirements

apt-get install git
apt-get install g++
apt-get install make
apt-get install unzip
apt-get install enum34

3.2 Vidjil server installation and initialization

Enter in the server/ directory.

If you just want to do some tests without installing a real web server, then launch make install_web2py_standalone. In the other case, launch make install_web2py.

The process for installing Vidjil server together with a real web server will be detailed in the future.

3.3 Detailed manual server installation and browser linking

Requirements: ssh, zip unzip, tar, openssh-server, build-essential, python, python-dev, mysql, python2.5-psycopg2, postfix, wget, python-matplotlib, python-reportlab, python-enum34, mercurial, git

If you want to run Vidjil with an Apache webserver you will also need: apache2, libapache2-mod-wsgi

Or if you want to use Nginx: nginx-full, fcgiwrap

For simplicity this guide will assume you are installing to /home/www-data

Clone https://github.com/vidjil/vidjil.git

Download and unzip web2py. Copy the contents of web2py to the server/web2py folder of you Vidjil installation (in this case /home/www-data/vidjil/server/web2py) and give ownership to www-data:

chown -R www-data:www-data /home/www-data/vidjil

If you are using apache, you can run the following commands to make sure all the apache modules you need are activated:

a2enmod ssl
a2enmod proxy
a2enmod proxy_http
a2enmod headers
a2enmod expires
a2enmod wsgi
a2enmod rewrite  # for 14.04

In order to setup the SSL encryption a key to give to apache. The safest option is to get a certicate from a trusted Certificate Authority, but for testing purposes you can generate your own:

    mkdir /etc/<webserver>/ssl
    openssl genrsa 1024 > /etc/<webserver>/ssl/self_signed.key
    chmod 400 /etc/<webserver>/ssl/self_signed.key
    openssl req -new -x509 -nodes -sha1 -days 365 -key
/etc/<webserver>/ssl/self_signed.key > /etc/apache2/ssl/self_signed.cert
    openssl x509 -noout -fingerprint -text <
/etc/<webserver>/ssl/self_signed.cert > /etc/<webserver>/ssl/self_signed.info

<webserver> should be replaced with the appropriate webserver name (ie. apache2 or nginx)

Given that Vidjil is a two-part application, one that serves routes from a server and one that is served statically, we need to configure the apache to do so. Therefore we tell the apache to:

  • Start web2py as a wsgi daemon (allows apache to serve the application).
  • Reserve two virtual hosts (one to be served with ssl encryption, and one not).
  • We configure the first host to serve static content and prevent overriding by the sever (otherwise all routes are redirected through web2py) and to follow symlinks this allows us to symlink to our browser app in the /var/www directory and keep both parts of Vidjil together.
  • The second is set to use SSL encryption, and only serve very specific folders statically (such as javascript files and images because we don't want to create a controller to serve that kind of data)

you can replace your apache default config with the following (/etc/apache2/sites-available/default.conf - remember to make a backup just in case):

WSGIDaemonProcess web2py user=www-data group=www-data processes=1 threads=1

<VirtualHost *:80>

  DocumentRoot /var/www
  <Directory />
    Options FollowSymLinks
    AllowOverride None
  </Directory>

  <Directory /var/www/>
    Options Indexes FollowSymLinks MultiViews
    AllowOverride all
    Order allow,deny
    allow from all
  </Directory>

  ScriptAlias /cgi/ /usr/lib/cgi-bin/

  <Directory /usr/lib/cgi-bin/>
    Options Indexes FollowSymLinks
    Options +ExecCGI
    #AllowOverride None
    Require all granted
    AddHandler cgi-script cgi pl
  </Directory>

  <Directory /home/www-data/vidjil/browser>
    AllowOverride None
  </Directory>

  CustomLog /var/log/apache2/access.log common
  ErrorLog /var/log/apache2/error.log
</VirtualHost>


<VirtualHost *:443>
  SSLEngine on
  SSLCertificateFile /etc/apache2/ssl/self_signed.cert
  SSLCertificateKeyFile /etc/apache2/ssl/self_signed.key

  WSGIProcessGroup web2py
  WSGIScriptAlias / /home/www-data/vidjil/server/web2py/wsgihandler.py
  WSGIPassAuthorization On

  <Directory /home/www-data/vidjil/server/web2py>
    AllowOverride None
    Require all denied
    <Files wsgihandler.py>
      Require all granted
    </Files>
  </Directory>

  AliasMatch ^/([^/]+)/static/(?:_[\d]+.[\d]+.[\d]+/)?(.*) \
	/home/www-data/vidjil/server/web2py/applications/$1/static/$2

  <Directory /home/www-data/vidjil/server/web2py/applications/*/static/>
    Options -Indexes
    ExpiresActive On
    ExpiresDefault "access plus 1 hour"
    Require all granted
  </Directory>

  CustomLog /var/log/apache2/ssl-access.log common
  ErrorLog /var/log/apache2/error.log
</VirtualHost>

Now we want to activate some more apache mods:

a2ensite default                   # FOR 14.04
a2enmod cgi

Restart the server in order to make sure the config is taken into account.

And create some symlinks to avoid splitting our app:

ln -s /home/www-data/vidjil/browser /var/www/browser
ln -s /home/www-data/vidjil/browser/cgi/align.cgi /usr/lib/cgi-bin
ln -s /home/www-data/vidjil/germline /var/www/germline
ln -s /home/www-data/vidjil/data /var/www/data

If you are using Nginx, the configuration is the following:

server {
    listen 80;
    server_name \$hostname;
    return 301 https://\$hostname$request_uri;

}
server {
        listen 443 default_server ssl;
        server_name     \$hostname;
        ssl_certificate         /etc/nginx/ssl/web2py.crt;
        ssl_certificate_key     /etc/nginx/ssl/web2py.key;
        ssl_prefer_server_ciphers on;
        ssl_session_cache shared:SSL:10m;
        ssl_session_timeout 10m;
        ssl_ciphers ECDHE-RSA-AES256-SHA:DHE-RSA-AES256-SHA:DHE-DSS-AES256-SHA:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA;
        ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
        keepalive_timeout    70;
        location / {
            #uwsgi_pass      127.0.0.1:9001;
            uwsgi_pass      unix:///tmp/web2py.socket;
            include         uwsgi_params;
            uwsgi_param     UWSGI_SCHEME \$scheme;
            uwsgi_param     SERVER_SOFTWARE    nginx/\$nginx_version;
            ###remove the comments to turn on if you want gzip compression of your pages
            # include /etc/nginx/conf.d/web2py/gzip.conf;
            ### end gzip section

            proxy_read_timeout 600;
            client_max_body_size 20G;
            ###
        }
        ## if you serve static files through https, copy here the section
        ## from the previous server instance to manage static files

        location /browser {
            root /home/www-data/vidjil/;
            expires 1h;

            error_page 405 = $uri;
        }

        location /germline {
            root $CWD/../;
            expires 1h;

            error_page 405 = $uri;
        }

        ###to enable correct use of response.static_version
        #location ~* ^/(\w+)/static(?:/_[\d]+\.[\d]+\.[\d]+)?/(.*)$ {
        #    alias /home/www-data/vidjil/server/web2py/applications/\$1/static/\$2;
        #    expires max;
        #}
        ###

        location ~* ^/(\w+)/static/ {
            root /home/www-data/vidjil/server/web2py/applications/;
            expires max;
            ### if you want to use pre-gzipped static files (recommended)
            ### check scripts/zip_static_files.py and remove the comments
            # include /etc/nginx/conf.d/web2py/gzip_static.conf;
            ###
        }

        client_max_body_size 20G;

        location /cgi/ {
            gzip off;
            root  /home/www-data/vidjil/browser/;
            # Fastcgi socket
            fastcgi_pass  unix:/var/run/fcgiwrap.socket;
            # Fastcgi parameters, include the standard ones
            include /etc/nginx/fastcgi_params;
            # Adjust non standard parameters (SCRIPT_FILENAME)
            fastcgi_param SCRIPT_FILENAME  \$document_root\$fastcgi_script_name;
        }

}

We also do not create symlinks since all references are managed correctly.

Now we need to configure the database connection parameters:

  • create a file called conf.js in /home/www-data/vidjil/browser/js containing:
    var config = {
        /*cgi*/
        "cgi_address" : "default",
    
        /*database */
        "use_database" : true,
        "db_address" : "default",
    
        "debug_mode" : false
    }
    

This tells the browser to access the server on the current domain.

  • copy vidjil/server/web2py/applications/vidjil/modules/defs.py.sample to vidjil/server/web2py/applications/vidjil/modules/defs.py and change the value of DBADDRESS to reference your database.

You can now access your app. All that is left to do is click on the init database link above the login page. This creates a default admin user: plop@plop.com and password: 1234 (make sure to remove this user in your production environment) and creates the configurations you can have for files and results.

4 Testing the server

If you develop on the server, or just want to check if everything is ok, you should launch the server tests.

First, you should have a working fuse server by launching make launch_fuse_server (just launch it once, then it is running in the background and can be killed with make kill_fuse_server).

Then you can launch the tests with make unit.

5 Troubleshootings

5.1 Workers seem to be stuck

For some reasons, that are not clear yet, it may happen that workers are not assigned any additional jobs even if they don't have any ongoin jobs.

In such a (rare) case, it may be useful to restart web2py schedulers

initctl restart web2py-scheduler

5.2 Restarting web2py

Just touch the file /etc/uwsgi/web2py.ini.

Another of restarting it is by touching the file server/web2py/applications/vidjil/modules/defs.py. This will tell uwsgi to restart web2py (including the workers).

5.3 Restarting uwsgi

When one modifies an uwsgi config file (usually in /etc/uwsgi directory, it may be necessary to restart uwsgi so that the modifications are taken into account. This can be done using

initctl restart uwsgi-emperor

6 Running the server in a production environment

6.1 Introduction

When manipulating a production environment it is important to take certain precautionnary mesures, in order to ensure production can either be rolled back to a previous version or simply that any encurred loss of data can be retrieved.

Web2py and Vidjil are no exception to this rule.

6.2 Making backups

Performing an Analysis in Vidjil is time-consuming, therefore should the data be lost, valuable man-hours are also lost. In order to prevent this we make regular incremental (?) backups of the data stored on the vidjil servers. This not only applies to the fiels uploaded and created by vidjil, but also to the database.

6.3 Autodelete and Permissions

Web2py has a handy feature called AutoDelete which allows the administrator to state that file reference deletions should be cascaded if no other references to the file exist. When deploying to production one needs to make sure AutoDelete is deactivated. As a second precaution it is also wise to temporarily restrict web2py's access to referenced files.

Taking two mesures to prevent file loss might seem like overkill, but securing data is more important than the small amount of extra time spent putting these mesures into place.

6.4 Deploying the server

Currently deploying changes to production is analogous to merging into the rbx branch and pulling from the server.

Once this has been done, it is important that any database migrations have been applied. This can be verified by refreshing the server (calling a controller) and then looking at the database.

6.5 Step by Step

  • Set AutoDelete to False
  • Check permissions on the uploads folder (set to 100)
    • you can also check the amount of files present at this point for future reference
  • Backup database: Archive old backup.csv and then from admin page: backup db
  • pull rbx (if already merged dev)
  • Check the database (for missing data or to ensure mmigrations have been applied)
  • Check files to ensure no files are missing
  • Reset the folder permissions on uploads (755 seems to be the minimum requirement for web2py)
  • Run unit tests (Simply a precaution: Continuous Integration renders this step redundant but it's better to be sure)
  • Check site functionnality

7 Resetting user passwords

Currently there is not easy way of resetting a user's password. The current method is the following: `cd server/web2py` `python web2py -S vidjil -M` `db.authuser[<user-id].updaterecord(password=CRYPT(key=auth.settings.hmackey)('<password>')1,resetpasswordkey='')`

Footnotes:

1

DEFINITION NOT FOUND.

Created: 2017-03-03 Fri 17:17

Emacs 24.3.1 (Org mode 8.2.4)

Validate