Welcome to the GenomeQuest Documentation Wiki

Minimum Server Configuration

From GQ Wiki
Jump to: navigation, search

This page addresses the system requirements for installing the GenomeQuest system in-house.

If you use the GenomeQuestLive! service, this document is not for you. All you need is a web browser to access the full power of GenomeQuest and its compute cluster.

This document is for companies that wish to install the entire GenomeQuest system in-house, i.e. behind their firewall.

Contents

Introduction

GenomeQuest service is offered on in-house servers as a web-based client-server application. The computational engine runs on UNIX or Linux servers. The GenomeCast receiver is designed to provide highly-reliable and highly-efficient sequence database transfers over the Internet. Below are the recommended IT requirements to provide an acceptable installed GenomeQuest experience.

There are two hardware configurations supported, a cluster installation that should provide at least 2 years of growth based on the current set of databases served through the GenomeCast service, and an expectation of 100-200 NGS runs per year. The second configuration uses a single SMP machine which provides an easy way to get started for the purposes of evaluation or in smaller environments.

If you are using "My GenomeQuest" for sequence analysis, you do not need this document. This is document is meant for people who would like to host GenomeQuest behind their own firewalls.

There are three (3) basic requirements to host GenomeQuest on your servers:

  1. GenomeQuest server - hosts the GenomeQuest engine and web front end. We recommend running this on a cluster but a SMP configuration is also available.
  2. GenomeCast server - performs nightly reference data updates and refreshes. Typically this is run on the same server as the GenomeQuest software suite.
  3. End user computer - used to access the server(s)

Below are the minimum configuration requirements for each of the above.

GenomeQuest Server

Cluster Configuration

Notes:
  • Compute Clusters must have a dedicated Head node plus a minimum of 8 compute nodes.
Head node
  • At least 8 Cores
  • >2GHz processors (x86_64)
  • 64GB RAM (Minimum) Note: A SMP Server should have more RAM
  • 500GB Local disk storage (Minimum) for OS + GenomeQuest software installation
  • Swap space should be 3X system memory size
  • 4 x Gigabit Network (two internal, two external)
Compute Nodes
  • 8 Cores
  • >2GHz processors (x86_64)
  • 16GB RAM (24GB recommended)
  • 1Tb local scratch space per node
  • 64GB swap space per node (minimum required)
  • Gigabit Network
Shared Storage At least 10Tb
Operating System RedHat Enterprise Linux 5.x, CentOS 5x.
Job Manager/Scheduler Grid Engine 6.x

Single SMP Configuration

Single SMP machine
  • At least 8 Cores
  • >2GHz processors (x86_64)
  • 128GB RAM (Minimum)
  • 1TB Local disk storage (Minimum) for OS + GenomeQuest software installation
  • Swap space should be 3X system memory size
  • Gigabit Network
Shared Storage At least 10Tb
Operating System RedHat Enterprise Linux 5.x, CentOS 5x.
Job Manager/Scheduler Grid Engine 6.x

GenomeCast Server

The GenomeCast receiver may be hosted on a dedicated server however the standard configuration is to have both GenomeCast and GenomeQuest on the same server. The GenomeCast receiver uses the standard http protocol to update the databases.

Reference Database Disk Requirements

GenomeQuest performs database caching on compute nodes to increase overall system performance which results in higher local disk space requirements on the compute nodes. An estimate for the disk space requirements to support the GenomeCast reference sequence databases and local database growth is 10 TB of effective storage for the head node and 500GB for the compute nodes. Additional architectural design and performance consulting is available to clients wanting to run the GenomeQuest suite on a compute cluster. Please contact your GenomeQuest sales representative for more information.

Architecture and Operating System Requirements

These requirements are valid for both GenomeQuest and GenomeCast servers.

Operating System Architecture
Sun Solaris 10 AMD 64, SPARC
LINUX x86_64 AMD64 (Opteron), Intel Xeon or Pentium 64EMT. We recommend RedHat Enterprise 5 or higher.
LINUX64 IA64 (Itanium)

Operating System Configuration

  • A dedicated Unix user and group created on server. e.g. genomequest and genomequest.
  • ulimit. Due to the large number of open files that the GenomeQuest package handles during normal operation, the default ulimit setting for open files on most systems is too low. To increase this setting on a system running RedHat Linux, the following steps should be followed as root (super user):
    • Edit the /etc/security/limits.conf file
    • Add the following lines to the end of the file:
# Increase the default number of open files
 *		soft	nofile		16384
 *		hard	nofile		16384

Additional Server Software Requirements

  • Apache 2.x, running as the dedicated Unix user created for GenomeQuest.
  • presence of common system libraries including:
    • libcrypto.so.6
    • libssl.so.6
  • Perl 5.6 with the following modules:
    • Net::SMTP
    • Digest::MD5
    • Cache::File
    • XML::Parser
    • File::NFSLock
    • LWP::UserAgent
    • Time::HiRes
    • Date::Parse
    • Config::General
    • String::ShellQuote
  • ClustalW - a general purpose multiple sequence alignment program for DNA or proteins. Version 2.x can be downloaded from ftp://ftp.ebi.ac.uk/pub/software/clustalw2/. The executable file should be installed to /usr/local/bin/.
  • Certain GenomeQuest workflows require additional 3rd party software packages to be compiled and installed for the given hardware architecture:
    • newbler
    • ssaha
    • velvet
  • Bash interpreter (shell)
  • Either MySQL 5.0 – for user data only (sequence/annotation data stored in GQ Engine)
    • required dedicated database user created for GenomeQuest, e.g. 'genomequest'. This database user should have the ability to create/alter/delete tables, and perform CRUD operations on records.
  • GNU awk 3.1 or higher
  • Lftp 3.0 or higher
  • FTP server - for advanced upload support
  • Email server support (Sendmail 8.11 / Postfix 1.1 or higher)
  • SGE 6 queuing system (http://gridscheduler.sourceforge.net/)
  • SSH or RSH access from head node to compute nodes without password for cluster installations.


 system
 and 
Additional Software 
Requirements
 are 
minimum
 requirements.
 GenomeQuest 
is
 upward
 compatible.

Client Requirements

Windows PC

The Client can be a PC running Windows XP or greater, with a JavaScript-enabled web browser to access the GenomeQuest software. Currently supported browsers are:

  • Internet Explorer v7 or higher (in IE8 and IE9, the option "Enable XSS filter" should be disabled)
  • Firefox v3.0.6 or higher
  • Client's firewall should allow outbound traffic to the GenomeQuest server on port that is listened by FTP server, e.g. (21). This is required in order to use Advanced upload option in GenomeQuest.

JRE 1.6 or higher must be installed on each client machine and integrated into the client web browser.

Mac

Alternatively the Client can be a mac running Mac OS X 10.6 or higher. Safari (version 4 or higher) is the supported browser.

JRE 1.6 or higher must be installed on each client machine and integrated into the client web browser.

Personal tools