GQ Gene

    GQgene databases (Sep, 18 2009)



Suite of GQgene databases aims to provide for each supported species a full description of a genome in terms of genes, transcripts and associated proteins. To achieve this goal, a GQgene database is in fine composed of four (4) types of resources :

  • a set of gene models (one per gene) aggregating all gene-centric related biological objects (splice-variants - exons/introns, CDS, proteins) with their respective 'mapping' on some natural reference (genomic for the gene itself and its transcripts, transcript for the CDS, ...). These models are encoded using an XML framework and used by the GQgene viewer to display a graphical representation of the gene and all its 'by-products'.
  • a reference genomic sequence database containing the genomic sequences (from fully assembled chromosomes to contigs or BACs);
  • a reference transcript sequence database;
  • a reference peptide sequence database;

Build methodology

When available and sufficiently 'complete', EntrezGene from NCBI is by far the preferred source to start with. In such a case, gene models are directly extracted from the EntrezGene database and reference sequences are searched in RefSeq/GenBank corresponding divisions. When for a species, NCBI is considered as being not enough 'mature' in term of species coverage, an alternate source is searched and used, e.g. some international consortium dedicated to the sequencing of the species. In such a case, gene models are in general extracted from a so-called GFF(*) model and associated sequences are also looked for in the same 'site'.

(*): Other gene metamodels exist like TIGR XML format but not currently supported.


Species Source Site(ftp)
Arabidopsis_thaliana NCBI
Glycine_max phytozome.net ftp://ftp.jgi-psf.org/pub/JGI_data/Glycine_max/Glyma1/
Homo_sapiens NCBI
Mus_musculus NCBI
Rattus_norvegicus NCBI
Oryza_sativa NCBI
Sorghum_bicolor phytozome.net ftp://ftp.jgi-psf.org/pub/JGI_data/Sorghum_bicolor/(v(\d+)\.(\d+))/
Zea_mays maizesequence.org ftp://ftp.maizesequence.org/pub/maize/release-4a.53/sequences/, ftp://ftp:agiftpguest@ftp.genome.arizona.edu/pub/fpc/maize/maize_pseudo.tar.gz

GQ Gene collection

Currently GQ Gene collection of databases consists of the following collection

  • GQ Gene for Arabidopsis_thaliana
  • GQ Gene for Glycine_max
  • GQ Gene for Homo_sapiens
  • GQ Gene for Mus_musculus
  • GQ Gene for Oryza_sativa
  • GQ Gene for Rattus_norvegicus
  • GQ Gene for Sorghum_bicolor
  • GQ Gene for Zea_mays
  • GQ Gene Transcripts for Arabidopsis_thaliana
  • GQ Gene Transcripts for Glycine_max
  • GQ Gene Transcripts for Homo_sapiens
  • GQ Gene Transcripts for Mus_musculus
  • GQ Gene Transcripts for Oryza_sativa
  • GQ Gene Transcripts for Rattus_norvegicus
  • GQ Gene Transcripts for Sorghum_bicolor
  • GQ Gene Transcripts for Zea_mays
  • Glycine max genomic
  • Glycine max mRNA
  • Sorghum bicolor genomic
