Welcome to the GenomeQuest Documentation Wiki

Advanced Syntax for Keyword Search

From GQ Wiki
Jump to: navigation, search


Keyword.jpg





  • The following table gives details of advanced syntax of annotation search:


Type Example Result Comment
Basic Expressions Kinase Retrieves records containing the word “kinase” All searches are case insensitive
Limiting to an annotation field OS="human" Retrieves records containing the word “human” in the organism (OS) field. Remember the quotation marks around human, or it would be a syntax error
AND Expressions protein AND kinase Retrieves records containing the words “protein” AND the word “kinase”, in any order Short form: protein kinase
AND Expressions protein kinase Retrieves records containing the words “protein” AND the word “kinase”, in any order Short form for protein AND kinase
OR Expressions kinase OR protein Retrieves records containing the word “kinase” OR the word “protein” Short form:

kinase,protein or kinase;protein

OR Expressions kinase, protein Retrieves records containing the word “kinase” OR the word “protein” Short form of

kinase OR protein

OR Expressions kinase; protein Retrieves records containing the word “kinase” OR the word “protein” Another short form of kinase OR protein
NOT Expressions NOT kinase Retrieves records not containing the word “kinase” Short form: ! kinase
NOT Expressions !kinase Retrieves records not containing the word “kinase” Short form of NOT kinase
AND NOT protein AND NOT kinase Retrieves records containing the word “protein” and not the word “kinase” Short forms: protein NOT kinase, or protein !kinase
AND NOT protein NOT kinase Retrieves records containing the word “protein” and not the word “kinase” Short form of

protein AND NOT kinase

AND NOT protein  !kinase Retrieves records containing the word “protein” and not the word “kinase” Another short form of protein AND NOT kinase
Wildcard Expressions antibod* Returns records with words starting with “antibod” “*” Also matches the empty string
Wildcard Expressions *ase Returns records with words ending with “ase” “*” Also matches the empty string
Wildcard Expressions *zym* Returns records with words containing “zym” “*” Also matches the empty string
Phrase Searching ’protein kinase’ Retrieves records containing the word “protein” immediately followed by the word “kinase” In the record, the words may be separated by any separator: space, tab, new line, dot, comma, hyphen or their combinations.



Advanced Workarounds

Two apparent limitations exist at this time in relation to Virtual databases. Both have work-arounds using some of the above advanced syntax.

  1. No multi-select. It is not possible at this time to choose multiple virtual databases to search.
    1. Example problem. Say you have human kinase and human phosphatase saved as two separate virtual databases from RefSeq mRNA. On the sequence search page, you cannot choose both these databases to search. These results must first be merged.
    2. Workaround. You can run both the queries on RefSeq at the same time, and save the results as a new virtual database. In this case, use the query string (using the query syntax explained above) 'human kinase, human phosphatase' . Then save the resulting entries as a new virtual database.
  2. Cannot combine physical and virtual. It is not possible to choose a physical and a virtual database on the sequence search page.
    1. Example Problem. You cannot combine a physical database, say Genbank, with a virtual database, say "human kinase from RefSeq".
    2. Workaround. Since virtual databases are often derived from physical reference databases, the meaning of combining them can be ambiguous. The workaround here too is to create a new virtual database with the right meaning. In this case, open an annotation search on a combination of Genbank and RefSeq and run the query: 'GNAME="GB_PRI" OR (GNAME="RSM" AND ALL="human kinase")' .
Personal tools