geneSmash API Documentation

geneSmash is a mash-up of various sources of information about human genes. The primary sources at the time of this writing are

  1. The gene_info file from the NCBI Entrez gene FTP site.
  2. The gene2unigene file from the NCBI Entrez gene FTP site.
  3. The refFlat.txt file from the UCSC Genome Browser.
  4. The hsa.gff file from miRBase.
  5. Human gene expression array annotation information is extracted from the Manufacturer's (Affymetrix, Agilent and Illumina) websites.
    Currently, probe annotation information for various Human gene expression array platforms from the above specified manufacturers is available in geneSmash

    Note: Microarray probes associated with a NCBI Entrez Gene are only included in the geneSmash database.
Other sources may be incorporated in the future. These sources of information have been combined into a simple CouchDB database. As a consequence, we can build tools that make it possible to find the genomic location of a gene from its symbol, or to map easily between other classes of gene identifiers.

Input and Output

All calls to read data or query results from the geneSmash web service are made using RESTful HTTP calls. All results are returned in JavaScript Object Notation (JSON).

geneSmash follows the usual CouchDb conventions. Each object has a unique identifier, which in this case is given by the Entrez gene ID. For example, the Entrez gene ID of the p53 gene happens to be "7157". To get the JSON representation of the CouchDB document for the p53 gene, send an HTTP GET request to the following URI:

http://app1.bioinformatics.mdanderson.org/genesmash/7157

Views

CouchDB (and thus geneSmash) queries are also known as views. The available views define the main API. In the current version, all views are contained in the "basic" design document. You can get a copy of the design document by sending an HTTP GET request to the URI:

http://app1.bioinformatics.mdanderson.org/genesmash/_design/basic
To use the API, each of the calls described below should be preceeded by

http://app1.bioinformatics.mdanderson.org/genesmash/_design/basic/_view/
Although we provide examples of calls that provide query parameters, every parameter is defined by the general CouchDB interface.

UNFINISHED

HTTP Call JSON Value Result
all {"total_rows": ..., "offset": ..., "rows" : [
   {"id": "...",
   "key": "...",
   "value": { ... }}]}
Fetch all documents from the database.
all?limit=10 Same as above Fetch the first 10 documents from the database.
by_symbol Fetch all genes sorted by HGNC symbol.
by_symbol?key="TP53" Fetch the document for the gene TP53
by_alias For all known aliases or synonyms, fetch the corresponding genes.
by_alias?key="AR" Fetch all genes with "AR" as a synonym.
by_cytoband?key="17p13.1" Fetch all genes mapped to the given cytoband.
by_ensembl?key="ENSG00000012048" Fetch the gene with the given Ensembl identifier.
by_unigene?key="Hs.654481" Fetch the gene with the given UniGene cluster ID
by_probe2?key=["Affymetrix","205241_at","HG-U133A"] Fetch the gene with the given microarray probe identifier
by_mir
by_location
gene_location
maxlength
minlength