geneSmash API Documentation
geneSmash is a mash-up of various sources of information about human genes. The primary sources at the time of this writing are
- The gene_info file from the NCBI Entrez gene FTP site.
- The gene2unigene file from the NCBI Entrez gene FTP site.
- The refFlat.txt file from the UCSC Genome Browser.
- The hsa.gff file from miRBase.
- Human gene expression array annotation information is extracted from the Manufacturer's (Affymetrix, Agilent and Illumina) websites.
- Affymetrix annotation files are obtained from NetAffx Analysis Center
- Illumina probe annotation is acquired from this location
- Agilent annotation information is taken from Agilent earray portal
Currently, probe annotation information for various Human gene expression array platforms from the above specified manufacturers is available in geneSmash
Note: Microarray probes associated with a NCBI Entrez Gene are only included in the geneSmash database.
Input and Output
All calls to read data or query results from the geneSmash web service are made using RESTful HTTP calls. All results are returned in JavaScript Object Notation (JSON).
geneSmash follows the usual CouchDb conventions. Each object has a unique identifier, which in this case is given by the Entrez gene ID. For example, the Entrez gene ID of the p53 gene happens to be "7157". To get the JSON representation of the CouchDB document for the p53 gene, send an HTTP GET request to the following URI:
http://app1.bioinformatics.mdanderson.org/genesmash/7157
Views
CouchDB (and thus geneSmash) queries are also known as views. The available views define the main API. In the current version, all views are contained in the "basic" design document. You can get a copy of the design document by sending an HTTP GET request to the URI:
http://app1.bioinformatics.mdanderson.org/genesmash/_design/basicTo use the API, each of the calls described below should be preceeded by
http://app1.bioinformatics.mdanderson.org/genesmash/_design/basic/_view/Although we provide examples of calls that provide query parameters, every parameter is defined by the general CouchDB interface.
UNFINISHED
HTTP Call | JSON Value | Result |
---|---|---|
all | {"total_rows": ..., "offset": ..., "rows" : [ {"id": "...", "key": "...", "value": { ... }}]} |
Fetch all documents from the database. |
all?limit=10 | Same as above | Fetch the first 10 documents from the database. |
by_symbol | Fetch all genes sorted by HGNC symbol. | |
by_symbol?key="TP53" | Fetch the document for the gene TP53 | |
by_alias | For all known aliases or synonyms, fetch the corresponding genes. | |
by_alias?key="AR" | Fetch all genes with "AR" as a synonym. | |
by_cytoband?key="17p13.1" | Fetch all genes mapped to the given cytoband. | |
by_ensembl?key="ENSG00000012048" | Fetch the gene with the given Ensembl identifier. | |
by_unigene?key="Hs.654481" | Fetch the gene with the given UniGene cluster ID | |
by_probe2?key=["Affymetrix","205241_at","HG-U133A"] | Fetch the gene with the given microarray probe identifier | |
by_mir | ||
by_location | ||
gene_location | ||
maxlength | ||
minlength |