#genomics — oanor

Genome Assemblies API

Reference genome assemblies as an API — powered by NCBI Assembly, the registry of genome builds for organisms across the tree of life. Search assemblies by organism (or free text) and look up any assembly's metadata: its accession (GCF_… RefSeq or GCA_… GenBank), name (e.g. GRCh38.p14), organism and taxon id, assembly level (complete genome, chromosome, scaffold or contig), contiguity statistics (contig and scaffold N50), sequencing coverage, RefSeq category, UCSC and Ensembl names, the submitting organization, release date and FTP download paths. From the human reference genome to any sequenced microbe, plant or animal, it turns the genome-assembly registry into a clean search-and-fetch API. A genome-assembly registry — distinct from sequence (ENA), genome annotation (Ensembl), variant (ClinVar, dbVar) and gene-expression (GEO) databases. Open data from NCBI Assembly (public domain).

api.oanor.com/genomes-api

Gene Expression API

Functional-genomics experiments as an API — powered by NCBI GEO (Gene Expression Omnibus), the largest public repository of gene-expression data. GEO archives expression series and curated datasets from microarray and high-throughput-sequencing experiments across every organism. Search experiments by keyword and optionally by organism, and look up any series or dataset to get its metadata: title, summary, assay type (expression profiling by array or by sequencing), organism, number of samples, platform and the publication behind it. From β-cell stress studies to cancer transcriptomics across human and mouse, it turns the GEO archive into a simple search-and-fetch API for transcriptomics, bioinformatics and research-data discovery. A gene-expression / functional-genomics dataset repository — distinct from sequence (ENA), variant (ClinVar, dbVar), structure (PDB) and ontology databases. Open data from NCBI GEO (public domain).

api.oanor.com/geodatasets-api

Structural Variants API

Human genomic structural variation as an API — powered by NCBI dbVar, the archive of structural variants (SVs): copy-number variants (CNVs), large deletions, duplications, insertions, inversions and translocations, typically larger than 50 base pairs. This is the structural counterpart to single-nucleotide variant databases: search structural variants overlapping a gene (or by free text) and get each variant's dbVar accession, the study it came from, its type, the genes it overlaps, its genomic placement on GRCh38 and its clinical significance; then look up any variant for the full record — placements on both GRCh37 and GRCh38 assemblies, variant type, genes, clinical significance, study type, methods and variant counts. From BRCA1 CNVs to Cri-du-chat deletions, it is ideal for genomics, cytogenetics, rare-disease and bioinformatics work. A structural-variation / CNV resource — distinct from clinical single-nucleotide variant interpretation (ClinVar), population allele frequencies (gnomAD) and trait associations (GWAS). Open data from NCBI dbVar (public domain).

api.oanor.com/dbvar-api

Protein Interactions API

Protein-protein interaction networks as an API — powered by STRING, the database of known and predicted protein associations that combines evidence from laboratory experiments, curated pathway databases, gene co-expression, genomic context and automated text mining into a single confidence score, across thousands of organisms. Get a protein's top interaction partners (each with the combined confidence score and the seven evidence-channel subscores), the interaction network among any set of proteins as scored edges, and functional enrichment for a gene set — the over-represented GO terms, KEGG pathways, Pfam domains and more, each with its p-value, FDR and member genes. Pass gene symbols (TP53) or STRING/Ensembl ids, for human (default) or any species by NCBI taxon id. It is a cornerstone of systems biology — ideal for network analysis, functional genomics, pathway and bioinformatics tools. A protein-interaction-network resource — distinct from biological pathways (Reactome), curated protein complexes (Complex Portal) and Gene Ontology annotations (QuickGO). Open data from STRING (CC BY 4.0).

api.oanor.com/stringdb-api

Polygenic Scores API

Polygenic (risk) scores as an API — powered by the NHGRI-EBI PGS Catalog, the open database of published polygenic scores: weighted combinations of genetic variants used to estimate a person's genetic predisposition to a trait or disease. Search traits by name to find their ontology ids, list every polygenic score developed for a trait, and read a score's full metadata — the reported and mapped (EFO/MONDO) traits, the number of variants in the score, the development method, genome build, the ancestry distribution of the samples it was built and evaluated on, the publication behind it (title, journal, date, PubMed id), the release date, license and a direct link to the scoring file. From breast cancer and coronary artery disease to type 2 diabetes and BMI, it is ideal for statistical genetics, genomics, risk-prediction research and bioinformatics tools. A polygenic-score / genetic-risk-prediction resource — distinct from single-variant association studies (GWAS Catalog), population allele frequencies (gnomAD) and clinical variant interpretation (ClinVar). Open data from the NHGRI-EBI PGS Catalog (CC BY 4.0).

api.oanor.com/pgs-api

Gene Ontology API

Gene function as an API — powered by EMBL-EBI's QuickGO and the Gene Ontology (GO), the standard vocabulary that describes what gene products do across three aspects: molecular function, biological process and cellular component. Given a gene or protein (a UniProt accession), list every GO annotation made for it — the GO term, its aspect, the qualifier, the evidence code, the supporting reference (e.g. a PubMed id), the organism and who assigned it — optionally filtered by aspect or organism. Look up any GO term to get its definition, aspect, synonyms and number of child terms; and search the ontology by name to find the right GO terms. GO term names are resolved automatically on annotations. From TP53 to any protein in any species, it is the backbone of functional genomics — ideal for enrichment analysis, annotation pipelines, bioinformatics and research tools. A gene-function annotation resource (which genes have which functions, with evidence) — distinct from generic ontology-term lookup. Open data from EMBL-EBI QuickGO and the GO Consortium (CC BY 4.0).

api.oanor.com/quickgo-api

GWAS Catalog API

Human genetic trait associations as an API — powered by the NHGRI-EBI GWAS Catalog, the curated reference of published genome-wide association studies. It answers the core question of statistical genetics: which genetic variants (SNPs) are associated with which traits and diseases, and how strongly. Look up a SNP to get its functional class, genomic location and mapped genes; pull every trait association reported for it — the trait, p-value, effect size (odds ratio or beta), risk allele and frequency, and author-reported genes; and read the study behind the evidence — trait, sample sizes, ancestries, genotyping technology and the publication (PubMed id, authors, journal, date). From type 2 diabetes and Crohn disease to systemic lupus erythematosus and hundreds of thousands of associations, it is ideal for genomics, bioinformatics, statistical-genetics and biomedical research tools. A published genetic-association evidence base — distinct from population allele frequencies (gnomAD), clinical variant interpretation (ClinVar) and genome annotation (Ensembl). Open data from the NHGRI-EBI GWAS Catalog (EMBL-EBI).

api.oanor.com/gwas-api

UCSC Genome API

The UCSC Genome Browser as an API — reference genome data for hundreds of species, from the renowned UCSC Genome Browser at UC Santa Cruz. /v1/genomes lists the 220+ genome assemblies UCSC hosts, each with its assembly id (such as hg38 for human, mm39 for mouse, danRer11 for zebrafish), organism, description and data source. /v1/chromosomes?genome=hg38 returns an assembly's chromosomes and sequences with their sizes in base pairs, largest first. /v1/sequence?genome=hg38&chrom=chrM&start=0&end=100 retrieves the raw DNA sequence of any genomic region (0-based start, half-open end; regions are capped at 100,000 bases per call). Assembly ids come from /v1/genomes and chromosome names look like chr1, chrX or chrM. Ideal for bioinformatics pipelines, genome-visualisation and primer-design tools, region and sequence lookups, comparative genomics and teaching. Data from the UCSC Genome Browser (free for academic, non-profit and personal use). This is the genome browser's assemblies and raw reference sequence — distinct from gene-annotation and protein-sequence databases such as Ensembl, UniProt and ENA.

api.oanor.com/ucsc-api

ENA API

The European Nucleotide Archive (ENA) as an API, powered by EMBL-EBI — one of the three INSDC partners alongside NCBI GenBank and DDBJ, and the comprehensive public archive of the world's nucleotide sequence data. ENA holds raw sequencing reads, assembled and annotated genomes, individual sequences, biological samples and the studies behind them, for every domain of life — the backbone resource for genomics, microbiology, ecology, evolution and clinical research. This API gives a clean three-step workflow over that archive. First, /v1/taxon resolves an organism name (e.g. "Homo sapiens") to its NCBI taxon id, scientific name, taxonomic rank and full lineage — or looks a taxon up directly by id. Then /v1/search queries the archive for that taxon's records of a chosen type: genome assemblies (with assembly name, level and base count), sequencing runs (with platform, instrument and read counts), biological samples (with collection date and country), annotated sequences, read experiments, analyses, coding and non-coding sequences, and studies — by default including all descendant taxa, or restricted to the exact taxon. Finally /v1/record returns a summary for any ENA accession — assemblies (GCA_…), studies and projects (PRJ…), samples (SAM…/ERS…), sequencing runs (ERR…/SRR…) and sequences — with its title, data type, taxon, scientific name, base and sequence counts and public status. Ideal for bioinformatics pipelines, genome-data discovery, sequencing-metadata harvesting, biodiversity and metagenomics tooling, and research reproducibility. Taxon ids look like 9606 (human); accessions like GCA_000001405. Data from EMBL-EBI ENA, an INSDC archive, free to use.

api.oanor.com/ena-api

Ensembl API

The Ensembl genome database as an API, powered by the official Ensembl REST service from EMBL-EBI. Look up any gene by symbol or Ensembl stable id for its biotype, genomic location, strand, description and transcripts; resolve any feature (gene, transcript, exon) by stable id; pull external database cross-references; fetch sequence variants by rsID with their alleles, most-severe consequence, minor-allele frequency, clinical significance and genomic mappings; list the genes, transcripts, exons, variations or repeats overlapping any genomic region; retrieve genomic, cDNA, CDS or protein sequences by id; and read genome-assembly metadata including the karyotype and chromosome lengths. Across human, mouse and 300+ vertebrate species. Ideal for bioinformatics pipelines, genome browsers and variant-annotation tools, genetics research apps, clinical-genomics dashboards and life-science chatbots.

api.oanor.com/ensembl-api