#bioinformatics

DNA Melting Temperature API

DNA-oligo and PCR-primer maths as an API, computed locally and deterministically. The tm endpoint computes the melting temperature of a primer sequence three ways: the Wallace rule 2·(A+T) + 4·(G+C) for short oligos up to 13 nt, the Marmur–Wallace GC formula 64.9 + 41·(nGC − 16.4)/N for longer ones, and the salt-adjusted 81.5 + 0.41·%GC − 675/N + 16.6·log10[Na+] for a given sodium concentration, and recommends the right method for the length — an eight-base ATGCATGC melts at 24 °C by Wallace, a 20-base 50 %-GC primer at about 51.8 °C by Marmur. The gc-content endpoint reports the GC and AT percentages, the per-base counts and the single-stranded molecular weight. The reverse-complement endpoint returns the complement, the reverse and the reverse complement of a strand. Sequences use A/C/G/T (case-insensitive, whitespace ignored) and [Na+] is in mol/L. Everything is computed locally and deterministically, so it is instant and private. Ideal for molecular-biology, biotech, PCR, primer-design, bioinformatics and lab-automation app developers, oligo and primer calculators, and LIMS software. Estimation formulas for primer design, not a substitute for nearest-neighbour thermodynamics. Pure local computation — no key, no third-party service, instant. Live, nothing stored. 3 endpoints. This is oligo melting temperature; for population-genetics allele frequencies use a genetics API.

api.oanor.com/dnamelt-api

DNA Sequence API

DNA/RNA sequence-analysis maths as an API, computed locally and deterministically. The analyze endpoint reports the length and base composition of a sequence, the GC and AT content, the complement, the reverse and the reverse complement (the opposite strand read 5'→3'), and the approximate single-stranded molecular weight. The translate endpoint transcribes DNA to mRNA (T→U) and translates it to protein with the standard genetic code in reading frame 1, 2 or 3, giving the one-letter amino-acid sequence, the protein length and the number of stop codons. The melting endpoint estimates a primer's melting temperature with the Wallace rule, 4·(G+C) + 2·(A+T), for short oligos and a salt-adjusted basic formula for longer ones. Sequences are case- and whitespace-insensitive and accept A, C, G, T for DNA or U for RNA. Everything is computed locally and deterministically, so it is instant and private. Ideal for bioinformatics, molecular-biology, genomics and lab app developers, primer-design and sequence-inspection tools, and biology education. Pure local computation — no key, no third-party service, instant. Live, nothing stored. 3 endpoints. This is sequence analysis; for genome assembly data use a genomes API.

api.oanor.com/dna-api

Genome Assemblies API

Reference genome assemblies as an API — powered by NCBI Assembly, the registry of genome builds for organisms across the tree of life. Search assemblies by organism (or free text) and look up any assembly's metadata: its accession (GCF_… RefSeq or GCA_… GenBank), name (e.g. GRCh38.p14), organism and taxon id, assembly level (complete genome, chromosome, scaffold or contig), contiguity statistics (contig and scaffold N50), sequencing coverage, RefSeq category, UCSC and Ensembl names, the submitting organization, release date and FTP download paths. From the human reference genome to any sequenced microbe, plant or animal, it turns the genome-assembly registry into a clean search-and-fetch API. A genome-assembly registry — distinct from sequence (ENA), genome annotation (Ensembl), variant (ClinVar, dbVar) and gene-expression (GEO) databases. Open data from NCBI Assembly (public domain).

api.oanor.com/genomes-api

Gene Expression API

Functional-genomics experiments as an API — powered by NCBI GEO (Gene Expression Omnibus), the largest public repository of gene-expression data. GEO archives expression series and curated datasets from microarray and high-throughput-sequencing experiments across every organism. Search experiments by keyword and optionally by organism, and look up any series or dataset to get its metadata: title, summary, assay type (expression profiling by array or by sequencing), organism, number of samples, platform and the publication behind it. From β-cell stress studies to cancer transcriptomics across human and mouse, it turns the GEO archive into a simple search-and-fetch API for transcriptomics, bioinformatics and research-data discovery. A gene-expression / functional-genomics dataset repository — distinct from sequence (ENA), variant (ClinVar, dbVar), structure (PDB) and ontology databases. Open data from NCBI GEO (public domain).

api.oanor.com/geodatasets-api

Structural Variants API

Human genomic structural variation as an API — powered by NCBI dbVar, the archive of structural variants (SVs): copy-number variants (CNVs), large deletions, duplications, insertions, inversions and translocations, typically larger than 50 base pairs. This is the structural counterpart to single-nucleotide variant databases: search structural variants overlapping a gene (or by free text) and get each variant's dbVar accession, the study it came from, its type, the genes it overlaps, its genomic placement on GRCh38 and its clinical significance; then look up any variant for the full record — placements on both GRCh37 and GRCh38 assemblies, variant type, genes, clinical significance, study type, methods and variant counts. From BRCA1 CNVs to Cri-du-chat deletions, it is ideal for genomics, cytogenetics, rare-disease and bioinformatics work. A structural-variation / CNV resource — distinct from clinical single-nucleotide variant interpretation (ClinVar), population allele frequencies (gnomAD) and trait associations (GWAS). Open data from NCBI dbVar (public domain).

api.oanor.com/dbvar-api

Protein Interactions API

Protein-protein interaction networks as an API — powered by STRING, the database of known and predicted protein associations that combines evidence from laboratory experiments, curated pathway databases, gene co-expression, genomic context and automated text mining into a single confidence score, across thousands of organisms. Get a protein's top interaction partners (each with the combined confidence score and the seven evidence-channel subscores), the interaction network among any set of proteins as scored edges, and functional enrichment for a gene set — the over-represented GO terms, KEGG pathways, Pfam domains and more, each with its p-value, FDR and member genes. Pass gene symbols (TP53) or STRING/Ensembl ids, for human (default) or any species by NCBI taxon id. It is a cornerstone of systems biology — ideal for network analysis, functional genomics, pathway and bioinformatics tools. A protein-interaction-network resource — distinct from biological pathways (Reactome), curated protein complexes (Complex Portal) and Gene Ontology annotations (QuickGO). Open data from STRING (CC BY 4.0).

api.oanor.com/stringdb-api

Polygenic Scores API

Polygenic (risk) scores as an API — powered by the NHGRI-EBI PGS Catalog, the open database of published polygenic scores: weighted combinations of genetic variants used to estimate a person's genetic predisposition to a trait or disease. Search traits by name to find their ontology ids, list every polygenic score developed for a trait, and read a score's full metadata — the reported and mapped (EFO/MONDO) traits, the number of variants in the score, the development method, genome build, the ancestry distribution of the samples it was built and evaluated on, the publication behind it (title, journal, date, PubMed id), the release date, license and a direct link to the scoring file. From breast cancer and coronary artery disease to type 2 diabetes and BMI, it is ideal for statistical genetics, genomics, risk-prediction research and bioinformatics tools. A polygenic-score / genetic-risk-prediction resource — distinct from single-variant association studies (GWAS Catalog), population allele frequencies (gnomAD) and clinical variant interpretation (ClinVar). Open data from the NHGRI-EBI PGS Catalog (CC BY 4.0).

api.oanor.com/pgs-api

Gene Ontology API

Gene function as an API — powered by EMBL-EBI's QuickGO and the Gene Ontology (GO), the standard vocabulary that describes what gene products do across three aspects: molecular function, biological process and cellular component. Given a gene or protein (a UniProt accession), list every GO annotation made for it — the GO term, its aspect, the qualifier, the evidence code, the supporting reference (e.g. a PubMed id), the organism and who assigned it — optionally filtered by aspect or organism. Look up any GO term to get its definition, aspect, synonyms and number of child terms; and search the ontology by name to find the right GO terms. GO term names are resolved automatically on annotations. From TP53 to any protein in any species, it is the backbone of functional genomics — ideal for enrichment analysis, annotation pipelines, bioinformatics and research tools. A gene-function annotation resource (which genes have which functions, with evidence) — distinct from generic ontology-term lookup. Open data from EMBL-EBI QuickGO and the GO Consortium (CC BY 4.0).

api.oanor.com/quickgo-api

GWAS Catalog API

Human genetic trait associations as an API — powered by the NHGRI-EBI GWAS Catalog, the curated reference of published genome-wide association studies. It answers the core question of statistical genetics: which genetic variants (SNPs) are associated with which traits and diseases, and how strongly. Look up a SNP to get its functional class, genomic location and mapped genes; pull every trait association reported for it — the trait, p-value, effect size (odds ratio or beta), risk allele and frequency, and author-reported genes; and read the study behind the evidence — trait, sample sizes, ancestries, genotyping technology and the publication (PubMed id, authors, journal, date). From type 2 diabetes and Crohn disease to systemic lupus erythematosus and hundreds of thousands of associations, it is ideal for genomics, bioinformatics, statistical-genetics and biomedical research tools. A published genetic-association evidence base — distinct from population allele frequencies (gnomAD), clinical variant interpretation (ClinVar) and genome annotation (Ensembl). Open data from the NHGRI-EBI GWAS Catalog (EMBL-EBI).

api.oanor.com/gwas-api

BioSamples API

BioSamples as an API, powered by EMBL-EBI — the database that stores and links the metadata of biological samples, the physical specimens behind biological experiments. A sample in BioSamples carries a stable accession (such as SAMEA3231268) and a rich set of characteristics — organism, tissue or organism part, cell type, sex, disease, developmental stage, strain and any submitter-provided attributes — and is referenced by other EBI archives including the European Nucleotide Archive (ENA), ArrayExpress and PRIDE. /v1/search?q=liver searches samples by free text and returns each match's accession, name, organism and release date. /v1/sample?id=SAMEA3231268 returns a sample's metadata — its accession, name, NCBI taxon id, organism, release and update dates, the number of relationships to other samples, and its characteristics flattened to a clean key→value map. Accessions look like SAMEA…, SAMN… or SAMD…; get one from the search endpoint. Ideal for life-science data integration, sample tracking, metadata harmonisation and linking sequencing or expression data back to its source specimen. Data from EMBL-EBI BioSamples (public). This is a biological-sample metadata registry — distinct from study (BioStudies), sequence (ENA), variant (ClinVar) and structure databases.

api.oanor.com/biosamples-api

BioStudies API

BioStudies as an API, powered by EMBL-EBI — the database that holds the descriptions of biological studies and links their data together across EBI resources, including imaging (BioImage Archive), functional genomics (ArrayExpress), proteomics, and the literature (Europe PMC). Each study has an accession, a title and abstract, the collection it belongs to and links to its underlying data and publications. /v1/search?query=covid searches the studies and returns each match's accession (e.g. S-EPMC8017430), title, author, study type, release date and link/file counts. /v1/study?id=S-EPMC8017430 returns a study's metadata — its accession, the collection it belongs to (such as EuropePMC, ArrayExpress or BioImages), title, abstract, release date, authors and the number of linked resources. Accessions look like S-EPMC8017430 or S-BSST123; get one from the search endpoint. Ideal for research-data discovery, linking literature to its underlying datasets, systematic reviews and reproducibility tooling. Data from EMBL-EBI BioStudies (public). This is a studies and datasets metadata index — distinct from the sequence (UniProt, ENA), structure (PDB, EMDB), variant (ClinVar) and ontology databases.

api.oanor.com/biostudies-api

EMDB API

The Electron Microscopy Data Bank (EMDB) as an API, powered by EMBL-EBI — the public archive of three-dimensional electron-microscopy density maps of proteins, nucleic acids and large macromolecular complexes. EMDB is the electron-microscopy counterpart of the Protein Data Bank, holding maps solved by single-particle cryo-EM, electron tomography and electron crystallography, the technique behind the recent "resolution revolution" in structural biology. /v1/search?q=ribosome searches the archive and returns each matching entry's EMDB id (e.g. EMD-1010), title, electron-microscopy method and resolution in ångström. /v1/entry?id=EMD-1010 returns an entry's metadata — its title, the EM method (single particle, tomography, …), the aggregation state, the resolution, the biological sample studied, classification keywords, the deposition, map-release and last-update dates, and the depositing authors. EMDB ids look like EMD-1010, and you may pass just the number. Ideal for structural-biology and cryo-EM tools, structure-comparison and visualisation apps, and education. Data from EMBL-EBI EMDB (public domain). This is the archive of experimental electron-microscopy MAPS — distinct from atomic-coordinate structures (the PDB), predicted structures (AlphaFold) and protein-sequence databases (UniProt).

api.oanor.com/emdb-api

BioModels API

BioModels as an API, powered by EMBL-EBI — the world's largest repository of curated, published mathematical models of biological systems. BioModels collects computational models (mostly in SBML, the Systems Biology Markup Language) of metabolism, cell signalling, gene-regulatory networks, the cell cycle, disease processes and physiology, each linked to the peer-reviewed publication it comes from. /v1/search?query=glycolysis searches the repository and returns each matching model's id (such as BIOMD0000000012), name, format, submitter and submission/modification dates. /v1/model?id=BIOMD0000000012 returns a model's metadata — its name and description, the encoding format, the modelling approach (e.g. ordinary differential equation model), the curation status, the publication behind it (title, journal, year, authors) and the model files. Model ids look like BIOMD0000000012 for curated models or MODEL1234567890 for non-curated submissions; get them from the search endpoint. Ideal for systems-biology and computational-modelling tools, reproducible-research and model-reuse workflows, and teaching. Data from EMBL-EBI BioModels (CC0). This is a systems-biology / computational-model repository — distinct from sequence (UniProt, ENA), structure (PDB, AlphaFold), pathway and variant (ClinVar) databases.

api.oanor.com/biomodels-api

UCSC Genome API

The UCSC Genome Browser as an API — reference genome data for hundreds of species, from the renowned UCSC Genome Browser at UC Santa Cruz. /v1/genomes lists the 220+ genome assemblies UCSC hosts, each with its assembly id (such as hg38 for human, mm39 for mouse, danRer11 for zebrafish), organism, description and data source. /v1/chromosomes?genome=hg38 returns an assembly's chromosomes and sequences with their sizes in base pairs, largest first. /v1/sequence?genome=hg38&chrom=chrM&start=0&end=100 retrieves the raw DNA sequence of any genomic region (0-based start, half-open end; regions are capped at 100,000 bases per call). Assembly ids come from /v1/genomes and chromosome names look like chr1, chrX or chrM. Ideal for bioinformatics pipelines, genome-visualisation and primer-design tools, region and sequence lookups, comparative genomics and teaching. Data from the UCSC Genome Browser (free for academic, non-profit and personal use). This is the genome browser's assemblies and raw reference sequence — distinct from gene-annotation and protein-sequence databases such as Ensembl, UniProt and ENA.

api.oanor.com/ucsc-api

ClinVar API

ClinVar as an API, powered by the US National Library of Medicine via NCBI E-utilities. ClinVar is the public archive of the relationships between human genetic variants and health, recording the clinical significance (interpretation) of each variant — whether it is Pathogenic, Likely pathogenic, of Uncertain significance, Likely benign or Benign — together with the conditions it is associated with. /v1/search?gene=BRCA1 searches ClinVar by gene symbol, or by free text with q= (e.g. a disease or HGVS expression), returning the total number of matching variants and a list of ClinVar variation ids. /v1/variant?id=4852102 returns a variant's summary: its ClinVar accession (VCV…), title, variant type, the variation and cDNA names, the clinical classification and review status, the associated condition(s), the gene(s) and primary gene, the chromosome and location, the protein change and the molecular consequence, plus a link to the ClinVar record. Get a variation id from /v1/search, then fetch its details. Ideal for clinical-genomics and variant-annotation pipelines, rare-disease and genetic-counselling tools, and research dashboards. Data from NCBI ClinVar (public domain). This is clinical variant interpretation — distinct from population allele-frequency databases (such as gnomAD) and from protein/sequence databases. Please keep request rates modest under NCBI fair-use.

api.oanor.com/clinvar-api

ENA API

The European Nucleotide Archive (ENA) as an API, powered by EMBL-EBI — one of the three INSDC partners alongside NCBI GenBank and DDBJ, and the comprehensive public archive of the world's nucleotide sequence data. ENA holds raw sequencing reads, assembled and annotated genomes, individual sequences, biological samples and the studies behind them, for every domain of life — the backbone resource for genomics, microbiology, ecology, evolution and clinical research. This API gives a clean three-step workflow over that archive. First, /v1/taxon resolves an organism name (e.g. "Homo sapiens") to its NCBI taxon id, scientific name, taxonomic rank and full lineage — or looks a taxon up directly by id. Then /v1/search queries the archive for that taxon's records of a chosen type: genome assemblies (with assembly name, level and base count), sequencing runs (with platform, instrument and read counts), biological samples (with collection date and country), annotated sequences, read experiments, analyses, coding and non-coding sequences, and studies — by default including all descendant taxa, or restricted to the exact taxon. Finally /v1/record returns a summary for any ENA accession — assemblies (GCA_…), studies and projects (PRJ…), samples (SAM…/ERS…), sequencing runs (ERR…/SRR…) and sequences — with its title, data type, taxon, scientific name, base and sequence counts and public status. Ideal for bioinformatics pipelines, genome-data discovery, sequencing-metadata harvesting, biodiversity and metagenomics tooling, and research reproducibility. Taxon ids look like 9606 (human); accessions like GCA_000001405. Data from EMBL-EBI ENA, an INSDC archive, free to use.

api.oanor.com/ena-api

MGnify API

MGnify as an API, powered by EMBL-EBI — the world's largest free resource for the analysis and archiving of microbiome sequencing data, and the metagenomics sister to PRIDE (proteomics) and MetaboLights (metabolomics). MGnify holds tens of thousands of public metagenomics and metabarcoding studies spanning the human gut microbiome, marine and freshwater environments, soils, wastewater, the built environment and host-associated communities. Search the studies by keyword, getting each study's MGnify accession (MGYS...), name, abstract, biome, sample count and the source sequencing BioProject; read a study's full metadata including its name and abstract, biome classification, number of samples, submitting centre, public status, data origination and last-update date; and browse the GOLD-style biome classification tree — from root:Host-associated:Human:Digestive system to root:Environmental:Aquatic:Marine — with per-biome sample and study counts, for discovery by environment. Ideal for microbiome and environmental-genomics research, dataset reuse and meta-analysis, bioinformatics pipelines and teaching. Study accessions look like MGYS00006862. Data from EMBL-EBI MGnify.

api.oanor.com/mgnify-api

Cellosaurus API

Cellosaurus as an API, powered by the SIB Swiss Institute of Bioinformatics — the reference encyclopaedia of cell lines used in biomedical research. With more than 150,000 entries spanning cancer cell lines, hybridomas, induced pluripotent stem cells, and lines from hundreds of species, Cellosaurus is the authoritative catalogue researchers use to identify and validate the cell lines behind published experiments. Search the cell lines by name or keyword, getting each line's Cellosaurus accession (CVCL_…), name, category, species and disease; and read a cell line's full record — its name and synonyms, category (e.g. cancer cell line, hybridoma, stem cell), species with NCBI taxonomy id, sex, age, the disease it derives from with NCIt/ontology identifiers, the tissue or anatomical site of origin, its parent cell line and the number of derived child lines, the count of literature references and the many cross-references (to ATCC, DSMZ, ECACC, Wikidata and more), relevant web pages, and — critically for research reproducibility — whether the line is flagged PROBLEMATIC, meaning it has been misidentified or cross-contaminated, together with the explanatory notes. Ideal for laboratory quality control and cell-line authentication, biomedical and cancer research, data curation and reproducibility checks. Accessions look like CVCL_0030 (HeLa). Data from Cellosaurus (CC-BY 4.0).

api.oanor.com/cellosaurus-api

AlphaFold API

The AlphaFold Protein Structure Database as an API, powered by EMBL-EBI and Google DeepMind. AlphaFold predicts the three-dimensional structure of a protein from its amino-acid sequence with experimental-level accuracy, and the database now covers over 200 million proteins — nearly every sequence in UniProt. Look up the AlphaFold model for any protein by its UniProt accession and get its gene and protein description, organism and sequence length, model version and creation date, the global confidence metric, the full amino-acid sequence, and direct download links to the predicted structure as mmCIF, PDB and BinaryCIF together with the Predicted Aligned Error (PAE) plot image and data; and read a protein's structural coverage — the AlphaFold predicted model(s) and any linked structures with their provider, model category, method and the UniProt residue range covered. Ideal for structural biology, drug discovery and target assessment, protein engineering, molecular visualisation and teaching. Proteins are identified by UniProt accession (for example P00520 or P38398). Data from the AlphaFold DB (CC-BY 4.0). For experimentally-determined 3D structures see the PDB API, for protein sequences and functional annotation the UniProt API, and for families & domains InterPro.

api.oanor.com/alphafold-api

Complex Portal API

The Complex Portal as an API, powered by EMBL-EBI — a manually curated, encyclopaedic database of stable macromolecular complexes: assemblies of two or more proteins (and sometimes nucleic acids, ligands or small molecules) that work together as a single functional unit, such as ribosomes, proteasomes, RNA and DNA polymerases, the spliceosome, respiratory-chain complexes and thousands more across many species. Search the complexes by keyword and optionally by organism, getting each complex's Complex Portal accession (CPX-…), name, organism, description and whether it is computationally predicted; read a complex's full curated record including its recommended and systematic names, synonyms, species, biological function, the participating subunits each with its molecule identifier (for example a UniProt accession) and stoichiometry, any associated ligands and diseases, the evidence type and cross-references to UniProt, Gene Ontology, Reactome, Wikidata and more; and pull just the subunit composition of a complex. Ideal for structural and systems biology, pathway and network analysis, protein-function research and bioinformatics pipelines. Complex accessions look like CPX-6036. Data from EMBL-EBI Complex Portal (IMEx consortium, CC-BY). For protein–protein interaction networks see the STRING API, for protein sequences UniProt, for biological pathways Reactome and for families & domains InterPro.

api.oanor.com/complexes-api

Rfam API

The Rfam database of non-coding RNA families as an API, powered by EMBL-EBI. Rfam groups functional RNAs that share a common evolutionary origin into families, each modelled by a covariance model built from a curated seed alignment and secondary structure. Search the families by name, description or RNA type — riboswitches and other cis-regulatory elements, ribozymes, microRNA families, ribosomal RNAs, transfer RNAs, small nuclear and small nucleolar RNAs, long non-coding RNAs and CRISPR direct repeats — getting each family's Rfam accession, name, description, RNA type and curators; read a family's full record including its description, RNA-type classification, the curators who built it, the number of sequences in its full and seed alignments, the structure source, the curator comment, the clan (group of related families) it belongs to and the Rfam release; and browse the families by RNA class. Ideal for RNA biology, bioinformatics pipelines, non-coding-RNA annotation, comparative genomics and teaching. Family accessions look like RF00005 (transfer RNA). Data from EMBL-EBI Rfam. For protein families and domains see the InterPro API, for protein sequences UniProt, for proteomics datasets PRIDE and for metabolomics MetaboLights.

api.oanor.com/rfam-api

MetaboLights API

MetaboLights as an API, powered by EMBL-EBI — the world's premier open repository for metabolomics experiments (NMR spectroscopy and mass spectrometry) and a sister resource to PRIDE for proteomics. Search the public metabolomics studies by keyword (returning each study's accession, title, description and organism); read a study's full metadata including its abstract, status, submission and release dates, study-design descriptors, experimental factors, the analytical assays with their measurement type, technology and platform, the contributors and their roles, the linked publications with DOI and PubMed identifiers, submitters, sample count, FTP download URL and data license; inspect the analytical workflow — every protocol with its name, type, description and parameters (sample collection, extraction, chromatography, NMR/MS spectroscopy, data transformation and metabolite identification); and list the organisms and organism parts studied with their ontology terms. Ideal for metabolomics and systems-biology research, dataset reuse and meta-analysis, bioinformatics pipelines and tools that integrate experimental evidence. Study accessions look like MTBLS1. Data from EMBL-EBI MetaboLights.

api.oanor.com/metabolights-api

PRIDE API

The PRIDE proteomics archive as an API, powered by the EMBL-EBI PRIDE Archive — the world's largest public repository of mass-spectrometry proteomics data and a founding member of ProteomeXchange. Search the public proteomics experiments by keyword (returning each project's accession, title, organisms, diseases and instruments); read a project's full metadata including its description, keywords, organisms and organism parts, mass-spectrometry instruments, software, the protein modifications identified, sample- and data-processing protocols, submitters, affiliations and the linked publication (DOI and PubMed); list a project's data files with their category, format, size and a direct download link; and explore facets — the diseases, organisms, instruments, experiment types, software and countries represented across matching projects — for discovery. Ideal for proteomics and systems-biology research, dataset reuse and meta-analysis, bioinformatics pipelines, and tools that integrate experimental evidence. Project accessions look like PXD000001. Data from EMBL-EBI.

api.oanor.com/pride-api

InterPro API

Protein families, domains and functional sites as an API, powered by the EBI InterPro database. InterPro classifies proteins into families and identifies the domains, repeats and important sites they contain, by combining the predictive signatures of many member databases (Pfam, SMART, PROSITE, CDD, PANTHER, SUPERFAMILY, NCBIfam and more) into a single integrated resource. Look up an InterPro entry — a family, domain, repeat, conserved/binding/active site or post-translational modification — with its description, Gene Ontology terms and the member-database signatures that define it; search entries by name and type; read a protein's metadata; and, most usefully, list the InterPro entries found on a protein together with their start–end positions, so you can see a protein's domain architecture. Ideal for protein annotation and function prediction, comparative genomics, structural-biology and bioinformatics pipelines, and research and teaching tools. Entry ids are IPR followed by six digits; protein ids are UniProt accessions. Data from EMBL-EBI.

api.oanor.com/interpro-api

Open Targets API

Drug target–disease associations as an API, powered by the Open Targets Platform. Open Targets integrates human genetics, genomics, transcriptomics, known drugs, animal models and the scientific literature to systematically score how strongly a target (gene/protein) is associated with a disease — the evidence that underpins modern drug discovery. Search across targets, diseases and drugs; read a target for its approved symbol, biotype, function, genomic location and UniProt ids together with the diseases it is most strongly associated with and their overall association scores; read a disease for its description, therapeutic areas and its top associated targets with scores; and read a drug for its modality, maximum clinical stage, trade names, synonyms and mechanisms of action. Ideal for drug-discovery and target-identification pipelines, therapeutic-area research, biomedical data science and pharma intelligence tools. Target ids are Ensembl gene ids, disease ids are EFO/MONDO/Orphanet ids, drug ids are ChEMBL ids. Data is open (CC0).

api.oanor.com/opentargets-api

KEGG API

The KEGG molecular database as an API, powered by the official KEGG REST service. KEGG (the Kyoto Encyclopedia of Genes and Genomes) links genomes, chemistry and disease. Fetch any KEGG entry parsed to JSON — a metabolic compound, KEGG Orthology group (KO), enzyme (EC number), reaction, module, drug, disease, glycan, gene or pathway map; search any KEGG database by name; list a database's entries; cross-link entries between databases (a gene to its pathways, a pathway to its compounds, an enzyme to its reactions); and convert KEGG identifiers to and from outside namespaces (NCBI Gene/Protein, UniProt, ChEBI, PubChem). Ideal for systems-biology and metabolomics pipelines, enzyme and orthology mapping, drug and disease research, gene-to-pathway annotation and bioinformatics id conversion. KEGG ids are letter-prefixed (C compound, K orthology, D drug, H disease, M module, R reaction, G glycan) or organism-coded (hsa human, eco E. coli).

api.oanor.com/kegg-api

gnomAD API

Population genetics as an API, powered by the Broad Institute's gnomAD (Genome Aggregation Database) — allele frequencies and gene constraint aggregated from over 800,000 human exomes and genomes. Look up a gene's constraint scores (pLI, LOEUF, observed vs expected loss-of-function, missense Z) and genomic location; get a variant's allele frequencies broken down by ancestry population (African/African-American, Admixed American, Ashkenazi Jewish, East Asian, Finnish, Non-Finnish European, South Asian, Middle Eastern…) across both genome and exome callsets, with rsIDs, homozygote counts and predicted consequence; search genes by symbol; read a transcript's constraint; and list the variants in a small genomic region. Supports GRCh38 and GRCh37 and the gnomAD v4/v3/v2 datasets. Ideal for clinical and population genetics, variant interpretation and prioritisation, rare-disease and pharmacogenomics research, and bioinformatics pipelines. Variant ids are chrom-pos-ref-alt.

api.oanor.com/gnomad-api

STRING API

The STRING protein–protein interaction database as an API — the curated and predicted network of functional associations between proteins, powered by the official STRING API. Resolve gene or protein names to STRING identifiers with annotations; get a protein's top interaction partners with a combined confidence score and per-channel evidence (experimental, curated databases, co-expression, text-mining, gene fusion, neighbourhood and co-occurrence); build the interaction network among a set of proteins as scored edges; run functional enrichment of a gene set over Gene Ontology, KEGG, Reactome, Pfam, InterPro and more with p-values and false-discovery rates; and score homology between proteins. Covers 12,000+ organisms (default human, NCBI taxon 9606). Ideal for systems-biology and network-biology pipelines, gene-set and pathway analysis, drug-target and disease-gene research, and bioinformatics dashboards.

api.oanor.com/string-api

Reactome API

The Reactome pathway knowledgebase as an API — the open, peer-reviewed database of biological pathways and reactions, powered by the official Reactome ContentService. Search the curated archive of pathways, reactions and molecules; read any entity by its Reactome stable id (a pathway, reaction, complex or protein: name, type, species, compartments, summary and disease flag); list the events (sub-pathways and reactions) contained in a pathway; list the molecules participating in a pathway or reaction with their reference identifiers; get the top-level pathways for any model organism; map a UniProt protein to the pathways it takes part in; and list the supported species. Covers human and 15+ model organisms across metabolism, signal transduction, cell cycle, immune system, disease and more. Ideal for systems-biology and bioinformatics pipelines, pathway-enrichment and drug-target tools, biomedical research apps, teaching resources and life-science chatbots.

api.oanor.com/reactome-api

PDB API

The RCSB Protein Data Bank as an API — 3D macromolecular structures of proteins, nucleic acids and complexes, powered by the official RCSB PDB data and search services. Fetch a structure entry by its 4-character PDB id for its title, experimental method (X-ray, cryo-EM, NMR), resolution, keywords, deposit and release dates, authors, primary citation and entity & assembly counts; run full-text search across the whole archive returning matching PDB ids and the total hit count; read a polymer entity for its protein or nucleic-acid name, one-letter sequence, length, source organism, chains and linked UniProt ids; read a biological assembly for its oligomeric state, symmetry and chain & atom counts; list the ligands bound in a structure with their component ids and names; and look up any chemical component (ligand) by code for its formula, weight, SMILES and InChIKey. Ideal for structural-biology and drug-discovery tools, molecular viewers, bioinformatics pipelines, education apps and research dashboards.

api.oanor.com/pdb-api

Ensembl API

The Ensembl genome database as an API, powered by the official Ensembl REST service from EMBL-EBI. Look up any gene by symbol or Ensembl stable id for its biotype, genomic location, strand, description and transcripts; resolve any feature (gene, transcript, exon) by stable id; pull external database cross-references; fetch sequence variants by rsID with their alleles, most-severe consequence, minor-allele frequency, clinical significance and genomic mappings; list the genes, transcripts, exons, variations or repeats overlapping any genomic region; retrieve genomic, cDNA, CDS or protein sequences by id; and read genome-assembly metadata including the karyotype and chromosome lengths. Across human, mouse and 300+ vertebrate species. Ideal for bioinformatics pipelines, genome browsers and variant-annotation tools, genetics research apps, clinical-genomics dashboards and life-science chatbots.

api.oanor.com/ensembl-api

UniProt API

The UniProt protein knowledge base as an API, powered by the official UniProt REST service curated by EMBL-EBI, SIB and PIR. Look up any protein by its UniProt accession for protein and gene names, organism, length, mass, function, keywords, Gene Ontology (GO) terms and linked PDB 3D structures; run full-text protein searches filtered by organism (NCBI taxon id) and Swiss-Prot review status; fetch amino-acid sequences with FASTA, molecular weight and CRC64 checksum; list sequence features such as signal peptides, chains, domains, active and binding sites, modified residues and natural variants, with a by-type breakdown; resolve NCBI taxonomy nodes with their full lineage; and pull reference proteomes with protein counts and genome-assembly ids. Across all kingdoms of life, from human to bacteria. Ideal for bioinformatics pipelines, drug-discovery and proteomics tools, sequence-analysis dashboards, academic research apps and life-science chatbots.

api.oanor.com/uniprot-api