#life-sciences

BioSamples API

BioSamples as an API, powered by EMBL-EBI — the database that stores and links the metadata of biological samples, the physical specimens behind biological experiments. A sample in BioSamples carries a stable accession (such as SAMEA3231268) and a rich set of characteristics — organism, tissue or organism part, cell type, sex, disease, developmental stage, strain and any submitter-provided attributes — and is referenced by other EBI archives including the European Nucleotide Archive (ENA), ArrayExpress and PRIDE. /v1/search?q=liver searches samples by free text and returns each match's accession, name, organism and release date. /v1/sample?id=SAMEA3231268 returns a sample's metadata — its accession, name, NCBI taxon id, organism, release and update dates, the number of relationships to other samples, and its characteristics flattened to a clean key→value map. Accessions look like SAMEA…, SAMN… or SAMD…; get one from the search endpoint. Ideal for life-science data integration, sample tracking, metadata harmonisation and linking sequencing or expression data back to its source specimen. Data from EMBL-EBI BioSamples (public). This is a biological-sample metadata registry — distinct from study (BioStudies), sequence (ENA), variant (ClinVar) and structure databases.

api.oanor.com/biosamples-api

BioStudies API

BioStudies as an API, powered by EMBL-EBI — the database that holds the descriptions of biological studies and links their data together across EBI resources, including imaging (BioImage Archive), functional genomics (ArrayExpress), proteomics, and the literature (Europe PMC). Each study has an accession, a title and abstract, the collection it belongs to and links to its underlying data and publications. /v1/search?query=covid searches the studies and returns each match's accession (e.g. S-EPMC8017430), title, author, study type, release date and link/file counts. /v1/study?id=S-EPMC8017430 returns a study's metadata — its accession, the collection it belongs to (such as EuropePMC, ArrayExpress or BioImages), title, abstract, release date, authors and the number of linked resources. Accessions look like S-EPMC8017430 or S-BSST123; get one from the search endpoint. Ideal for research-data discovery, linking literature to its underlying datasets, systematic reviews and reproducibility tooling. Data from EMBL-EBI BioStudies (public). This is a studies and datasets metadata index — distinct from the sequence (UniProt, ENA), structure (PDB, EMDB), variant (ClinVar) and ontology databases.

api.oanor.com/biostudies-api

ENA API

The European Nucleotide Archive (ENA) as an API, powered by EMBL-EBI — one of the three INSDC partners alongside NCBI GenBank and DDBJ, and the comprehensive public archive of the world's nucleotide sequence data. ENA holds raw sequencing reads, assembled and annotated genomes, individual sequences, biological samples and the studies behind them, for every domain of life — the backbone resource for genomics, microbiology, ecology, evolution and clinical research. This API gives a clean three-step workflow over that archive. First, /v1/taxon resolves an organism name (e.g. "Homo sapiens") to its NCBI taxon id, scientific name, taxonomic rank and full lineage — or looks a taxon up directly by id. Then /v1/search queries the archive for that taxon's records of a chosen type: genome assemblies (with assembly name, level and base count), sequencing runs (with platform, instrument and read counts), biological samples (with collection date and country), annotated sequences, read experiments, analyses, coding and non-coding sequences, and studies — by default including all descendant taxa, or restricted to the exact taxon. Finally /v1/record returns a summary for any ENA accession — assemblies (GCA_…), studies and projects (PRJ…), samples (SAM…/ERS…), sequencing runs (ERR…/SRR…) and sequences — with its title, data type, taxon, scientific name, base and sequence counts and public status. Ideal for bioinformatics pipelines, genome-data discovery, sequencing-metadata harvesting, biodiversity and metagenomics tooling, and research reproducibility. Taxon ids look like 9606 (human); accessions like GCA_000001405. Data from EMBL-EBI ENA, an INSDC archive, free to use.

api.oanor.com/ena-api

MGnify API

MGnify as an API, powered by EMBL-EBI — the world's largest free resource for the analysis and archiving of microbiome sequencing data, and the metagenomics sister to PRIDE (proteomics) and MetaboLights (metabolomics). MGnify holds tens of thousands of public metagenomics and metabarcoding studies spanning the human gut microbiome, marine and freshwater environments, soils, wastewater, the built environment and host-associated communities. Search the studies by keyword, getting each study's MGnify accession (MGYS...), name, abstract, biome, sample count and the source sequencing BioProject; read a study's full metadata including its name and abstract, biome classification, number of samples, submitting centre, public status, data origination and last-update date; and browse the GOLD-style biome classification tree — from root:Host-associated:Human:Digestive system to root:Environmental:Aquatic:Marine — with per-biome sample and study counts, for discovery by environment. Ideal for microbiome and environmental-genomics research, dataset reuse and meta-analysis, bioinformatics pipelines and teaching. Study accessions look like MGYS00006862. Data from EMBL-EBI MGnify.

api.oanor.com/mgnify-api

Cellosaurus API

Cellosaurus as an API, powered by the SIB Swiss Institute of Bioinformatics — the reference encyclopaedia of cell lines used in biomedical research. With more than 150,000 entries spanning cancer cell lines, hybridomas, induced pluripotent stem cells, and lines from hundreds of species, Cellosaurus is the authoritative catalogue researchers use to identify and validate the cell lines behind published experiments. Search the cell lines by name or keyword, getting each line's Cellosaurus accession (CVCL_…), name, category, species and disease; and read a cell line's full record — its name and synonyms, category (e.g. cancer cell line, hybridoma, stem cell), species with NCBI taxonomy id, sex, age, the disease it derives from with NCIt/ontology identifiers, the tissue or anatomical site of origin, its parent cell line and the number of derived child lines, the count of literature references and the many cross-references (to ATCC, DSMZ, ECACC, Wikidata and more), relevant web pages, and — critically for research reproducibility — whether the line is flagged PROBLEMATIC, meaning it has been misidentified or cross-contaminated, together with the explanatory notes. Ideal for laboratory quality control and cell-line authentication, biomedical and cancer research, data curation and reproducibility checks. Accessions look like CVCL_0030 (HeLa). Data from Cellosaurus (CC-BY 4.0).

api.oanor.com/cellosaurus-api

AlphaFold API

The AlphaFold Protein Structure Database as an API, powered by EMBL-EBI and Google DeepMind. AlphaFold predicts the three-dimensional structure of a protein from its amino-acid sequence with experimental-level accuracy, and the database now covers over 200 million proteins — nearly every sequence in UniProt. Look up the AlphaFold model for any protein by its UniProt accession and get its gene and protein description, organism and sequence length, model version and creation date, the global confidence metric, the full amino-acid sequence, and direct download links to the predicted structure as mmCIF, PDB and BinaryCIF together with the Predicted Aligned Error (PAE) plot image and data; and read a protein's structural coverage — the AlphaFold predicted model(s) and any linked structures with their provider, model category, method and the UniProt residue range covered. Ideal for structural biology, drug discovery and target assessment, protein engineering, molecular visualisation and teaching. Proteins are identified by UniProt accession (for example P00520 or P38398). Data from the AlphaFold DB (CC-BY 4.0). For experimentally-determined 3D structures see the PDB API, for protein sequences and functional annotation the UniProt API, and for families & domains InterPro.

api.oanor.com/alphafold-api

Complex Portal API

The Complex Portal as an API, powered by EMBL-EBI — a manually curated, encyclopaedic database of stable macromolecular complexes: assemblies of two or more proteins (and sometimes nucleic acids, ligands or small molecules) that work together as a single functional unit, such as ribosomes, proteasomes, RNA and DNA polymerases, the spliceosome, respiratory-chain complexes and thousands more across many species. Search the complexes by keyword and optionally by organism, getting each complex's Complex Portal accession (CPX-…), name, organism, description and whether it is computationally predicted; read a complex's full curated record including its recommended and systematic names, synonyms, species, biological function, the participating subunits each with its molecule identifier (for example a UniProt accession) and stoichiometry, any associated ligands and diseases, the evidence type and cross-references to UniProt, Gene Ontology, Reactome, Wikidata and more; and pull just the subunit composition of a complex. Ideal for structural and systems biology, pathway and network analysis, protein-function research and bioinformatics pipelines. Complex accessions look like CPX-6036. Data from EMBL-EBI Complex Portal (IMEx consortium, CC-BY). For protein–protein interaction networks see the STRING API, for protein sequences UniProt, for biological pathways Reactome and for families & domains InterPro.

api.oanor.com/complexes-api

Rfam API

The Rfam database of non-coding RNA families as an API, powered by EMBL-EBI. Rfam groups functional RNAs that share a common evolutionary origin into families, each modelled by a covariance model built from a curated seed alignment and secondary structure. Search the families by name, description or RNA type — riboswitches and other cis-regulatory elements, ribozymes, microRNA families, ribosomal RNAs, transfer RNAs, small nuclear and small nucleolar RNAs, long non-coding RNAs and CRISPR direct repeats — getting each family's Rfam accession, name, description, RNA type and curators; read a family's full record including its description, RNA-type classification, the curators who built it, the number of sequences in its full and seed alignments, the structure source, the curator comment, the clan (group of related families) it belongs to and the Rfam release; and browse the families by RNA class. Ideal for RNA biology, bioinformatics pipelines, non-coding-RNA annotation, comparative genomics and teaching. Family accessions look like RF00005 (transfer RNA). Data from EMBL-EBI Rfam. For protein families and domains see the InterPro API, for protein sequences UniProt, for proteomics datasets PRIDE and for metabolomics MetaboLights.

api.oanor.com/rfam-api

MetaboLights API

MetaboLights as an API, powered by EMBL-EBI — the world's premier open repository for metabolomics experiments (NMR spectroscopy and mass spectrometry) and a sister resource to PRIDE for proteomics. Search the public metabolomics studies by keyword (returning each study's accession, title, description and organism); read a study's full metadata including its abstract, status, submission and release dates, study-design descriptors, experimental factors, the analytical assays with their measurement type, technology and platform, the contributors and their roles, the linked publications with DOI and PubMed identifiers, submitters, sample count, FTP download URL and data license; inspect the analytical workflow — every protocol with its name, type, description and parameters (sample collection, extraction, chromatography, NMR/MS spectroscopy, data transformation and metabolite identification); and list the organisms and organism parts studied with their ontology terms. Ideal for metabolomics and systems-biology research, dataset reuse and meta-analysis, bioinformatics pipelines and tools that integrate experimental evidence. Study accessions look like MTBLS1. Data from EMBL-EBI MetaboLights.

api.oanor.com/metabolights-api

Europe PMC API

Europe PMC as an API, powered by EMBL-EBI — an open repository of biomedical and life-sciences literature covering 45 million+ abstracts and 9 million+ full-text articles drawn from PubMed, PubMed Central, preprint servers (bioRxiv and medRxiv), patents and Agricola. Search the literature with rich field syntax (by author, title, journal, MeSH term, publication year or open-access status), ordering results by relevance, date or citation count, and optionally restricting to preprints only; read an article's full metadata and abstract — its authors, journal, volume and pages, DOI, PubMed and PMC identifiers, MeSH terms, keywords, funding grants and links to the free full text; and walk the citation network in both directions: the articles that cite a given paper, and the works that paper itself references. Together these let you measure scholarly impact, build citation graphs, track a research topic across preprints and peer-reviewed papers, and feed evidence into bibliometric, systematic-review and research-intelligence tools. Article identifiers are PubMed ids (numeric), PMC ids (PMC…) or preprint ids (PPR…); the source defaults to PubMed (MED). Data from EMBL-EBI Europe PMC.

api.oanor.com/europepmc-api

PRIDE API

The PRIDE proteomics archive as an API, powered by the EMBL-EBI PRIDE Archive — the world's largest public repository of mass-spectrometry proteomics data and a founding member of ProteomeXchange. Search the public proteomics experiments by keyword (returning each project's accession, title, organisms, diseases and instruments); read a project's full metadata including its description, keywords, organisms and organism parts, mass-spectrometry instruments, software, the protein modifications identified, sample- and data-processing protocols, submitters, affiliations and the linked publication (DOI and PubMed); list a project's data files with their category, format, size and a direct download link; and explore facets — the diseases, organisms, instruments, experiment types, software and countries represented across matching projects — for discovery. Ideal for proteomics and systems-biology research, dataset reuse and meta-analysis, bioinformatics pipelines, and tools that integrate experimental evidence. Project accessions look like PXD000001. Data from EMBL-EBI.

api.oanor.com/pride-api

InterPro API

Protein families, domains and functional sites as an API, powered by the EBI InterPro database. InterPro classifies proteins into families and identifies the domains, repeats and important sites they contain, by combining the predictive signatures of many member databases (Pfam, SMART, PROSITE, CDD, PANTHER, SUPERFAMILY, NCBIfam and more) into a single integrated resource. Look up an InterPro entry — a family, domain, repeat, conserved/binding/active site or post-translational modification — with its description, Gene Ontology terms and the member-database signatures that define it; search entries by name and type; read a protein's metadata; and, most usefully, list the InterPro entries found on a protein together with their start–end positions, so you can see a protein's domain architecture. Ideal for protein annotation and function prediction, comparative genomics, structural-biology and bioinformatics pipelines, and research and teaching tools. Entry ids are IPR followed by six digits; protein ids are UniProt accessions. Data from EMBL-EBI.

api.oanor.com/interpro-api

Open Targets API

Drug target–disease associations as an API, powered by the Open Targets Platform. Open Targets integrates human genetics, genomics, transcriptomics, known drugs, animal models and the scientific literature to systematically score how strongly a target (gene/protein) is associated with a disease — the evidence that underpins modern drug discovery. Search across targets, diseases and drugs; read a target for its approved symbol, biotype, function, genomic location and UniProt ids together with the diseases it is most strongly associated with and their overall association scores; read a disease for its description, therapeutic areas and its top associated targets with scores; and read a drug for its modality, maximum clinical stage, trade names, synonyms and mechanisms of action. Ideal for drug-discovery and target-identification pipelines, therapeutic-area research, biomedical data science and pharma intelligence tools. Target ids are Ensembl gene ids, disease ids are EFO/MONDO/Orphanet ids, drug ids are ChEMBL ids. Data is open (CC0).

api.oanor.com/opentargets-api

gnomAD API

Population genetics as an API, powered by the Broad Institute's gnomAD (Genome Aggregation Database) — allele frequencies and gene constraint aggregated from over 800,000 human exomes and genomes. Look up a gene's constraint scores (pLI, LOEUF, observed vs expected loss-of-function, missense Z) and genomic location; get a variant's allele frequencies broken down by ancestry population (African/African-American, Admixed American, Ashkenazi Jewish, East Asian, Finnish, Non-Finnish European, South Asian, Middle Eastern…) across both genome and exome callsets, with rsIDs, homozygote counts and predicted consequence; search genes by symbol; read a transcript's constraint; and list the variants in a small genomic region. Supports GRCh38 and GRCh37 and the gnomAD v4/v3/v2 datasets. Ideal for clinical and population genetics, variant interpretation and prioritisation, rare-disease and pharmacogenomics research, and bioinformatics pipelines. Variant ids are chrom-pos-ref-alt.

api.oanor.com/gnomad-api

STRING API

The STRING protein–protein interaction database as an API — the curated and predicted network of functional associations between proteins, powered by the official STRING API. Resolve gene or protein names to STRING identifiers with annotations; get a protein's top interaction partners with a combined confidence score and per-channel evidence (experimental, curated databases, co-expression, text-mining, gene fusion, neighbourhood and co-occurrence); build the interaction network among a set of proteins as scored edges; run functional enrichment of a gene set over Gene Ontology, KEGG, Reactome, Pfam, InterPro and more with p-values and false-discovery rates; and score homology between proteins. Covers 12,000+ organisms (default human, NCBI taxon 9606). Ideal for systems-biology and network-biology pipelines, gene-set and pathway analysis, drug-target and disease-gene research, and bioinformatics dashboards.

api.oanor.com/string-api

ChEMBL API

The ChEMBL database of bioactive molecules as an API — the EBI's manually curated knowledgebase of drug-like compounds and their biological activity, powered by the official ChEMBL data API. Look up a compound by its ChEMBL id for its development phase, chemical structure (SMILES, InChIKey), molecular formula and weight, calculated properties (ALogP, polar surface area, hydrogen-bond donors/acceptors, Rule-of-Five violations, QED drug-likeness) and synonyms; search compounds by name; read a biological target with its organism and UniProt protein components; list a drug's mechanisms of action; list its approved and investigational indications (MeSH and EFO terms with development phase); and pull its measured bioactivities (IC50, Ki, EC50, potency…) with values, units, pChEMBL scores, assays and targets. Ideal for drug-discovery and cheminformatics pipelines, medicinal-chemistry and pharmacology tools, target-identification and SAR research, and life-science apps.

api.oanor.com/chembl-api

Reactome API

The Reactome pathway knowledgebase as an API — the open, peer-reviewed database of biological pathways and reactions, powered by the official Reactome ContentService. Search the curated archive of pathways, reactions and molecules; read any entity by its Reactome stable id (a pathway, reaction, complex or protein: name, type, species, compartments, summary and disease flag); list the events (sub-pathways and reactions) contained in a pathway; list the molecules participating in a pathway or reaction with their reference identifiers; get the top-level pathways for any model organism; map a UniProt protein to the pathways it takes part in; and list the supported species. Covers human and 15+ model organisms across metabolism, signal transduction, cell cycle, immune system, disease and more. Ideal for systems-biology and bioinformatics pipelines, pathway-enrichment and drug-target tools, biomedical research apps, teaching resources and life-science chatbots.

api.oanor.com/reactome-api

Ensembl API

The Ensembl genome database as an API, powered by the official Ensembl REST service from EMBL-EBI. Look up any gene by symbol or Ensembl stable id for its biotype, genomic location, strand, description and transcripts; resolve any feature (gene, transcript, exon) by stable id; pull external database cross-references; fetch sequence variants by rsID with their alleles, most-severe consequence, minor-allele frequency, clinical significance and genomic mappings; list the genes, transcripts, exons, variations or repeats overlapping any genomic region; retrieve genomic, cDNA, CDS or protein sequences by id; and read genome-assembly metadata including the karyotype and chromosome lengths. Across human, mouse and 300+ vertebrate species. Ideal for bioinformatics pipelines, genome browsers and variant-annotation tools, genetics research apps, clinical-genomics dashboards and life-science chatbots.

api.oanor.com/ensembl-api

UniProt API

The UniProt protein knowledge base as an API, powered by the official UniProt REST service curated by EMBL-EBI, SIB and PIR. Look up any protein by its UniProt accession for protein and gene names, organism, length, mass, function, keywords, Gene Ontology (GO) terms and linked PDB 3D structures; run full-text protein searches filtered by organism (NCBI taxon id) and Swiss-Prot review status; fetch amino-acid sequences with FASTA, molecular weight and CRC64 checksum; list sequence features such as signal peptides, chains, domains, active and binding sites, modified residues and natural variants, with a by-type breakdown; resolve NCBI taxonomy nodes with their full lineage; and pull reference proteomes with protein counts and genome-assembly ids. Across all kingdoms of life, from human to bacteria. Ideal for bioinformatics pipelines, drug-discovery and proteomics tools, sequence-analysis dashboards, academic research apps and life-science chatbots.

api.oanor.com/uniprot-api