#sitemap
2 APIs con questa etichetta
Sitemap API
Fetch and parse an XML sitemap (the sitemaps.org protocol). Pass a sitemap URL and the parse endpoint fetches it — following redirects and transparently gunzipping .gz sitemaps — and returns its type: a urlset with every URL and its lastmod, changefreq and priority, or a sitemapindex listing the child sitemaps, with offset/limit paging for large files. The urls endpoint goes further: when the sitemap is an index it fetches the child sitemaps too and flattens every page URL into a single list, with a configurable cap on URLs and child sitemaps and a truncated flag so you stay in control. The request is made server-side and private or internal targets are refused (SSRF-guarded). Built for SEO audits, building crawl queues and content inventories, change monitoring and migration checks. A sitemap fetcher and parser — distinct from generic XML-to-JSON conversion (xml), the robots.txt evaluator (robots) and the on-page SEO audit (seo). No upstream key, no cache.
api.oanor.com/sitemap-api
robots.txt API
Fetch and evaluate any website's robots.txt. Pass a URL and a user-agent and the check endpoint tells you whether that URL is crawlable — selecting the most-specific user-agent group and applying the RFC 9309 longest-match Allow/Disallow rules (with * and $ wildcards, where Allow wins ties), and returning the matched rule, the group's crawl-delay and the sitemaps the site declares. The parse endpoint returns the whole file structured into per-user-agent groups (their allow and disallow lists and crawl-delay) plus the list of sitemaps. A missing robots.txt (404/403) means everything is allowed, exactly as the spec requires. The request is made server-side and private or internal targets are refused (SSRF-guarded). Built for SEO audits, crawler and scraper compliance, sitemap discovery and pre-flight "am I allowed to fetch this?" checks. A robots.txt evaluator — distinct from the on-page SEO audit (seo), the XML toolkit (xml) and link unfurling/preview (url). No upstream key, no cache.
api.oanor.com/robots-api