#robots-txt

1 APIs con questa etichetta

robots.txt API

Fetch and evaluate any website's robots.txt. Pass a URL and a user-agent and the check endpoint tells you whether that URL is crawlable — selecting the most-specific user-agent group and applying the RFC 9309 longest-match Allow/Disallow rules (with * and $ wildcards, where Allow wins ties), and returning the matched rule, the group's crawl-delay and the sitemaps the site declares. The parse endpoint returns the whole file structured into per-user-agent groups (their allow and disallow lists and crawl-delay) plus the list of sitemaps. A missing robots.txt (404/403) means everything is allowed, exactly as the spec requires. The request is made server-side and private or internal targets are refused (SSRF-guarded). Built for SEO audits, crawler and scraper compliance, sitemap discovery and pre-flight "am I allowed to fetch this?" checks. A robots.txt evaluator — distinct from the on-page SEO audit (seo), the XML toolkit (xml) and link unfurling/preview (url). No upstream key, no cache.

api.oanor.com/robots-api

robots.txt API

Le tue scelte sui cookie