What this tool controls
Robots.txt tells crawlers which paths they should avoid. Some AI and search products publish their own
user-agent names, so you can write separate rules for training, search, and user-requested fetches. It is
public guidance, not access control.
Safer default
For most public sites, keep normal search crawlers open, block account or checkout paths, and decide
separately whether training crawlers can read public pages. The default preset follows that pattern.
Before publishing
Crawler names and provider rules change. Check the official docs before replacing a production robots.txt
file, especially on publisher, SaaS, or ecommerce sites.
Official crawler references
Use these provider pages to verify crawler names and policy behavior before changing production robots.txt.