Free Robots.txt Generator: Control Crawlers and Block AI Scrapers
Your robots.txt controls which crawlers can access your site. Our free generator helps you allow search engines, block AI training scrapers like GPTBot and CCBot, and protect private paths.
mubashar
Your robots.txt file is a set of instructions you leave for web crawlers — search engine bots, AI training scrapers, and other automated agents that visit your site. It sits at yourdomain.com/robots.txt and tells crawlers which parts of your site they can and cannot access. Getting it right protects your crawl budget, keeps sensitive paths out of search results, and — increasingly — blocks AI companies from scraping your content without permission.
Why robots.txt matters in 2025
Two years ago, most developers set User-agent: * Allow: / and forgot about it. In 2025, the calculus has changed. AI companies are aggressively crawling the web to build training datasets. OpenAI's GPTBot, Common Crawl's CCBot, and others consume vast amounts of bandwidth and content without compensation. Many website owners are now opting out.
Our Robots.txt Generator makes it easy to block specific AI crawlers while keeping Google and other search engines you want.
Understanding the format
# Allow all search engines
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /api/
# Block OpenAI's training crawler
User-agent: GPTBot
Disallow: /
Sitemap: https://yourdomain.com/sitemap.xml
Each User-agent block applies rules to a specific crawler. * is a wildcard that matches all crawlers. Specific user-agents (like GPTBot) override the wildcard for that bot.
What to always disallow
Regardless of which crawlers you allow, these paths should almost always be in your Disallow list:
/admin/— no benefit to having admin pages indexed/api/— API endpoints are not useful in search results/accounts/— login, signup, and account management pages- Any staging or draft preview paths
How to serve robots.txt in Django
Rather than a static file (which is hard to version control and easy to forget), serve it as a Django view:
# views.py
from django.http import HttpResponse
def robots_txt(request):
lines = [
"User-agent: *",
"Allow: /",
"Disallow: /admin/",
f"Sitemap: https://yourdomain.com/sitemap.xml",
]
return HttpResponse("
".join(lines), content_type="text/plain")
# urls.py
path('robots.txt', views.robots_txt, name='robots_txt'),
How to use the generator
- Visit the Robots.txt Generator
- Choose your default crawler policy
- Select which AI bots to block (GPTBot, CCBot, etc.)
- Enter any paths you want to disallow
- Add your sitemap URL
- Copy the generated file and deploy it to your root domain
Test your robots.txt using Google Search Console's robots.txt tester after deploying. It validates your syntax and shows you exactly what Googlebot sees.
Written by
Mubashar Iqbal
Web developer, SEO expert, and independent maker. I build products, write about what I've learned, and create free tools for developers and marketers.