Free Robots.txt Generator — Block AI Scrapers, Control Crawlers

Your robots.txt file is a set of instructions you leave for web crawlers — search engine bots, AI training scrapers, and other automated agents that visit your site. It sits at yourdomain.com/robots.txt and tells crawlers which parts of your site they can and cannot access. Getting it right protects your crawl budget, keeps sensitive paths out of search results, and — increasingly — blocks AI companies from scraping your content without permission.

Why robots.txt matters in 2025

Two years ago, most developers set User-agent: * Allow: / and forgot about it. In 2025, the calculus has changed. AI companies are aggressively crawling the web to build training datasets. OpenAI's GPTBot, Common Crawl's CCBot, and others consume vast amounts of bandwidth and content without compensation. Many website owners are now opting out.

Our Robots.txt Generator makes it easy to block specific AI crawlers while keeping Google and other search engines you want.

Understanding the format

# Allow all search engines
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /api/

# Block OpenAI's training crawler
User-agent: GPTBot
Disallow: /

Sitemap: https://yourdomain.com/sitemap.xml

Each User-agent block applies rules to a specific crawler. * is a wildcard that matches all crawlers. Specific user-agents (like GPTBot) override the wildcard for that bot.

What to always disallow

Regardless of which crawlers you allow, these paths should almost always be in your Disallow list:

/admin/ — no benefit to having admin pages indexed
/api/ — API endpoints are not useful in search results
/accounts/ — login, signup, and account management pages
Any staging or draft preview paths

How to serve robots.txt in Django

Rather than a static file (which is hard to version control and easy to forget), serve it as a Django view:

# views.py
from django.http import HttpResponse

def robots_txt(request):
    lines = [
        "User-agent: *",
        "Allow: /",
        "Disallow: /admin/",
        f"Sitemap: https://yourdomain.com/sitemap.xml",
    ]
    return HttpResponse("
".join(lines), content_type="text/plain")

# urls.py
path('robots.txt', views.robots_txt, name='robots_txt'),

How to use the generator

Visit the Robots.txt Generator
Choose your default crawler policy
Select which AI bots to block (GPTBot, CCBot, etc.)
Enter any paths you want to disallow
Add your sitemap URL
Copy the generated file and deploy it to your root domain

Test your robots.txt using Google Search Console's robots.txt tester after deploying. It validates your syntax and shows you exactly what Googlebot sees.

Free Robots.txt Generator: Control Crawlers and Block AI Scrapers

Why robots.txt matters in 2025

Understanding the format

What to always disallow

How to serve robots.txt in Django

How to use the generator

mubashar

Why robots.txt matters in 2025

Understanding the format

What to always disallow

How to serve robots.txt in Django

How to use the generator

mubashar

More in Free Tools