ToS;DR Crawler

The ToS;DR Crawler is important to the functionally of Phoenix.

By crawling a service we ensure that the documents are mirrored and cannot be altered until a further crawl (Verified using CRC)

We do not index websites on our own, all websites are crawled manually by curators or staff on our site.

Identifying the ToS;DR Crawler

All ToS;DR Crawlers send a respective user agent with all their requests

Check for the following user agent:

ToSDRCrawler/1.0.0 (+https://to.tosdr.org/bot)

robots.txt

If you want to forbid the crawling for some reason you can include the following directive into the robots.txt

User-Agent: TosDRCrawler
Disallow: YOUR_PATH

Crawler Clusters

176.9.76.173
144.76.3.178
45.136.28.177
87.78.131.160
157.245.142.64

Crawler problems

If you are the provider of the website, common crawling issues are

  • Cloudflare
  • robots.txt
  • IPTables based restriction (See Crawler Clusters)
  • User-Agent based blocking

To fix this, add our servers or user agents to their respective whitelist.

1 Like