diff --git a/draft-illyes-aipref-cbcp.md b/draft-illyes-aipref-cbcp.md index 29613e6..adeaf3a 100644 --- a/draft-illyes-aipref-cbcp.md +++ b/draft-illyes-aipref-cbcp.md @@ -59,6 +59,13 @@ identified and how their behavior can be influenced. Therefore, crawler operators are asked to follow the best practices for crawling outlined in this document. +For the purposes of this document, a crawler is an automated +HTTP {{HTTP-SEMANTICS}} client that retrieves resources across one or more web +sites without direct human initiation of individual requests. A crawler +discovers URIs during retrieval and schedules them for later processing. It +relies on algorithmic prioritization and protocol-level instructions such as the +Robots Exclusion Protocol {{REP}} to govern its behavior. + To further assist website owners, it should also be considered to create a central registry where website owners can look up well-behaved crawlers. Note that while self-declared research crawlers, including privacy and malware