Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions draft-illyes-aipref-cbcp.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,13 @@ identified and how their behavior can be influenced. Therefore, crawler
operators are asked to follow the best practices for crawling outlined in this
document.

For the purposes of this document, a crawler is an automated
HTTP {{HTTP-SEMANTICS}} client that retrieves resources across one or more web
sites without direct human initiation of individual requests. A crawler
discovers URIs during retrieval and schedules them for later processing. It
relies on algorithmic prioritization and protocol-level instructions such as the
Robots Exclusion Protocol {{REP}} to govern its behavior.

To further assist website owners, it should also be considered to create a
central registry where website owners can look up well-behaved crawlers. Note
that while self-declared research crawlers, including privacy and malware
Expand Down
Loading