Skip to content

Commit 3213174

Browse files
authored
Merge pull request #13 from garyillyes/crawler
Add definition for "crawler"
2 parents cbf1014 + 63ab539 commit 3213174

1 file changed

Lines changed: 7 additions & 0 deletions

File tree

draft-illyes-aipref-cbcp.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,13 @@ identified and how their behavior can be influenced. Therefore, crawler
5959
operators are asked to follow the best practices for crawling outlined in this
6060
document.
6161

62+
For the purposes of this document, a crawler is an automated
63+
HTTP {{HTTP-SEMANTICS}} client that retrieves resources across one or more web
64+
sites without direct human initiation of individual requests. A crawler
65+
discovers URIs during retrieval and schedules them for later processing. It
66+
relies on algorithmic prioritization and protocol-level instructions such as the
67+
Robots Exclusion Protocol {{REP}} to govern its behavior.
68+
6269
To further assist website owners, it should also be considered to create a
6370
central registry where website owners can look up well-behaved crawlers. Note
6471
that while self-declared research crawlers, including privacy and malware

0 commit comments

Comments
 (0)