Merge pull request #13 from garyillyes/crawler

garyillyes · web-flow · commit 3213174fcab8 · 2026-04-09T21:00:30.000+02:00
Add definition for "crawler"
diff --git a/draft-illyes-aipref-cbcp.md b/draft-illyes-aipref-cbcp.md
@@ -59,6 +59,13 @@ identified and how their behavior can be influenced. Therefore, crawler
 operators are asked to follow the best practices for crawling outlined in this
 document.
 
+For the purposes of this document, a crawler is an automated
+HTTP {{HTTP-SEMANTICS}} client that retrieves resources across one or more web
+sites without direct human initiation of individual requests. A crawler
+discovers URIs during retrieval and schedules them for later processing. It
+relies on algorithmic prioritization and protocol-level instructions such as the
+Robots Exclusion Protocol {{REP}} to govern its behavior.
+
 To further assist website owners, it should also be considered to create a
 central registry where website owners can look up well-behaved crawlers. Note
 that while self-declared research crawlers, including privacy and malware