Skip to content

feat(collect): add error rate threshold#396

Merged
yolile merged 2 commits into
mainfrom
29-collect-errors-rate
May 6, 2025
Merged

feat(collect): add error rate threshold#396
yolile merged 2 commits into
mainfrom
29-collect-errors-rate

Conversation

@yolile
Copy link
Copy Markdown
Member

@yolile yolile commented May 6, 2025

closes #29

@yolile yolile requested a review from jpmckinney May 6, 2025 18:30
Copy link
Copy Markdown
Member

@jpmckinney jpmckinney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just one comment then can merge.

logger.warning("%s has warnings: %s\n%s\n", self, url, "\n".join(messages))

if scrapy_log.error_rate > 0.15: # 15%
raise UnexpectedError(f"The crawl had a {scrapy_log.error_rate} error rate")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do expect crawls to fail sometimes. (Unexpected is like the Scrapy log not existing or not being parseable, i.e. it should never happen if system is functioning normally.)

You'll need to update the import.

Suggested change
raise UnexpectedError(f"The crawl had a {scrapy_log.error_rate} error rate")
raise IrrecoverableError(f"The crawl had a {scrapy_log.error_rate} error rate")

@yolile yolile merged commit c2d8e70 into main May 6, 2025
16 of 18 checks passed
@yolile yolile deleted the 29-collect-errors-rate branch May 6, 2025 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Acceptance criteria - Kingfisher Collect

2 participants