Skip to content

deps(deps): update beautifulsoup4 requirement from >=4.11 to >=4.14.3#3

Open
dependabot[bot] wants to merge 1 commit into
mainfrom
dependabot/pip/beautifulsoup4-gte-4.14.3
Open

deps(deps): update beautifulsoup4 requirement from >=4.11 to >=4.14.3#3
dependabot[bot] wants to merge 1 commit into
mainfrom
dependabot/pip/beautifulsoup4-gte-4.14.3

Conversation

@dependabot
Copy link
Copy Markdown

@dependabot dependabot Bot commented on behalf of github May 4, 2026

Updates the requirements on beautifulsoup4 to permit the latest version.

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Updates the requirements on [beautifulsoup4](https://www.crummy.com/software/BeautifulSoup/bs4/) to permit the latest version.

---
updated-dependencies:
- dependency-name: beautifulsoup4
  dependency-version: 4.14.3
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot @github
Copy link
Copy Markdown
Author

dependabot Bot commented on behalf of github May 4, 2026

Labels

The following labels could not be found: dependencies. Please create it before Dependabot can add it to a pull request.

Please fix the above issues or remove invalid values from dependabot.yml.

IgorShutko added a commit that referenced this pull request May 17, 2026
Корень: универсальные RU/UA словари давали систематические
false positives на hardware-каталоге (Arduino, ЧПУ, авто-электроника).
Общий подход — безопасный матчинг, а не расширение словарей.

🔴 #1 consistency: placeholder не domain-aware
  - «заглушка» (socket blanking plug, торцевая заглушка для полива)
    УДАЛЕНА из списка — это реальный товар в hardware
  - «placeholder» убран
  - Разделено на STRONG (substring, безопасные: lorem ipsum, тест тест)
    и WEAK (todo/tbd/tba/n/a — только \bword\b + доминирование
    в коротком <60 симв тексте)
  - tba больше НЕ ловится внутри SKU 14AM00TBAS
  - 5 unit-тестов прошли

🔴 #2 audit.py: sitemap-страницы смешаны с категориями
  - INFO_KEYWORDS расширен (delivery_info, return_policy, contacts,
    brands, blog, faq, cart...)
  - is_home() ловит языковые главные (/ua/, /ru/)
  - is_info() режет языковой префикс перед матчингом
  - page_label(): в примерах печатается САМ title + длина,
    а не URL-слаг (было /ua/, /g102988977-...)

🟡 #3 text-quality: CAPS/letter_repeats шум на hardware
  - caps_lock: вайтлист brand+title токенов (INKBIRD, BIGTREETECH),
    скип SKU-подобных (цифра/дефис), расширен KNOWN_ABBR
  - letter_repeats: regex теперь только БУКВЫ, не \w —
    «30000 мАг» больше не false positive, «оооочень» ловится
  - 5 unit-тестов прошли

🟡 #4 apply_fixes: новый --fix decode-entities
  - html.unescape() по description/short/marketplace_description
  - превью before→after + счётчик + лог
  - на тесте: 44 товара, 1277 entities, &mdash;→—, &laquo;→«, &sup2;→²
  - описан в fix_recipes.md как Fix 11
  - отличие от inline-styles разъяснено (текст vs style-атрибуты)

🟡 #5 audit.py: HTML-фаза товаров возвращала 0 молча
  - sitemap fallback: если catalog-sitemap пуст → парсим
    sitemap-index, ищем под-карты catalog/product/tovar
  - catalog_phase_ok флаг в html_audit return
  - явный ⚠️ warning в REPORT.md если 0 товарных URL или
    0 успешно распарсено (вместо тихого частичного результата)

🟢 #6 audit.py: finding_meta +6 типов
  - description_empty, description_short, short_description_empty,
    marketplace_description_empty, no_marketplaces, no_images
  - больше нет «🟡 див. деталі» без рецепта

Smoke на protexttile: consistency 0 конфликтов (clean),
text-quality без caps/letter false positives, decode-entities
preview корректный. Все 14 .py syntax OK.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants