Summary
modules/category/sql/20260403000001_category_legacy01.sql introduces a JOIN key between two tables with mismatched collations. When MySQL 8 actually has to compare the two columns (i.e. the JOIN returns rows), it errors out with Error 1267 (HY000): Illegal mix of collations (utf8mb4_0900_ai_ci,IMPLICIT) and (utf8mb4_general_ci,IMPLICIT) for operation '=', breaking /v1/sidebar/sync (fail-closed for the follow tab).
Symptom
Production log:
{"level":"error","msg":"【Sidebar】sidebar sync: category query failed (follow tab fail-closed)",
"error":"Error 1267 (HY000): Illegal mix of collations (utf8mb4_0900_ai_ci,IMPLICIT) and (utf8mb4_general_ci,IMPLICIT) for operation '='"}
[GIN] | 400 | 131.347337ms | POST "/v1/sidebar/sync"
Query (modules/message/db_group_category.go:65, QueryCategorySettingsByGroupNos):
SELECT gs.group_no, gs.category_id, IFNULL(gs.category_sort,0), IFNULL(gc.sort,0)
FROM group_setting gs
LEFT JOIN group_category gc
ON gs.category_id = gc.category_id -- collation mismatch
AND gs.uid = gc.uid -- collation mismatch
AND gc.status != 2
WHERE gs.group_no IN (?) AND gs.uid = ?
Root cause
modules/category/sql/20260403000001_category_legacy01.sql does two things in the same file:
CREATE TABLE group_category ... COLLATE=utf8mb4_general_ci (explicit, follows the historical convention)
ALTER TABLE group_setting ADD COLUMN category_id VARCHAR(32) ... — no explicit COLLATE, so the column inherits the table's collation, which on MySQL 8 prod databases is utf8mb4_0900_ai_ci (silent MySQL 8 default applied to many legacy tables that were also created without an explicit COLLATE)
Result: group_setting.category_id is utf8mb4_0900_ai_ci, group_category.category_id is utf8mb4_general_ci. The JOIN compiles, runs, and works for any row where one side returns no match — and only errors out when both sides actually produce comparable rows. That is why CI/unit tests do not catch it: the fixtures don't drive the offending row combination.
Reproduction
-- on a fresh MySQL 8 prod-like DB after running all migrations
SELECT gs.group_no, gs.category_id
FROM group_setting gs
LEFT JOIN group_category gc
ON gs.category_id = gc.category_id AND gs.uid = gc.uid
WHERE gs.uid = '<any uid that has both a group_setting row and a group_category row with matching category_id>';
-- -> ERROR 1267 (HY000)
Scope of the underlying drift
Audit on our prod (im_prod):
- ~60 tables are
utf8mb4_0900_ai_ci (legacy tables created without explicit COLLATE on MySQL 8): user, group, group_setting, message, message_extra, friend, device, ...
- ~22 business tables are
utf8mb4_general_ci (newer tables that did set COLLATE per the documented convention): group_category, user_oidc_identity, thread, thread_member, thread_setting, user_api_key, user_verification, user_conversation_ext, ...
Any future JOIN between the two groups will explode the same way. group_category × group_setting is the one that surfaced first because /v1/sidebar/sync actively exercises it.
Suggested fixes (pick one)
Option A — Patch the offending migration (cleanest going forward)
Update 20260403000001_category_legacy01.sql so the new column matches the table it is being added to:
ALTER TABLE `group_setting`
ADD COLUMN `category_id` VARCHAR(32) DEFAULT NULL COLLATE utf8mb4_0900_ai_ci COMMENT '用户自定义类别ID';
This only helps fresh installs / dev. Existing databases that already ran the migration still have the wrong collation and need an additive forward-repair migration.
Option B — Add a forward-repair migration
A new migration that explicitly aligns the JOIN sides. Two valid directions; the right one depends on the project-wide convention you actually want:
-- Direction 1: align new general_ci tables down to the prod-majority 0900_ai_ci
ALTER TABLE `group_category` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
-- Direction 2: align old 0900_ai_ci tables up to the documented general_ci
ALTER TABLE `group_setting` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci;
We picked Direction 1 in our prod (5,434 rows, ~1s) because fewer tables to change and matches the majority of legacy tables. A full sweep in the other direction would touch 60+ tables including hot tables like message*.
Option C — Patch the query
Add COLLATE utf8mb4_general_ci casts in db_group_category.go:65. Treats one symptom; does not fix the underlying drift and risks defeating idx_uid_category on group_setting.
Recommendations
- Apply Option A + a Direction-1 forward-repair migration, scoped to the minimum set of tables (the new general_ci ones that have a JOIN partner on the 0900_ai_ci side). Lowest blast radius, fixes prod, keeps the bigger sweep out of the hot path.
- Document the actual convention. The codebase comments say
utf8mb4_general_ci, but the runtime majority is utf8mb4_0900_ai_ci. Pick one and update the migration style guide so future CREATE TABLE / ADD COLUMN are explicit.
- CI lint: reject any
CREATE TABLE / ADD COLUMN <varchar/text> in *.sql migrations that lacks an explicit COLLATE clause. This is the recurring root cause.
Environment
- MySQL 8.0.x
- octo-server
main (commit at the time of investigation: a8f56d15 Merge branch 'test' into 'main')
Summary
modules/category/sql/20260403000001_category_legacy01.sqlintroduces a JOIN key between two tables with mismatched collations. When MySQL 8 actually has to compare the two columns (i.e. the JOIN returns rows), it errors out withError 1267 (HY000): Illegal mix of collations (utf8mb4_0900_ai_ci,IMPLICIT) and (utf8mb4_general_ci,IMPLICIT) for operation '=', breaking/v1/sidebar/sync(fail-closed for the follow tab).Symptom
Production log:
Query (
modules/message/db_group_category.go:65,QueryCategorySettingsByGroupNos):Root cause
modules/category/sql/20260403000001_category_legacy01.sqldoes two things in the same file:CREATE TABLE group_category ... COLLATE=utf8mb4_general_ci(explicit, follows the historical convention)ALTER TABLE group_setting ADD COLUMN category_id VARCHAR(32) ...— no explicit COLLATE, so the column inherits the table's collation, which on MySQL 8 prod databases isutf8mb4_0900_ai_ci(silent MySQL 8 default applied to many legacy tables that were also created without an explicitCOLLATE)Result:
group_setting.category_idisutf8mb4_0900_ai_ci,group_category.category_idisutf8mb4_general_ci. The JOIN compiles, runs, and works for any row where one side returns no match — and only errors out when both sides actually produce comparable rows. That is why CI/unit tests do not catch it: the fixtures don't drive the offending row combination.Reproduction
Scope of the underlying drift
Audit on our prod (
im_prod):utf8mb4_0900_ai_ci(legacy tables created without explicitCOLLATEon MySQL 8):user,group,group_setting,message,message_extra,friend,device, ...utf8mb4_general_ci(newer tables that did setCOLLATEper the documented convention):group_category,user_oidc_identity,thread,thread_member,thread_setting,user_api_key,user_verification,user_conversation_ext, ...Any future JOIN between the two groups will explode the same way.
group_category×group_settingis the one that surfaced first because/v1/sidebar/syncactively exercises it.Suggested fixes (pick one)
Option A — Patch the offending migration (cleanest going forward)
Update
20260403000001_category_legacy01.sqlso the new column matches the table it is being added to:This only helps fresh installs / dev. Existing databases that already ran the migration still have the wrong collation and need an additive forward-repair migration.
Option B — Add a forward-repair migration
A new migration that explicitly aligns the JOIN sides. Two valid directions; the right one depends on the project-wide convention you actually want:
We picked Direction 1 in our prod (5,434 rows, ~1s) because fewer tables to change and matches the majority of legacy tables. A full sweep in the other direction would touch 60+ tables including hot tables like
message*.Option C — Patch the query
Add
COLLATE utf8mb4_general_cicasts indb_group_category.go:65. Treats one symptom; does not fix the underlying drift and risks defeatingidx_uid_categoryongroup_setting.Recommendations
utf8mb4_general_ci, but the runtime majority isutf8mb4_0900_ai_ci. Pick one and update the migration style guide so futureCREATE TABLE/ADD COLUMNare explicit.CREATE TABLE/ADD COLUMN <varchar/text>in*.sqlmigrations that lacks an explicitCOLLATEclause. This is the recurring root cause.Environment
main(commit at the time of investigation:a8f56d15Merge branch 'test' into 'main')