Skip to content

new arch: remove cdc changefeed with redo failed #3761

@apollodafoni

Description

@apollodafoni

What did you do?

  1. create changefeed with content
level = "eventual"
storage = "%s"
max-log-size = 64`
  1. initial and run workload
  2. inject all pd node restart
  3. inject all capture restart
  4. make tikv failure
  5. wait a monent , check consistency success
  6. remove changefeed

What did you expect to see?

remove changefeed success

What did you see instead?

remove changefeed failed!
stdout: [2025/12/22 07:59:01.768 +00:00] [ERROR] [request.go:309] [\"failed to send a http request\"] [error=\"Delete \\\"http://127.0.0.1:8301/api/v2/changefeeds/redo-enable-cdc-all-node-restart-sync?keyspace=default\\\": EOF\"]\nChangefeed remove failed.\nID: redo-enable-cdc-all-node-restart-sync\nError: Delete \"http://127.0.0.1:8301/api/v2/changefeeds/redo-enable-cdc-all-node-restart-sync?keyspace=default\": EOF, stderr: Error: Delete \"http://127.0.0.1:8301/api/v2/changefeeds/redo-enable-cdc-all-node-restart-sync?keyspace=default\": EOF, ExitCode: 1"]

[WARN] [sink.go:327] ["close mysql sink, remove changefeed meet error"] [changefeed=default/redo-enable-cdc-all-node-restart-sync] [error="[CDC:ErrMySQLTxnError]MySQL txn error: select ddl ts table: begin Tx fail;: context canceled"] [errorVerbose="[CDC:ErrMySQLTxnError]MySQL txn error: select ddl ts table: begin Tx fail;: context canceled\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\tgithub.com/pingcap/errors@v0.11.5-0.20250523034308-74f78ae071ee/normalize.go:177\ngithub.com/pingcap/ticdc/pkg/errors.WrapError\n\tgithub.com/pingcap/ticdc/pkg/errors/helper.go:35\ngithub.com/pingcap/ticdc/pkg/sink/mysql.(*Writer).RemoveDDLTsItem\n\tgithub.com/pingcap/ticdc/pkg/sink/mysql/mysql_writer_for_ddl_ts.go:398\ngithub.com/pingcap/ticdc/downstreamadapter/sink/mysql.(*Sink).Close\n\tgithub.com/pingcap/ticdc/downstreamadapter/sink/mysql/sink.go:326\ngithub.com/pingcap/ticdc/downstreamadapter/dispatchermanager.(*DispatcherManager).close\n\tgithub.com/pingcap/ticdc/downstreamadapter/dispatchermanager/dispatcher_manager.go:880\nruntime.goexit\n\truntime/asm_amd64.s:1700"]

Versions of the cluster

Upstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

Release Version: v8.5.5
Edition: Community
Git Commit Hash: 762e2422853fe36197236ee9acea983a07871a50
Git Branch: HEAD
UTC Build Time: 2025-12-22 10:44:11
GoVersion: go1.25.5
Race Enabled: false
Check Table Before Drop: false
Store: tikv

Upstream TiKV version (execute tikv-server --version):

TiKV
Release Version:   8.5.4
Edition:           Community
Git Commit Hash:   888acc67e1ac9dcb666fcd9080df9d387936202b
Git Commit Branch: HEAD
UTC Build Time:    2025-12-18 14:17:45
Rust Version:      rustc 1.77.0-nightly (89e2160c4 2023-12-27)
Enable Features:   memory-engine pprof-fp jemalloc mem-profiling portable sse test-engine-kv-rocksdb test-engine-raft-raft-engine trace-async-tasks openssl-vendored
Profile:           dist_release

TiCDC version (execute cdc version):

Release Version: v8.5.5
Git Commit
Hash: a2b203136e5a6637bf7019ae1d91ecd4f931433f
Git Branch: HEAD
UTC Build Time: 2025-12-18 14:23:34
Go Version: go version go1.24.7 linux/amd64
Failpoint Build: false
Kernel Type: Classic

Metadata

Metadata

Assignees

Labels

contributionThis PR is from a community contributor.severity/majortype/bugThe issue is confirmed as a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions