redo(ticdc): add column info for redo ddl event#12602
Conversation
There was a problem hiding this comment.
Code Review
This pull request enhances DDL redo logs by including column metadata, enabling more accurate table schema reconstruction. The review feedback identifies a critical nil pointer risk in the log reader and highlights the need to properly persist and map unique column IDs, which are currently missing from the data structures and logic.
| ddl := item.data.RedoDDL.DDL | ||
| ddl.TableInfo.Columns = setColumns(item.data.RedoDDL.Columns) | ||
| if ddl != nil && ddl.CommitTs > cfg.startTs && ddl.CommitTs <= cfg.endTs { |
There was a problem hiding this comment.
Potential nil pointer dereference. The code accesses ddl.TableInfo on line 212 before checking if ddl is nil on line 213. Additionally, if ddl.TableInfo is initialized but its embedded timodel.TableInfo is nil (as seen in postUnmarshal), accessing Columns will also panic. The column initialization should be guarded by a nil check.
| ddl := item.data.RedoDDL.DDL | |
| ddl.TableInfo.Columns = setColumns(item.data.RedoDDL.Columns) | |
| if ddl != nil && ddl.CommitTs > cfg.startTs && ddl.CommitTs <= cfg.endTs { | |
| ddl := item.data.RedoDDL.DDL | |
| if ddl != nil { | |
| if ddl.TableInfo.TableInfo == nil { | |
| ddl.TableInfo.TableInfo = &timodel.TableInfo{} | |
| } | |
| ddl.TableInfo.Columns = setColumns(item.data.RedoDDL.Columns) | |
| } | |
| if ddl != nil && ddl.CommitTs > cfg.startTs && ddl.CommitTs <= cfg.endTs { |
| colInfo := &timodel.ColumnInfo{ | ||
| ID: int64(len(columns)), | ||
| Name: ast.NewCIStr(col.Name), | ||
| State: timodel.StatePublic, | ||
| Version: col.Version, | ||
| } |
There was a problem hiding this comment.
Setting the ID of every column to int64(len(columns)) is incorrect as column IDs should be unique within a table. Use the ID field from model.ColumnInfo (after adding it to the struct) to reconstruct the column info correctly.
| colInfo := &timodel.ColumnInfo{ | |
| ID: int64(len(columns)), | |
| Name: ast.NewCIStr(col.Name), | |
| State: timodel.StatePublic, | |
| Version: col.Version, | |
| } | |
| colInfo := &timodel.ColumnInfo{ | |
| ID: col.ID, | |
| Name: ast.NewCIStr(col.Name), | |
| State: timodel.StatePublic, | |
| Version: col.Version, | |
| } |
| type ColumnInfo struct { | ||
| Name string `msg:"name"` | ||
| OriginDefaultValue any `msg:"origin_default"` | ||
| Type byte `msg:"type"` | ||
| Version uint64 `msg:"version"` | ||
| } |
There was a problem hiding this comment.
The ColumnInfo struct is missing the column ID. Since timodel.ColumnInfo requires a unique ID for various internal operations and schema tracking, it is recommended to persist the original ID from the upstream table info to ensure accurate reconstruction during redo log playback.
| type ColumnInfo struct { | |
| Name string `msg:"name"` | |
| OriginDefaultValue any `msg:"origin_default"` | |
| Type byte `msg:"type"` | |
| Version uint64 `msg:"version"` | |
| } | |
| // ColumnInfo is for column meta in DDL event | |
| type ColumnInfo struct { | |
| ID int64 `msg:"id"` | |
| Name string `msg:"name"` | |
| OriginDefaultValue any `msg:"origin_default"` | |
| Type byte `msg:"type"` | |
| Version uint64 `msg:"version"` | |
| } |
| columns = append(columns, &ColumnInfo{ | ||
| Name: col.Name.String(), | ||
| OriginDefaultValue: col.GetOriginDefaultValue(), | ||
| Type: col.GetType(), | ||
| Version: col.Version, | ||
| }) |
There was a problem hiding this comment.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 3AceShowHand, hongyunyan The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
|
/retest |
|
/retest |
2 similar comments
|
/retest |
|
/retest |
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
|
In response to a cherrypick label: new pull request created to branch |
|
@wk989898: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: close #12600
What is changed and how it works?
Add column info for Redo DDL event to avoid data inconsistency if the column type is time default with origin_default
This PR #12490 will use column info in the DDL event to align the upstream current timestamp. TiCDC has to encode column info for redo ddl event, and redo apply will decode the column info for replicating the ddl event.
Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note