sync-diff: replace hardcoded use of MD5 with configurable option to support FIPS-compliant environments.#12622
sync-diff: replace hardcoded use of MD5 with configurable option to support FIPS-compliant environments.#12622maxz-db wants to merge 1 commit into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @maxz-db. Thanks for your PR. I'm waiting for a pingcap member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Welcome @maxz-db! |
|
|
There was a problem hiding this comment.
Code Review
This pull request introduces support for SHA256 checksumming alongside the existing MD5 implementation to facilitate FIPS compliance. Key changes include the addition of a ChecksumAlgorithm configuration type, updates to the Config and DataSource structs, and the modification of the core checksum utility to dynamically generate SQL using either MD5 or SHA2. Review feedback recommends making the new --checksum-algorithm flag visible to users and reverting the args parameter type from []interface{} to []any to maintain consistency with modern Go standards.
… option to support FIPS-compliant environments. (pingcap#6) Context: TiDB FIPS build mode introduced in 7.6.0: pingcap/tidb#47949. TiDB binaries built with FIPS 140-3 compliance mode disable MD5 hashing in OpenSSL library used by TiKV. Problem: sync-diff-inspector relies on hardcoded MD5() for chunk checksumming. For performance reasons, TiDB may push expression evaluation down to TiKV coprocessor (tidb_query_expr), which uses OpenSSL for cryptographic functions. In FIPS mode, TiKV's OpenSSL inner_evp_generic_fetch() tries to load MD5 algorithm and fails with error code 50856204 (EVP_R_UNSUPPORTED). As a result, sync-diff-inspector fails because TiDB rejected all MD5-based checksum queries due to OpenSSL security policy restrictions. Changes: - Added a new `checksum-algorithm` configuration flag: - Supported options: md5" and "sha256" hash functions for checksumming - Default: md5 for backwards compatibility.
3a17fe4 to
98099ee
Compare
joechenrh
left a comment
There was a problem hiding this comment.
And please update the PR description.
| if len(ds) > 0 { | ||
| checksumAlgorithm = ds[0].ChecksumAlgorithm | ||
| } |
There was a problem hiding this comment.
len(ds) is validated to be positive before entering this function.
| Conn *sql.DB | ||
| SessionConfig SessionConfig `toml:"session" json:"session"` | ||
|
|
||
| ChecksumAlgorithm ChecksumAlgorithm `toml:"checksum-algorithm" json:"checksum-algorithm"` |
There was a problem hiding this comment.
How about passing Config to buildSourceFromCfg, so we don't need store ChecksumAlgorithm in the DataSource.
| db *sql.DB, schemaName, tableName string, | ||
| tbInfo *model.TableInfo, limitRange string, indexHint string, args []any, | ||
| tbInfo *model.TableInfo, limitRange string, indexHint string, | ||
| args []any, checksumAlgorithm string, |
There was a problem hiding this comment.
can we pass the param as ChecksumAlgorithm instead of string
|
/ok-to-test |
|
@maxz-db: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: close #12627
What is changed and how it works?
Added a new checksum-algorithm configuration flag to sync-diff-inspector:
Supported options: md5" and "sha256" hash functions for checksumming
Default: md5 for backwards compatibility.
Check List
Tests
Manual test on dev environment.
Questions
Will it cause performance regression or break compatibility?
No. This only reuses the existing binlog value normalization path in validator.
Do you need to update user documentation, design documentation or monitoring documentation?
No.
Release note