Skip to content

fix(evm-ee): preserve bytecode code-hash keys in chunk witnesses#1850

Open
irnb wants to merge 3 commits into
mainfrom
fix/ee-witness-bytecode-hash-key
Open

fix(evm-ee): preserve bytecode code-hash keys in chunk witnesses#1850
irnb wants to merge 3 commits into
mainfrom
fix/ee-witness-bytecode-hash-key

Conversation

@irnb
Copy link
Copy Markdown
Contributor

@irnb irnb commented May 22, 2026

Description

Accessed-state records already identify bytecodes by the account code_hash, but range witness assembly dropped that key and rebuilt the EvmPartialState bytecode map by hashing materialized Bytecode values.

That is incorrect when the stored Bytecode contains revm legacy-analysis padding:
the replay witness can be keyed by keccak(runtime + padding) instead of the account's real keccak(runtime), causing chunk replay to miss bytecode that was actually accessed.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature/Enhancement (non-breaking change which adds functionality or enhances an existing one)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactor
  • New or updated tests
  • Dependency Update

Notes to Reviewers

Is this PR addressing any specification, design doc or external reference document?

  • Yes
  • No

If yes, please add relevant links:

Checklist

  • I have performed a self-review of my code.
  • I have commented my code where necessary.
  • I have updated the documentation if needed.
  • My changes do not introduce new warnings.
  • I have added (where necessary) tests that prove my changes are effective or that my feature works.
  • New and existing tests pass with my changes.
  • I have disclosed my use of AI in the body of this PR.

Related Issues

@storopoli
Copy link
Copy Markdown
Member

This still looks like it can preserve the padded revm representation as the bytecode value. The PR fixes the key side by carrying the AccessedStateGenerator code hash through range witness assembly, but here we persist code.bytes().to_vec() for the bytecode store:

let bytecodes: Vec<BytecodeEntry> = accessed
.accessed_contracts()
.iter()
.map(|(hash, code)| (hash_from_b256(*hash), code.bytes().to_vec()))
.collect();

For legacy analyzed bytecode, revm may append zero padding during analysis: https://github.com/bluealloy/revm/blob/main/crates/bytecode/src/legacy/analysis.rs#L42-L47. When the range witness later reconstructs the bytecode with Bytecode::new_raw(code.into()), those padded bytes become part of the reconstructed bytecode’s original length:

// Resolve bytecodes from the content-addressed store and
// merge into the accumulator.
for code_hash_bytes in &record.bytecode_hashes {
let code_hash_storage_key = Hash::from(*code_hash_bytes);
let code = handle
.block_on(
self.accessed_state_store
.get_bytecode(code_hash_storage_key),
)
.map_err(|e| eyre!("get_bytecode({code_hash_storage_key:?}): {e}"))?
.ok_or_else(|| {
eyre!(
"no bytecode for code hash {code_hash_storage_key:?} referenced \
by block {block_hash:?}",
)
})?;

That means the preserved key can make code_by_hash_ref(real_hash) succeed, but replay may still diverge for contracts that observe code size or code bytes (CODESIZE, EXTCODESIZE, and related copy paths). Should this store code.original_bytes().to_vec() instead, so the bytecode value is also reconstructed from the real runtime bytes while this PR preserves the original code-hash key?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 22, 2026

Commit: bd88dee

SP1 Execution Results

program cycles gas
EVM EE Chunk 564,874 770,917
EVM EE Account 421,125 526,509
Checkpoint 2,241,010 2,582,478

@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

❌ Patch coverage is 95.65217% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.03%. Comparing base (d1d88bd) to head (f478b72).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
crates/evm-ee/src/types/partial_state.rs 33.33% 2 Missing ⚠️
@@            Coverage Diff             @@
##             main    #1850      +/-   ##
==========================================
+ Coverage   79.76%   80.03%   +0.27%     
==========================================
  Files         674      674              
  Lines       74805    75710     +905     
==========================================
+ Hits        59665    60593     +928     
+ Misses      15140    15117      -23     
Flag Coverage Δ
functional 60.29% <88.88%> (+0.09%) ⬆️
unit 66.13% <80.43%> (+0.47%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
bin/prover-perf/src/programs/alpen_chunk.rs 90.00% <100.00%> (+0.76%) ⬆️
crates/evm-ee/src/execution.rs 83.77% <100.00%> (+0.43%) ⬆️
crates/proof-impl/alpen-chunk/src/program.rs 96.96% <100.00%> (+0.12%) ⬆️
crates/reth/exex/src/accessed_state_exex.rs 75.86% <100.00%> (+2.17%) ⬆️
crates/reth/witness/src/range_witness_extractor.rs 86.81% <100.00%> (+0.29%) ⬆️
crates/evm-ee/src/types/partial_state.rs 85.47% <33.33%> (+0.91%) ⬆️

... and 19 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@irnb
Copy link
Copy Markdown
Contributor Author

irnb commented May 22, 2026

@storopoli please check b9c92bf

Copy link
Copy Markdown
Member

@storopoli storopoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK f478b72

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants