perf: replace slice bits by low bits#1840
Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes the word.Uint.Slice() implementation on the zkc exec hot path by replacing the previous per-bit extraction helper with a lower-level “mask low bits” approach intended to be faster for common widths.
Changes:
- Replaced
Uint.Slice(width)implementation with a newlowBits(width, value)helper. - Removed the former
readBitSliceloop-based implementation. - Added a
math/bits-based fast path to avoid touching high words for small widths.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // lowBits returns the low `width` bits of value (i.e. value mod 2^width). | ||
| // For example, given value 10111000 and width=4 the result is 1000. | ||
| func lowBits(width uint, value big.Int) big.Int { | ||
| var slice big.Int | ||
| // Fast paths: the result fits within a single uint64. | ||
| if width <= 64 { | ||
| if value.IsUint64() { | ||
| slice.SetUint64(value.Uint64() & mask64(width)) | ||
| return slice | ||
| } | ||
| // For a multi-word value, the least-significant machine word already | ||
| // holds every bit we keep, so there's no need to touch the high words. | ||
| if width <= bits.UintSize { | ||
| slice.SetUint64(uint64(value.Bits()[0]) & mask64(width)) | ||
| return slice | ||
| } | ||
| } | ||
| // General path: mask off everything at or above bit `width`. | ||
| mask := new(big.Int).Lsh(&one, width) | ||
| mask.Sub(mask, &one) | ||
| slice.And(&value, mask) | ||
| // | ||
| return slice | ||
| } |
DavePearce
left a comment
There was a problem hiding this comment.
Overall, LGTM --- it is definitely an improvement. However, I have to note that on two somewhat realistic benchmarks I am not observing much performance benefit. I guess its not hot. The benchmarks are: 50K block keccak (ZkC) which takes around 58s vs 56s; a run of blake on the RISC-V interpreter (not sure how many rounds, but takes around 4s for both versions).
Signed-off-by: Gautam Botrel <gautam.botrel@gmail.com>
On the hot path of
zkc exec, micro bench on this specific function x5 - x10, on some specificzkc execruns, x2.