Skip to content

Use multiple CMPXCHG16B locks instead of 1 global lock#527

Merged
OFFTKP merged 6 commits into
masterfrom
cas
May 27, 2026
Merged

Use multiple CMPXCHG16B locks instead of 1 global lock#527
OFFTKP merged 6 commits into
masterfrom
cas

Conversation

@OFFTKP
Copy link
Copy Markdown
Owner

@OFFTKP OFFTKP commented May 27, 2026

CMPXCHG16B would previously use a single global lock to provide some (but not perfect) atomicity to the instruction, since we don't have hardware support. This would cause high contention. In some games, the blocks with the lock would take up 80% of the CPU and lag immensely.

This PR implements a fast per-address spinlock. A random spinlock out of 256 is picked based on a hash of the address. This greatly reduces collisions and improves performance in affected games by a lot. If two threads perform CAS on the same address, the same spinlock is picked, providing the same atomicity guarantees as before, assuming all the instructions are aligned.

@OFFTKP OFFTKP merged commit 302eb9a into master May 27, 2026
2 checks passed
@OFFTKP OFFTKP deleted the cas branch May 27, 2026 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant