Skip to content

chilldkg: security review fixes#247

Merged
LLFourn merged 10 commits into
masterfrom
claude-security-review-omnibus
May 8, 2026
Merged

chilldkg: security review fixes#247
LLFourn merged 10 commits into
masterfrom
claude-security-review-omnibus

Conversation

@LLFourn
Copy link
Copy Markdown
Owner

@LLFourn LLFourn commented Apr 21, 2026

Summary

A security review of the chilldkg module in schnorr_fun (and adjacent code reached from its decoding paths). No critical vulnerabilities were found. The issues fall into two buckets:

  1. Decoding panics — cases where malformed serialized input could crash the process rather than return an error.
  2. Defensive hardening — places where the protocol was permissive in ways that didn't expose secrets but tolerated unintended behaviour (ghost contributions, x-only PoP replay, silently accepted unverified certs, redundant/confusing parameters, etc.).

Each commit is self-contained and independently reviewable.

Commit-by-commit

b41919d — Replace &'static str errors in chilldkg with typed error enums

Introduces simplepedpop::AddInputError, encpedpop::AddInputError, encpedpop::RecoverShareError, certpedpop::RecoverShareError. Replaces string-typed Result<_, &'static str> returns so callers can discriminate failures programmatically. Drops the unused CertifierError::DuplicateCertificate variant.

58d90d9 — Fix panics on malformed input in chilldkg and SharedKey decoding

Three panic paths reachable from untrusted input:

  • encpedpop::Contributor::verify_agg_input indexed encryption_nonces[my_index] directly. A coordinator sending a shorter vec crashed the contributor; now returns ContributionDidntMatch.
  • SharedKey::from_slice(&[]) and the bincode::Decode / serde::Deserialize impls panicked on empty polynomials via poly[0]. Now return None / a decode error.
  • simplepedpop::Contributor::gen_keygen_input indexed secret_poly[0] when called with threshold == 0. Now asserts threshold > 0 matching Coordinator::new.

b638920 — Enforce contributor count across chilldkg layers

A malicious coordinator can craft an AggKeygenInput with more key_contrib slots than any honest contributor agreed to (a "ghost" contribution whose secret the coordinator knows). Doesn't compromise secrecy but gives the coordinator extra influence over the public key and breaks the protocol's implicit "every contributor signed off" contract.

  • simplepedpop::Contributor now stores n_contributors at construction and verify_agg_input rejects mismatched key_contrib.len().
  • encpedpop::Contributor::verify_agg_input additionally rejects mismatched encryption_nonces.len().
  • certpedpop::Certifier::new becomes fallible and rejects an AggKeygenInput whose contributor count doesn't match contributor_keys.len(), with new variant CertifierError::ContributorCountMismatch.

4ff15a0 — Bind simplepedpop proof-of-possession to its slot index and parity

The PoP previously signed the empty message. Concrete replay primitives this allowed:

  • (com, pop) from one slot replayed at a different slot.
  • (−com, pop) accepted under BIP340 x-only verification even though the caller intended com.

The PoP message is now parity_byte || idx_be (5 bytes), carried under the BIP340-style 33-byte domain separator "BIP DKG/pop". Parity binding closes the x-only gap — (−A, pop_A) reconstructs a different message and fails verify. This diverges from the bip-frost-dkg Python reference's choice to bind only idx via BIP340's tag_prefix; we bind more and use the BIP340-recommended message-prefix domain separator. Regression test covers cross-slot replay and negated-key-same-slot replay.

21ce300 — Remove serialization from CertifiedKeygen

The only sound path to a CertifiedKeygen is Certifier::finish, where every stored certificate was verified as it came in. Deserialization bypassed that and could land attacker-chosen gammas in the struct; vrf_security_check would then hand back a beacon value the attacker controlled. Dropping the bincode::Encode/Decode and serde Serialize/Deserialize derives on CertifiedKeygen eliminates the unverified-state path. Apps that need to persist a certified keygen can serialize AggKeygenInput and the certificate map separately and rebuild on the receiving side.

b1e1c4e — Validate share index and encryption key in receive_secret_share

encpedpop::receive_secret_share silently tolerated two malformed inputs: a missing entry in encrypted_shares (treated as a zero ciphertext via unwrap_or_default) and a caller-supplied keypair that didn't match the registered encryption key. Both surfaced as the generic InvalidSecretShare, diverging from the sibling AggKeygenInput::recover_share path which already returned UnknownShareIndex / WrongEncryptionKey. Adds both variants to ReceiveShareError and checks them explicitly before decryption.

9c34719 — Always verify signatures in Certifier::receive_certificate

On a duplicate cert (same key already registered), the function returned Ok without calling verify_cert on the new signature. A debug_assert_eq! guarded the "sigs must match" case but vanished in release builds, so a mismatched second signature was silently discarded while the caller believed the cert was validated. For randomized signature / VRF schemes this was the always-taken path. Now every supplied signature is verified; a verified duplicate is idempotent (first stored wins), a failed verification returns InvalidSignature.

Conversation History

User: can you check for security issues in chilldkg's implementation in schnorr_fun. I am especially concerned with parties not checking their inputs and included in things.

Assistant: (Flagged 7 concerns: panic in encpedpop verify_agg_input; ghost contributor injection; receive_secret_share keypair validation; PoP not bound to context; Certifier silent duplicate-cert; vrf_security_check trusts unverified gammas; same-keypair-for-encryption-and-cert design note.)

User: cool fix #1 and then we'll discuss #2.

User: actually before moving on to #2 can you search for similar problems especially in decoding implementations that could cause panics.

Assistant: (Found SharedKey::from_slice(&[]) panic, bincode Decode / serde Deserialize empty-poly panics, gen_keygen_input threshold=0 panic.)

User: Do you think that #2 is a security vulnerability? Can you explain the attack.

Assistant: (Ghost injection: count defense missing means malicious coordinator can add slots with known-secret "ghost" contributions. Not a secrecy break but adaptive-bias on the public key; protocol doesn't enforce "N authorized contributors.")

User: right but the thing is that in practice the coordinator is always a contributor as well... But I agree it's sort of worth fixing since technically the coordinator role itself is not meant to have that ability.

User: why couldn't the attacker just echo back the signature shares contributions of the honest party too (except negated). They are homomorphic.

Assistant: (Walked through why the negation attack doesn't close into key compromise: encpedpop pads are H(dh_key, enc_key, dest) hash outputs, not linearly composable. The attack remains a targeted DoS; encryption-layer non-linearity keeps the downstream damage bounded.)

User: Ok can you look up the python referenmce for the FROST chilldkg and see what they throw in the PoP and we can see if we can follow it.

(— bip-frost-dkg signs idx.to_bytes(4, big) under tag "BIP DKG/pop message". schnorr_fun uses BIP340 message-prefix domain separation instead of tag_prefix. Bound idx here.)

User: @simplepedpop.rs I feel like [the x-only same-slot replay passing] is not a very desirable behaviour.

(— the test acknowledged that negated-key-same-slot passed PoP and was only caught by verify_agg_input's Point-equality self-check. User wanted the PoP itself to bind parity. Extended the PoP message to parity_byte || idx_be.)

User: maybe we just shouldn't allow [CertifiedKeygen] to be serialized?

(— chose the simplest fix for vrf_security_check's reliance on unverified gammas: drop the derive.)

User: A just drop it [the n_contributors param from receive_secret_share]. When you're receiving a secret share the only thing you care about is it's the correct secret share with respect to the join sharing polynomial.

(— decided receive_secret_share's n_contributors parameter was redundant with agg_input; receivers' defense against count tampering is the cert round or out-of-band comparison, not a per-receive assertion. Folded into b638920.)

User: What is the actual security implications of duplicate keys though in general... I am not really interested in adding more errors I am more interested in avoiding them when the protocol could tolerate duplicates.

(— audited for panics / spurious errors on duplicate keys across Certifier::required_keys, certificates, and related paths; found none; concluded the BTreeSet dedup is intentional and benign. No code changes.)

nickfarrow
nickfarrow previously approved these changes Apr 27, 2026
Copy link
Copy Markdown
Collaborator

@nickfarrow nickfarrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed that b638920 says

    - simplepedpop::receive_secret_share takes n_contributors and rejects
      length mismatches with a new ReceiveShareError::WrongContributorCount.

but that was not implemented.

I think this is ok since a malicious coordinator inflating the contributor count should achieve nothing given the security model. The dangerous situation would be if a coordinator were able to inflate the receiver count, in frostsnap we check this count in the keygen check.

Comment thread schnorr_fun/src/frost/chilldkg/simplepedpop.rs Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function verifies certificates for the expected parties in cert_keys, but propagates the caller-supplied certificate map which could contain extra (cert_key, certificate)s, which is then included in the VRF beacon.

I don't think this is exploitable with how we are currently using it in Frostsnap, devices & coordinator independently build their own certificate and only include known participants. But it is a rough edge that could be solved by filtering on cert_keys or asserting length of cert keys matches certificate map length in finalize.

/// `CertifiedKeygen` can only be constructed via [`Certifier::finish`], which
/// verifies each certificate as it is received, so this method is a sanity check
/// rather than a safety gate.
pub fn verify<S>(
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh we don't need a "sanity check" polluting the API. We could turn this into simpler new function which just takes a list of certificiates.

) -> Self {
) -> Result<Self, CertifierError> {
if agg_input.n_encryption_nonces() != contributor_keys.len()
|| agg_input.inner().n_contributors() != contributor_keys.len()
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like the n_contributors and n_encryption_nonces should already be equal here. This is checking the validity of agg_input implicitly but it should already be correct. agg_input should just have a method n_contributors() so you don't have to call inner().

In order to enforce the invariant correctly we'll have to make WireAggkeygenInput that is serializable and has an invariant checker that converts it to AggKeygenInput.

IncompleteCertificates,
/// The aggregated keygen input has a different number of contributors than
/// the provided `contributor_keys` list.
ContributorCountMismatch,
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to me that this error variant is only used in one of the functions and is no tused in receive_certificate even though it returns CertifierError. We should make the errors tight.

let paired_secret_shares: Vec<(PairedSecretShare<Normal>, KeyPair)> = dual_role_receivers
.into_iter()
.map(|(receiver, enckey)| {
let CertifiedSecretShare { paired_share, .. } = receiver
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we check that the certified keygen obtained from this is eq to the one above.

Comment thread schnorr_fun/src/frost/chilldkg/encpedpop.rs
/// No encrypted share exists at the receiver's share index.
UnknownShareIndex,
/// The supplied keypair isn't the encryption key registered for this share.
WrongEncryptionKey,
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are defined here but never used in this file. "EncryptionKey" should be a clue. They should be defined in encpedpop.

impl AggKeygenInput {
/// The number of contributors in this aggregated input.
pub fn n_contributors(&self) -> usize {
self.encryption_nonces.len()
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should call simplepedpop's AggKeygenInput's n_contributors.

/// reconstructs a different message and so fails PoP verification.
///
/// SPEC DEVIATION: we add the parity byte where as the spec only has the contribution_index.
fn pop_message_bytes(first_coeff: Point, contributor_index: u32) -> [u8; 5] {
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just put the is_event_y as the first param to avoid the impression we're rehashing the whole point.

@LLFourn LLFourn force-pushed the claude-security-review-omnibus branch 2 times, most recently from 6cfa40e to 890a003 Compare April 30, 2026 06:13
@LLFourn LLFourn force-pushed the claude-security-review-omnibus branch 2 times, most recently from e037e96 to 8c3f869 Compare April 30, 2026 06:30
@LLFourn
Copy link
Copy Markdown
Owner Author

LLFourn commented Apr 30, 2026

@nickfarrow thanks for catching the stale commit message. I probably asked it to undo it because it wasn't doing anything. Effectively, trying to tighten security so that much more things are checked automatically was a bit of an odessy. The current implementation has a very loose API and relies on the checking symmetric view accross parties for security. Since security always rests on that check this is not really wrong but this API could do with a lot more defense in-depth. There were various and subtle ways of using it insecurely most of which we avoided in frostsnap but it would be much more confident if the API didn't have any.

I built #248 on top of this one but it might be better just to review #248 in its entireity rather than continueing tto review this one. The purpose of #248 is to create a less flexible but absoluitely impossible to fuck up API.

LLFourn added 10 commits May 1, 2026 18:58
Introduces named error types for each fallible API surface:
- simplepedpop::AddInputError
- encpedpop::AddInputError and encpedpop::RecoverShareError
- certpedpop::certificate::RecoverShareError

Replaces the prior `Result<_, &'static str>` return types across
Coordinator::add_input, AggKeygenInput::recover_share, and
CertifiedKeygen::recover_share so callers can discriminate failure
cases programmatically and formatters can produce consistent messages.

Also drops the unused CertifierError::DuplicateCertificate variant;
duplicate-cert handling is inlined in Certifier::receive_certificate.
- encpedpop::Contributor::verify_agg_input: bounds-check the
  encryption_nonces index instead of panicking on a shorter vec. An
  adversary-supplied AggKeygenInput could previously crash a
  contributor.

- SharedKey::from_slice / bincode Decode / serde Deserialize: reject
  an empty polynomial with an error instead of indexing poly[0]. An
  attacker-controlled serialized SharedKey with a zero-length
  polynomial would otherwise panic the decoder.

- simplepedpop::Contributor::gen_keygen_input: assert threshold > 0 to
  match Coordinator::new and avoid indexing an empty secret_poly.
Threads the number of contributors through simplepedpop, encpedpop,
and certpedpop so every honest role commits up-front to the size of
the contributor set and rejects any AggKeygenInput whose slot count
disagrees.

- simplepedpop::Contributor carries n_contributors from gen_keygen_input;
  verify_agg_input checks key_contrib.len() == n_contributors.
- encpedpop layers the same check onto encryption_nonces.len().
- certpedpop plumbs n_contributors through gen_keygen_input,
  SecretShareReceiver::receive_secret_share, and
  verify_receive_share_and_certify.
- Certifier::new is now fallible and verifies both lengths against
  contributor_keys.len() via new
  CertifierError::ContributorCountMismatch; CertifiedKeygen::verify
  propagates the new error.

Closes the ghost-contributor rough edge on the contribution-verification
path: a malicious coordinator who pads AggKeygenInput with extra slots
no longer passes any honest role's verify checks.
The PoP previously signed the empty message, which let a malicious
party replay an honest contributor's (com, pop) at any other slot
(and, exploiting BIP340 x-only verification, with the public key
negated). There is no actual exploit you could achieve with this but it
is a peculiar capability.

We add two protections against this kind of vulnerability -- sign the parity
bit (which the BIP doesn't) and the contributor index (which the BIP does).

Adds a regression test covering cross-slot replay and the
negation-replay variant.
The only sound way to get a CertifiedKeygen is Certifier::finish,
which has verified every stored certificate as it was received.
Deserialization bypassed that, producing a struct that looked
identical to a safely-constructed one but could carry attacker-chosen
gammas — and vrf_security_check would then hand back a beacon value
the attacker controlled.

Dropping the bincode/serde derives on CertifiedKeygen closes the
unverified-state path entirely. vrf_security_check keeps its cheap
one-pass form because the only constructor now is Certifier::finish.
verify() stays as a paranoia sanity check; its docstring is updated
to stop advertising the (no longer supported) deserialization path.

If someone needs to persist a certified keygen across processes, they
can serialize AggKeygenInput and the certificate map separately and
rebuild via Certifier on the receiving side.
encpedpop::receive_secret_share silently swallowed two malformed-input
cases: a missing entry in encrypted_shares (treated as a zero
ciphertext via unwrap_or_default) and a caller-supplied keypair that
didn't match the encryption key registered at my_index (decryption
proceeded with mismatched pads). Both surfaced later as the generic
InvalidSecretShare, which made debugging harder and diverged from the
sibling AggKeygenInput::recover_share path which already returns
UnknownShareIndex / WrongEncryptionKey for the same conditions.

Adds both variants to simplepedpop::ReceiveShareError and checks
them explicitly in encpedpop::receive_secret_share before attempting
decryption, bringing the two share-extraction paths into agreement.
On a duplicate cert (same key already registered), the function
previously returned Ok without calling verify_cert on the new
signature. A debug_assert_eq! guarded the "sigs must match" case
but vanished in release, so a mismatched second signature was
silently discarded while the caller believed the cert was validated.
For randomized signature / VRF schemes — where two legitimate
certifications from the same key produce different bytes — this
always fell into the unchecked path.

Verify every supplied signature. A verified duplicate is idempotent
(first stored wins); a failed verification returns InvalidSignature.
Verify expected certificates into a fresh map and reject leftovers with
CertificateError::Unexpected. finalize now borrows the caller's map and
clones internally, keeping unknown certificates out of the resulting
CertifiedKeygen.
Return SecretShareReceiver from verify_receive_share_and_certify instead
of a bare PairedSecretShare. Dual-role receivers now release their share
only after certificate collection and finalize succeed.
Use "BIP DKG/pop message", matching the draft/reference implementation.
@LLFourn LLFourn force-pushed the claude-security-review-omnibus branch from 8c3f869 to 4102268 Compare May 1, 2026 08:58
@LLFourn LLFourn merged commit 990c46a into master May 8, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shouldn't messages to be signed be marked with a secrecy?

2 participants