Update Vault validate observations#22736
Conversation
|
I see you updated files related to
|
|
✅ No conflicts with other open PRs targeting |
|
|
| return gateAllows(ctx, r.lggr, r.cfg.VaultOptimizationsEnabled, "VaultOptimizationsEnabled") | ||
| } | ||
|
|
||
| func (r *ReportingPlugin) ciphertextlessObservationsEnabled(ctx context.Context) bool { |
There was a problem hiding this comment.
Nit: worth inlining IMO -- I don't think you gain much by abstracting this out
| Result: &vaultcommon.SecretResponse_Data{ | ||
| Data: &vaultcommon.SecretData{ | ||
| EncryptedValue: hex.EncodeToString(secret.EncryptedSecret), | ||
| EncryptedValue: encryptedValueForGetSecretsObservation(r.ciphertextlessObservationsEnabled(ctx), secret.EncryptedSecret), |
There was a problem hiding this comment.
I do wonder if this has some weird error modes and I think we should ask research to take a look.
One possibility that springs to mind: we generate the shares during Observation, but then a request in the same batch updates the secret. If this is processed before the get request in StateTransition, the shares will effectively be for an outdated version of the ciphertext.
It might be safer to hash the ciphertext so that we can effectively "pin" the shares to a particular ciphertext to avoid issues like this.
| } | ||
| continue | ||
| } | ||
| resp.GetData().EncryptedValue = hex.EncodeToString(stored.EncryptedSecret) |
There was a problem hiding this comment.
You shouldn't use a getter when writing; this will initialize a nil Data if it doesn't exist which may lead to unexpected results. You're guarding against that above, so you can just use resp.Data.EncryptedValue = foo





Summary
Fixes two malicious-node liveness vulnerabilities in the vault OCR plugin where observations could pass
ValidateObservationbut fail to reach consensus inStateTransitionwhen only2F+1observations are available.Fix 1 — Pending queue observation validation (no flag, needs to be merged after optimizations flag is enabled)
ValidateObservationnow re-runsappendPendingQueueObservationsusing the submittedPendingQueueItemsandSortNonceto compute the expected truncation point, then requires the submitted observations to be an ordered prefix of exactly that length.This blocks attacks where a malicious node submits too few items (e.g. only the first or last pending queue item), which would otherwise leave honest nodes one observation short of the
2F+1GetSecrets quorum.Fix 2 — Ciphertextless GetSecrets observations (feature-flagged)
GetSecrets observations previously included
EncryptedValuein the observation SHA without validating it, so a malicious node could submit fake ciphertext and split consensus.When
VaultCiphertextlessObservationsEnabledis on:Observationomits ciphertext from GetSecrets responsesshaForObservationexcludes ciphertext from the SHAStateTransitionreads the real ciphertext from KV when building outcomesThe optimizations smoke topology enables both
VaultOptimizationsEnabledandVaultCiphertextlessObservationsEnabled.