[Work in Progress] Cluster-wide Multi-key commands#369

Open

nilanshu-sharma wants to merge 1 commit intomainfrom

cluster-wide-multi-key-commands

Collaborator

nilanshu-sharma commented Mar 10, 2026

No description provided.


          Cluster-wide Multi-key commands

a873343

Signed-off-by: Nilanshu Sharma <nilanshu_sharma@apple.com>

github-actions bot commented Mar 10, 2026

✅ Pull request no significant performance differences ✅

Summary

New baseline 'pull_request' is WITHIN the 'main' baseline thresholds.

Full Benchmark Comparison

Comparing results between 'main' and 'pull_request'

Host '7e3d460d803a' with 4 'x86_64' processors with 15 GB memory, running:
#17~24.04.1-Ubuntu SMP Mon Dec  1 20:10:50 UTC 2025

ValkeyBenchmarks

Client: GET benchmark metrics

Malloc (total): results within specified thresholds, fold down for details.

Malloc (total) *	p0	p25	p50	p75	p90	p99	p100	Samples
main	52	55	55	57	58	58	58	5
pull_request	53	54	55	57	57	57	57	5
Δ	1	-1	0	0	-1	-1	-1	0
Improvement %	-2	2	0	0	2	2	2	0

Client: GET benchmark | parallel 20 | 20 concurrent connections metrics

Malloc (total): results within specified thresholds, fold down for details.

Malloc (total) *	p0	p25	p50	p75	p90	p99	p100	Samples
main	52	56	57	59	61	63	63	24
pull_request	51	54	58	59	60	64	64	22
Δ	-1	-2	1	0	-1	1	1	-2
Improvement %	2	4	-2	0	2	-2	-2	-2

Connection: GET benchmark metrics

Malloc (total): results within specified thresholds, fold down for details.

Malloc (total) *	p0	p25	p50	p75	p90	p99	p100	Samples
main	4	4	4	4	4	4	4	8
pull_request	4	4	4	4	4	4	4	7
Δ	0	0	0	0	0	0	0	-1
Improvement %	0	0	0	0	0	0	0	-1

Connection: GET benchmark – NoOpTracer metrics

Malloc (total): results within specified thresholds, fold down for details.

Malloc (total) *	p0	p25	p50	p75	p90	p99	p100	Samples
main	4	7	8	9	9	9	9	7
pull_request	4	7	8	9	9	9	9	7
Δ	0	0	0	0	0	0	0	0
Improvement %	0	0	0	0	0	0	0	0

Connection: Pipeline array benchmark metrics

Malloc (total): results within specified thresholds, fold down for details.

Malloc (total) *	p0	p25	p50	p75	p90	p99	p100	Samples
main	14	18	25	29	29	29	29	7
pull_request	14	18	25	29	29	29	29	7
Δ	0	0	0	0	0	0	0	0
Improvement %	0	0	0	0	0	0	0	0

Connection: Pipeline benchmark metrics

Malloc (total): results within specified thresholds, fold down for details.

Malloc (total) *	p0	p25	p50	p75	p90	p99	p100	Samples
main	7	24	24	25	25	25	25	7
pull_request	7	24	24	25	25	25	25	7
Δ	0	0	0	0	0	0	0	0
Improvement %	0	0	0	0	0	0	0	0

HashSlot – {user}.whatever metrics

Malloc (total): results within specified thresholds, fold down for details.

Malloc (total) *	p0	p25	p50	p75	p90	p99	p100	Samples
main	0	0	0	0	0	0	0	17
pull_request	0	0	0	0	0	0	0	18
Δ	0	0	0	0	0	0	0	1
Improvement %	0	0	0	0	0	0	0	1

ValkeyCommandEncoder – Command with 7 words metrics

Malloc (total): results within specified thresholds, fold down for details.

Malloc (total) *	p0	p25	p50	p75	p90	p99	p100	Samples
main	0	0	0	0	0	0	0	826
pull_request	0	0	0	0	0	0	0	828
Δ	0	0	0	0	0	0	0	2
Improvement %	0	0	0	0	0	0	0	2

ValkeyCommandEncoder – Simple GET metrics

Malloc (total): results within specified thresholds, fold down for details.

Malloc (total) *	p0	p25	p50	p75	p90	p99	p100	Samples
main	0	0	0	0	0	0	0	1952
pull_request	0	0	0	0	0	0	0	1950
Δ	0	0	0	0	0	0	0	-2
Improvement %	0	0	0	0	0	0	0	-2

ValkeyCommandEncoder – Simple MGET 15 keys metrics

Malloc (total): results within specified thresholds, fold down for details.

Malloc (total) *	p0	p25	p50	p75	p90	p99	p100	Samples
main	0	0	0	0	0	0	0	374
pull_request	0	0	0	0	0	0	0	377
Δ	0	0	0	0	0	0	0	3
Improvement %	0	0	0	0	0	0	0	3

adam-fowler reviewed

View reviewed changes

Collaborator

adam-fowler left a comment

I've had an initial look at this.

There is a lot of memory allocation.
The public protocol API needs some work. I'm not certain how it can be improved but if feels a bit hacky
Have you considered how this is going to work for pipelined commands?

Sources/Valkey/Cluster/ValkeyClusterClient+MultiKeyCommands.swift

+                          slotIndices[slot]!.append(i)
+                      }
+                      return slotOrder.map { (indices: slotIndices[$0]!, slot: $0) }

Collaborator

adam-fowler Mar 10, 2026

Suggested change

      
                    return slotOrder.map { (indices: slotIndices[$0]!, slot: $0) }
          
                    return slotIndices.map { (indices: $0.value, slot: $0.key) }

Sources/Valkey/Cluster/ValkeyClusterClient+MultiKeyCommands.swift

+                              slotIndices[slot] = []
+                          }
+                          slotIndices[slot]!.append(i)
+                      }

Collaborator

adam-fowler Mar 10, 2026

Do the slots indices need to be ordered? If not you can replace this code with

for (i, key) in keys.enumerated() {
    let slot = HashSlot(key: key)
    slotIndices[slot, default: []].append(i)
}

Sources/Valkey/Cluster/ValkeyClusterClient+MultiKeyCommands.swift

+                      // Map each raw Result<RESPToken, ValkeyClientError> to the typed Response.
+                      let slotResults = partitions.enumerated().map { i, partition in
+                          (indices: partition.indices, result: rawResults[i].convertFromRESP(to: Command.Response.self))

Collaborator

adam-fowler Mar 10, 2026

Is it worth not converting to the response type and parsing the RESP in the assemble function. This could avoid multiple allocations, if a response is converted to an Array. In the case of MGET you end up rebuilding a RESPToken anyway.

Sources/Valkey/Cluster/ValkeyClusterMultiKeyCommand.swift

+                  /// - Parameter indices: Positions into this command's full key list that
+                  ///   belong to a single hash slot.
+                  /// - Returns: A new command of the same type scoped to those keys.
+                  func subCommand(for indices: [Int]) -> Self

Collaborator

adam-fowler Mar 10, 2026

Suggested change

      
                func subCommand(for indices: [Int]) -> Self
          
                func createSubCommand(for indices: [Int]) -> Self

Sources/Valkey/Cluster/ValkeyClusterMultiKeyCommand.swift

+                  ///     indices for that slot and the sub-command result.
+                  /// - Returns: The fully assembled response in original key order.
+                  /// - Throws: ``ValkeyClientError`` if any sub-result is a failure.
+                  static func assemble(

Collaborator

adam-fowler Mar 10, 2026

Suggested change

      
                static func assemble(
          
                static func assembleResults(

Sources/Valkey/Cluster/ValkeyClusterClient+MultiKeyCommands.swift

+                      let partitions = partitionBySlot(keys: keys)
+                      // Build one sub-command per slot, cast to the type-erased protocol.
+                      let subCommands: [any ValkeyCommand] = partitions.map { command.subCommand(for: $0.indices) }

Collaborator

adam-fowler Mar 10, 2026

Is it worth having an early out when there is only one partition which calls the standard code path?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet