Skip to content

fix: benchmark frequency on CWF, use pcm-tpmi to count number of compute dies#676

Open
harp-intel wants to merge 2 commits intomainfrom
cwf-freq-bench
Open

fix: benchmark frequency on CWF, use pcm-tpmi to count number of compute dies#676
harp-intel wants to merge 2 commits intomainfrom
cwf-freq-bench

Conversation

@harp-intel
Copy link
Copy Markdown
Contributor

multi-die processors requires core interleaving for frequency benchmarking

This pull request updates the logic for detecting and handling specific CPU models in the internal/script/scripts.go script, particularly for GNR and CWF architectures. The main improvements include extending support to an additional CPU model, switching from lspci to pcm-tpmi for more accurate compute die detection, and updating dependencies accordingly.

CPU Model Support and Detection:

  • Extended the special handling logic to include both GNR (model 173) and CWF (model 221) CPUs, instead of just GNR.
  • Replaced the method for counting compute dies from parsing lspci output to using pcm-tpmi, which provides a more accurate count.
  • Updated the calculation of dies_per_socket to use the new compute die count.

Dependency Updates:

  • Changed the script's dependencies to require pcm-tpmi instead of lspci, reflecting the new detection method.

…ute dies

Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates PerfSpect’s x86 frequency benchmark script to better handle multi-die Intel CPUs (GNR/CWF) by interleaving core IDs across dies, and switches compute-die counting from lspci-based heuristics to pcm-tpmi output parsing.

Changes:

  • Extend special interleaving logic to apply to both GNR (model 173) and CWF (model 221).
  • Replace compute-die detection from lspci device counting to parsing pcm-tpmi output, and recompute dies_per_socket accordingly.
  • Update the frequency benchmark script dependency list to include pcm-tpmi instead of lspci.

Comment on lines +1242 to +1246
output=$(pcm-tpmi 2 0x10 -d -b 26:26)
compute_die_count=0
while read -r line; do
if [[ $line == *"entry"* && $line == *"instance"* && $line == *"value 0"* ]]; then
compute_die_count=$((compute_die_count + 1))
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pcm-tpmi execution/parsing isn’t validated here. If pcm-tpmi isn’t present on the target PATH, fails, or its output format changes, compute_die_count will stay 0 and the script will later miscompute or crash. Consider checking the command exit status and that the parsed count is a positive integer before proceeding (otherwise fall back to no interleaving or emit a clear error and exit).

Copilot uses AI. Check for mistakes.
Comment on lines 1250 to 1254
num_sockets=$(lscpu | grep -E '^Socket\(s\):' | awk '{print $2}')
# echo "Number of devices: $num_devices"
# echo "Number of sockets: $num_sockets"
num_devices_per_die=2
# Calculate the number of dies per socket
dies_per_socket=$((num_devices / num_sockets / num_devices_per_die))
dies_per_socket=$(( compute_die_count / num_sockets ))
# echo "Number of dies per socket: $dies_per_socket"
# Calculate the number of cores per die
cores_per_die=$((num_cores_per_socket / dies_per_socket))
Copy link

Copilot AI Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dies_per_socket=$(( compute_die_count / num_sockets )) can evaluate to 0 (or error) if num_sockets is 0/empty or compute_die_count < num_sockets, which then causes a division-by-zero in cores_per_die=$((num_cores_per_socket / dies_per_socket)). Add guards to ensure num_sockets > 0 and dies_per_socket >= 1 (and optionally that the division is exact) before using these values.

Copilot uses AI. Check for mistakes.
Signed-off-by: Harper, Jason M <jason.m.harper@intel.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment on lines +1249 to 1262
output=$(pcm-tpmi 2 0x10 -d -b 26:26)
compute_die_count=0
while read -r line; do
if [[ $line == *"entry"* && $line == *"instance"* && $line == *"value 0"* ]]; then
compute_die_count=$((compute_die_count + 1))
fi
done <<< "$output"
# echo "Number of compute dies: $compute_die_count"
num_sockets=$(lscpu | grep -E '^Socket\(s\):' | awk '{print $2}')
# echo "Number of devices: $num_devices"
# echo "Number of sockets: $num_sockets"
num_devices_per_die=2
# Calculate the number of dies per socket
dies_per_socket=$((num_devices / num_sockets / num_devices_per_die))
dies_per_socket=$(( compute_die_count / num_sockets ))
# echo "Number of dies per socket: $dies_per_socket"
# Calculate the number of cores per die
cores_per_die=$((num_cores_per_socket / dies_per_socket))
# echo "Number of cores per die: $cores_per_die"
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compute-die detection logic can produce compute_die_count=0 (e.g., pcm-tpmi failure/permission issue/unexpected output), which then makes dies_per_socket=0 and triggers a division-by-zero when computing cores_per_die. Please add explicit error handling: check pcm-tpmi exit status, validate compute_die_count and num_sockets are >0, and ensure compute_die_count is evenly divisible by num_sockets before doing arithmetic; otherwise return a clear error or skip interleaving.

Copilot uses AI. Check for mistakes.
Comment on lines +127 to +157
var numDies int
var coreMultiplier int
if strings.Contains(arch, cpus.UarchSRF) {
numDies = 4
coreMultiplier = 4
} else if strings.Contains(arch, cpus.UarchCWF) {
numDies = 3
coreMultiplier = 4
} else if strings.Contains(arch, cpus.UarchGNR_X3) {
archMultiplier = 3
numDies = 3
coreMultiplier = 1
} else if strings.Contains(arch, cpus.UarchGNR_X2) {
archMultiplier = 2
numDies = 2
coreMultiplier = 1
} else {
archMultiplier = 1
numDies = 1
coreMultiplier = 1
}
for _, count := range bucketCoreCounts {
if startRange > count {
adjustedCount := count * coreMultiplier
if startRange > adjustedCount {
break
}
if archMultiplier > 1 {
totalCoreCount := count * archMultiplier
if totalCoreStartRange > int(totalCoreCount) {
break
}
totalCoreBuckets = append(totalCoreBuckets, fmt.Sprintf("%d-%d", totalCoreStartRange, totalCoreCount))
totalCoreStartRange = int(totalCoreCount) + 1
totalCoreCount := adjustedCount * numDies
if totalCoreStartRange > int(totalCoreCount) {
break
}
dieCoreBuckets = append(dieCoreBuckets, fmt.Sprintf("%d-%d", startRange, count))
startRange = int(count) + 1
totalCoreBuckets = append(totalCoreBuckets, fmt.Sprintf("%d-%d", totalCoreStartRange, totalCoreCount))
totalCoreStartRange = int(totalCoreCount) + 1
dieCoreBuckets = append(dieCoreBuckets, fmt.Sprintf("%d-%d", startRange, adjustedCount))
startRange = int(adjustedCount) + 1
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetSpecFrequencyBuckets now has new SRF/CWF-specific bucket math (numDies/coreMultiplier) but there are no unit tests covering this function. Since internal/extract/frequency_test.go already exists for related helpers, please add cases that validate the generated bucket ranges/headers for SRF and CWF (including the new CWF 3-die behavior and the coreMultiplier scaling) to prevent regressions.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants