Skip to content

[TRTLLM-8236][infra] fix platform tag for public wheel#14616

Open
niukuo wants to merge 3 commits into
NVIDIA:mainfrom
niukuo:wheel_platform
Open

[TRTLLM-8236][infra] fix platform tag for public wheel#14616
niukuo wants to merge 3 commits into
NVIDIA:mainfrom
niukuo:wheel_platform

Conversation

@niukuo
Copy link
Copy Markdown
Collaborator

@niukuo niukuo commented May 27, 2026

Summary by CodeRabbit

  • Chores
    • Updated build configuration to support platform-specific wheel tagging across different Linux distributions and architectures (x86_64, aarch64).
    • Added platform tag option to build system for more accurate wheel metadata.

Review Change Stack

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@niukuo niukuo requested review from a team as code owners May 27, 2026 04:00
@niukuo niukuo requested review from mlefeb01 and yiqingy0 May 27, 2026 04:00
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

📝 Walkthrough

Walkthrough

The PR adds platform tag support to the wheel build system. The build script now accepts a --platform-tag option and computes the target platform tag by combining the provided tag prefix with the detected machine architecture. Jenkins sanity-check configuration is updated to specify platform tags for each build environment.

Changes

Platform Tag Support

Layer / File(s) Summary
Wheel build platform tag implementation
scripts/build_wheel.py
main function signature accepts new platform_tag parameter (default 'linux'), detects current machine architecture via platform.machine() with ARM normalization, and appends --config-setting="--plat-name={platform_tag}_{arch}" to the build backend invocation. CLI argument --platform-tag is wired to expose this option to callers.
Jenkins sanity-check platform tag configuration
jenkins/L0_Test.groovy
Sanity-check config tuple schema is documented with a new platformTag slot. Both x86 and aarch64 sanity-check config arrays append concrete platform tag values: manylinux_2_28 for Ubuntu x86_64 entries, manylinux_2_39 for Ubuntu aarch64 entry, and linux for DLFW-based entries.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ⚠️ Warning PR description is entirely the template with no custom content added to address the specific changes made to platform tag handling. Add a clear Description section explaining the issue being fixed and the solution. Include Test Coverage section with test details and complete all PR Checklist items explicitly.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: fixing the platform tag for public wheel distribution, directly related to the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
jenkins/L0_Test.groovy (1)

4235-4288: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

platformTag is configured but never consumed in the sanity-check pipeline.

The new slot is added in configs, but the job logic still only uses indices up to values[6]; values[7] never reaches runLLMBuild, so wheel builds ignore the configured platform tag.

Suggested fix
-            def wheelName = ""
+            def wheelName = ""
+            def platformTag = values[7]
             def cpver = "cp312"
             def pyver = "3.12"
@@
             buildRunner("[${toStageName(values[1], key)}] Build") {
-                wheelName = runLLMBuild(pipeline, cpu_arch, values[3], "", versionLocal, cpver)
+                wheelName = runLLMBuild(pipeline, cpu_arch, values[3], "", versionLocal, cpver, platformTag)
             }

And update runLLMBuild(...) to pass it through:

-def runLLMBuild(pipeline, cpu_arch, reinstall_dependencies=false, wheel_path="", version_local="", cpver="cp312")
+def runLLMBuild(pipeline, cpu_arch, reinstall_dependencies=false, wheel_path="", version_local="", cpver="cp312", platformTag="linux")
@@
-    withCredentials([usernamePassword(credentialsId: "urm-artifactory-creds", usernameVariable: 'CONAN_LOGIN_USERNAME', passwordVariable: 'CONAN_PASSWORD')]) {
-        trtllm_utils.llmExecStepWithRetry(pipeline, script: "#!/bin/bash \n" + "cd tensorrt_llm/ && python3 scripts/build_wheel.py --use_ccache -G Ninja -j ${BUILD_JOBS} -D 'WARNING_IS_ERROR=ON' ${buildArgs}")
+    withCredentials([usernamePassword(credentialsId: "urm-artifactory-creds", usernameVariable: 'CONAN_LOGIN_USERNAME', passwordVariable: 'CONAN_PASSWORD')]) {
+        trtllm_utils.llmExecStepWithRetry(pipeline, script: "#!/bin/bash \n" + "cd tensorrt_llm/ && python3 scripts/build_wheel.py --use_ccache -G Ninja -j ${BUILD_JOBS} -D 'WARNING_IS_ERROR=ON' --platform-tag ${platformTag} ${buildArgs}")
     }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@jenkins/L0_Test.groovy` around lines 4235 - 4288, The configs add a new
eighth slot (platformTag) but the pipeline still reads only up to values[6], so
values[7] is never passed into the build and wheel logic; update the code that
unpacks config arrays (the spot using values[0]..values[6]) to also extract
values[7] as platformTag and thread that platformTag through to any downstream
calls (notably the runLLMBuild invocation) so runLLMBuild receives and uses the
platformTag parameter; also update runLLMBuild's signature/parameters to accept
platformTag and propagate it into the wheel/build steps where platform-specific
behavior is applied.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@scripts/build_wheel.py`:
- Around line 1105-1107: The platform tag (--plat-name) is only added to
extra_wheel_build_args once, so the second python -m build invocation doesn't
receive it; ensure both wheel build calls receive the same plat override by
appending the f' --config-setting="--plat-name={platform_tag}_{arch}"' to
extra_wheel_build_args before any build is executed (or explicitly pass that
string into both build invocations), referencing the existing variables
platform_tag, arch, and extra_wheel_build_args and the two python -m build calls
so both produce the same deterministic wheel filename.
- Around line 1311-1314: The --platform-tag value is interpolated into a shell
command unsafely (from parser.add_argument into a shell=True call); validate or
sanitize platform_tag before using it in any shell invocation by restricting it
to a safe pattern or a whitelist (e.g. allowed values like "linux",
"manylinux_2_28" or a regex such as ^[A-Za-z0-9_\\-]+$), and reject/raise
argparse.ArgumentTypeError for invalid values (or better: avoid shell=True and
build the command as a list). Locate the platform_tag variable from
parser.add_argument and add a validator function (passed as the type in
add_argument or run immediately after parsing) that enforces the whitelist/regex
and returns the sanitized value before any subprocess.run/call where
platform_tag is interpolated.

---

Outside diff comments:
In `@jenkins/L0_Test.groovy`:
- Around line 4235-4288: The configs add a new eighth slot (platformTag) but the
pipeline still reads only up to values[6], so values[7] is never passed into the
build and wheel logic; update the code that unpacks config arrays (the spot
using values[0]..values[6]) to also extract values[7] as platformTag and thread
that platformTag through to any downstream calls (notably the runLLMBuild
invocation) so runLLMBuild receives and uses the platformTag parameter; also
update runLLMBuild's signature/parameters to accept platformTag and propagate it
into the wheel/build steps where platform-specific behavior is applied.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4e9e6bad-7c16-4278-8202-2fc3f0d57554

📥 Commits

Reviewing files that changed from the base of the PR and between 37079f6 and 2e38273.

📒 Files selected for processing (2)
  • jenkins/L0_Test.groovy
  • scripts/build_wheel.py

Comment thread scripts/build_wheel.py Outdated
Comment on lines +1105 to +1107
machine = platform.machine()
arch = 'aarch64' if machine in ('aarch64', 'arm64') else machine
extra_wheel_build_args += f' --config-setting="--plat-name={platform_tag}_{arch}"'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Platform tag is only applied to one of the two wheel builds.

The new --plat-name setting is appended once, but the second python -m build call (Line 1149) does not receive it. That can produce mixed wheel tags in the same output directory and break downstream consumers expecting a single, deterministic wheel filename.

Suggested fix
-        build_run(
-            f'\"{venv_python}\" -m build {project_dir} --skip-dependency-check --no-isolation --wheel --outdir "{dist_dir}"',
-            env=env)
+        build_run(
+            f'\"{venv_python}\" -m build {project_dir} --skip-dependency-check {extra_wheel_build_args} --no-isolation --wheel --outdir "{dist_dir}"',
+            env=env)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/build_wheel.py` around lines 1105 - 1107, The platform tag
(--plat-name) is only added to extra_wheel_build_args once, so the second python
-m build invocation doesn't receive it; ensure both wheel build calls receive
the same plat override by appending the f'
--config-setting="--plat-name={platform_tag}_{arch}"' to extra_wheel_build_args
before any build is executed (or explicitly pass that string into both build
invocations), referencing the existing variables platform_tag, arch, and
extra_wheel_build_args and the two python -m build calls so both produce the
same deterministic wheel filename.

Comment thread scripts/build_wheel.py Outdated
Comment on lines +1311 to +1314
parser.add_argument("--platform-tag",
type=str,
default='linux',
help="Wheel platform tag prefix; arch is appended automatically (e.g. linux, manylinux_2_28)")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate --platform-tag before shell interpolation.

platform_tag is injected into a shell command string (shell=True) without validation. A malformed or crafted value can alter command execution.

Suggested fix
+    def _platform_tag(value: str) -> str:
+        # PEP 425/600-friendly subset; keeps shell-safe and predictable.
+        if not re.fullmatch(r"[A-Za-z0-9_.]+", value):
+            raise ValueError(
+                "--platform-tag may only contain letters, digits, '_' and '.'")
+        return value
+
     parser.add_argument("--platform-tag",
-                        type=str,
+                        type=_platform_tag,
                         default='linux',
                         help="Wheel platform tag prefix; arch is appended automatically (e.g. linux, manylinux_2_28)")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/build_wheel.py` around lines 1311 - 1314, The --platform-tag value is
interpolated into a shell command unsafely (from parser.add_argument into a
shell=True call); validate or sanitize platform_tag before using it in any shell
invocation by restricting it to a safe pattern or a whitelist (e.g. allowed
values like "linux", "manylinux_2_28" or a regex such as ^[A-Za-z0-9_\\-]+$),
and reject/raise argparse.ArgumentTypeError for invalid values (or better: avoid
shell=True and build the command as a list). Locate the platform_tag variable
from parser.add_argument and add a validator function (passed as the type in
add_argument or run immediately after parsing) that enforces the whitelist/regex
and returns the sanitized value before any subprocess.run/call where
platform_tag is interpolated.

@niukuo niukuo force-pushed the wheel_platform branch 2 times, most recently from a7848bc to 5ff7502 Compare May 27, 2026 04:26
@niukuo
Copy link
Copy Markdown
Collaborator Author

niukuo commented May 27, 2026

/bot run --skip-test

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50454 [ run ] triggered by Bot. Commit: 505e41e Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50454 [ run ] completed with state FAILURE. Commit: 505e41e
/LLM/main/L0_MergeRequest_PR pipeline #39973 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50486 [ run ] triggered by Bot. Commit: 505e41e Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50486 [ run ] completed with state FAILURE. Commit: 505e41e
/LLM/main/L0_MergeRequest_PR pipeline #39994 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50499 [ run ] triggered by Bot. Commit: 1bb225c Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50520 [ run ] triggered by Bot. Commit: ea3d86b Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50499 [ run ] completed with state ABORTED. Commit: 1bb225c

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50520 [ run ] completed with state SUCCESS. Commit: ea3d86b
/LLM/main/L0_MergeRequest_PR pipeline #40028 (Partly Tested) completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

CI Report

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50560 [ run ] triggered by Bot. Commit: 8d3da7f Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50560 [ run ] completed with state FAILURE. Commit: 8d3da7f
/LLM/main/L0_MergeRequest_PR pipeline #40060 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50567 [ run ] triggered by Bot. Commit: 8d3da7f Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50567 [ run ] completed with state SUCCESS. Commit: 8d3da7f
/LLM/main/L0_MergeRequest_PR pipeline #40068 (Partly Tested) completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

CI Report

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50706 [ run ] triggered by Bot. Commit: 5e33b31 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50706 [ run ] completed with state FAILURE. Commit: 5e33b31
/LLM/main/L0_MergeRequest_PR pipeline #40189 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50707 [ run ] triggered by Bot. Commit: 5e33b31 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50726 [ run ] triggered by Bot. Commit: 252a1b3 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50707 [ run ] completed with state ABORTED. Commit: 5e33b31

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50726 [ run ] completed with state FAILURE. Commit: 252a1b3
/LLM/main/L0_MergeRequest_PR pipeline #40209 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50746 [ run ] triggered by Bot. Commit: dc3f395 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50746 [ run ] completed with state SUCCESS. Commit: dc3f395
/LLM/main/L0_MergeRequest_PR pipeline #40225 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants