+ "details": "## Summary\n\nThe `docker.system_packages` field in `bentofile.yaml` accepts arbitrary strings that are interpolated directly into Dockerfile `RUN` commands without sanitization. Since `system_packages` is semantically a list of OS package names (data), users do not expect values to be interpreted as shell commands. A malicious `bentofile.yaml` achieves arbitrary command execution during `bentoml containerize` / `docker build`.\n\n## Affected Component\n\n- `src/_bentoml_sdk/images.py:85-89` — `.format(packages=\" \".join(packages))` into shell command\n- `src/bentoml/_internal/container/frontend/dockerfile/templates/base_debian.j2:13` — `{{ __options__system_packages | join(' ') }}`\n- `src/bentoml/_internal/bento/build_config.py:174` — No validation on `system_packages`\n- All distro install commands in `src/bentoml/_internal/container/frontend/dockerfile/__init__.py`\n\n## Affected Versions\n\nAll versions supporting `docker.system_packages` in `bentofile.yaml`, confirmed on 1.4.36.\n\n## Steps to Reproduce\n\n1. Create a project directory with:\n\n**service.py:**\n```python\nimport bentoml\n\n@bentoml.service\nclass MyService:\n @bentoml.api\n def predict(self) -> str:\n return \"hello\"\n```\n\n**bentofile.yaml:**\n```yaml\nservice: \"service:MyService\"\ndocker:\n system_packages:\n - \"curl && id > /tmp/bentoml-pwned #\"\n```\n\n2. Run:\n```bash\nbentoml build\n```\n\n3. Examine the generated Dockerfile at `~/bentoml/bentos/my_service/<tag>/env/docker/Dockerfile`. Line 41 will contain:\n```dockerfile\nRUN apt-get install -q -y -o Dpkg::Options::=--force-confdef curl && id > /tmp/bentoml-pwned #\n```\n\n4. Running `bentoml containerize my_service:<tag>` will execute `id > /tmp/bentoml-pwned` as root during the Docker build.\n\n## Root Cause\n\nThe `system_packages` field values are treated as package names (data) by the user but are string-formatted directly into shell commands in the Dockerfile:\n\n```python\n# images.py:85-89\nself.commands.append(\n CONTAINER_METADATA[self.distro][\"install_command\"].format(\n packages=\" \".join(packages) # No escaping\n )\n)\n```\n\nWhere `install_command` is `\"apt-get install -q -y -o Dpkg::Options::=--force-confdef {packages}\"`.\n\nA `bash_quote` filter (wrapping `shlex.quote`) exists in the codebase and is registered in both Jinja2 environments, but it is only applied to environment variable values, never to `system_packages`.\n\n## Impact\n\n1. **Malicious repositories**: An attacker publishes an ML project with a crafted `bentofile.yaml`. Anyone who clones and builds it gets arbitrary code execution during `docker build`.\n2. **CI/CD compromise**: Automated pipelines running `bentoml containerize` on PRs that modify `bentofile.yaml` are vulnerable.\n3. **BentoCloud**: If BentoCloud builds images from user-supplied `bentofile.yaml`, this could achieve RCE on cloud infrastructure.\n4. **Supply chain**: Shared bentos or model repos in the BentoML ecosystem can contain malicious configs.\n\n## Suggested Fix\n\n### Option 1: Input validation (recommended)\n\nAdd a regex validator to `system_packages` in `build_config.py`:\n\n```python\nimport re\n\nVALID_PACKAGE_NAME = re.compile(r'^[a-zA-Z0-9][a-zA-Z0-9.+\\-_:]*$')\n\ndef _validate_system_packages(instance, attribute, value):\n if value is None:\n return\n for pkg in value:\n if not VALID_PACKAGE_NAME.match(pkg):\n raise BentoMLException(\n f\"Invalid system package name: {pkg!r}. \"\n \"Package names may only contain alphanumeric characters, \"\n \"dots, plus signs, hyphens, underscores, and colons.\"\n )\n\nsystem_packages: t.Optional[t.List[str]] = attr.field(\n default=None, validator=_validate_system_packages\n)\n```\n\n### Option 2: Output escaping\n\nApply `shlex.quote()` to each package name before interpolation in `images.py:system_packages()` and apply the `bash_quote` Jinja2 filter in `base_debian.j2`.",
0 commit comments