test(golden): harness, three sentinels, semver gate (PR 1/9)#9058
Open
bnusunny wants to merge 29 commits into
Open
test(golden): harness, three sentinels, semver gate (PR 1/9)#9058bnusunny wants to merge 29 commits into
bnusunny wants to merge 29 commits into
Conversation
Three sentinel cases (SAM, CFN, LE-foreach) used by the harness in the next commit. Each case carries metadata.yaml + README.md + src/app.py. Pinned outputs (expected.*.yaml) are generated by update_goldens.py once the harness is in place.
Strips volatile Metadata.SamTransformMetrics, sort_keys=True, single trailing newline. Used by both the harness and the regenerator so they cannot diverge on serialization.
Mirrors sam build's pipeline up to the template-write step: LE expansion -> SAM transform -> local-artifact-path normalization. Calls expand_language_extensions and the upstream samtranslator.translator.translator.Translator directly; no subprocess, no AWS calls.
Walks packageable artifact properties (using RESOURCES_WITH_LOCAL_PATHS + RESOURCES_WITH_IMAGE_COMPONENT) and rewrites each to a deterministic s3://golden-bucket/<sha256>. Runs _export_global_artifacts_pass before and after rewriting to mirror PR #9030's pre-LE AWS::Include pass.
CLI supports --filter / --check / --diff / --new. Dry-run modes never
mutate disk; --new only writes missing files. Three sentinel cases
have expected.{build,package}.yaml pinned.
One parametrized test per case for build + package output. Mismatch hint points at update_goldens.py and explains the semver gate's expectations.
Pure-function check() takes a changed-files list + base/head versions and returns (exit_code, message). git interaction lives behind a thin CLI shim that reads git diff --name-status and pulls __version__ from both refs.
Triggered only when corpus pins or samcli/__init__.py change. Tiny, auditable, isolated from build.yml so it can be reasoned about independently.
Separate pytest invocation so the corpus harness doesn't enter the samcli coverage calculation.
- Type-narrow rtype before passing to RESOURCES_WITH_LOCAL_PATHS.get. - Annotate Translator.translate() return so harness's run_build_pipeline return type is preserved (no_any_return). - Use Optional[str] = None on GoldenS3Uploader keyword args (PEP 484 no-implicit-optional).
Drop unused pytest/yaml imports across the new tests, and replace the rename-detection magic number in check_semver_bump.py with a named constant. No behavior change. Surfaces from ruff check tests/golden when run directly (the project's `make lint` target restricts to samcli/schema, so these escaped CI).
…rect scripts Both scripts have docstrings and corpus-test failure hints that tell users to invoke them as `python tests/golden/<script>.py ...` from the repo root. But Python does not auto-add the repo root to sys.path for direct script invocation, so the absolute `from tests.golden...` imports fail with ModuleNotFoundError. Prepend the repo root to sys.path when each script is run as __main__, gated on `__package__ is None` so the module form continues to work.
…nt-aware artifact digest
Two correctness fixes for the regen CLI and the harness it drives:
1. update_goldens.py --new: peek at expected.{build,package}.yaml before
running _generate. When both already exist, skip the case entirely
instead of running the full build+package pipeline and discarding the
result. Add a regression test that monkeypatches _generate to raise so
any future re-introduction of the wasted call is caught.
2. harness._packageable_replacement: include resource_id in the digest
salt. After the build pass replaces local paths with
<<BUILT_ARTIFACT>>, `current` is constant — so every Lambda Code in
the corpus collapsed to the same S3Key, and a regression that swapped
two functions' artifacts would have gone undetected. Salting on
resource_id makes per-resource digests distinct (Alpha vs Beta within
one template, etc.). Pins regenerated to reflect the new digests.
…n limitations Cleanups surfaced by the code-quality review of PR 1: - harness.run_package_pipeline: drop the dead first _export_global_artifacts_pass(original) call. The original template was read, mutated by that pass, and then never referenced again — the artifact rewrite operates on a deep-copied build_output, so the first pass had no observable effect. Update the docstring to match. - harness._walk_artifact_properties: replace the bare `-> List:` return type with `-> List[Tuple[str, str, Dict[str, Any]]]` so downstream callers and reviewers can see the shape without reading the body. - harness module docstring: add a Known limitations subsection covering four issues that the next contributor should expect (image / ECR over-rewrite, _stub_local_uris_for_translator drift, GoldenS3Uploader not subclassing S3Uploader, missing AWS::Include sentinel). They will be addressed by PR 2 / corpus growth. - test_harness_package.test_package_sam_case_rewrites_code_to_s3_uri: replace the lenient `s3_uri_field == "golden-bucket"` short-circuit with two specific assertions that the dict has the expected keys. - test_harness_build: drop unused `tmp_path` fixture parameters from two tests. - update_goldens.py / check_semver_bump.py: silence mypy `[unreachable]` on the script-form sys.path bootstrap added in the previous fix.
CI's black-check runs against Python 3.10 and rejected the formatting written by black running under Python 3.13 (the venv's interpreter), which preferred longer single-line function signatures that 3.10's parser couldn't accept. Re-running black with an explicit --target-version py310 collapses the affected lines back to the multi-line shape CI accepts. Pure formatting; no behavior change. 37/37 tests still pass.
Per GitHub's security hardening guide, context expressions like
${{ github.base_ref }} are substituted into the script string before the
shell parses it. Although base_ref is constrained by git ref-name rules,
the canonical mitigation is to pass the value via env: so the shell sees
a literal variable expansion.
git diff --name-status emits C<score>\tOLD\tNEW for copies (when -C is enabled via repo config). Previously only R was special-cased; C fell through with status="C", which doesn't match A/M/D filters — so a copy of an existing pin would silently bypass the gate. Note the asymmetry: - Rename: source goes away (D) and target appears (A). - Copy: source persists, only the target is new (A only). Renamed _RENAME_DIFF_PARTS to _RENAME_OR_COPY_PARTS to reflect the broader meaning. Added parser-level tests covering both rename and copy lines, plus an end-to-end test asserting a copy is observed (not silently elided).
main() unconditionally read base/head versions before checking whether any corpus pin changed. _read_version_at_ref raises if the regex misses or if samcli/__init__.py is missing at the ref, so on every unrelated PR — the common case — it could fail for unrelated reasons even though the gate is supposed to be self-gating. Now main() short-circuits with exit 0 the moment we determine no corpus pin paths are in the diff. Added two tests: one asserting _read_version_at_ref is never called when the diff has no corpus pin changes, and a counterpart asserting it IS called (and the gate runs) when a corpus pin does change.
The parser only rejected --check + --new and --diff + --new. Passing both --check and --diff was accepted, with --diff winning on output. The docstring describes them as alternatives, so reject the combination at the parser — symmetric with the existing --new checks, and less surprise.
Pull magic numbers (rename change count, argparse usage exit code) into named constants, and let black collapse the adjacent-string formatting on the new parser tests.
CI runs without AWS_DEFAULT_REGION and without ~/.aws/config; the upstream SAM Translator's parameter SDK calls ``boto3.session.Session()`` and unconditionally raises ``NoRegionFound`` when the resulting session has no region — even when AWS::Region is already in the parameter_values dict we pass. Hand the Translator a session pinned to the same region we feed via parameter_values (us-east-1, from ``IntrinsicsSymbolTable.DEFAULT_PSEUDO_PARAM_VALUES``). Pure in-process state, no AWS calls. Confirmed by reproducing the CI failure locally with ``env -u AWS_DEFAULT_REGION -u AWS_REGION HOME=/tmp/no-aws-config pytest tests/golden`` — 16 failures match CI without the fix; 43 pass with it.
…merge
The harness was producing pinned expected.{build,package}.yaml files that
showed the LE-EXPANDED template (AlphaFunction/BetaFunction as siblings),
but real `sam build` and `sam package` preserve the original Fn::ForEach
structure with merged artifact paths inside the body — that's the design
in PR #8637.
run_build_pipeline now delegates the LE merge to
``BuildContext._get_template_for_output`` (invoked via a __new__ shell
since the method only reads ``self.*`` for other methods on the same
class). run_package_pipeline mirrors PackageContext's LE flow:
``_export_global_artifacts_pass`` → LE expand → rewrite artifacts →
``merge_language_extensions_s3_uris`` to copy URIs back into the original
ForEach body.
The SAM transform is skipped on the LE branch — the body keeps
``AWS::Serverless::Function`` with ``CodeUri`` (matching real
sam build / sam package output, where the SAM transform runs server-side
at deploy time). Update the corresponding unit test assertions to verify
``Fn::ForEach::*`` is preserved instead of expanded.
…rving harness Both files now show ``Fn::ForEach::Workers`` at the top of Resources with the body resource's ``CodeUri`` rewritten to the build placeholder (build) or an s3://golden-bucket/<sha256> URI (package). This matches what real ``sam build`` / ``sam package`` write to disk for an LE template — see PR #8637.
…ne reimplementation
The previous harness reimplemented pieces of the build/package flow
(running the SAM translator manually, walking artifact properties,
calling internal merge helpers directly), and that drift produced
two user-visible bugs in the pinned outputs:
1. ``Fn::ForEach`` was replaced with the expanded form. Real
``sam build`` preserves the ForEach so CloudFormation can re-expand
it server-side at deploy time (PR #8637).
2. ``AWS::Serverless::Function`` got SAM-transformed to
``AWS::Lambda::Function`` (with an auto-generated IAM Role). Real
``sam build`` does NOT run the SAM transform; that happens
server-side at deploy.
Both were symptoms of running steps SAM CLI itself does not run. Fix
the underlying issue by driving the same ``BuildContext`` and
``PackageContext`` classes the CLI instantiates and stubbing only the
network/compute boundaries:
- ``ApplicationBuilder.build`` returns an ``ApplicationBuildResult``
with artifacts pointing at a sentinel directory under
``<build_dir>/__sam_golden_artifact__/<full_path>/``. The placeholder
file's content is the resource's full path, which makes the
downstream content-addressed S3 URI deterministic per resource.
- ``S3Uploader.upload`` / ``upload_with_dedup`` return
``s3://golden-bucket/<sha256>`` where the digest is taken over the
uploaded file's content.
After build, the on-disk ``.aws-sam/build/template.yaml`` is read back
and any artifact-property string that resolves under the sentinel
directory is rewritten to ``<<BUILT_ARTIFACT>>``. Everything else
(LE expansion, ForEach merge, Mappings generation, AWS::Include
export) flows through the real CLI code paths.
Also: ``samcli.yamlhelper.yaml_parse`` mutates the global
``yaml.SafeLoader`` to emit ``OrderedDict``; once any caller imports
yamlhelper, every subsequent ``yaml.safe_load`` returns OrderedDict
and ``yaml.safe_dump`` rejects it. Coerce parsed templates to plain
``dict`` in both the harness and ``normalize`` so the test corpus and
pin regenerator see plain dicts.
Test signatures change: ``run_package_pipeline`` no longer takes a
``build_output`` dict (PackageContext re-loads the template from
disk); a new ``run_build_and_package`` helper produces both pins
from a single build for callers that need both.
Five sentinel pins now reflect what real ``sam build`` and
``sam package`` write to disk:
- ``sam_resources/serverless_function_zip/expected.build.yaml`` now
contains ``AWS::Serverless::Function`` (no SAM transform), with
``CodeUri: <<BUILT_ARTIFACT>>`` and a SAM-CLI-added
``Metadata.SamResourceId``. The pre-existing auto-generated
``HelloFunctionRole`` is gone (the SAM transform runs server-side
at deploy time, not at build time).
- ``sam_resources/serverless_function_zip/expected.package.yaml``
preserves the same shape, with ``CodeUri`` rewritten to
``s3://golden-bucket/<sha>``.
- ``packageable_resources/lambda_function_zip/expected.build.yaml``
now carries ``Metadata.SamResourceId`` (added by SAM CLI's stack
provider for raw CFN Lambdas, too).
- ``packageable_resources/lambda_function_zip/expected.package.yaml``
carries the same metadata; ``Code`` becomes the
``{S3Bucket, S3Key}`` dict shape that real ``sam package`` emits.
- ``language_extensions/foreach_static_zip/expected.package.yaml``
digest changes because the package URI is now content-addressed
over the harness's sentinel placeholder file (whose content is the
resource full path) rather than a synthesized hash. ``Fn::ForEach``
preservation is unchanged.
Real `sam package` zips the build artifact directory before "uploading"
it. The harness's S3Uploader.upload stub was hashing the resulting zip
bytes — but raw zip bytes embed OS-specific metadata (DOS timestamps,
Unix mode bits, central directory ordering), so the same source content
produced different digests on Linux vs Windows. The Windows runner saw
the package pins regenerate with new hashes and the corpus tests failed
on byte-exact diff:
- CodeUri: s3://golden-bucket/dd2bf2d8...
+ CodeUri: s3://golden-bucket/1c1f1e20...
Hash zip files via a normalized representation: sorted member names plus
each member's content sha256. The envelope metadata is excluded from the
digest, so identical source content produces an identical hash on every
OS. Non-zip artifacts continue to be hashed by raw bytes. Detection uses
zipfile.is_zipfile so .jar / .war and other zip-format artifacts are
treated the same as a Lambda zip.
Pins regenerated once with the new hash; from now on Linux and Windows
runners agree.
…lpath Real `sam build` writes CodeUri/Code/etc. as a relative path from the output template's directory to the artifact directory — for a top-level resource the result is just the resource id (e.g. `CodeUri: HelloFunction`) because both end up under .aws-sam/build/. The harness was substituting a synthetic <<BUILT_ARTIFACT>> placeholder instead, which obscured what real sam build actually produces and made the build pin disagree with the CLI's own output. Drop the placeholder rewriter entirely. Stage the case dir (template.yaml + src/) into the temp tree before driving BuildContext, and put .aws-sam/build/ inside the staged tree. ApplicationBuilder's relpath then resolves to .aws-sam/build/<full_path> which move_template collapses to <full_path> when writing to .aws-sam/build/template.yaml — exactly what real sam build writes. Net: harness drops the artifact-walker, the path-rewriter, the RESOURCES_WITH_LOCAL_PATHS / RESOURCES_WITH_IMAGE_COMPONENT imports, the BUILT_ARTIFACT_PLACEHOLDER constant, and the sentinel-dirname indirection (~150 lines). All sentinel build pins now show the real deterministic relpath. Tests updated to assert the relpath shape; one test renamed for accuracy (`test_build_cfn_case_replaces_code_with_placeholder` → `test_build_cfn_case_keeps_lambda_function`). 42/42 pass under CI-like env.
…erge queue This repo uses GitHub's merge queue (per build.yml's existing merge_group trigger). When a PR enters the queue, GitHub dispatches a merge_group event against an ephemeral ref, not pull_request. A required check that only declares on.pull_request never reports a status on the merge_group event, so the merge queue treats it as missing/pending and blocks the merge — the same skipped-but-required failure mode the path-filter pitfall was about, just under a different trigger. Mirror build.yml's pattern: listen on both pull_request and merge_group with the same branch filters. The check_semver_bump.py script's self-gating (exit 0 when no expected.*.yaml changed) only helps when the workflow runs at all; this fix makes it run. Add a small "resolve base ref" step because the two event types expose the target branch differently: pull_request sets github.base_ref to the bare branch name; merge_group leaves base_ref empty and exposes the full ref at github.event.merge_group.base_ref (e.g. "refs/heads/develop"), which we strip to match the pull_request shape before passing to check_semver_bump.py. Caught by review on PR #9057.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR 1 of the golden-templates rollout (design in #9057). Adds the in-process harness, three sentinel cases, the semver gate, and the workflow.
What lands
tests/golden/harness.py—run_build_pipelineandrun_package_pipeline. Callsexpand_language_extensions(PR feat: make AWS::LanguageExtensions local processing opt-in via --language-extensions flag #9033's explicit entry) and the SAMTranslatordirectly; replaces local artifact paths with<<BUILT_ARTIFACT>>and S3 URIs withs3://golden-bucket/<sha256-of-resource>. No subprocess, no AWS calls.tests/golden/normalize.py— deterministic YAML rendering (sorted keys, droppedMetadata.SamTransformMetrics, single trailing newline).tests/golden/update_goldens.py— CLI:--filter,--check,--diff,--new. Same harness functions used by tests, so regenerator and verifier cannot diverge. Works both aspython tests/golden/update_goldens.pyandpython -m tests.golden.update_goldens.tests/golden/check_semver_bump.py— purecheck()+ git CLI shim. Modifications/deletions of existing pins require major-version bump in the same PR. Additions ungated. Renames forced major (conservative). Self-gates: returns exit 0 when noexpected.*.yamlchanged.tests/golden/conftest.py+test_corpus.py— parametrized: one test ID per case dir (e.g. `language_extensions/foreach_static_zip`). Two tests per case — build and package output diffed against pinned files. Mismatch hint points at `update_goldens.py --diff` and explains the semver rule..github/workflows/golden-semver-gate.yml— runs unconditionally on every PR. Does NOT use `on.pull_request.paths:` filtering (per review feedback on docs(designs): golden templates and semver check (PR 0/9) #9057): path filtering at the trigger level skips the workflow on unrelated PRs, no status posts, branch protection treats a required check as missing/pending. The script self-gates instead.Makefile— `test` and `test-all` targets each include `pytest tests/golden` as a separate invocation (so it doesn't enter the 94% `samcli` coverage gate).Known limitations (will surface in PR 2 onward)
Documented inline in `harness.py`'s module docstring. Listed here so the next contributor knows to expect them:
These are deliberate scope limits, not bugs — the harness is correct for what's currently committed and covers the specific bug shapes from #9004, #9005, #9027, #9029. The wave plan has each axis extending the harness as needed.
Test plan