docs(designs): golden templates and semver check (PR 0/9)#9057
Open
bnusunny wants to merge 9 commits into
Open
docs(designs): golden templates and semver check (PR 0/9)#9057bnusunny wants to merge 9 commits into
bnusunny wants to merge 9 commits into
Conversation
PR 0 of the golden-templates rollout. Documents the corpus, the in-process harness, the daily integration tier, and the semver gate that requires a major-version bump when an existing pinned output changes. After #8637 shipped, four output-shape regressions slipped past test coverage (#9004, #9005, #9027, #9029) before #9033 made local LE processing opt-in. The design pins the expanded output of representative templates so future regressions surface in code review rather than after release. No code changes; review and discussion happen on this PR before infrastructure lands in PR 1.
Three sentinel cases (SAM, CFN, LE-foreach) used by the harness in the next commit. Each case carries metadata.yaml + README.md + src/app.py. Pinned outputs (expected.*.yaml) are generated by update_goldens.py once the harness is in place.
Strips volatile Metadata.SamTransformMetrics, sort_keys=True, single trailing newline. Used by both the harness and the regenerator so they cannot diverge on serialization.
Per review on PR #9057: a workflow that uses on.pull_request.paths: to fire only on relevant changes cannot also be a required branch- protection check. When paths don't match, the workflow doesn't run, no status posts, and branch protection treats the required check as missing/pending — blocking the PR. check_semver_bump.py already self-gates (returns exit 0 when no expected.*.yaml files changed), so dropping the paths filter and running unconditionally is the canonical fix for the "skipped but required check" pitfall. The script does the gating; the workflow just runs it.
Reverts both: - fa61962 test(golden): normalize() — deterministic YAML rendering - e0280c2 test(golden): add sentinel template directories These belong in PR 1 (the harness PR), not PR 0 (the design doc PR). They were authored on this branch in error and pushed to PR #9057 before the mistake was caught. Force-push isn't permitted on protected feat/* branches, so reverting is the cleanest reset. The reverted changes are preserved on branch feat/golden-templates-pr1 (plus uncommitted WIP harness.py/test_harness_build.py from a parallel session) for the upcoming PR 1 to build on.
8 tasks
Move PR 2's subprocess-driven tier from tests/integration/golden_templates/ to tests/regression/golden_templates/. Aligns with the existing tests/regression/deploy/ and tests/regression/package/ suites — same shape (real `sam` subprocess, real AWS credentials, daily-only). Bonus: integration-tests.yml's existing other-and-e2e matrix entry already collects tests/regression/, so no new matrix entry or workflow edit is needed for the regression tier to start running. Renamed "integration tier" → "regression tier" throughout for clarity.
…ates The previous commit moved the subprocess runner under tests/regression/, but tests/regression/ is reserved for sam-vs-aws-cli parity tests (deploy_regression compares sam deploy output to aws cloudformation deploy output; package_regression does the same for package). Goldens assert the opposite: sam CLI output matches its own pinned past output, not aws-cli's current output. Putting them under tests/regression would conflate two different test semantics and invite future contributors to misuse the suite. Move back to tests/integration/golden_templates/ alongside the existing tests/integration/buildcmd/, tests/integration/package/, and tests/integration/deploy/ — all subprocess-driven SAM CLI behavior tests, which is exactly what the integration tier is. Adds back the requirement for a new \`golden-templates\` entry in integration-tests.yml's matrix (one extra line); explicitly documents why tests/regression/ would be the wrong home so the choice stands up to future review.
The previous motivation leaned on a retroactive "we'd have caught the four bugs" claim. That claim is partially false: pre-#8637, SAM CLI didn't process AWS::LanguageExtensions locally, so a pre-existing LE corpus pinning behavior nobody had committed to would not have been written in the first place. The right moment to add the corpus is the feature PR itself, not before. Replace the motivation with the actual gap: existing tests verify internals (unit) and narrow output assertions (integration), but neither puts the rendered template in front of a human reviewer at PR time. Subtle interactions between pipeline stages — LE expansion vs. SAM transform vs. global-transform export vs. artifact-path merge — are invisible at review time. #8637 is exhibit A: the four post-release bugs are each a YAML-shape diff a reviewer would have caught on sight, if the diff had been visible. Also reframe the semver pitch around the actual customer impact: the most impactful behavior changes don't merely surprise tooling, they break customer CI/CD pipelines outright. #9004 is the canonical example — sam build raised InvalidTemplateException on any template using !FindInMap over a parameter without Default, blocking every pipeline that consumed it. The 1.159 → 1.160 bump was minor per semver, signalling "additive, backwards-compatible" — but local LE processing turning on by default broke working pipelines, which is the definition of a backwards-incompatible change. The version number didn't tell consumers to audit their pipelines, so they didn't.
bnusunny
added a commit
that referenced
this pull request
Jun 2, 2026
…erge queue This repo uses GitHub's merge queue (per build.yml's existing merge_group trigger). When a PR enters the queue, GitHub dispatches a merge_group event against an ephemeral ref, not pull_request. A required check that only declares on.pull_request never reports a status on the merge_group event, so the merge queue treats it as missing/pending and blocks the merge — the same skipped-but-required failure mode the path-filter pitfall was about, just under a different trigger. Mirror build.yml's pattern: listen on both pull_request and merge_group with the same branch filters. The check_semver_bump.py script's self-gating (exit 0 when no expected.*.yaml changed) only helps when the workflow runs at all; this fix makes it run. Add a small "resolve base ref" step because the two event types expose the target branch differently: pull_request sets github.base_ref to the bare branch name; merge_group leaves base_ref empty and exposes the full ref at github.event.merge_group.base_ref (e.g. "refs/heads/develop"), which we strip to match the pull_request shape before passing to check_semver_bump.py. Caught by review on PR #9057.
Per review on PR #9057: the workflow as proposed only declared on.pull_request, but this repo uses GitHub's merge queue (see build.yml's existing merge_group trigger). When a PR enters the queue, GitHub dispatches a merge_group event against an ephemeral ref, not pull_request. A required check that only listens on pull_request never reports a status on the merge_group event, so the merge queue treats it as missing/pending and blocks the merge — the same skipped-but-required failure mode the path-filter pitfall was about, just under a different trigger. Mirror build.yml: listen on both pull_request and merge_group with the same branch filters. The check_semver_bump.py script's self-gating (exit 0 when no expected.*.yaml changed) only helps when the workflow runs at all; this fix makes it run. Add a "resolve base ref" step because the two event types expose the target branch differently: pull_request sets github.base_ref to the bare branch name; merge_group leaves base_ref empty and exposes the full ref at github.event.merge_group.base_ref (e.g. "refs/heads/develop"), which we strip to match the pull_request shape before passing to check_semver_bump.py. Document both flavors of the pitfall (path-filter and single-trigger) in the design's "skipped but required" callout so future maintainers see why we chose this trigger configuration.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR 0 of the golden-templates rollout. Documents the corpus, the in-process harness, the daily integration tier, and the semver gate that requires a major-version bump when an existing pinned output changes.
Why
After #8637 shipped, four output-shape regressions slipped past test coverage (#9004, #9005, #9027, #9029) before #9033 made local LE processing opt-in. The design pins the expanded output of representative templates so future regressions surface in code review rather than after release.
Wave plan
Test plan