Skip to content

Add release workflow#264

Open
blakeembrey wants to merge 8 commits into
masterfrom
be/ci-release
Open

Add release workflow#264
blakeembrey wants to merge 8 commits into
masterfrom
be/ci-release

Conversation

@blakeembrey
Copy link
Copy Markdown
Member

Adds a release workflow in GitHub Actions.

@blakeembrey blakeembrey requested review from a team and sheplu February 23, 2026 21:17
Copy link
Copy Markdown
Member

@jonchurch jonchurch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idk that I have anything new to add, so I suppose I'll list the tradeoffs as I've clarified my understanding over the past few months.

The threat model that I want to see us operate under is that a compromise of a repo admin's Github PAT cannot lead to an npm publish. E.g. npm i in the 2 hour window when there's malware in someone's dep tree somewhere and I get my github cli PAT exfiled.

Today we are protected via local publishing w/ 2fa npm gate.

The tradeoff w/ TP is that we gain release automation, but we become vulnerable to the above threat. We do not currently use any npm tokens in CI, so the TP benefit of obseleting them does not apply to us.

I went down the rabbit hole on all the Github 2fa gates for Trusted Publishing I could think of, envionments w/ required approvals, verifying tags are signed by a known key, etc.

The core issue with gating at Github is a "fox in the the hen house" problem where:

  • Any gate we can configure lives in Github settings, or workflows
  • A compromised repo admin PAT is sufficient to disable all those gates, via API calls to reconfigure environments (dropping required reviewers, allowing self approval, tampering w/ workflows, rewriting branch protection).
  • No 2fa is required to do the above via the API (I verified the environment tampering via gh cli with a dummy repo)

(claude has more technical terms for the structural issue: "GitHub's auth domain and config domain are the same trust domain. A token that can act on the repo can also rewrite the rules governing how the repo can be acted upon.")

I know none of that is in this PR, but I have been trying to figure out how to harden a TP model bc the workflow in the PR (which is the recommended setup by githhub/npm) is susceptible today to someone w/ tag push access having their GH PAT compromised.

I don't think TP can satisfy the threat model I desire of "a compromise of a repo admin's Github PAT cannot lead to an npm publish."

I evaluated the step security wait-for-secrets approach of eschewing TP and bringing in an external workflow to bring back 2fa prompt. It relies on long (90 day) lived tokens in CI, but it at least closes the risk of a gh token exfil leading to publish package. Assuming that packages and user accounts are all 2fa publishing protected, an attacker can do maximal damage within a repo w/ a repo admin's stolen token but cannot cause a publish.

I dont like any of our options to be honest. My wishlist is just that there was first party CI publishing w/ a npm 2fa challenge. The Step Security workflow is closest to that, but is not TP, and has overhead that is gross (keeping people's npm tokens in repo secrets, rotating, 3rd party touching 2fa tokens). I dont like it, but it does at least pass the threat model. So idk that I'd recommend it over local publish, but if CI publish is a requirement, that does seeem like the safest one.

I wrote this all up to get my thoughts out there and see how we feel. Am I wrong about any facts here? Is there agreement on the threat model I outlined about preventing publish on repo admin compromise? Do we want to operate under a different threat model? Are we already operating under a different threat model?

@blakeembrey
Copy link
Copy Markdown
Member Author

You're absolutely right on GitHub token exploitation and I haven't come up with anything better, so I don't think there's any acceptable solution today without some sort of additional feature development from NPM. The best case is that they enable either a staging area or 2FA that can be done async, because Trusted Publishing is preferable to managing any kind of token rotation ourselves or potentially exposing ourselves to new third parties that could be exploited. I'd prefer to keep the surface area of attack minimal and, at least with TP, that's only GitHub.

Comment thread .github/workflows/release.yml Outdated
Comment thread .github/workflows/release.yml Outdated
Comment thread .github/workflows/release.yml Outdated
Copy link
Copy Markdown
Member

@bjohansebas bjohansebas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, that new command definitely simplifies everything

Comment thread .github/workflows/release.yml Outdated
@blakeembrey
Copy link
Copy Markdown
Member Author

@jonchurch Just a heads up but you are still blocking this PR.

That said, I don't intend to merge it until latest node comes with npm@11.15.0. Currently it's getting set up with 11.13.0 so this won't work unless we add a step to update npm.

Comment thread .github/workflows/release.yml Outdated
Comment on lines +26 to +28
- run: npm ci
- run: npm test
- run: npm stage publish # Runs prepublish script.
Copy link
Copy Markdown
Member

@jonchurch jonchurch May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do yall think about:

  • dropping the test run from the publish job
  • splitting out the pack and publish steps, so the publish job doesn't need to run npm ci? That firewalls the OIDC token from the npm i which could steal mint a token if somehow a poisoned install script got into the tree. Idk how duplicate stage attempts work, aka if they attempted to stage a version before us and then we stage the same version. Maybe it is extra complexity without buying us much?
name: release

on:
  push:
    tags:
      - "v*.*.*"

jobs:
  # No id-token, no environment. ALL arbitrary code runs here, where
  # there's nothing to steal: dependency install scripts, your tests,
  # and your prepack/prepare build (npm pack triggers those).
  pack:
    name: Test & pack
    runs-on: ubuntu-latest
    permissions:
      contents: read                       # checkout only
    steps:
      - uses: actions/checkout@93cb6efe18208431cddfb8368fd83d5badbf9bfd # v5.0.1
      - uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4.4.0
        with:
          node-version: latest
      - run: npm ci
      - run: npm test # idk still if we want test at all? doesn't hurt in this step I don't think
      - run: npm pack --pack-destination ./tarball   # runs prepack/prepare
      - uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
        with:
          name: npm-tarball
          path: ./tarball/*.tgz
          if-no-files-found: error
          retention-days: 1

  # Holds id-token + environment. Runs NO arbitrary code: downloads the
  # prebuilt tarball and stages it. `npm publish <tarball>` treats the
  # tarball as opaque bytes, so no lifecycle scripts run in this job.
  publish:
    name: Publish to npm
    needs: pack
    runs-on: ubuntu-latest
    environment:
      name: release
    permissions:
      id-token: write                      # OIDC claim for npm trusted publishing
    steps:
      - uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4.4.0
        with:
          node-version: latest
      - uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
        with:
          name: npm-tarball
          path: ./tarball
      - run: npm stage publish ./tarball/*.tgz

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's unnecessary. If the vulnerability is from a dependency, won't it still end up modifying things to get into the tarball and be published? Is the concern the id-token getting exfiltrated? I'm trying to think about what this is attempting to protect against and that's all I can think of.

Copy link
Copy Markdown
Member Author

@blakeembrey blakeembrey May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized in my original proposal I had npm ci --ignore-scripts but it wasn't in this PR, so I added it.

Another note is that using pack does make the things a bit more complex because I need to refactor how it "builds". It isn't a big deal independently but every package I maintain uses prepublishOnly, which runs before publish (and not before pack?), but it would need to be removed for this workflow to work. It's not an issue, but it is a footgun that makes it easy for someone manually publishing to forget to build the code therefore publishing an outdated or old version.

Edit: prepublishOnly. We can switch to prepare which resolves this but I explicitly use prepublishOnly because I don't want to build every time I run npm install. All easily solvable but I just don't know if it's even adding any extra protection to need refactoring here.

@jonchurch jonchurch self-requested a review May 26, 2026 21:06
@jonchurch jonchurch dismissed their stale review May 26, 2026 21:12

not blocking

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants