Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,13 @@ out
packages/*/dist/**
**/.pnpm-store/**

# Vendored upstream sources are NOT build output. Anything under
# packages/*/src/vendor/*/upstream/ is a verbatim copy of an npm tarball and
# must ship with the package; the `dist` and `build` matchers above would
# otherwise silently exclude their `dist/` subdirs from git.
!packages/*/src/vendor/**


# dependencies
node_modules
**/node_modules/**
Expand Down
5 changes: 5 additions & 0 deletions eslint.config.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,11 @@ export default tseslint.config([
message: "Please always import from '@clerk/shared/<module>' instead of '@clerk/shared'.",
name: '@clerk/shared',
},
{
name: 'base-64',
message:
"base-64 is vendored at packages/expo/src/vendor/base-64. Import { encode, decode } from '../vendor/base-64' instead. See packages/expo/src/vendor/base-64/README.md.",
},
Comment on lines +354 to +358
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Global base-64 restriction conflicts with parity tests and can break lint.

This rule is global, but the PR includes parity tests that intentionally import the upstream base-64 package for behavioral comparison. As written, those tests will be lint-blocked unless excluded.

Scope this restriction to non-test Expo source files (or add a test-file override exception).

Suggested fix (scope restriction away from test files)
   {
     name: 'repo/global',
@@
       'no-restricted-imports': [
         'error',
         {
           paths: [
             {
               message: "Please always import from '`@clerk/shared/`<module>' instead of '`@clerk/shared`'.",
               name: '`@clerk/shared`',
             },
-            {
-              name: 'base-64',
-              message:
-                "base-64 is vendored at packages/expo/src/vendor/base-64. Import { encode, decode } from '../vendor/base-64' instead. See packages/expo/src/vendor/base-64/README.md.",
-            },
           ],
@@
       ],
     },
   },
+  {
+    name: 'packages/expo base-64 restriction',
+    files: ['packages/expo/src/**/*.{ts,tsx,js,jsx}'],
+    ignores: ['packages/expo/src/**/__tests__/**', 'packages/expo/src/**/*.test.{ts,tsx,js,jsx}'],
+    rules: {
+      'no-restricted-imports': [
+        'error',
+        {
+          paths: [
+            {
+              name: 'base-64',
+              message:
+                "base-64 is vendored at packages/expo/src/vendor/base-64. Import { encode, decode } from '../vendor/base-64' instead. See packages/expo/src/vendor/base-64/README.md.",
+            },
+          ],
+        },
+      ],
+    },
+  },
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@eslint.config.mjs` around lines 354 - 358, The rule object with name
'base-64' currently blocks all imports of upstream base-64 and will lint-break
parity tests that intentionally import the package; update the ESLint
configuration so this restriction applies only to non-test Expo source (e.g.,
scope the rule to packages/expo/src/** or add an overrides entry that exempts
test files like **/*.test.* and **/__tests__/**), or add a negative pattern to
the rule's target to exclude test paths; locate the rule by the object with name
'base-64' and apply the file-globbing change or add an overrides block to allow
imports in test files.

],
patterns: [
{
Expand Down
2 changes: 1 addition & 1 deletion packages/expo/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -116,14 +116,14 @@
"@clerk/clerk-js": "workspace:^",
"@clerk/react": "workspace:^",
"@clerk/shared": "workspace:^",
"base-64": "^1.0.0",
"react-native-url-polyfill": "2.0.0",
"tslib": "catalog:repo"
},
"devDependencies": {
"@clerk/expo-passkeys": "workspace:*",
"@expo/config-plugins": "^54.0.4",
"@types/base-64": "^1.0.2",
"base-64": "^1.0.0",
"esbuild": "^0.28.0",
"expo-apple-authentication": "^7.2.4",
"expo-auth-session": "^5.5.2",
Expand Down
6 changes: 5 additions & 1 deletion packages/expo/src/polyfills/base64Polyfill.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
import { decode, encode } from 'base-64';
// Vendored from base-64@1.0.0 — see ../vendor/base-64/README.md for the
// supply-chain rationale (this code becomes global.btoa/global.atob inside
// every Clerk-Expo customer app, which makes it an unusually high-leverage
// upstream dep to leave externalized).
import { decode, encode } from '../vendor/base-64';

import { isHermes } from '../utils';

Expand Down
89 changes: 89 additions & 0 deletions packages/expo/src/vendor/base-64/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Vendored: `base-64`

- **Upstream:** https://github.com/mathiasbynens/base64
- **Vendored version:** 1.0.0 (published 2020-12-12)
- **License:** MIT (see `upstream/LICENSE-MIT.txt`)
- **Maintainer (npm):** `mathias` (single)
- **Owner inside Clerk:** `@clerk/expo` maintainers

## Why this is vendored

`base-64` is the userland `atob`/`btoa` implementation that `@clerk/expo` polyfills onto `global` for Hermes engines that lack native versions (see `packages/expo/src/polyfills/base64Polyfill.ts`). When `@clerk/expo` is installed by a customer, the published tarball declares `base-64` as a runtime external. The customer's package manager resolves `^1.0.0` against the npm registry at install time and fetches `base-64` fresh — Clerk's own `pnpm-lock.yaml` is not in the published tarball and does not participate in the customer's install.

That externality + the fact that `base-64`'s exports become **`global.btoa` and `global.atob` inside every Clerk-Expo customer's running app** makes this dependency a high-leverage supply-chain target. Two attack chains motivate vendoring:

### Chain 1 — Publisher account compromise

`base-64` is single-maintainer. If `mathias`'s npm account is compromised (phishing, token theft, hostile transfer, social engineering) and a malicious `base-64@1.0.1` is published, every customer install of `@clerk/expo` after the publish resolves `^1.0.0` to `1.0.1` and pulls the compromised bytes. The polyfill assigns the compromised `encode`/`decode` to `global.btoa`/`global.atob`. Every subsequent `btoa()` or `atob()` call anywhere in the customer's app — including third-party libraries and Clerk's own runtime — runs through the compromised implementation, silently.

Historical precedents in this class: `event-stream` (2018), `ua-parser-js` (2021), `colors.js`/`faker.js` (2022), `xz-utils` (2024). Each was a maintainer-account compromise that published a malicious new version.

Pinning to an exact version (`"base-64": "1.0.0"` instead of `"^1.0.0"`) would close Chain 1 for direct deps — the customer's resolver would never pick up `1.0.1`. But:

### Chain 2 — Registry-level same-version substitution

If the npm registry itself serves substituted bytes for `base-64@1.0.0` (registry compromise, malicious unpublish-then-republish, npm-internal account compromise), the customer's first install of `@clerk/expo` fetches the substituted bytes, computes their hash, and records it as the trusted reference in their lockfile. There is no prior hash to compare against. Future installs with `--frozen-lockfile` "verify" against the now-poisoned hash.

Exact-pinning does not address Chain 2 — the resolver still routes through the registry for `1.0.0`, and whatever bytes the registry serves are what the customer gets. Vendoring is the only mechanism that closes both chains: the customer's resolver never fetches `base-64` from the npm registry because the bytes ship inside the `@clerk/expo` npm tarball.

| | Caret range | Exact pin | Vendored |
|---|---|---|---|
| Chain 1 (new version) | ❌ | ✅ | ✅ |
| Chain 2 (same-version substitution, first install) | ❌ | ❌ | ✅ |

See `Sessions/S161/PROPOSAL.md` for the broader proposal.

## What's in `upstream/`

`upstream/` is a **byte-for-byte copy of the published `base-64@1.0.0` npm tarball.** Nothing in that directory has been modified by Clerk.

```
upstream/
├── base64.js (~164 lines, single-file UMD; exports {encode, decode, version})
├── package.json (upstream's; see "inert fields" below)
├── LICENSE-MIT.txt (MIT)
└── README.md (upstream's README)
```

### Inert fields in `upstream/package.json`

The upstream `package.json` is preserved so future refresh diffs against new `base-64` versions match byte-for-byte against `npm pack` output. These fields are **inert in this location** — they do nothing here:

- `scripts.*` — not executed; no install lifecycle runs against vendored code.
- `main: "base64.js"` — bundlers do not walk inner `package.json` of nested `src/vendor/` directories for relative imports; the Clerk-side `index.ts` (in this directory) handles resolution explicitly.

## How consumers import it

Inside `@clerk/expo`:

```ts
import { decode, encode } from '../vendor/base-64';
```

The Clerk-side `index.ts` shim re-exports from `./upstream/base64.js` with typed signatures, abstracting the bundler-resolution detail (see `index.ts`).

## Refreshing from upstream

`upstream/` is intentionally frozen. Don't routinely sync.

Refresh **only** when:

- A CVE is reported against `mathiasbynens/base64` upstream, OR
- A spec-relevant bug is discovered.

`base-64@1.0.0` has been the only release since 2014. Any new upstream release after 2020-12-12 should be treated as anomalous and investigated before adoption.

Procedure:

1. `npm pack base-64@<new-version>` in a scratch directory; extract.
2. `diff -r` against `upstream/`.
3. Read every changed line. Apply Clerk's threat model — is this a fix you want, or a behavior change you don't?
4. If accepting: replace `upstream/` with the new tarball contents in one commit (no other changes).
5. Re-run `parity.spec.ts` to confirm behavioral equivalence still holds.
6. Update the vendored version in this README.

## Tests

`__tests__/parity.spec.ts` asserts byte-equivalent inputs produce identical outputs between the upstream npm package (kept as `@clerk/expo`'s `devDependency`) and this vendored copy. Covers RFC 4648 fixtures, `atob`/`btoa` cross-compatibility, and Unicode edge cases.

The upstream `base-64` stays a `devDependency` of `@clerk/expo` for as long as this parity test exists. Removing the devDep would mean giving up the empirical comparator.
135 changes: 135 additions & 0 deletions packages/expo/src/vendor/base-64/__tests__/parity.spec.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
/**
* Vendor parity test for base-64@1.0.0.
*
* Loads encode/decode from BOTH the upstream npm package (kept in
* @clerk/expo devDependencies for as long as this test exists) AND the
* vendored copy at ../upstream/base64.js. Asserts byte-equivalent inputs
* produce identical outputs.
*
* What this test buys:
* - The byte-equivalence check (tools/verify-vendor.sh) proves the bytes
* on disk match the upstream tarball.
* - This test proves that loading those bytes through our bundler /
* test runtime produces upstream's behavior — closing the gap between
* "the bytes are correct" and "the runtime does what upstream does."
*
* When to remove this test:
* - When the upstream `base-64` devDependency is removed from
* packages/expo/package.json, this file must be removed too (the
* `from 'base-64'` import would fail to resolve). Removing the devDep
* means losing the comparator; don't do that unless the vendoring
* approach is fully accepted.
*
* See packages/expo/src/vendor/base-64/README.md for the broader vendoring
* rationale and the customer-side attack chains that motivate it.
*/

// eslint-disable-next-line no-restricted-imports -- intentional: comparator for vendor parity
import { decode as upstreamDecode, encode as upstreamEncode } from 'base-64';
import { describe, expect, it } from 'vitest';

import { decode as vendoredDecode, encode as vendoredEncode } from '../';

/**
* RFC 4648 §10 test vectors — canonical base64 fixtures from the spec.
* Every base64 implementation should handle these identically.
*/
const RFC4648_VECTORS: Array<[plain: string, encoded: string]> = [
['', ''],
['f', 'Zg=='],
['fo', 'Zm8='],
['foo', 'Zm9v'],
['foob', 'Zm9vYg=='],
['fooba', 'Zm9vYmE='],
['foobar', 'Zm9vYmFy'],
];

/**
* Cases beyond the RFC vectors — the polyfill use case is hijacking
* global.btoa / global.atob, so the parity surface must cover everything
* an arbitrary caller (third-party library) might throw at it.
*/
const EXTRA_VECTORS: Array<[label: string, plain: string]> = [
['empty', ''],
['single null byte', '\x00'],
['Latin-1 high', '\xff'],
['arbitrary binary', '\x00\x01\x02\x03\x04\xfd\xfe\xff'],
['ASCII letters', 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'],
['ASCII symbols', '!@#$%^&*()_+-=[]{}|;:,.<>?/~`\'"\\'],
['long string', 'a'.repeat(1024)],
['exactly 3 bytes', 'abc'],
['exactly 4 bytes (forces padding=)', 'abcd'],
['exactly 6 bytes (no padding)', 'abcdef'],
['JSON shape', '{"foo":"bar","baz":[1,2,3]}'],
];

describe('base-64 vendor parity — RFC 4648 fixtures', () => {
it.each(RFC4648_VECTORS)('encode(%j) === %j', (plain, encoded) => {
expect(vendoredEncode(plain)).toBe(upstreamEncode(plain));
expect(vendoredEncode(plain)).toBe(encoded); // canonical anchor
});

it.each(RFC4648_VECTORS)('decode(%j) === %j', (plain, encoded) => {
expect(vendoredDecode(encoded)).toBe(upstreamDecode(encoded));
expect(vendoredDecode(encoded)).toBe(plain); // canonical anchor
});
});

describe('base-64 vendor parity — extra fixtures', () => {
it.each(EXTRA_VECTORS)('encode/decode roundtrip: %s', (_label, plain) => {
const vEnc = vendoredEncode(plain);
const uEnc = upstreamEncode(plain);
expect(vEnc).toBe(uEnc);
expect(vendoredDecode(vEnc)).toBe(upstreamDecode(uEnc));
expect(vendoredDecode(vEnc)).toBe(plain);
});
});

describe('base-64 vendor parity — deterministic fuzz', () => {
it('matches upstream for 512 random binary strings of varying length', () => {
for (let seed = 0; seed < 512; seed++) {
const len = (seed * 37) % 256;
const chars: string[] = [];
for (let i = 0; i < len; i++) {
// Latin-1 range only (0-255) — what base-64 contracts on.
chars.push(String.fromCharCode((seed + i * 13) & 0xff));
}
const plain = chars.join('');
const vEnc = vendoredEncode(plain);
const uEnc = upstreamEncode(plain);
expect(vEnc, `seed=${seed}`).toBe(uEnc);
expect(vendoredDecode(vEnc), `seed=${seed}`).toBe(upstreamDecode(uEnc));
}
});
});

describe('base-64 vendor parity — error handling', () => {
// Both upstream and vendored should throw on invalid input. We don't pin
// the error message, just that they agree on which inputs throw.
const INVALID: Array<[label: string, invalid: string]> = [
['truncated padding', 'Zm9'],
['invalid char', 'Zm$9v'],
['stray padding', 'Zm9v='],
];
it.each(INVALID)('decode throws-or-matches on invalid input: %s', (_label, invalid) => {
let vErr: unknown = null;
let uErr: unknown = null;
let vResult: string | null = null;
let uResult: string | null = null;
try {
vResult = vendoredDecode(invalid);
} catch (e) {
vErr = e;
}
try {
uResult = upstreamDecode(invalid);
} catch (e) {
uErr = e;
}
// Either both threw or both produced the same output.
expect(Boolean(vErr)).toBe(Boolean(uErr));
if (!vErr) {
expect(vResult).toBe(uResult);
}
});
});
47 changes: 47 additions & 0 deletions packages/expo/src/vendor/base-64/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
/**
* Clerk-side entry for the vendored `base-64` package.
*
* Why this file exists:
* - `upstream/` is a verbatim copy of the base-64@1.0.0 npm tarball and
* must not be modified (byte-equivalence is the security claim — see
* ./README.md).
* - The upstream ships as a UMD wrapper — its exports are assigned to
* `module.exports` inside an IIFE that TypeScript cannot trace
* statically. tsc therefore infers an empty export surface for the
* `.js` file even with `allowJs: true`. The cast below asserts the
* export shape; the parity test (`./__tests__/parity.spec.ts`)
* verifies the assertion empirically against the upstream npm
* package.
* - Consumers import from this file (`../vendor/base-64`); they should
* not reach into `upstream/` directly.
*/

import * as upstreamModule from './upstream/base64.js';

interface Base64Module {
encode: (input: string) => string;
decode: (input: string) => string;
version: string;
}

// tsc infers `typeof upstreamModule` as `{}` because base-64's UMD wrapper
// hides the module.exports assignment inside an IIFE. The shape asserted
// here is verified empirically by __tests__/parity.spec.ts against the
// upstream npm package (kept as a devDependency for this purpose).
const upstream: Base64Module = upstreamModule as unknown as Base64Module;

/**
* Encode a binary-safe string to base64. Compatible with the WHATWG
* `btoa()` algorithm (RFC 4648 §4). Throws on non-Latin-1 input.
*
* Vendored from base-64@1.0.0 — see ./README.md.
*/
export const encode: (input: string) => string = upstream.encode;

/**
* Decode a base64-encoded string back to a binary string. Compatible with
* the WHATWG `atob()` algorithm (RFC 4648 §4). Throws on invalid input.
*
* Vendored from base-64@1.0.0 — see ./README.md.
*/
export const decode: (input: string) => string = upstream.decode;
20 changes: 20 additions & 0 deletions packages/expo/src/vendor/base-64/upstream/LICENSE-MIT.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Copyright Mathias Bynens <https://mathiasbynens.be/>

Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
"Software"), to deal in the Software without restriction, including
without limitation the rights to use, copy, modify, merge, publish,
distribute, sublicense, and/or sell copies of the Software, and to
permit persons to whom the Software is furnished to do so, subject to
the following conditions:

The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Loading