Commit Graph

1910 Commits

Author SHA1 Message Date
Bilal Godil
e50c9293a2 fix(hexclave): update clientApp.version regex for @hexclave/js sentinel
apps/e2e/tests/js/auth-like.test.ts:68 asserts clientApp.version matches
`^js @stackframe/js@…`, but the build-time sentinel is stamped from
packages/js/package.json's `name` field (configs/tsdown/plugins.ts:10),
which PR 3 renamed @stackframe/js → @hexclave/js. Post-rename the
sentinel is `js @hexclave/js@1.0.0` (or 1.0.1 after auto-bump), so the
old regex fails.

Caught by a follow-up reviewer pass — the published-artifact sentinel
is the most direct observable that's affected by a source-name rename,
and this is the one e2e test that asserts on it directly.
2026-05-23 17:41:53 -07:00
Bilal Godil
e8f71c1d19 fix(hexclave): address PR 3 multi-agent review findings
Six fixes from the four parallel reviewers on the rename PR:

1. **Backend .well-known/ routes** — the sweep's directory walker had a
   bug that skipped every dot-prefixed dir (intended to exclude .git /
   .turbo / etc.), which also caught Next.js .well-known/ route folders.
   Two route handlers under apps/backend/.well-known/ still imported
   @stackframe/stack-shared/dist/* — flipped to @hexclave/shared/dist/*.

2. **Legacy docs/ folder excluded from workspace** — docs/ is the legacy
   fumadocs site, no longer maintained (replaced by docs-mintlify/).
   Per user direction, kept on disk for migration reference but dropped
   from pnpm-workspace.yaml so it no longer gates install / typecheck /
   lint. This is the right call given the typecheck failures in
   docs/src/ from the sweep carve-out were never going to be fixed.

3. **Root package.json scripts** — removed every `--filter=@hexclave/docs`
   reference now that docs/ isn't in the workspace: build:docs (rerouted
   to @hexclave/docs-mintlify), dev / dev:tui / dev:docs (dropped the
   filter), and the dead 'fern' script (was @hexclave/docs-only).

4. **build:demo filter** — fixed pre-existing bug where the script
   filtered package name 'demo-app', but the package is
   '@hexclave/example-demo-app'. Never resolved before, fixed now.

5. **github-config-push.test.ts legacy fallback** — the sweep flipped the
   test 'preserves the existing @stackframe/* import package…' from
   @stackframe/react to @hexclave/react, which made it a duplicate of
   the test above it and eliminated all coverage of the legacy regex
   branch in detectImportPackage. Renamed the modified test to reflect
   what it now tests, and added a new parallel test that feeds an
   @stackframe/react import and asserts the legacy import is preserved
   on output. Both branches of the dual-name regex are now covered.

6. **examples/react-example version** — the only package the sweep
   missed for the 1.0.0 version reset (unscoped name 'react-example'
   wasn't in the rename map). Bumped 2.8.103 → 1.0.0 for consistency.

Verification on a clean install:
- `pnpm install --frozen-lockfile` — clean (only pre-existing
  @vercel/mcp-adapter bin warnings).
- `pnpm typecheck` — 28/28 tasks green across the whole workspace.
- `pnpm lint` — 28/28 tasks green.

Reviewers flagged but I did NOT change (out of scope or non-actionable):
- npm-publish.yaml GH Environment name still says 'hexclave/stack-auth'
  — env names are managed in repo settings, not in YAML; cosmetic.
- RENAME-TO-HEXCLAVE.md references the deleted rewrite script — it's
  a planning doc / historical record, leaving as-is.
- code-examples and migration.mdx user-facing references to
  @stackframe/* — these are documentation that teaches the rename, by
  design they mention both names.
2026-05-23 17:41:53 -07:00
Bilal Godil
7125a9eff4 feat(hexclave): rename @stackframe/* → @hexclave/* (PR 3)
Source rename across the monorepo. Every publishable package now ships
under its @hexclave/* name natively, no rewrite-at-publish indirection.

Workflow + tooling:
- Delete scripts/rewrite-packages-to-hexclave.ts (one-shot mirror).
- Remove the mirror-publish block from .github/workflows/npm-publish.yaml.
  The remaining `pnpm publish -r` step publishes @hexclave/* natively.
- Flip the auto-bump changeset target from @stackframe/stack to
  @hexclave/next so 'Update package versions on dev' keeps working.
- Delete packages/template/src/internal/deprecation-warning.ts and its
  imports — @hexclave/* never warns about itself, and after PR 3 no
  @stackframe/* artifact is ever built from source again.

Package renames (publishable):
  @stackframe/react              → @hexclave/react
  @stackframe/stack              → @hexclave/next
  @stackframe/js                 → @hexclave/js
  @stackframe/stack-shared       → @hexclave/shared
  @stackframe/stack-ui           → @hexclave/ui
  @stackframe/stack-sc           → @hexclave/sc
  @stackframe/stack-cli          → @hexclave/cli
  @stackframe/tanstack-start     → @hexclave/tanstack-start
  @stackframe/dashboard-ui-components → @hexclave/dashboard-ui-components

Internal monorepo packages (private, never published) also renamed for
brand consistency: backend, dashboard, docs, mcp, skills, e2e-tests,
example apps, the swift-sdk, the monorepo root, etc. Cost is mechanical;
payoff is no stray @stackframe/* names left under apps/, examples/, sdks/.

Carve-outs intentionally kept under their legacy names:
- @stackframe/emails — virtual module imported by customer-stored email
  templates; the renderer in apps/backend/src/lib/email-rendering.tsx
  dual-aliases both names to the same backing module indefinitely.
- @stackframe/template — internal codegen source, never published; per
  docs-mintlify/migration.mdx 'internal packages keep names'.
- @stackframe/init-stack — deprecated; now marked private: true so the
  last published version on npm continues to serve old install commands
  but the workspace stops publishing it.

Backward-compat detection (so projects still on the last @stackframe/*
release keep working):
- packages/stack-shared/src/config-rendering.ts — CONFIG_IMPORT_PACKAGES
  table includes both @hexclave/* (canonical, first match wins) and
  legacy @stackframe/* names. Function renamed
  detectStackframeImportPackage → detectConfigImportPackage.
- apps/dashboard/src/lib/github-config-push.ts — import detection regex
  now matches both @hexclave/<name> and @stackframe/<name>, hexclave
  preferred.

Versions: every renamed package reset to 1.0.0 in source. The repo's
existing 'bump versions before merging to main' flow will move them to
1.0.1 on the first publish run, so the dual-publish 1.0.0 from PR 2 is
not overwritten.

Other touch-ups discovered during sweep:
- Root package.json: 'fern' script filter was @stackframe/docs (legacy
  typo, never resolved) → @hexclave/docs.
- README.md contributor note: @stackframe/XYZ → @hexclave/XYZ.
- packages/stack-cli/package.json: register `hexclave` bin alongside
  the legacy `stack` bin so `npx @hexclave/cli init` works on the
  natively-published artifact (PR 1481's rewrite script did this at
  publish time; now it's in source).
- packages/template/package-template.json: per-platform names + version
  flipped to hexclave + 1.0.0 to stay in sync with generated package.json.
- docs/package.json (legacy fumadocs folder, otherwise carved out of the
  brand sweep): workspace deps and name updated minimally so `pnpm
  install` resolves — content (MDX) intentionally untouched per the
  PR 2 scoping decision.

Carve-out files (skipped entirely by the sweep, intentional history):
- docs-mintlify/migration.mdx — teaches the rename, references both.
- RENAME-TO-HEXCLAVE.md — planning doc, references both indefinitely.
- legacy docs/ folder — content untouched per PR 2 carve-out.

generate-sdks regenerated packages/{react,stack,js} from template.
pnpm-lock.yaml regenerated. Typecheck green on stack-shared, stack, js,
react. Dashboard typecheck has pre-existing 'X is of type unknown'
errors that need to be investigated separately (likely a local
node_modules build state issue, not source).
2026-05-23 17:41:53 -07:00
Bilal Godil
d4f6f58735 feat(hexclave): PR 2 — visible rebrand to Hexclave
Rebased onto dev after PR 1475 (cl/hexclave-pr1) was squash-merged.
Squashes the original 46-commit branch (including PR1-duplicate commits
that arrived via cherry-picks/merges) into a single commit containing
only PR2's net delta over dev.

Original PR 1481 head: 94872de407873a1cabd4085deb21b69afe8d7699
(kept locally at backup/cl-romantic-mendel-5a2c25-pre-rebase)
2026-05-23 17:35:08 -07:00
BilalG1
f7e389809e
feat(hexclave): PR 1 — wire compatibility layer (invisible) (#1475)
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
## Summary

**Stacked on #1468** (`docs/hexclave-rename-plan` — the plan doc). Diff
vs that base = the actual PR 1 code.

This is **PR 1 of the Hexclave rebrand: the invisible compatibility
layer**. Everything is additive. Old SDKs, old wire identifiers, and old
env var names keep working unchanged. The backend dual-accepts and
dual-emits; new SDK code emits `x-hexclave-*` headers and the
`hexclave_` Bearer prefix; cookies dual-write; env vars dual-read across
every category. **No user-visible rebranding lands here** — that's PR 2.

See [`RENAME-TO-HEXCLAVE.md`](./RENAME-TO-HEXCLAVE.md) → *"PR 1
implementation guide"* for the full per-work-area spec, file pointers,
and chosen approach.

## What's implemented (all 14 PR-1 work-areas)

- **SDK export aliases** — `Hexclave*` aliases for the user-facing
`Stack*` exports added in `packages/template`; codegen propagates them
to `@stackframe/{js,stack,react,tanstack-start}`. React-only aliases
correctly excluded from `@stackframe/js`. (`e60550a2`)
- **JWT issuer dual-accept** — `decodeAccessToken` accepts both
`api.stack-auth.com` and `api.hexclave.com` issuers. Signing unchanged.
(`fc781def`)
- **Request-header dual-accept** — backend + dashboard proxies normalize
`x-hexclave-*` → `x-stack-*` at the existing empty proxy hook (so
`smart-request.tsx` and every route schema keep working unchanged); CORS
allowlists extended via a derive-once helper. (`2a056eac`)
- **MCP `ask_hexclave`** — registered alongside `ask_stack_auth` via a
shared helper; `ask_stack_auth` behavior byte-identical. (`30ffd604`)
- **Dev-tool** — DOM ids + header emit switched.
`window.HexclaveDevTool` exposed alongside `window.StackDevTool`.
(`32131ea7`)
- **The big consolidated commit** (`7fed864a`):
- **Env vars** — central `getEnvVariable` prefix-transform (HEXCLAVE
first, STACK fallback); dashboard + template client env files dual-read;
`turbo.json` globalEnv; `NEXT_PUBLIC_STACK_PORT_PREFIX` renamed outright
across ~82 files including docker.
- **Cookies** — dual-write/dual-read auth (`stack-access`/`-refresh-*`
and custom-domain variants), OAuth-state
(`stack-oauth-{inner,outer}-*`), and low-risk cookies (`stack-is-https`,
`stack-last-seen-changelog-version`). Bypass sites patched (backend
OAuth callback, dashboard remote-dev auth route, impersonation snippets,
snapshot serializer).
- **Bearer prefix** — SDK token parser accepts both `stackauth_` and
`hexclave_`; emits `hexclave_`. Discovery correction: this is purely
SDK-internal — the backend never parses it.
- **Response headers** — backend dual-emits
`x-hexclave-{request-id,actual-status,known-error}`; SDKs dual-read (new
first, stack fallback).
- **SDK request-header emit switch** —
`client/server/admin-interface.ts` + dashboard `api-headers.ts` +
`internal-project-headers.ts` + `feedback-form.tsx` switched to
`x-hexclave-*`. Plus `stack_response_mode` query param.
- **Storage keys** — dev-tool / cli-auth / oauth-button / docs keys
renamed (straight); `stack:session-replay:v1` dual-read so in-progress
recordings survive SDK upgrades; `stack_mfa_attempt_code` dual-read.
- **Query params** — cross-domain params dual-emit/dual-accept via
shared helpers; backend `oauth/authorize` accepts
`hexclave_response_mode` and `stack_response_mode`; `stack-init-id`
renamed.
- **`Symbol.for`** — app-internals symbol gets a parallel
`Symbol.for("Hexclave--app-internals")` getter on each attach site (no
read-site churn — old symbol still attached). 3 file-private symbols
renamed outright.
- **Config discovery** — prefer `hexclave.config.ts`, fall back to
`stack.config.ts` at every discovery site (CLI / dashboard / backend /
local-emulator); `init` writes the new filename; CLI credentials path
migrates.
- **Internal renames** — `StackAssertionError`,
`StackClient/Server/AdminInterface` renamed outright (no alias, per the
"internal-only → rename" rule). ~264 files touched.
- **Review-pass fixes** (`21217fbe`) — three real bugs found by parallel
review agents and fixed:
- `snapshot-serializer.ts` was interpolating the whole
`keyedCookieNamePrefixes` array (`${arr}`) — adding a second prefix
would have corrupted **every** OAuth-cookie snapshot, not just new ones.
- **Docker port-prefix producer/consumer mismatch** —
`entrypoint.sh`/`run-emulator.sh`/cloud-init `user-data` were still
producing `NEXT_PUBLIC_STACK_PORT_PREFIX` while the dashboard sentinel +
consumers had been renamed; silent self-host regression (custom port
prefix would be ignored).
- **Missing `hexclave-oauth-inner-*` dual-write** in the OAuth authorize
route — callback's fallback masked it but the dual-write was specified
by the plan.
- Plus: `mcp.test.ts` tool-list assertions updated to include
`ask_hexclave`; two dashboard header-emit sites switched to
`x-hexclave-*` for consistency.
- **E2E snapshot serializer follow-up** (`4b16cc5d`) —
`x-hexclave-request-id` added to the hidden-headers list (mirroring
`x-stack-request-id` treatment), and 2 sample inline snapshots
regenerated in `projects.test.ts` to include the new dual-emitted
headers.

## Verification

- **`pnpm typecheck`** — clean (the fresh-worktree `@/.source` / Prisma
codegen gap in `stack-docs` is pre-existing and unrelated).
- **`pnpm lint`** — 29/29 packages green.
- **`pnpm exec turbo run build --filter=./packages/*`** — 13/13 packages
build (including `@stackframe/stack-cli` once the dashboard standalone
is present).
- **Live E2E** against a running backend on `cl/hexclave-pr1`:
- `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/internal/mcp.test.ts` — **6/6
pass** (verifies the new `ask_hexclave` tool — the hand-written inline
snapshot matched actual MCP server output).
- `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/internal/projects.test.ts` —
**11/11 pass** (verifies wire dual-accept + dual-emit end-to-end; the
snapshot serializer fix was found and applied during this check).

A four-agent parallel **review pass** also audited the full diff for
logic/runtime bugs across the work-areas (wire headers + JWT, cookies +
bearer + symbols, env vars, query params + config + MCP + aliases). All
in-slice review verdicts were ✓ except the three bugs listed above,
which are now fixed.

## Known follow-ups (out of scope for this PR)

- **E2E snapshots across the rest of the suite** — backend now
dual-emits `x-hexclave-{known-error,actual-status}` alongside
`x-stack-*`, which legitimately appears in inline snapshots throughout
`apps/e2e`. Two were regenerated here as a sample; the rest should regen
with `vitest -u` in CI.
- **Docker shell env vars beyond `PORT_PREFIX`** — `entrypoint.sh` still
reads `STACK_*` env vars directly (the JS-side `getEnvVariable`
transform doesn't help the shell). JS consumers dual-read so it works in
practice; full shell-level dual-read is a deeper self-host follow-up.
- **`@stackframe/stack-cli` build ordering** — pre-existing; needs
`build:rde-standalone` first. Not affected by this PR.

## Test plan

- [ ] CI runs full e2e suite (with `vitest -u` to absorb dual-emit
snapshot deltas, then committed back)
- [ ] Spot-check: an old SDK build (emitting only `x-stack-*`) still
authenticates against the new backend
- [ ] Spot-check: a new SDK (emitting `x-hexclave-*` / `Bearer
hexclave_*`) still authenticates against an old backend during deploy
ordering
- [ ] Manual: `npx @stackframe/stack-cli@latest init` (new onboarding
entrypoint) generates `hexclave.config.ts`
- [ ] Manual: existing `stack.config.ts`-only project still resolves (no
migration required)

---------

Co-authored-by: bilal <bilal@stack-auth.com>
2026-05-23 17:24:55 -07:00
Konstantin Wohlwend
1044144091 More test fixes 2026-05-23 13:55:15 -07:00
Mantra
10df9b2b7b
Fix dashboard sandbox compile errors and switch smart model to Grok (#1476)
## Summary
- Forward Babel/JSX compile errors, runtime throws, and unhandled
rejections from the AI dashboard sandbox iframe to the parent composer
via `postMessage`, so users see actionable errors instead of a blank
preview
- Compile AI-generated dashboard source explicitly with
`Babel.transform` + try/catch (stored in `text/plain` to avoid Babel's
auto-handler swallowing parse errors) and add `crossorigin="anonymous"`
on the Babel script for readable cross-origin error messages
- Switch authenticated smart-tier model from
`moonshotai/kimi-k2.6:nitro` to `x-ai/grok-build-0.1`

## Test plan
- [ ] Generate a dashboard with valid AI code and confirm the preview
still renders
- [ ] Generate a dashboard with invalid JSX and confirm the composer
shows the compile error (not a blank iframe)
- [ ] Trigger a runtime error in generated dashboard code and confirm it
reaches the parent error boundary
- [ ] Verify authenticated smart-tier requests route to
`x-ai/grok-build-0.1`


Made with [Cursor](https://cursor.com)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Embedded dashboards now show a clear “Dashboard failed to compile”
message on compilation errors instead of a blank iframe.
* Dashboard runtime errors and unhandled promise rejections are captured
earlier and forwarded to the parent for improved visibility.

* **Updates**
* The authenticated AI model used for the "smart" quality has been
changed, affecting model selection for authenticated requests.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1476?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-23 12:34:37 -07:00
Konstantin Wohlwend
725f2da886 Fix tests 2026-05-23 12:29:20 -07:00
Konstantin Wohlwend
281701e99d Fix CI 2026-05-23 11:17:51 -07:00
github-actions[bot]
957a33a651 chore: update package versions 2026-05-23 18:13:12 +00:00
Konstantin Wohlwend
866e618a20 Fix CI/CD
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
Publish npm packages / publish (push) Has been cancelled
Publish Swift SDK to prerelease repo / publish (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
2026-05-23 10:48:41 -07:00
Konstantin Wohlwend
d4663fbe7d Capture more errors on failures 2026-05-23 10:34:55 -07:00
github-actions[bot]
6a0ded1340 chore: update package versions 2026-05-23 16:45:36 +00:00
Konstantin Wohlwend
760b866fea Fix CI/CD 2026-05-23 09:36:23 -07:00
Mantra
9b1851dd54
Managed email domain deletion and Cloudflare DNS import UX (#1442)
## Summary
- Add an admin-only delete endpoint and SDK method to remove managed
email domains, with Resend/DNSimple cleanup and a guard against deleting
domains currently in use for sending.
- Add dashboard UI to remove unused managed domains (with confirmation)
and improve the DNS setup step with Cloudflare detection, zone file
download, and import instructions.
- Add E2E coverage for delete auth, success, in-use rejection,
post-switch deletion, and 404 cases.

## Test plan
- [ ] Run `pnpm test run managed-email-onboarding`
- [ ] In dashboard email settings, add a managed domain and verify
Cloudflare hint appears when NS records point to Cloudflare
- [ ] Remove an unused managed domain and confirm it disappears from the
list
- [ ] Verify active (in-use) managed domains cannot be deleted until
email provider is switched away


Made with [Cursor](https://cursor.com)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Delete managed email domains from the dashboard with a confirmation
flow and success notification
* Cloudflare-aware domain setup: detection banner, quick links to
Cloudflare DNS, downloadable zone file, and import instructions
  * Admin API and admin-app method to perform managed-domain deletion

* **Bug Fixes**
* Deletion blocked with a clear error when a domain is actively used for
sending

* **Tests**
* Added end-to-end coverage for managed-domain delete scenarios
(success, in-use conflict, auth rejection, and 404)

* **Style**
* Data grid layout adjusted to prevent unintended full-height stretching
across various tables

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1442?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-23 09:22:29 -07:00
Konstantin Wohlwend
f6ef49a3dc Remove source-of-truth logic 2026-05-23 01:06:42 -07:00
Konstantin Wohlwend
884f8dfcea Better logs for unknown email errors 2026-05-22 21:08:31 -07:00
github-actions[bot]
01948e2313 chore: update package versions 2026-05-23 03:30:16 +00:00
github-actions[bot]
62aa8616d5 chore: update package versions 2026-05-23 03:25:52 +00:00
Konsti Wohlwend
6a8b45e6c5
fix: CI failures — Node 22, test timeouts, QEMU transaction timeout (#1479) 2026-05-22 20:24:13 -07:00
Konsti Wohlwend
b48902fcf2
docs: restart changelog with weekly updates (#1478) 2026-05-22 20:18:05 -07:00
github-actions[bot]
70999df64e chore: update package versions 2026-05-23 01:02:03 +00:00
Konsti Wohlwend
7e70b11d80
Fix onboarding shared OAuth provider persistence (#1477) 2026-05-22 17:54:11 -07:00
Konsti Wohlwend
1f0c27b644
fix: fix failing CI/CD on dev branch (#1473) 2026-05-22 16:25:57 -07:00
github-actions[bot]
9355c8665c chore: update package versions 2026-05-22 23:02:49 +00:00
github-actions[bot]
cd29811456 chore: update package versions 2026-05-22 22:58:20 +00:00
BilalG1
fb9f77843e
Populate ClickHouse analytics tables when seeding preview projects (#1471)
## Summary

In preview-mode deployments (`NEXT_PUBLIC_STACK_IS_PREVIEW=true`) the
project overview dashboard reported **0 total users, 0 monthly active
users, and no live users** on the globe. The internal metrics endpoint
reads user/team totals from the ClickHouse `analytics_internal.*` tables
and "live users" from recent `$token-refresh` events — but those tables
are normally filled by the external-db-sync pipeline, which does not run
in preview deployments, so they were empty.

This makes the preview/demo dummy-data seeder populate ClickHouse
directly:

- **`seedDummyAnalyticsMirrorTables`** — mirrors the seeded users /
teams / contact channels into `analytics_internal.users` / `teams` /
`contact_channels` so the metrics endpoint reports real totals.
- **`seedDummyLiveTokenRefreshEvents`** — emits recent `$token-refresh`
events across distinct countries so the overview globe shows live users.
- **Timestamp clamping** — `bulkRandomTimestampOnDay` and the
page-view/click timestamps are clamped so seeded events are never dated
in the future (future-dated events permanently matched the unbounded
"live users" query).
- **`buildTokenRefreshClickhouseRow`** — shared helper for the
`$token-refresh` ClickHouse row shape.
- **`create-project`** — pre-warms the ClickHouse connection so the
seeding inserts don't pay the cold-start cost.
- **`projects-metrics`** — types the ClickHouse `.json()` results (fixes
a `tsc` error).

Also bundles a seeding performance optimization that skips redundant
idempotency lookups when seeding a brand-new project.

Notes:
- Seeded mirror rows use `sync_sequence_id = 0` so that if the
external-db-sync pipeline ever does run for the project, any real update
supersedes the seeded placeholder under `ReplacingMergeTree` + `FINAL`.
- "Live users" naturally decays out of the ~2-minute window a couple of
minutes after project creation; preview creates a fresh project per
visit, so the initial overview always shows them.

## Test plan

- [x] `pnpm --filter @stackframe/backend typecheck` passes
- [x] `pnpm --filter @stackframe/backend lint` passes
- [x] Created fresh preview projects; overview shows non-zero Total
Users / Monthly Active Users
- [x] `analytics_internal.users` / `teams` / `contact_channels`
populated for the seeded project
- [x] Globe shows 8 live users across 8 distinct countries (verified via
the metrics 2-minute query)
- [x] No future-dated `$token-refresh` events in
`analytics_internal.events`

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Refactor**
* Faster preview project creation by pre-warming the analytics database
and reusing the warmed connection.
* Reduced initialization delays and redundant checks when seeding
brand-new projects; creation paths now skip needless probes.
* More efficient, parallelized seeding of teams/users/events with
deterministic handling of token-refresh and session-replay data.
* Safer timestamp generation to avoid future-dated events and deferred
background processing for long-running tasks like payments.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1471?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com>
2026-05-22 15:54:24 -07:00
Konstantin Wohlwend
1effedbc42 Fix various cross-domain auth bugs 2026-05-22 13:40:39 -07:00
Konsti Wohlwend
197ad09eea
Route read-only raw Prisma queries to read replica (#1467) 2026-05-22 11:16:11 -07:00
github-actions[bot]
0c6e135c30 chore: update package versions
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
2026-05-22 01:35:39 +00:00
Konstantin Wohlwend
99f07e9516 Trust hosted domains
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
Publish npm packages / publish (push) Has been cancelled
Publish Swift SDK to prerelease repo / publish (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
2026-05-21 18:23:23 -07:00
Konsti Wohlwend
9fc7332ead
Restore sign-up setting descriptions (#1469) 2026-05-21 17:58:24 -07:00
github-actions[bot]
d12968eb3d chore: update package versions 2026-05-22 00:46:34 +00:00
Konsti Wohlwend
f993e92b68
fix: flaky E2E tests and backend build TypeScript error (#1462) 2026-05-21 17:28:35 -07:00
Konsti Wohlwend
c6d59d0288
Cross domain handoffs (#1458) 2026-05-21 17:15:12 -07:00
Konsti Wohlwend
600a0d6fcf
fix: handle race condition in recordExternalDbSyncDeletion (#1466) 2026-05-21 17:02:20 -07:00
github-actions[bot]
03e7b61308 chore: update package versions 2026-05-21 23:29:36 +00:00
Konstantin Wohlwend
bf8d0ece28 chore: update package versions 2026-05-21 16:23:12 -07:00
Konstantin Wohlwend
35f3e699dd Fix types 2026-05-21 15:56:17 -07:00
Konstantin Wohlwend
4ff24dea9b chore: update package versions 2026-05-21 14:54:23 -07:00
Konstantin Wohlwend
a6762b00fa Update projects-metrics to use more Clickhouse 2026-05-21 14:52:25 -07:00
Armaan Jain
e42ec65c88
Payments app design fixes (#1375)
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->

## Summary

This PR brings the Payments dashboard surfaces in line with the shared
design system: product creation, product-line / included-item dialogs,
auth-method toggles, payments empty states, and related layout polish.
Dialogs migrate from raw shadcn `Dialog` to `DesignDialog` with
consistent headers, footers, inputs, and selector dropdowns.

**Base:** `dev` → **Head:** `Payments-app-design-fixes`  
**Scope:** 31 files, ~+1.4k / −1.3k lines  
**Captured on:** local dev server (`internal` project), signed in as
`admin@example.com`

## Screenshots

Captured from `http://localhost:8101` (viewport: **1920×1200** standard,
**2560×1440** widescreen). Assets hosted in [this
gist](https://gist.github.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf).

> Red outlines on the **after** shots mark the new or changed UI
introduced by this PR.

### Create Product — payments form redesign

| | Before | After |
| --- | --- | --- |
| Light |
![payments-products-new-before-light](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-products-new-before-light.png)
|
![payments-products-new-after-light](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-products-new-after-light.png)
|
| Dark |
![payments-products-new-before-dark](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-products-new-before-dark.png)
|
![payments-products-new-after-dark](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-products-new-after-dark.png)
|

Widescreen:

| | Before | After |
| --- | --- | --- |
| Light |
![payments-products-new-before-light-wide](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-products-new-before-light-wide.png)
|
![payments-products-new-after-light-wide](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-products-new-after-light-wide.png)
|
| Dark |
![payments-products-new-before-dark-wide](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-products-new-before-dark-wide.png)
|
![payments-products-new-after-dark-wide](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-products-new-after-dark-wide.png)
|

### Product Lines onboarding — vertical centering fix

| | Before | After |
| --- | --- | --- |
| Light |
![payments-product-lines-before-light](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-product-lines-before-light.png)
|
![payments-product-lines-after-light](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-product-lines-after-light.png)
|
| Dark |
![payments-product-lines-before-dark](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-product-lines-before-dark.png)
|
![payments-product-lines-after-dark](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-product-lines-after-dark.png)
|

### Create Product Line dialog — `DesignDialog` migration

| | Before | After |
| --- | --- | --- |
| Light | *(legacy shadcn dialog on `dev` — open via Product Line →
Create new)* |
![dialog-create-product-line-after-light](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/dialog-create-product-line-after-light.png)
|
| Dark | |
![dialog-create-product-line-after-dark](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/dialog-create-product-line-after-dark.png)
|

### Auth Methods — toggle row accessibility

| | Before | After |
| --- | --- | --- |
| Light |
![auth-methods-before-light](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/auth-methods-before-light.png)
|
![auth-methods-after-light](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/auth-methods-after-light.png)
|
| Dark |
![auth-methods-before-dark](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/auth-methods-before-dark.png)
|
![auth-methods-after-dark](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/auth-methods-after-dark.png)
|

### Other migrated surfaces (after only)

| Page | Light | Dark |
| --- | --- | --- |
| Payments settings |
![payments-settings-after-light](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-settings-after-light.png)
|
![payments-settings-after-dark](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/payments-settings-after-dark.png)
|
| Sign-up rules |
![sign-up-rules-after-light](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/sign-up-rules-after-light.png)
|
![sign-up-rules-after-dark](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/sign-up-rules-after-dark.png)
|
| Projects list (Create Project button) |
![projects-after-light](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/projects-after-light.png)
|
![projects-after-dark](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/projects-after-dark.png)
|
| Playground / DesignDialog |
![playground-dialog-after-light](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/playground-dialog-after-light.png)
|
![playground-dialog-after-dark](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/playground-dialog-after-dark.png)
|
| Included Item dialog |
![dialog-included-item-after-light](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/dialog-included-item-after-light.png)
|
![dialog-included-item-after-dark](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/dialog-included-item-after-dark.png)
|

### Scroll behaviour — Sign-up Rules

| | Light | Dark |
| --- | --- | --- |
| Scroll |
![sign-up-rules-scroll-light](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/sign-up-rules-scroll-light.gif)
|
![sign-up-rules-scroll-dark](https://gist.githubusercontent.com/mantrakp04/ca3483d2b66b8e28f0872488df573ccf/raw/sign-up-rules-scroll-dark.gif)
|

## What's new

- **`DesignDialog`** extended with `customHeader`, `noBodyPadding`, and
section `className` hooks; Playground updated to showcase them.
- **Payments dialogs** (`CreateProductLineDialog`, `IncludedItemDialog`,
price edit, item dialog) migrated to design-system components.
- **Create Product** page uses `DesignButton`, `DesignInput`,
`DesignSelectorDropdown`, and refreshed header actions.
- **Auth Methods** toggle rows use semantic `<Label htmlFor>` instead of
click-capture divs.
- **Payments layout** empty-state card centers correctly; product-lines
onboarding slideshow vertically centers.
- **Backend** seed invariant for Growth product price; removed unused
import in product switch route.

## Notes for reviewers

- Dialog migrations preserve validation + async error handling
(`runAsynchronouslyWithAlert` where applicable).
- Included-item dialog uses a sentinel value for “Create new item” to
avoid colliding with real item IDs.
- `packages/stack` / `packages/js` are untouched; template +
dashboard-ui-components carry SDK-facing dialog changes.

## Test plan

- [x] Visual capture on `internal` project (`admin@example.com`) —
light/dark, standard + widescreen
- [ ] Create product flow: customer type → product line dropdown →
create line dialog
- [ ] Add included item dialog from create/edit product
- [ ] Auth Methods toggles (label click + switch)
- [ ] Payments product-lines onboarding slideshow at varied viewport
heights
- [ ] `pnpm typecheck` / `pnpm lint` / targeted E2E if API surface
changed

---------

Co-authored-by: nams1570 <amanganapathy@gmail.com>
Co-authored-by: mantrakp04 <mantrakp@gmail.com>
Co-authored-by: Mantra <87142457+mantrakp04@users.noreply.github.com>
2026-05-21 14:48:56 -07:00
Aman Ganapathy
be2ad595ad
fix: checkout flow for 0 dollar subscription (#1465)
### Context
There was a small bug via dashboard checkout flow where it would fail on
trying to create a checkout flow for a free product subscription because
no client secret is generated for a 0 dollar subscription.

### Summary of Changes
The flow should be fine now. There's special carve out logic for it.
That being said, users attempting to mimic a free plan grant are
encouraged to follow the `ensureFreePlan` pattern.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Free subscription selections now bypass Stripe payment processing,
streamlining checkout for zero-cost offerings.
* Purchase return flow now properly recognizes and activates free
subscriptions without requiring payment confirmation.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1465?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-21 14:33:55 -07:00
Mantra
e3bd5a6638
Move internal metrics queries to ClickHouse replica (#1463)
## Summary
- Move `loadTotalUsers`, `loadAuthOverview`, and
`loadRecentlyActiveUsers` off direct Postgres queries to read from the
ClickHouse `analytics_internal` tables.
- Route the remaining `projectUser.findMany` reads in
`loadActiveUsersByCountry` and `loadRecentlyActiveUsers` through
`$replica()`.
- `loadRecentlyActiveUsers` falls back to an empty list on ClickHouse
query failure (captured via `captureError`) rather than failing the
whole metrics endpoint.

## Test plan
- [ ] Hit the internal metrics endpoint on a tenancy with users/teams
and confirm totals, daily series, and recently-active users match the
previous Postgres-backed numbers.
- [ ] Verify the 30-day daily-users series fills zero-activity days
correctly.
- [ ] Simulate a ClickHouse failure for the recently-active query and
confirm the endpoint still responds with the rest of the payload.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes & Improvements**
  * Improved metrics aggregation for more consistent reporting.
* More accurate active-user and total-user time series with missing days
zero-filled.
* Authentication overview updated with clearer counts for verified,
unverified, and anonymous users.
* Performance improvements: recently-active and overview calculations
run more efficiently and in parallel.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1463?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-21 14:31:58 -07:00
Konstantin Wohlwend
5edccc322c Increase replication timeout to 2s 2026-05-21 14:18:39 -07:00
Konsti Wohlwend
0df594abe7
Handle payout.created and payout.reconciliation_completed Stripe webhook events (#1461) 2026-05-21 14:11:36 -07:00
Konsti Wohlwend
3b2e991c78
Fix browser compatibility: guard requestIdleCallback and startViewTransition (#1464) 2026-05-21 14:10:51 -07:00
BilalG1
b8fc04bdbd
feat: link Stack Auth projects to GitHub and push config from the dashboard (#1450)
End-to-end flow for managing Stack Auth config via GitHub: link a repo
during onboarding, edit settings in the dashboard, and have the change
committed to your repo + synced back via a GitHub Actions workflow.


![demo](https://gist.githubusercontent.com/BilalG1/29d1188fc581e87d1311baec6e2ae770/raw/demo-2x.gif)

## What this adds

- **CLI** — `stack config push --source github --source-repo
--source-path --source-workflow-path`. Records the source on the config
row so the dashboard knows where the file lives. Reads `GITHUB_SHA` /
`GITHUB_REF_NAME` for commit + branch.
- **Onboarding "Link existing project"** — searchable repo/branch
comboboxes, auto-detects candidate `stack.config.{ts,js}` paths, writes
`STACK_AUTH_PROJECT_ID` + `STACK_AUTH_SECRET_SERVER_KEY` secrets, and
commits a generated workflow YAML that re-runs `stack config push` on
every change to the config file.
- **Dashboard "Push to GitHub" dialog** — replaces the prior TODO
buttons. Pre-flights `repo`+`workflow` scopes on the user's GitHub
connection; if missing, the button flips to "Reconnect with GitHub". On
push, commits the dashboard's edit straight to the linked repo/branch
via the Contents API (with `cache: "no-store"` to dodge GitHub's 60s GET
cache so consecutive pushes don't 409). Suspense boundary scoped to the
dialog body so opening it doesn't blank the dashboard.
- **Project settings** — surface the linked workflow file as a clickable
GitHub link when the source carries `workflow_path`.

## Test plan

- `pnpm lint` (29/29) ✓
- `pnpm typecheck` (29/29) ✓
- `pnpm --filter @stackframe/stack-cli test` (111/111) ✓
- Dashboard vitest on the three relevant files
(`link-existing-onboarding-workflow`, `github-api`,
`github-config-push`) — 37/37 ✓
- Live end-to-end: `BilalG1/lex-lookup` linked to a local dev project;
passkey toggled, push committed `0bb958bd`
([commit](0bb958bda3)).

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
  * Persist workflow file paths for GitHub-backed config sync
* Dashboard “Push” flow to commit config updates with trimmed/default
commit messages
* CLI options to declare GitHub source (repo/path/workflow) and persist
selectable package runner for manual pushes
  * Show workflow-file link in project configuration when present

* **Improvements**
* Robust config-path normalization, existence checks, debounced
repo/branch search, and better GitHub rate-limit handling
* New GitHub API utilities for safe file read/commit and import-package
detection

* **Tests**
* Expanded tests covering GitHub API, config rendering/merge, and push
behaviors

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1450?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-21 13:47:46 -07:00
BilalG1
91b8e4caa4
Fix /internal/metrics ClickHouse OOM (#1457)
Fixes Sentry
[STACK-BACKEND-16H](https://stackframe-pw.sentry.io/issues/STACK-BACKEND-16H)
— the `/api/v1/internal/metrics` endpoint was triggering the cluster's
10.8 GiB OvercommitTracker kill on tenants with months of
`$token-refresh` history.

## Root cause

Three queries in `loadAnalyticsOverview` plus `loadUsersByCountry` did
`GROUP BY user_id` over the events table with **no lower `event_at`
bound**, so their hash table working set scaled with
cumulative-distinct-users-ever-seen instead of the 30-day metrics
window.

## Changes

- Add 30-day `event_at` lower bound to `loadUsersByCountry` and to the
`analyticsUserJoin` inner subquery (used by `dailyEvents`,
`totalVisitors`, `topReferrers`).
- New `getClickhouseAdminClientForMetrics()` factory in
`lib/clickhouse.tsx` with connection-level safety net: per-query +
per-user memory caps, external GROUP BY spill, and `join_algorithm:
'grace_hash,parallel_hash,hash'` (grace_hash measured to give 48% memory
reduction at zero latency cost — see benchmark notes in the file).
- Inline comment + concrete next steps for the long-term fix (option C:
stamp `is_anonymous` at ingest on page-view/click events, then drop the
join entirely).
- Extend `scripts/benchmark-internal-metrics.ts` with the
historical-seed knob and three new modes (`BENCH_BACKFILL_COMPARE`,
`BENCH_JOIN_ALGO_COMPARE`, plus the existing `BENCH_ROUTE_QUERIES`
updated) used to validate the choices above.

## Benchmark — pre-PR vs post-PR

Synthetic seed: 300k users × 9 events spread over 365 days (~2.7M
events).

| | pre-PR | post-PR | delta |
|---|---:|---:|---:|
| Sum peak memory | 2.18 GiB | 515 MiB | **4.3× less** |
| Max query duration | 1293 ms | 101 ms | **12.8× faster** |
| Sum CPU duration | 5119 ms | 394 ms | 13× less work |
| Sum bytes read | 3.87 GiB | 929 MiB | 4.3× less I/O |

Per-query at 300k users:
- `analyticsOverview:dailyEvents` 561 → 44 MiB (12.8× less)
- `analyticsOverview:totalVisitors` 560 → 50 MiB (11.2× less)
- `analyticsOverview:topReferrers` 546 → 50 MiB (10.9× less)
- `loadUsersByCountry` 388 → 44 MiB (8.9× less)

## Caveats

- `loadDailyActiveSplitFromClickhouse` still scans all-history on its
`min(event_at)` subquery. It can't be naively bounded — `first_date` is
used to classify entities as new vs reactivated, and a 30d bound would
silently mislabel old-but-active entities as "new." The new SETTINGS
cap+spill it; the proper fix is option C (documented inline).
- A user with a page-view but no `$token-refresh` in the last 30 days
now falls through to `coalesce(NULL, 0)` and is classified
non-anonymous. Token-refresh fires every few minutes per active session,
so this is rare but not impossible (embedded SDKs that poll less
frequently, sessions straddling the 30d boundary).
- `max_memory_usage_for_user: 9 GB` trades "cluster-wide
OvercommitTracker kill of a random query" for "clean per-user memory
error attributed to the specific query." After our 30d bounds, no query
is anywhere near 9 GB.

## Test plan

- [x] `pnpm typecheck` passes
- [x] `pnpm lint` passes
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/internal-metrics.test.ts` — 9/10
pass; the 1 failure (`risk_scores` snapshot drift) reproduces on clean
`dev` and is unrelated
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/analytics-{events,events-batch,query}.test.ts
apps/e2e/tests/backend/endpoints/api/v1/token-refresh-events.test.ts
apps/e2e/tests/backend/performance/metrics.test.ts` — all passing tests
pass; 10 pre-existing `PRODUCT_DOES_NOT_EXIST` setup failures reproduce
on clean `dev`
- [x] Benchmark `BENCH_ROUTE_QUERIES=1` at 300k users shows the deltas
above

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Improved internal metrics collection to use metrics-specific DB
settings for more reliable, safer analytical reads.
* Added guardrails to metrics queries to enforce time-window bounds and
avoid unbounded scans.
* Expanded benchmark modes (backfill and join-algo comparisons),
extended perf seeding, and improved logging/retry behavior to capture
more complete stats and reduce missing log rows.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1457?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-21 13:47:32 -07:00
Konsti Wohlwend
002692e519
Compress oversized images client-side in AI chat (#1456) 2026-05-21 12:57:20 -07:00