Commit Graph

386 Commits

Author SHA1 Message Date
Konstantin Wohlwend
fa4f25bcdd Rename port prefix envvar 2026-05-27 18:09:52 -07:00
Konstantin Wohlwend
eeac70e48b Fix fallback URL JWT issuer 2026-05-27 17:33:31 -07:00
BilalG1
6b07c43fe9
feat(backend): derive JWT issuer and OAuth redirect_uri from request host (#1498)
## Summary

When the backend serves both `api.stack-auth.com` and `api.hexclave.com`
from the same deployment, signed JWT `iss` claims and OAuth
`redirect_uri` values need to match the host the customer's SDK actually
talks to — otherwise customers with hardcoded issuer checks or
registered OAuth callback URLs break when their SDK upgrades. This PR
makes both follow the request host.

Closes the "Interpretation B" plan from our earlier discussion.

## Changes

### Backend — request-host-derived `iss` and `redirect_uri`

- **New helper**
[`apps/backend/src/lib/request-api-url.ts`](apps/backend/src/lib/request-api-url.ts)
exports `getApiUrlForRequest(req)` and `getApiUrlForHost(host)`. A
consolidated `CLOUD_HOST_PAIRS` constant is the single source of truth
for the stack-auth ↔ hexclave host pairs (prod, dev, staging). Both the
allowlist here and the validator alias map in `tokens.tsx` derive from
it, so they can never drift again.
- **JWT issuer per request**
([`apps/backend/src/lib/tokens.tsx`](apps/backend/src/lib/tokens.tsx)) —
`getIssuer` now takes `apiUrl`.
`generateAccessTokenFromRefreshTokenIfValid` and `createAuthTokens`
accept an `apiUrl` parameter that flows into the `iss` claim.
`getAllowedIssuers` stays env-driven with the bidirectional alias map,
so tokens cross-validate across hosts.
- **OAuth `redirect_uri` per request** — all 12 providers +
`MockProvider` now take `apiUrl` and use it to build `redirect_uri =
apiUrl + "/api/v1/auth/oauth/callback/<provider>"`. `getProvider()`
accepts an `{ apiUrl }` option and forwards it.
- **OAuth2Server factory** — the module-level `oauthServer` singleton
became a per-request `createOAuthServer({ apiUrl })` factory so
`OAuthModel.generateAccessToken` mints tokens with the right `iss`. Used
in the callback route, the token route, and the cross-domain-authorize
helper.
- **Token-minting call sites updated** — all 10 `createAuthTokens`
invocations (password sign-up/sign-in, sessions create,
sessions/current/refresh, anonymous sign-up, apple-native callback,
MFA/OTP/passkey sign-in, OAuth model token exchange), plus the
CLI-complete `generateAccessTokenFromRefreshTokenIfValid` direct call,
now pass `getApiUrlForRequest(fullReq)`.
- **`createVerificationCodeHandler` refactored** to pass `apiUrl` as a
6th positional arg to the user's handler, so MFA/OTP/passkey sign-in
flows get the same per-request `iss` as the rest. The other 8 callers
(password-reset, contact-channels-verify, etc.) don't need changes —
they accept fewer args and TS function-arity compatibility makes that
fine.

### SDK — freeze `@stackframe/*` defaults at stack-auth.com

-
[`packages/template/src/lib/stack-app/apps/implementations/common.ts`](packages/template/src/lib/stack-app/apps/implementations/common.ts)
reverts `defaultBaseUrl` and `defaultAnalyticsBaseUrl` to
`https://api.stack-auth.com` / `https://r.stack-auth.com`. A customer
who upgrades their `@stackframe/*` package to the latest version without
explicitly migrating to `@hexclave/*` keeps hitting `api.stack-auth.com`
and never sees their JWT `iss` or OAuth `redirect_uri` change.
- The `@hexclave/*` mirror packages are unaffected because they were
published from source when `defaultBaseUrl =
"https://api.hexclave.com"`; v1.0.0 already targets the hexclave host on
npm. Extending `scripts/rewrite-packages-to-hexclave.ts` to substitute
these literals during future republishes is a separate follow-up.

### Docs — migration guide rewrite

- [`docs-mintlify/migration.mdx`](docs-mintlify/migration.mdx) rewritten
concisely. Spells out the two host-visible changes that require
pre-deploy action when migrating to `@hexclave/*`: updating manual JWT
verifier code (with the array-of-issuers pattern) and updating OAuth
callback URLs at each provider (with the GitHub-OAuth-Apps single-URL
caveat explicitly called out).

## What was deliberately left out

- **Rewriter pipeline extension** for `@hexclave/*` republishes —
separate follow-up.
- **Cross-SDK defaults** (Swift, stack-cli, init-stack still default to
`api.hexclave.com`) — out of scope per discussion.
- **Dashboard launch-checklist host-awareness** — out of scope per
discussion (callback URLs stay hexclave-branded in the dashboard UI).

## Verification

- `pnpm typecheck` — 29/29 packages pass.
- `pnpm lint` — 29/29 packages pass.
- Five parallel review agents (JWT, OAuth, helper, migration guide, SDK
defaults) + an external review of the commit caught four real issues —
all resolved in the same commit before push:
- Missing `api.staging.*` entries in `issuerHostAliases` (would have
broken cross-host token validation on staging).
  - Stale comment in `apps/backend/src/stack.tsx`.
  - Misleading "backward-compat" comment in `getHardcodedFallbackUrls`.
- MFA/OTP/passkey using env-var fallback for `iss` instead of the
request host.

<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Makes JWT issuers and OAuth redirect_uri values follow the incoming
request’s host so tokens and redirects always match `api.stack-auth.com`
or `api.hexclave.com`. Also freezes `@stackframe/*` SDK defaults to
Stack Auth and tightens the migration guide to focus on OAuth callbacks.

- **New Features**
- Added `request-api-url` helper with an allowlist of cloud hosts;
unknown hosts fall back to the deployment’s API URL.
- JWT `iss` now uses the per-request API URL; validation accepts both
brands via aliases derived from one `CLOUD_HOST_PAIRS` source.
- All OAuth providers build `redirect_uri` from the request host;
`getProvider()` now takes `{ apiUrl }`.
- Replaced the OAuth2Server singleton with per-request
`createOAuthServer({ apiUrl })` so token exchange mints with the right
issuer.
- Updated token-minting and OAuth routes to pass the API URL;
verification-code flows receive it; connected-accounts refresh paths
safely pin the deployment default.
- SDK: `@stackframe/*` defaults point to `https://api.stack-auth.com`;
`@hexclave/*` mirrors remain hexclave-branded.
- Shared: updated fallback API host lists to include both stack-auth and
hexclave hosts.

- **Migration**
- Update each provider’s OAuth callback URL to
`https://api.hexclave.com/api/v1/auth/oauth/callback/<provider>` when
migrating to `@hexclave/*`.
- If you verify JWTs manually, update the expected issuer to the
hexclave host (including anonymous/restricted variants).

<sup>Written for commit e42dfaf05b.
Summary will update on new commits. <a
href="https://cubic.dev/pr/hexclave/stack-auth/pull/1498?utm_source=github">Review
in cubic</a></sup>

<!-- End of auto-generated description by cubic. -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Per-request API host awareness so tokens and OAuth flows reflect the
incoming request’s canonical API URL.

* **Updates**
* OAuth providers and servers now derive callback/redirect URLs from
request context rather than a static URL.
* Token issuance/validation now embeds and accepts host-specific issuer
URLs and paired host aliases for rebrand compatibility.

* **Documentation**
* Migration guide updated for branding, JWT issuer expectations, and
OAuth callback guidance.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1498?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-27 17:06:15 -07:00
BilalG1
c0fefd3b7a
feat(backend): dual-accept hexclave-mobile-oauth-url:// alongside legacy scheme (#1501)
## What

1. **Backend dual-accept**: `isAcceptedNativeAppUrl()` accepts both
`stack-auth-mobile-oauth-url://` (legacy) and
`hexclave-mobile-oauth-url://` (canonical).
2. **Swift SDK switches to the canonical scheme**: `StackAuth` Swift SDK
now emits and intercepts `hexclave-mobile-oauth-url://` for native-app
OAuth callbacks.

Before this PR, `hexclave-mobile-oauth-url` existed only inside
`RENAME-TO-HEXCLAVE.md` — not in any code.

## Why the Swift SDK change is safe

The Swift SDK uses
`ASWebAuthenticationSession(url:callbackURLScheme:completion:)`
([StackClientApp.swift:197-199](sdks/implementations/swift/Sources/StackAuth/StackClientApp.swift#L197)).
With this API, iOS intercepts the callback scheme **ephemerally** — no
`Info.plist` registration is required. The Swift SDK source has no
`Info.plist`, and the example apps' `pbxproj` registers no
`CFBundleURLSchemes`. So:

- New customer builds against the updated SDK → emit new scheme →
backend accepts → `ASWebAuthenticationSession` intercepts on new scheme
→ works.
- Already-shipped customer App Store binaries on older SDK versions →
emit old scheme → backend still accepts → works.
- **No customer ever has to update an `Info.plist`.**

The only real backward-compat constraint is that the backend can never
drop the old scheme (already-shipped customer binaries have the constant
baked into them). Hence the dual-accept.

(Note: `RENAME-TO-HEXCLAVE.md` line 88 incorrectly attributes the
constraint to `Info.plist` registration. That's not how the SDK works —
the scheme is baked into the SDK binary, not the customer's plist. The
fix described in that doc is essentially the right shape; only the
mechanism description is wrong.)

## Changes

| File | Change |
|---|---|
| `packages/stack-shared/src/utils/redirect-urls.tsx` |
`isAcceptedNativeAppUrl()` accepts either protocol. |
| `apps/backend/src/lib/redirect-urls.test.tsx` | Adds positive
assertions for the new scheme in `isAcceptedNativeAppUrl`; parity
negative assertions in `validateRedirectUrl`. |
| `sdks/implementations/swift/Sources/StackAuth/StackClientApp.swift` |
`callbackScheme` → `"hexclave-mobile-oauth-url"`; fatalError example
strings updated. |
| `sdks/implementations/swift/Tests/StackAuthTests/OAuthTests.swift` |
Test fixture URLs updated (no assertions depend on the scheme literal).
|
|
`sdks/implementations/swift/Examples/StackAuthiOS/.../StackAuthiOSApp.swift`
| Default values in the example UI. |
|
`sdks/implementations/swift/Examples/StackAuthMacOS/.../StackAuthMacOSApp.swift`
| Default values in the example UI. |
| `sdks/implementations/swift/README.md` | Documents the new canonical
scheme; compat note for the legacy one. |
| `sdks/spec/src/apps/client-app.spec.md` | New scheme is canonical;
legacy is "accepted indefinitely for already-shipped customer app
binaries built against older SDK versions." |

## Verification

- `pnpm test run apps/backend/src/lib/redirect-urls.test.tsx` — 34/34
passing (was 33; one new `it` block plus parity assertions).
- `pnpm --filter @stackframe/stack-shared --filter @stackframe/backend
run lint` — clean.
- `pnpm --filter @stackframe/stack-shared --filter @stackframe/backend
run typecheck` — clean.
- Swift assertions in `OAuthTests.swift` do not check the scheme literal
— they only check `oauth/authorize/<provider>`, state/verifier
non-emptiness, and that `redirectUrl` round-trips. The fixture-value
change is mechanical.

## Risk

Low. Backend behavior strictly widens (every URL accepted before is
still accepted). Swift SDK change is internal to OAuth callback
handling, requires no customer migration, and is paired with the backend
dual-accept landing in the same PR.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Adopted the canonical OAuth callback scheme
"hexclave-mobile-oauth-url://" for native apps while continuing to
accept the legacy "stack-auth-mobile-oauth-url://".

* **Documentation**
* Updated SDK docs, examples, and spec guidance to reference the
canonical callback scheme and clarify legacy acceptance.

* **Tests & Samples**
* Updated tests and example apps to use and validate the canonical
scheme.

* **Style**
* Rebranded the dev-tool trigger icon to the new Hexclave monochrome
logo.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1501?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-27 15:44:06 -07:00
BilalG1
57ff5d3ce9
feat(hexclave): PR 2 — visible rebrand (Hexclave brand goes public) (#1481)
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
## Summary

**Stacked on [#1475](https://github.com/hexclave/stack-auth/pull/1475)**
(`cl/hexclave-pr1`, the invisible compatibility layer). Diff vs that
base = the actual PR 2 code.

This is **PR 2 of the Stack Auth → Hexclave rebrand: the visible flip**.
Old wire identifiers (cookies, request/response headers, Bearer prefix,
JWT issuers, MCP tool name) keep working indefinitely via PR 1's
dual-accept. This PR flips every user-visible surface — package names
taught in docs, SDK class names in code examples, dashboard setup
snippets, page titles, error messages, email content, CLI binary,
default base URLs, GitHub repo slug, contributor guidance — to the
Hexclave brand.

See [`RENAME-TO-HEXCLAVE.md`](./RENAME-TO-HEXCLAVE.md) → *"PR 2: Rebrand
to Hexclave (visible)"* for the full per-work-area spec.

## What's implemented (per the plan's PR 2 scope)

- **SDK base URLs** flipped: `defaultBaseUrl` and
`defaultAnalyticsBaseUrl` in
[common.ts](packages/template/src/lib/stack-app/apps/implementations/common.ts:127)
→ `https://api.hexclave.com` / `https://r.hexclave.com`. PR 1's
[`getHardcodedFallbackUrls`](packages/stack-shared/src/utils/urls.tsx:199)
table now keys on the Hexclave domain.

- **Domain inventory sweep** (16 subdomains from the plan): every
`api/app/docs/discord/demo/mcp/skill/feedback/test/preview/r/api2/api.staging/idp-jwk-audience/built-with.stack-auth.com`
reference in production code, docs-mintlify, examples, READMEs, and
contributor guidance flipped to `*.hexclave.com`. Carve-outs: PR 1's
intentional JWT issuer dual-accept table in
[tokens.tsx](apps/backend/src/lib/tokens.tsx), the legacy `./docs/`
folder, the `unified-docs-widget` allowlist (deliberately accepts both
during DNS transition), and `url-targets.ts` hosted-component default
(baked into existing customer deploys).

- **`@deprecated` JSDoc** on every `Stack*` public export
([packages/template/src/lib/stack-app/index.ts](packages/template/src/lib/stack-app/index.ts)
+ [packages/template/src/index.ts](packages/template/src/index.ts)) —
`StackClientApp`, `StackServerApp`, `StackAdminApp` + every
constructor/options/JSON type, `StackHandler`, `StackProvider`,
`StackTheme`, `useStackApp`, `defineStackConfig`, `StackConfig`.
Hexclave\* aliases are now canonical.

- **Runtime `console.warn`**
([packages/template/src/internal/deprecation-warning.ts](packages/template/src/internal/deprecation-warning.ts))
— once-per-process when the SDK is loaded from a `@stackframe/*`
artifact. Detection uses the existing
`STACK_COMPILE_TIME_CLIENT_PACKAGE_VERSION_SENTINEL` (rewritten at build
time to e.g. `js @stackframe/stack@2.8.92` or `js
@hexclave/next@1.0.0`); `@hexclave/*` mirror artifacts short-circuit the
warning.

- **Tier 3 data migration**: new idempotent SQL migration
[`20260523000000_rename_internal_project_to_hexclave`](apps/backend/prisma/migrations/20260523000000_rename_internal_project_to_hexclave/migration.sql)
— updates the internal Project `displayName` 'Stack Dashboard' →
'Hexclave Dashboard' and `description` only if both still hold the
pre-rebrand defaults. Operator-renamed projects untouched, missing row
no-ops, re-runs are no-ops. [`seed.ts`](apps/backend/prisma/seed.ts:87)
default flipped. `getSharedEmailConfig("Stack Auth")` → `("Hexclave")`.

- **Tier 4 brand strings** (mechanical sweep, ~340 files):
- Page + OpenAPI titles (Hexclave API / Dashboard / REST API / Webhooks
API / Documentation). OpenAPI `info.description` documents
`X-Hexclave-*` headers as canonical with compat note on `X-Stack-*`.
- `HexclaveAssertionError` message text
([errors.tsx:71](packages/stack-shared/src/utils/errors.tsx:71)) — "an
error in Stack." → "an error in Hexclave."
- Known-error message templates
([known-errors.tsx](packages/stack-shared/src/known-errors.tsx)) flipped
to lead with `x-hexclave-*` + the new `docs.hexclave.com` URL; legacy
`x-stack-*` mentioned as compat aliases. **25 e2e test files updated in
lockstep**.
- Email content: failed-emails-digest body, sendTestEmail recipient (now
`sent-with-hexclave.com`), test-email-recipient default.
  - `CHANGELOG.md` title → "Hexclave Changelog".
- `AGENTS.md` env var convention: new vars prefix `HEXCLAVE_` /
`NEXT_PUBLIC_HEXCLAVE_` for Category A/B; legacy `STACK_*` explicitly
noted as accepted via PR 1's dual-read.

- **CLI / init wizard**:
- Every dashboard setup snippet, init-stack template, and docs-mintlify
page teaches `npx @hexclave/cli@latest init` (was
`@stackframe/stack-cli`).
[setup-page.tsx](apps/dashboard/src/app/(main)/(protected)/projects/[projectId]/(overview)/setup-page.tsx)
+
[link-existing-onboarding](apps/dashboard/src/app/(main)/(protected)/(outside-dashboard)/new-project/page-client-parts/link-existing-onboarding.tsx).
- [init-stack](packages/init-stack/src/index.ts:634)
`STACK_*_INSTALL_PACKAGE_NAME_OVERRIDE` defaults flipped to
`@hexclave/*`.
- Generated `stack/client.ts` / `stack/server.ts` import from
`@hexclave/next` and reference `HexclaveClientApp` /
`HexclaveServerApp`.
- Internal `StackAuthKeys` dashboard component renamed to
`HexclaveKeys`.

- **docs-mintlify rewrite** (legacy `./docs/` intentionally untouched
per scoping decision):
- **78 MDX files swept**.
`@stackframe/{react,stack,js,tanstack-start,...}` →
`@hexclave/{react,stack,js,...}` in install snippets and code blocks;
`Stack*` SDK class names → `Hexclave*` in all code examples; 'Stack
Auth' brand phrase → 'Hexclave'.
- `openapi/{server,admin,client,webhooks}.json` titles → 'Hexclave REST
API' / 'Hexclave Webhooks API'.

- **Generators flipped before regeneration**:
-
[`packages/stack-shared/src/helpers/init-prompt.ts`](packages/stack-shared/src/helpers/init-prompt.ts),
[`/ai/prompts.ts`](packages/stack-shared/src/ai/prompts.ts),
[`apps/backend/src/lib/ai/prompts.ts`](apps/backend/src/lib/ai/prompts.ts),
[`apps/backend/src/lib/ai/tools/create-email-{template,draft}.ts`](apps/backend/src/lib/ai/tools/create-email-template.ts),
[`apps/skills/src/app/route.ts`](apps/skills/src/app/route.ts) (taught
MCP tool → `ask_hexclave` with compat note; CLI binary teach →
`hexclave`),
[`docs-mintlify/snippets/home-prompt-island.jsx`](docs-mintlify/snippets/home-prompt-island.jsx),
[`packages/template/README.md`](packages/template/README.md) +
integrations/convex/component/README.md.
  - `generate-sdks` propagated changes to `packages/{react,stack,js}`.

- **OpenAPI dual-documentation**:
[`apps/backend/src/app/api/latest/route.ts`](apps/backend/src/app/api/latest/route.ts)
now lists `X-Hexclave-*` headers as primary documented schemas with
`X-Stack-*` duplicates marked `.optional()` (both accepted at runtime by
PR 1's normalize-at-proxy shim).

- **`@stackframe/emails` virtual module**: dual-aliased to
`@hexclave/emails` at the bundler boundary
([email-rendering.tsx:89](apps/backend/src/lib/email-rendering.tsx:89)).
Stored email templates continue to import from either name; new
AI-generated templates and the system prompt teach `@hexclave/emails`.

- **Tier 2 mirror-publish wiring** (new this PR, lays the groundwork for
`@hexclave/*` first publish):
-
[`scripts/rewrite-packages-to-hexclave.ts`](scripts/rewrite-packages-to-hexclave.ts)
— rewrites 9 publishable `@stackframe/*` → `@hexclave/*` `package.json`
files (reads `HEXCLAVE_VERSION` env or `--version=` flag), pins
cross-deps to the shared `@hexclave` version, registers `hexclave` bin
alongside `stack` for `@hexclave/cli`.
-
[`.github/workflows/npm-publish.yaml`](.github/workflows/npm-publish.yaml)
appended with rewrite-then-republish step. `pnpm publish` skips
already-on-npm versions so reruns are safe.

- **Sender email domain**: `noreply@stackframe.co` →
`noreply@sent-with-hexclave.com` (the dedicated transactional-sender
domain split per the plan, to isolate bulk deliverability from
`hexclave.com` reputation); `security@` / `team@stack-auth.com` inbound
mailboxes → `@hexclave.com`.

- **Self-host docs**: docker network / container names in the bash
examples flipped from `stack-auth` to `hexclave` (`hexclave-postgres`,
`hexclave-clickhouse`, `hexclave.env`). The docker image tag
`stackauth/server:latest` stays per the plan's locked decision.

- **GitHub repo slug**: `hexclave/stack-auth` → `hexclave/hexclave` in
every `package.json` `repository` field, README link, CHANGELOG
raw-asset URL.

## Carve-outs (deliberately untouched)

-
**[`apps/backend/src/lib/tokens.tsx`](apps/backend/src/lib/tokens.tsx)**
JWT issuer dual-accept table — PR 1 intentional infrastructure, kept
indefinitely.
- **Legacy `./docs/` folder** — per scoping decision (only
`docs-mintlify/` rewritten).
- **`unified-docs-widget` hostname allowlist** — accepts both
`.hexclave.com` (canonical) and `.stack-auth.com` (transition window)
for DNS rollout.
- **`url-targets.ts`** hosted-domain default
`.built-with-stack-auth.com` — wire identifier baked into existing
customer deploys; indefinite read-fallback.
- **Binary visual assets** (logos, favicons, OG images, README
screenshots) — out of scope for this PR. Need design work; tracked
separately.

## Verification

- **`pnpm typecheck`** on
`packages/{template,stack-shared,react,stack,js}` + `apps/dashboard`:
**all green**. The remaining backend / e-commerce-demo typecheck errors
are pre-existing (Prisma codegen output +
`./generated/api-versions.json` not present in fresh worktrees without
`pnpm run codegen-prisma` + a live DB) and unrelated to this diff.
- **`pnpm lint`** on the same 6 packages: all green.
- **Final grep** for residual `Stack Auth` / `stack-auth.com` /
`@stackframe/stack-cli@latest` references: zero outside the intentional
carve-outs above.
- **25 e2e test files updated in lockstep** with the known-error message
changes (asserted strings flipped to match the new x-hexclave-* +
compat-note messages).

## Deploy blockers (ops sequencing before this rebrand goes live)

This PR is code-complete, but the rebrand's visible surfaces (SDK
default URLs, dashboard links, npm READMEs, REST error messages, runtime
deprecation warning) all point at `*.hexclave.com` / `@hexclave/*`
resources that don't exist yet. None of these are fixable from a PR —
they're ops/registrar/npm work that has to be sequenced before merging
this to a release tag.

Suggested ordering, hardest blockers first:

### Tier 1 — required before customer-facing deploy (everything below
this line *will visibly break customers on day 1* if skipped)

1. **DNS + TLS for `api.hexclave.com` + `api1./api2.hexclave.com`** →
must point at the same backend that serves `api.stack-auth.com` (or a
backend that mirrors PR 1's dual-accept). The SDK's new `defaultBaseUrl`
is `https://api.hexclave.com`; every customer that relied on the old
default and upgrades to a post-PR2 SDK build sends API requests here.
Until this resolves, every default-config customer's API call NXDOMAINs.
2. **DNS for `app.hexclave.com`** → the dashboard. Referenced in the
SDK's default-error messages ("Please create a project on the Hexclave
dashboard at https://app.hexclave.com"), the init-stack flow's
`wizard-congrats` redirect, and the OAuth dashboard handoff.
3. **DNS for `docs.hexclave.com`** + Mintlify deploy → the SDK runtime
deprecation warning (`https://docs.hexclave.com/migration`), every
README, every "Learn more" link in the dashboard, and every REST API
error body (`/api/overview#authentication`) points here. The MDX is in
this PR; the docs build target needs DNS.
4. **DNS for `mcp.hexclave.com`** → the MCP server endpoint that every
taught agent integration (`claude mcp add ...`, `cursor`, `codex`,
`vscode`) registers. Until this resolves, every `npx
@hexclave/cli@latest init` MCP-registration step fails.
5. **Reserve the `@hexclave` npm scope + set repo variable
`HEXCLAVE_VERSION`** → the mirror-publish step in
`.github/workflows/npm-publish.yaml` is gated on this variable. Without
it, the entire taught onboarding command `npx @hexclave/cli@latest init`
404s from the npm registry, *and* every README that says "install
`@hexclave/next`" leads to install failure. Pick the initial version
intentionally (`1.0.0` or aligned to `@stackframe/stack`); don't accept
a silent default.

### Tier 2 — required before announcing the rebrand publicly (lookalike
or low-traffic surfaces, but visibly broken)

6. **DNS for `r.hexclave.com`** → the analytics beacon
`defaultAnalyticsBaseUrl`. Silent failure if missing (analytics drops),
but should land alongside Tier 1.
7. **Register `sent-with-hexclave.com` + full email auth (SPF / DKIM /
DMARC)** → the new default sender domain for shared-sender transactional
emails. Without it the dashboard "send test email" path emits bounces,
and shared-sender flows (`getSharedEmailConfig("Hexclave")`) deliver to
spam at best.
8. **MX + SPF / DMARC for `hexclave.com`** → `team@hexclave.com` and
`security@hexclave.com` mailboxes. The security disclosure mailbox is
referenced in [`.github/SECURITY.md`](.github/SECURITY.md);
`team@hexclave.com` is the actual recipient of internal feedback emails
sent at runtime by
[`apps/backend/src/lib/internal-feedback-emails.tsx`](apps/backend/src/lib/internal-feedback-emails.tsx).
Today, every runtime feedback email bounces.
9. **DNS for `skill.hexclave.com`** → the canonical AI-agent skill fetch
URL (the agent bootstrap pivot). Without it, the entire "agent downloads
`SKILL.md` from a known URL" flow taught in
[`packages/stack-shared/src/helpers/init-prompt.ts`](packages/stack-shared/src/helpers/init-prompt.ts)
fails.
10. **Create `github.com/hexclave/hexclave` as a public repo** (even as
a redirect to `hexclave/stack-auth`) **OR** rewrite every `package.json`
`"repository"` field + dashboard footer "view on GitHub" link to point
at `hexclave/stack-auth` (which already exists). Currently every npm
package page's "Repository" link is dead, and the dashboard's GitHub
button + dev-tool repo link are dead.

### Tier 3 — broken but low-visibility / low-traffic

11. **DNS for `discord.hexclave.com`** → Discord invite redirect, used
in every README's chip and the dashboard footer.
12. **DNS for `demo.hexclave.com`** → " Demo" badge in every npm
package README. Broken-image badge on the package page.
13. **DNS + TLS for `built-with-hexclave.com`** → optional
hosted-handler domain (the default reverted to
`.built-with-stack-auth.com` in this PR's carve-outs, so this only
matters for projects that manually flip).

## Other follow-ups (not deploy-blocking)

- **E2E snapshot regen across the full suite** for the dual-emitted
`x-hexclave-*` response headers (PR 1 follow-up; `vitest -u` in CI
absorbs).
- **Binary visual assets** — logos, favicons, OG images, README
screenshots; need design pass.
- **Backend OpenAPI fumadocs regen** in CI flow — the JSON files in
`docs-mintlify/openapi/` are committed but regen runs in CI. Verify the
workflow that does this still works against the post-PR2 source.
- **Backend typecheck infra debt** — needs `codegen-prisma` +
`codegen-route-info` to clear; pre-existing, unaffected by this PR.

## Test plan

- [ ] CI runs full e2e suite (with `vitest -u` to absorb residual
snapshot deltas, then committed back).
- [ ] Spot-check: new `@hexclave/cli init` (once published) generates
`hexclave.config.ts` and works against a fresh project.
- [ ] Spot-check: existing customer with `@stackframe/stack` import sees
the once-per-process `console.warn` recommending `@hexclave/next` on SDK
init.
- [ ] Manual: dashboard setup page renders the `npx @hexclave/cli@latest
init` snippet and the `x-hexclave-publishable-client-key` API header in
the curl example.
- [ ] Manual: a fresh `pnpm run prisma migrate` against a clean DB sets
the internal project displayName to 'Hexclave Dashboard'.

---------

Co-authored-by: Konstantin Wohlwend <n2d4xc@gmail.com>
2026-05-26 19:18:20 -07:00
BilalG1
a4ae7edecd
fix(ci): repair two pre-existing test failures on dev (#1488)
Both failures are pre-existing on `dev` (confirmed by checking the most
recent dev run
[26434368271](https://github.com/hexclave/stack-auth/actions/runs/26434368271)
— same two annotations, same line numbers). Neither is caused by an open
PR.

## Failure 1 — \`apps/backend/src/lib/redirect-urls.test.tsx:75\`

\`\`\`
AssertionError: expected false to be true
\`\`\`

The \`withHostedHandlerEnv\` helper set/cleared only the
\`STACK_*\`-prefixed env vars. CI's
[e2e-custom-base-port-api-tests.yaml:21](.github/workflows/e2e-custom-base-port-api-tests.yaml#L21)
sets only the \`HEXCLAVE_*\`-prefixed sibling
(\`NEXT_PUBLIC_HEXCLAVE_PORT_PREFIX=67\`), and the dual-read shim in
[packages/stack-shared/src/utils/env.tsx#L53-L55](packages/stack-shared/src/utils/env.tsx#L53-L55)
prefers \`HEXCLAVE_*\` over \`STACK_*\`:

\`\`\`ts
const hexclaveName = getHexclaveEnvVarName(name);
let value = (hexclaveName ? process.env[hexclaveName] : undefined) ??
process.env[name];
\`\`\`

So \`getEnvVariable(\"NEXT_PUBLIC_STACK_PORT_PREFIX\", \"81\")\`
returned \`\"67\"\` instead of the test's \`\"92\"\`, the template
resolved to port \`6709\` instead of \`9209\`, and the assertion at line
75 failed.

**Fix:** mirror every \`STACK_*\` key managed by the helper to its
\`HEXCLAVE_*\` sibling. The dual-read then resolves to the
test-controlled value regardless of which key it checks first.

## Failure 2 —
\`apps/backend/prisma/migrations/20260526060000_nullable_oauth_access_token_expires_at/tests/nullable-expires-at.ts:58\`

\`\`\`
PostgresError: null value in column \"updatedAt\" of relation
\"OAuthAccessToken\" violates not-null constraint
\`\`\`

The migration test's raw INSERT omits \`\"updatedAt\"\`. The Prisma
model declares \`updatedAt DateTime @updatedAt\` with no
\`@default(now())\`, so the DB column is \`NOT NULL\` with no default —
Prisma populates it at the ORM layer on insert, but this test bypasses
Prisma via \`postgres.js\`.

**Fix:** add the \`\"updatedAt\"\` column to the INSERT, set to
\`NOW()\`, with a comment noting why raw SQL must set it explicitly.

## Verification

- **Failure 1, before fix:** ran \`NEXT_PUBLIC_HEXCLAVE_PORT_PREFIX=67
pnpm test run apps/backend/src/lib/redirect-urls.test.tsx\` locally →
reproduces the exact line-75 assertion failure from CI.
- **Failure 1, after fix:** same command → 33/33 pass.
- **Failure 2:** local reproduction requires the migration-test postgres
harness; the fix is one column matching how every other raw SQL insert
in this repo handles \`@updatedAt\` fields. CI on this branch will
confirm.

<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Fixes two failing tests on dev CI by aligning env var handling in
redirect URL tests and by setting the missing updatedAt in a migration
test. Restores green CI with no runtime changes.

- **Bug Fixes**
- Redirect URL tests: `withHostedHandlerEnv` now mirrors `STACK_*`
values to their `HEXCLAVE_*` siblings and restores both, so
`getEnvVariable` reads the test-controlled values even when CI sets only
`HEXCLAVE_*` (e.g. `NEXT_PUBLIC_HEXCLAVE_PORT_PREFIX`).
- Migration test: the raw insert into `OAuthAccessToken` now sets
`"updatedAt" = NOW()` since `Prisma`’s `@updatedAt` isn’t applied when
using `postgres.js` and the column is NOT NULL.

<sup>Written for commit 75c8e4343e.
Summary will update on new commits. <a
href="https://cubic.dev/pr/hexclave/stack-auth/pull/1488?utm_source=github">Review
in cubic</a></sup>

<!-- End of auto-generated description by cubic. -->
2026-05-26 12:59:44 -07:00
BilalG1
f7e389809e
feat(hexclave): PR 1 — wire compatibility layer (invisible) (#1475)
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
## Summary

**Stacked on #1468** (`docs/hexclave-rename-plan` — the plan doc). Diff
vs that base = the actual PR 1 code.

This is **PR 1 of the Hexclave rebrand: the invisible compatibility
layer**. Everything is additive. Old SDKs, old wire identifiers, and old
env var names keep working unchanged. The backend dual-accepts and
dual-emits; new SDK code emits `x-hexclave-*` headers and the
`hexclave_` Bearer prefix; cookies dual-write; env vars dual-read across
every category. **No user-visible rebranding lands here** — that's PR 2.

See [`RENAME-TO-HEXCLAVE.md`](./RENAME-TO-HEXCLAVE.md) → *"PR 1
implementation guide"* for the full per-work-area spec, file pointers,
and chosen approach.

## What's implemented (all 14 PR-1 work-areas)

- **SDK export aliases** — `Hexclave*` aliases for the user-facing
`Stack*` exports added in `packages/template`; codegen propagates them
to `@stackframe/{js,stack,react,tanstack-start}`. React-only aliases
correctly excluded from `@stackframe/js`. (`e60550a2`)
- **JWT issuer dual-accept** — `decodeAccessToken` accepts both
`api.stack-auth.com` and `api.hexclave.com` issuers. Signing unchanged.
(`fc781def`)
- **Request-header dual-accept** — backend + dashboard proxies normalize
`x-hexclave-*` → `x-stack-*` at the existing empty proxy hook (so
`smart-request.tsx` and every route schema keep working unchanged); CORS
allowlists extended via a derive-once helper. (`2a056eac`)
- **MCP `ask_hexclave`** — registered alongside `ask_stack_auth` via a
shared helper; `ask_stack_auth` behavior byte-identical. (`30ffd604`)
- **Dev-tool** — DOM ids + header emit switched.
`window.HexclaveDevTool` exposed alongside `window.StackDevTool`.
(`32131ea7`)
- **The big consolidated commit** (`7fed864a`):
- **Env vars** — central `getEnvVariable` prefix-transform (HEXCLAVE
first, STACK fallback); dashboard + template client env files dual-read;
`turbo.json` globalEnv; `NEXT_PUBLIC_STACK_PORT_PREFIX` renamed outright
across ~82 files including docker.
- **Cookies** — dual-write/dual-read auth (`stack-access`/`-refresh-*`
and custom-domain variants), OAuth-state
(`stack-oauth-{inner,outer}-*`), and low-risk cookies (`stack-is-https`,
`stack-last-seen-changelog-version`). Bypass sites patched (backend
OAuth callback, dashboard remote-dev auth route, impersonation snippets,
snapshot serializer).
- **Bearer prefix** — SDK token parser accepts both `stackauth_` and
`hexclave_`; emits `hexclave_`. Discovery correction: this is purely
SDK-internal — the backend never parses it.
- **Response headers** — backend dual-emits
`x-hexclave-{request-id,actual-status,known-error}`; SDKs dual-read (new
first, stack fallback).
- **SDK request-header emit switch** —
`client/server/admin-interface.ts` + dashboard `api-headers.ts` +
`internal-project-headers.ts` + `feedback-form.tsx` switched to
`x-hexclave-*`. Plus `stack_response_mode` query param.
- **Storage keys** — dev-tool / cli-auth / oauth-button / docs keys
renamed (straight); `stack:session-replay:v1` dual-read so in-progress
recordings survive SDK upgrades; `stack_mfa_attempt_code` dual-read.
- **Query params** — cross-domain params dual-emit/dual-accept via
shared helpers; backend `oauth/authorize` accepts
`hexclave_response_mode` and `stack_response_mode`; `stack-init-id`
renamed.
- **`Symbol.for`** — app-internals symbol gets a parallel
`Symbol.for("Hexclave--app-internals")` getter on each attach site (no
read-site churn — old symbol still attached). 3 file-private symbols
renamed outright.
- **Config discovery** — prefer `hexclave.config.ts`, fall back to
`stack.config.ts` at every discovery site (CLI / dashboard / backend /
local-emulator); `init` writes the new filename; CLI credentials path
migrates.
- **Internal renames** — `StackAssertionError`,
`StackClient/Server/AdminInterface` renamed outright (no alias, per the
"internal-only → rename" rule). ~264 files touched.
- **Review-pass fixes** (`21217fbe`) — three real bugs found by parallel
review agents and fixed:
- `snapshot-serializer.ts` was interpolating the whole
`keyedCookieNamePrefixes` array (`${arr}`) — adding a second prefix
would have corrupted **every** OAuth-cookie snapshot, not just new ones.
- **Docker port-prefix producer/consumer mismatch** —
`entrypoint.sh`/`run-emulator.sh`/cloud-init `user-data` were still
producing `NEXT_PUBLIC_STACK_PORT_PREFIX` while the dashboard sentinel +
consumers had been renamed; silent self-host regression (custom port
prefix would be ignored).
- **Missing `hexclave-oauth-inner-*` dual-write** in the OAuth authorize
route — callback's fallback masked it but the dual-write was specified
by the plan.
- Plus: `mcp.test.ts` tool-list assertions updated to include
`ask_hexclave`; two dashboard header-emit sites switched to
`x-hexclave-*` for consistency.
- **E2E snapshot serializer follow-up** (`4b16cc5d`) —
`x-hexclave-request-id` added to the hidden-headers list (mirroring
`x-stack-request-id` treatment), and 2 sample inline snapshots
regenerated in `projects.test.ts` to include the new dual-emitted
headers.

## Verification

- **`pnpm typecheck`** — clean (the fresh-worktree `@/.source` / Prisma
codegen gap in `stack-docs` is pre-existing and unrelated).
- **`pnpm lint`** — 29/29 packages green.
- **`pnpm exec turbo run build --filter=./packages/*`** — 13/13 packages
build (including `@stackframe/stack-cli` once the dashboard standalone
is present).
- **Live E2E** against a running backend on `cl/hexclave-pr1`:
- `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/internal/mcp.test.ts` — **6/6
pass** (verifies the new `ask_hexclave` tool — the hand-written inline
snapshot matched actual MCP server output).
- `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/internal/projects.test.ts` —
**11/11 pass** (verifies wire dual-accept + dual-emit end-to-end; the
snapshot serializer fix was found and applied during this check).

A four-agent parallel **review pass** also audited the full diff for
logic/runtime bugs across the work-areas (wire headers + JWT, cookies +
bearer + symbols, env vars, query params + config + MCP + aliases). All
in-slice review verdicts were ✓ except the three bugs listed above,
which are now fixed.

## Known follow-ups (out of scope for this PR)

- **E2E snapshots across the rest of the suite** — backend now
dual-emits `x-hexclave-{known-error,actual-status}` alongside
`x-stack-*`, which legitimately appears in inline snapshots throughout
`apps/e2e`. Two were regenerated here as a sample; the rest should regen
with `vitest -u` in CI.
- **Docker shell env vars beyond `PORT_PREFIX`** — `entrypoint.sh` still
reads `STACK_*` env vars directly (the JS-side `getEnvVariable`
transform doesn't help the shell). JS consumers dual-read so it works in
practice; full shell-level dual-read is a deeper self-host follow-up.
- **`@stackframe/stack-cli` build ordering** — pre-existing; needs
`build:rde-standalone` first. Not affected by this PR.

## Test plan

- [ ] CI runs full e2e suite (with `vitest -u` to absorb dual-emit
snapshot deltas, then committed back)
- [ ] Spot-check: an old SDK build (emitting only `x-stack-*`) still
authenticates against the new backend
- [ ] Spot-check: a new SDK (emitting `x-hexclave-*` / `Bearer
hexclave_*`) still authenticates against an old backend during deploy
ordering
- [ ] Manual: `npx @stackframe/stack-cli@latest init` (new onboarding
entrypoint) generates `hexclave.config.ts`
- [ ] Manual: existing `stack.config.ts`-only project still resolves (no
migration required)

---------

Co-authored-by: bilal <bilal@stack-auth.com>
2026-05-23 17:24:55 -07:00
Mantra
10df9b2b7b
Fix dashboard sandbox compile errors and switch smart model to Grok (#1476)
## Summary
- Forward Babel/JSX compile errors, runtime throws, and unhandled
rejections from the AI dashboard sandbox iframe to the parent composer
via `postMessage`, so users see actionable errors instead of a blank
preview
- Compile AI-generated dashboard source explicitly with
`Babel.transform` + try/catch (stored in `text/plain` to avoid Babel's
auto-handler swallowing parse errors) and add `crossorigin="anonymous"`
on the Babel script for readable cross-origin error messages
- Switch authenticated smart-tier model from
`moonshotai/kimi-k2.6:nitro` to `x-ai/grok-build-0.1`

## Test plan
- [ ] Generate a dashboard with valid AI code and confirm the preview
still renders
- [ ] Generate a dashboard with invalid JSX and confirm the composer
shows the compile error (not a blank iframe)
- [ ] Trigger a runtime error in generated dashboard code and confirm it
reaches the parent error boundary
- [ ] Verify authenticated smart-tier requests route to
`x-ai/grok-build-0.1`


Made with [Cursor](https://cursor.com)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Embedded dashboards now show a clear “Dashboard failed to compile”
message on compilation errors instead of a blank iframe.
* Dashboard runtime errors and unhandled promise rejections are captured
earlier and forwarded to the parent for improved visibility.

* **Updates**
* The authenticated AI model used for the "smart" quality has been
changed, affecting model selection for authenticated requests.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1476?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-23 12:34:37 -07:00
Konstantin Wohlwend
725f2da886 Fix tests 2026-05-23 12:29:20 -07:00
Konstantin Wohlwend
d4663fbe7d Capture more errors on failures 2026-05-23 10:34:55 -07:00
Konstantin Wohlwend
760b866fea Fix CI/CD 2026-05-23 09:36:23 -07:00
Mantra
9b1851dd54
Managed email domain deletion and Cloudflare DNS import UX (#1442)
## Summary
- Add an admin-only delete endpoint and SDK method to remove managed
email domains, with Resend/DNSimple cleanup and a guard against deleting
domains currently in use for sending.
- Add dashboard UI to remove unused managed domains (with confirmation)
and improve the DNS setup step with Cloudflare detection, zone file
download, and import instructions.
- Add E2E coverage for delete auth, success, in-use rejection,
post-switch deletion, and 404 cases.

## Test plan
- [ ] Run `pnpm test run managed-email-onboarding`
- [ ] In dashboard email settings, add a managed domain and verify
Cloudflare hint appears when NS records point to Cloudflare
- [ ] Remove an unused managed domain and confirm it disappears from the
list
- [ ] Verify active (in-use) managed domains cannot be deleted until
email provider is switched away


Made with [Cursor](https://cursor.com)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Delete managed email domains from the dashboard with a confirmation
flow and success notification
* Cloudflare-aware domain setup: detection banner, quick links to
Cloudflare DNS, downloadable zone file, and import instructions
  * Admin API and admin-app method to perform managed-domain deletion

* **Bug Fixes**
* Deletion blocked with a clear error when a domain is actively used for
sending

* **Tests**
* Added end-to-end coverage for managed-domain delete scenarios
(success, in-use conflict, auth rejection, and 404)

* **Style**
* Data grid layout adjusted to prevent unintended full-height stretching
across various tables

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1442?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-23 09:22:29 -07:00
Konstantin Wohlwend
f6ef49a3dc Remove source-of-truth logic 2026-05-23 01:06:42 -07:00
Konstantin Wohlwend
884f8dfcea Better logs for unknown email errors 2026-05-22 21:08:31 -07:00
Konsti Wohlwend
6a8b45e6c5
fix: CI failures — Node 22, test timeouts, QEMU transaction timeout (#1479) 2026-05-22 20:24:13 -07:00
BilalG1
fb9f77843e
Populate ClickHouse analytics tables when seeding preview projects (#1471)
## Summary

In preview-mode deployments (`NEXT_PUBLIC_STACK_IS_PREVIEW=true`) the
project overview dashboard reported **0 total users, 0 monthly active
users, and no live users** on the globe. The internal metrics endpoint
reads user/team totals from the ClickHouse `analytics_internal.*` tables
and "live users" from recent `$token-refresh` events — but those tables
are normally filled by the external-db-sync pipeline, which does not run
in preview deployments, so they were empty.

This makes the preview/demo dummy-data seeder populate ClickHouse
directly:

- **`seedDummyAnalyticsMirrorTables`** — mirrors the seeded users /
teams / contact channels into `analytics_internal.users` / `teams` /
`contact_channels` so the metrics endpoint reports real totals.
- **`seedDummyLiveTokenRefreshEvents`** — emits recent `$token-refresh`
events across distinct countries so the overview globe shows live users.
- **Timestamp clamping** — `bulkRandomTimestampOnDay` and the
page-view/click timestamps are clamped so seeded events are never dated
in the future (future-dated events permanently matched the unbounded
"live users" query).
- **`buildTokenRefreshClickhouseRow`** — shared helper for the
`$token-refresh` ClickHouse row shape.
- **`create-project`** — pre-warms the ClickHouse connection so the
seeding inserts don't pay the cold-start cost.
- **`projects-metrics`** — types the ClickHouse `.json()` results (fixes
a `tsc` error).

Also bundles a seeding performance optimization that skips redundant
idempotency lookups when seeding a brand-new project.

Notes:
- Seeded mirror rows use `sync_sequence_id = 0` so that if the
external-db-sync pipeline ever does run for the project, any real update
supersedes the seeded placeholder under `ReplacingMergeTree` + `FINAL`.
- "Live users" naturally decays out of the ~2-minute window a couple of
minutes after project creation; preview creates a fresh project per
visit, so the initial overview always shows them.

## Test plan

- [x] `pnpm --filter @stackframe/backend typecheck` passes
- [x] `pnpm --filter @stackframe/backend lint` passes
- [x] Created fresh preview projects; overview shows non-zero Total
Users / Monthly Active Users
- [x] `analytics_internal.users` / `teams` / `contact_channels`
populated for the seeded project
- [x] Globe shows 8 live users across 8 distinct countries (verified via
the metrics 2-minute query)
- [x] No future-dated `$token-refresh` events in
`analytics_internal.events`

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Refactor**
* Faster preview project creation by pre-warming the analytics database
and reusing the warmed connection.
* Reduced initialization delays and redundant checks when seeding
brand-new projects; creation paths now skip needless probes.
* More efficient, parallelized seeding of teams/users/events with
deterministic handling of token-refresh and session-replay data.
* Safer timestamp generation to avoid future-dated events and deferred
background processing for long-running tasks like payments.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1471?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com>
2026-05-22 15:54:24 -07:00
Konsti Wohlwend
197ad09eea
Route read-only raw Prisma queries to read replica (#1467) 2026-05-22 11:16:11 -07:00
Konstantin Wohlwend
99f07e9516 Trust hosted domains
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
Publish npm packages / publish (push) Has been cancelled
Publish Swift SDK to prerelease repo / publish (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
2026-05-21 18:23:23 -07:00
Konsti Wohlwend
c6d59d0288
Cross domain handoffs (#1458) 2026-05-21 17:15:12 -07:00
Konsti Wohlwend
600a0d6fcf
fix: handle race condition in recordExternalDbSyncDeletion (#1466) 2026-05-21 17:02:20 -07:00
BilalG1
b8fc04bdbd
feat: link Stack Auth projects to GitHub and push config from the dashboard (#1450)
End-to-end flow for managing Stack Auth config via GitHub: link a repo
during onboarding, edit settings in the dashboard, and have the change
committed to your repo + synced back via a GitHub Actions workflow.


![demo](https://gist.githubusercontent.com/BilalG1/29d1188fc581e87d1311baec6e2ae770/raw/demo-2x.gif)

## What this adds

- **CLI** — `stack config push --source github --source-repo
--source-path --source-workflow-path`. Records the source on the config
row so the dashboard knows where the file lives. Reads `GITHUB_SHA` /
`GITHUB_REF_NAME` for commit + branch.
- **Onboarding "Link existing project"** — searchable repo/branch
comboboxes, auto-detects candidate `stack.config.{ts,js}` paths, writes
`STACK_AUTH_PROJECT_ID` + `STACK_AUTH_SECRET_SERVER_KEY` secrets, and
commits a generated workflow YAML that re-runs `stack config push` on
every change to the config file.
- **Dashboard "Push to GitHub" dialog** — replaces the prior TODO
buttons. Pre-flights `repo`+`workflow` scopes on the user's GitHub
connection; if missing, the button flips to "Reconnect with GitHub". On
push, commits the dashboard's edit straight to the linked repo/branch
via the Contents API (with `cache: "no-store"` to dodge GitHub's 60s GET
cache so consecutive pushes don't 409). Suspense boundary scoped to the
dialog body so opening it doesn't blank the dashboard.
- **Project settings** — surface the linked workflow file as a clickable
GitHub link when the source carries `workflow_path`.

## Test plan

- `pnpm lint` (29/29) ✓
- `pnpm typecheck` (29/29) ✓
- `pnpm --filter @stackframe/stack-cli test` (111/111) ✓
- Dashboard vitest on the three relevant files
(`link-existing-onboarding-workflow`, `github-api`,
`github-config-push`) — 37/37 ✓
- Live end-to-end: `BilalG1/lex-lookup` linked to a local dev project;
passkey toggled, push committed `0bb958bd`
([commit](0bb958bda3)).

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
  * Persist workflow file paths for GitHub-backed config sync
* Dashboard “Push” flow to commit config updates with trimmed/default
commit messages
* CLI options to declare GitHub source (repo/path/workflow) and persist
selectable package runner for manual pushes
  * Show workflow-file link in project configuration when present

* **Improvements**
* Robust config-path normalization, existence checks, debounced
repo/branch search, and better GitHub rate-limit handling
* New GitHub API utilities for safe file read/commit and import-package
detection

* **Tests**
* Expanded tests covering GitHub API, config rendering/merge, and push
behaviors

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1450?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-21 13:47:46 -07:00
BilalG1
91b8e4caa4
Fix /internal/metrics ClickHouse OOM (#1457)
Fixes Sentry
[STACK-BACKEND-16H](https://stackframe-pw.sentry.io/issues/STACK-BACKEND-16H)
— the `/api/v1/internal/metrics` endpoint was triggering the cluster's
10.8 GiB OvercommitTracker kill on tenants with months of
`$token-refresh` history.

## Root cause

Three queries in `loadAnalyticsOverview` plus `loadUsersByCountry` did
`GROUP BY user_id` over the events table with **no lower `event_at`
bound**, so their hash table working set scaled with
cumulative-distinct-users-ever-seen instead of the 30-day metrics
window.

## Changes

- Add 30-day `event_at` lower bound to `loadUsersByCountry` and to the
`analyticsUserJoin` inner subquery (used by `dailyEvents`,
`totalVisitors`, `topReferrers`).
- New `getClickhouseAdminClientForMetrics()` factory in
`lib/clickhouse.tsx` with connection-level safety net: per-query +
per-user memory caps, external GROUP BY spill, and `join_algorithm:
'grace_hash,parallel_hash,hash'` (grace_hash measured to give 48% memory
reduction at zero latency cost — see benchmark notes in the file).
- Inline comment + concrete next steps for the long-term fix (option C:
stamp `is_anonymous` at ingest on page-view/click events, then drop the
join entirely).
- Extend `scripts/benchmark-internal-metrics.ts` with the
historical-seed knob and three new modes (`BENCH_BACKFILL_COMPARE`,
`BENCH_JOIN_ALGO_COMPARE`, plus the existing `BENCH_ROUTE_QUERIES`
updated) used to validate the choices above.

## Benchmark — pre-PR vs post-PR

Synthetic seed: 300k users × 9 events spread over 365 days (~2.7M
events).

| | pre-PR | post-PR | delta |
|---|---:|---:|---:|
| Sum peak memory | 2.18 GiB | 515 MiB | **4.3× less** |
| Max query duration | 1293 ms | 101 ms | **12.8× faster** |
| Sum CPU duration | 5119 ms | 394 ms | 13× less work |
| Sum bytes read | 3.87 GiB | 929 MiB | 4.3× less I/O |

Per-query at 300k users:
- `analyticsOverview:dailyEvents` 561 → 44 MiB (12.8× less)
- `analyticsOverview:totalVisitors` 560 → 50 MiB (11.2× less)
- `analyticsOverview:topReferrers` 546 → 50 MiB (10.9× less)
- `loadUsersByCountry` 388 → 44 MiB (8.9× less)

## Caveats

- `loadDailyActiveSplitFromClickhouse` still scans all-history on its
`min(event_at)` subquery. It can't be naively bounded — `first_date` is
used to classify entities as new vs reactivated, and a 30d bound would
silently mislabel old-but-active entities as "new." The new SETTINGS
cap+spill it; the proper fix is option C (documented inline).
- A user with a page-view but no `$token-refresh` in the last 30 days
now falls through to `coalesce(NULL, 0)` and is classified
non-anonymous. Token-refresh fires every few minutes per active session,
so this is rare but not impossible (embedded SDKs that poll less
frequently, sessions straddling the 30d boundary).
- `max_memory_usage_for_user: 9 GB` trades "cluster-wide
OvercommitTracker kill of a random query" for "clean per-user memory
error attributed to the specific query." After our 30d bounds, no query
is anywhere near 9 GB.

## Test plan

- [x] `pnpm typecheck` passes
- [x] `pnpm lint` passes
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/internal-metrics.test.ts` — 9/10
pass; the 1 failure (`risk_scores` snapshot drift) reproduces on clean
`dev` and is unrelated
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/analytics-{events,events-batch,query}.test.ts
apps/e2e/tests/backend/endpoints/api/v1/token-refresh-events.test.ts
apps/e2e/tests/backend/performance/metrics.test.ts` — all passing tests
pass; 10 pre-existing `PRODUCT_DOES_NOT_EXIST` setup failures reproduce
on clean `dev`
- [x] Benchmark `BENCH_ROUTE_QUERIES=1` at 300k users shows the deltas
above

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Improved internal metrics collection to use metrics-specific DB
settings for more reliable, safer analytical reads.
* Added guardrails to metrics queries to enforce time-window bounds and
avoid unbounded scans.
* Expanded benchmark modes (backfill and join-algo comparisons),
extended perf seeding, and improved logging/retry behavior to capture
more complete stats and reduce missing log rows.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1457?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-21 13:47:32 -07:00
Konstantin Wohlwend
104f347cbf Update AI chat models 2026-05-20 13:43:28 -07:00
Armaan Jain
055304d3fd
Onboarding app redesign (#1370)
# Onboarding app redesign

Rolls out a unified dashboard visual language centered on `DesignCard`
groupings, a new canonical `DesignDialog`, and an inline live-preview
pattern. Touches the project listing, project overview, auth methods,
design language, onboarding, and sign-up rules surfaces. Reusable
primitives (`DesignCard`, `DesignDialog`, `MethodToggleRow`) replace
one-off layouts, and the project card now leads with **total users +
30-day signups** instead of a weekly-users tile.

**Base:** `dev` → **Head:** `onboarding-app-redesign`

> Red outlines on the "after" shots highlight the UI that changed in
this PR. Empty outlines = layout/chrome change with no data delta.

---

## Flagship: Project listing (`/projects`)

Project cards swap the weekly-users widget for a `ProjectUsersMetric`
(total user count + 30-day signups sparkline). Hover lifts the card; the
metrics row is now part of the card body instead of a footer strip.

|        | Light | Dark |
|--------|-------|------|
| Before | ![before
light](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/projects-before-light.png)
| ![before
dark](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/projects-before-dark.png)
|
| After | ![after
light](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/projects-after-light.png)
| ![after
dark](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/projects-after-dark.png)
|

## Flagship: Auth methods (`/projects/[id]/auth-methods`)

Full restructure: the horizontal `SettingCard` strips are replaced by
stacked `DesignCard` sections (Sign-in methods · Sign-up policies · User
deletion), with a sticky **live sign-in preview** column on the right.
Provider rows become `MethodToggleRow`s with inline configure actions.

|        | Light | Dark |
|--------|-------|------|
| Before | ![before
light](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/auth-methods-before-light.png)
| ![before
dark](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/auth-methods-before-dark.png)
|
| After | ![after
light](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/auth-methods-after-light.png)
| ![after
dark](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/auth-methods-after-dark.png)
|

## Flagship: Project overview (`/projects/[id]`)

Line + donut charts migrate to the shared `AnalyticsChart` component.
Referrers list gains a max-height + scroll affordance so it no longer
pushes neighbouring tiles off-screen.

|        | Light | Dark |
|--------|-------|------|
| Before | ![before
light](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/overview-before-light.png)
| ![before
dark](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/overview-before-dark.png)
|
| After | ![after
light](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/overview-after-light.png)
| ![after
dark](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/overview-after-dark.png)
|

## Other migrated surfaces

| Surface | Before (dark) | After (dark) | What changed |
|---------|---------------|--------------|--------------|
| `/projects/[id]/onboarding` |
![](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/onboarding-before-dark.png)
|
![](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/onboarding-after-dark.png)
| Email-verification toggle adopts the new `MethodToggleRow` +
confirmation `DesignDialog` variant |
| `/projects/[id]/sign-up-rules` |
![](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/sign-up-rules-before-dark.png)
|
![](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/sign-up-rules-after-dark.png)
| Rule builder rewrapped in `DesignCard`/`DesignAlert`/`DesignButton`
primitives |
| `/projects/[id]/design-language` |
![](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/design-language-before-dark.png)
|
![](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/design-language-after-dark.png)
| Adds a `DesignDialog` showcase section so consumers can see the
canonical modal styling |
| `/playground` |
![](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/playground-before-dark.png)
|
![](https://gist.githubusercontent.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7/raw/playground-after-dark.png)
| New `dialog` playground entry exercising the size/variant/icon-chip
permutations |

Light-mode counterparts for the long-tail surfaces are in the [companion
gist](https://gist.github.com/mantrakp04/ff6b32969cb08510860e94be7d67dbf7).

---

## What's new

- **`DesignDialog`**
(`packages/dashboard-ui-components/src/components/dialog.tsx`) —
canonical modal with configurable size/variant, optional icon chip, and
split header/body/footer regions. Replaces ad-hoc `Dialog` +
`DialogContent` usage across the dashboard.
- **`MethodToggleRow`** — shared row primitive used by auth-methods and
onboarding for "thing with a toggle and an inline configure CTA".
- **`ProjectUsersMetric`** — total users + 30-day signups sparkline;
powers the new project card metric and reuses the
`projects-weekly-users` backend route renamed to `projects-metrics`.
- **`action-dialog`** gains `keepOpenOnOutsideInteraction` and
`contentClassName` props so variant chrome can ride along through the
existing helper.
- Backend: new internal `projects-metrics` route + test;
`seed-dummy-data.ts` updated to populate the new metric.

## Notes for reviewers

- Reusable primitives (`DesignCard`, `DesignDialog`, `MethodToggleRow`)
live in `packages/dashboard-ui-components` — please flag any inline
duplications you spot.
- The auth-methods live-preview only renders at `lg+`. Below that
breakpoint the page falls back to the stacked card layout.
- The OAuth provider config dialogs adopt the new pill toggle for
**Shared keys / Custom OAuth credentials**; the underlying form fields
are unchanged.

## Test plan

- [ ] `/projects` — verify the metric tile renders both empty-state and
populated (Demo Project has 584 users seeded)
- [ ] `/projects/[id]/auth-methods` — toggle each method on/off, confirm
live preview updates in real time
- [ ] `/projects/[id]/auth-methods` — open a provider dialog, switch
between Shared / Custom keys, verify form state preserved
- [ ] `/projects/[id]/onboarding` — toggle email verification, confirm
the confirmation dialog variant
- [ ] `/projects/[id]/sign-up-rules` — verify rule builder still saves
correctly under the new chrome
- [ ] Mobile/`md` breakpoint — auth-methods falls back to stacked
layout, no overflow
- [ ] Dark mode parity on every flagship surface

<sub>Visuals captured via local dev server (`localhost:8101`) on
`admin@example.com` seeded account. Red outlines mark new/changed UI on
the "after" pass.</sub>

---------

Co-authored-by: mantrakp04 <mantrakp@gmail.com>
Co-authored-by: Mantra <87142457+mantrakp04@users.noreply.github.com>
2026-05-20 13:10:22 -07:00
BilalG1
512099ed23
Speed up dummy-project seeding (preview create-project ~15s → ~1.3s) (#1437)
## Summary

The internal `preview/create-project` endpoint was taking ~15s because
`seedDummyProject` created its dummy users one at a time through the
full `usersCrudHandlers.adminCreate` CRUD pipeline (one DB transaction +
config render per user, ~86 users). This reworks the seeding path to use
bulk inserts.

End-to-end, the endpoint's server-side handler time drops from
**~15,100ms → ~1,300ms** (~11× faster).

## Seeding changes (`seed-dummy-data.ts`)

- **`seedDummyUsers` — bulk insert.** Build every row (`ProjectUser`,
`ContactChannel`, `AuthMethod`, `ProjectUserOAuthAccount`,
`OAuthAuthMethod`, default permissions) up front with pre-generated
UUIDs, then insert via one `createMany` per table inside a single
transaction — replacing ~86 sequential `adminCreate` transactions.
Named-user team memberships are bulk-inserted the same way (`TeamMember`
+ `TeamMemberDirectPermission`). Idempotency is preserved with a single
up-front email lookup, so re-runs against an existing project still skip
existing users.
- **Native `randomUUID`.** The seed paths now use `node:crypto`'s
`randomUUID()` instead of stack-shared's `generateUuid()`. The
browser-safe polyfill calls `crypto.getRandomValues` ~31× per UUID (once
per template char, each with a fresh `Uint8Array(1)`); generating
thousands of seed UUIDs made that ~800ms of pure CPU in the
activity-event build alone.
- **`seedBulkSignupsAndActivity`.** Skip the redundant back-date
`UPDATE` for freshly-inserted users (`createMany` already writes correct
`createdAt`/`signedUpAt`), and flush ClickHouse events in larger,
parallel batches.
- **`seedDummyProject`.** Run `seedBulkSignupsAndActivity` concurrently
with the lighter remaining steps, and fold `seedDummyTransactions` into
the emails/activity/replays `Promise.all`.
- Removed the now-unused `syncSeedUserOauthProviders` helper.

The bulk path produces the same rows as the CRUD-handler path (verified
row-count equality during development). Webhooks / soft-limit checks are
intentionally not fired for seed data, consistent with the rest of the
seed.

## Also in this PR — preview-mode 404 fix
(`preview-project-redirect.tsx`)

While testing the above, the dashboard 404'd right after a preview
project was created. In preview mode the `/projects` page renders
`PreviewProjectRedirect`, which `POST`s
`/internal/preview/create-project` and then `router.push()`es to
`/projects/<new-id>` — but it never refreshed the client-side
owned-projects cache, so the `[projectId]` route's `useAdminApp()` read
a stale list, failed to find the just-created project, and called
`notFound()`.

Fixed by refreshing the owned-projects cache before navigating, matching
what the normal create-project flow in `page-client.tsx` already does.
(Pre-existing bug, not caused by the seeding change — but it surfaces
the seeding path, so it's bundled here.)

## Testing

`pnpm typecheck` and `pnpm lint` pass for both backend and dashboard.
The preview endpoint was exercised repeatedly during development (HTTP
200, projects created and populated correctly).

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Performance**
* Much faster bulk user and event seeding via larger, parallelized
batches and optimized backfilling.

* **Refactor**
* Dummy data seeding redesigned to be idempotent, deterministic, and
bulk-oriented; seeding tasks now overlap where safe.

* **Bug Fixes**
* Preview project flow validates client capabilities and refreshes the
local project list to avoid stale navigation.
  * Auto-login guarded to run only once to prevent duplicate sign-ins.

* **UI/UX**
* Walkthrough steps and sidebar behavior improved; walkthrough labels
and search keywords updated.

* **Chore**
* CLI identity command now resolves session authentication more
reliably.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1437?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-20 10:10:22 -07:00
Konsti Wohlwend
ffbd09dc57
Fix flaky tests and preexisting CI failures (#1443) 2026-05-20 10:00:11 -07:00
Konsti Wohlwend
29cea48beb
Remote dev envs (#1435) 2026-05-19 15:54:18 -07:00
BilalG1
d0202eeef9
payments: rework refund flow to three-knob API (#1429)
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
## Summary
- Replaces per-entry refund schema with a flat `{ amount_usd,
revoke_product, end_subscription? }` shape; refund state is now derived
from bulldozer ledger rows (`refund:<sourceTxnId>:<uuid>`) instead of
the legacy `refundedAt` column, enabling multiple partial refunds up to
the remaining cap.
- Adds `invoice_id` for refunding any subscription invoice (start or
renewal), Stripe idempotency keys derived from `(tenancyId, sourceTxnId,
amount, prior_refunded)` so retries dedupe but intentional partials
don't collide, and a legacy backstop that rejects pre-rework
`refundedAt` purchases.
- Dashboard refund dialog rebuilt around the three toggles (revoke→end
coupling cascades into the UI); refund rows surface in the listing as
`type: "refund"` with `adjusted_by` linkage handling both new and legacy
formats.

## Implements
[STA2-52 — Build in refund logic for
payments](https://linear.app/stack-auth/issue/STA2-52/build-in-refund-logic-for-payments)

## Documented limitations (planned follow-up work)
These are called out in code comments and intentionally deferred to a
follow-up PR:
- **Cap-check race under concurrent refunds.** Bulldozer's embedded
`BEGIN/COMMIT` prevents an outer Prisma tx from scoping the writes, so
two concurrent refunds can both pass the cap check. Needs a
bulldozer-aware mutex or pending-refund-intent pattern. In practice
refunds are admin-only and rare, so the race window is small.
- **Stripe + DB non-atomicity on the DB-success → response-loss path.**
The Stripe idempotency key is keyed on `(tenancyId, sourceTxnId, amount,
priorRefunded)`, so a retry after Stripe-success → DB-fail self-heals
(Stripe dedupes; the next attempt writes the bulldozer row). The hole is
the reverse direction: if the bulldozer row commits but the response is
lost, a retry sees a higher `priorRefunded` and generates a fresh key —
Stripe would issue a second real refund. No out-of-band reconciliation
today.
- **Dashboard can't reach the `invoice_id` path.** Refund actions are
only enabled on `purchase` rows and the submit call never passes
`invoice_id`, so admins refunding a renewal must use the API directly.
Follow-up: enable the action on `subscription-renewal` rows and thread
`invoice_id` through.

## Architectural note
`active-subscription-end` and `item-quantity-expire` entries are **not**
emitted on the refund row itself. They're produced by the derived
sub-end transaction (`transactions.ts:158-228`) once Prisma
`subscription.endedAt` is updated, keeping the `expiresWhen` /
`when-repeated` semantics in one place. This is the main structural
divergence from the ticket's literal entry recipe.

## Review follow-ups addressed in this PR

**First-pass review:**
- **KnownError back-compat preserved**: `SubscriptionAlreadyRefunded` /
`OneTimePurchaseAlreadyRefunded` are once again thrown by the
legacy-`refundedAt` backstop, and `TestModePurchaseNonRefundable` is
thrown when an admin sends `amount_usd > 0` against a test-mode
purchase. Callers catching by error code keep working through the
rework.
- **Idempotency-key comment corrected**: now accurately describes the
`(tenancyId, sourceTxnId, amount, priorRefunded)` key and its
self-healing behaviour on the Stripe-success → DB-fail retry path (see
Documented limitations above for the remaining hole).
- **Renewal-invoice e2e coverage added**: new test sets up a live-mode
subscription via Stripe webhooks (`subscription_create` +
`subscription_cycle` invoices), refunds the renewal invoice via
`invoice_id`, and asserts the resulting `refund_transaction_id` starts
with `refund:sub-renewal:` and is linked back via `adjusted_by` on the
*renewal* row (not the start row). Plus negative cases:
cross-subscription `invoice_id` → 404, `invoice_id` on a one-time
purchase → SchemaError.

**Second-pass review:**
- **Idempotent sub-cancel error-code string fix**: the Stripe code for
re-cancelling an already-canceled sub is
`subscription_already_canceled`, not `subscription_canceled` — the
previous catch would have re-thrown.
- **End-only sub refund replay rejected**: when `amount=0, revoke=false,
end=true` and the sub is already `cancelAtPeriodEnd` or `endedAt`, throw
SchemaError. Otherwise `readPriorRefundSummary` doesn't see end-only
events and the call would be a forever-no-op accumulating empty refund
rows.
- **`revoke_product=true` with renewal `invoice_id` rejected**: the
product grant lives on the sub-start txn, not on renewal txns — a
renewal-scoped revocation would write a back-reference to a non-existent
entry. Forces admin to revoke against the start invoice (or the default
no-`invoice_id` call).
- **Refund row `id` matches the linkage**: the listing route now returns
the full refund txnId as `id` for `type: "refund"` rows so it matches
`adjusted_by.transaction_id` — the dashboard can join source rows to
their refund rows.
- **+2 e2e tests** for the above (end-only replay rejection,
revoke+renewal rejection).

**Third-pass review:**
- **Dashboard refund dialog seeds state on open**: previously the reset
block lived in `ActionDialog`'s `onOpenChange`, which doesn't fire on
the open transition for a controlled dialog. As a result the dialog
opened with the initial `useState` defaults (`amountUsd = '0'`), and an
admin submitting unchanged on a paid purchase would revoke/end at $0
instead of refunding the charged amount. The seed now runs in the menu
`onClick` before `setIsDialogOpen(true)`.
- **`SUBSCRIPTION_START_PRODUCT_GRANT_ENTRY_INDEX` corrected from 1 →
0**: the constant is persisted as `adjustedEntryIndex` on
product-revocation entries and copied through verbatim by
`mapLedgerEntry`. That mapper drops the hidden
`active-subscription-start` entry, so the public-API layout puts the
product grant at index 0. The prior value of `1` pointed at the
money-transfer entry (or out of range on test-mode subs) through the
public listing.
- **`amountTotal` cap gated behind a USD pre-flight**:
`SubscriptionInvoice` doesn't persist invoice currency, and the previous
code took `invoice.amountTotal` as USD cents directly. Now
`getTotalUsdStripeUnits` (which throws on non-USD pricing) is always
called first; `amountTotal` is only preferred as the actual cap after
that pre-flight succeeds.

## Test plan
- [x] `pnpm typecheck` — 28/28 pass
- [x] `pnpm lint` — 28/28 pass
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/internal/transactions-refund.test.ts`
— **19/19 pass** (was 14/14 on the original PR; +3 for `invoice_id`
path: renewal refund happy path, unrelated `invoice_id` rejection,
`invoice_id` on OTP rejection; +2 for second-pass: end-only replay
rejection, revoke+renewal rejection)
- [x] curl smoke against
`/api/latest/internal/payments/transactions/refund` — unknown purchase →
404, no-op → 400, negative → 400, sub-revoke-without-end → 400
- [x] **Dashboard UI end-to-end re-run pending** — the original
agent-browser pass ran before the third-pass dialog-seed fix, so any
"money + revoke" submissions may have actually sent `amount_usd = "0"`.
Re-test before un-drafting: open the refund dialog from the menu,
confirm the amount field pre-fills with the charged amount, exercise
validation (negative / exceeds-cap / no-op), and submit both an
end-subscription-only sub refund and a money+revoke OTP refund; verify
bulldozer rows and Prisma `cancelAtPeriodEnd` updates.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Ledger-driven refund flow with stable refund IDs, invoice-aware
refunds, OTP/product-revocation support, tri-state end_action (now /
at-period-end / none), and API responses that include
refund_transaction_id.

* **Bug Fixes / Improvements**
* Deterministic Stripe idempotency, stronger replay protection,
refundable-amount caps, test-mode constraints, and transactions listing
updated to surface refunds.

* **Tests**
* Expanded unit and E2E coverage for new request shape, invoice paths,
money-unit conversion, and edge cases.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1429)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-15 19:29:21 -07:00
Mantra
c808e23b7d
Data-grid overhaul + session-replays / team-payments dashboard surfaces (#1424)
## Summary

Refactors the dashboard data-grid into a smaller, URL-state-aware
primitive and lands several new dashboard surfaces around it: per-user
session replays, team-level analytics and payments, and pagination for
permission definitions. Also moves session replays out from under
`/analytics` to a top-level surface and adds a
`project_user.last_active_at` index that the new weekly-active metrics
depend on.

**Base:** `dev` → **Head:** `refactor/data-grid-and-dashboard-surfaces`
**Scope:** 91 files, +5,644 / −1,858. Assets in [this
gist](https://gist.github.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7).

## Screenshots

Captured from a local dev server (dashboard at `:8101`, dummy project
seeded with 26 users). Standard viewport **1920×1200**, widescreen
**2560×1440**.

### Users list — data-grid overhaul in context

| Light | Dark |
| --- | --- |
|
![users-list-light](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/users-list-light.png)
|
![users-list-dark](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/users-list-dark.png)
|

Widescreen:

| Light | Dark |
| --- | --- |
|
![users-list-light-wide](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/users-list-light-wide.png)
|
![users-list-dark-wide](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/users-list-dark-wide.png)
|

### User detail — new session-replays card + weekly metrics

| Light | Dark |
| --- | --- |
|
![user-detail-light](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/user-detail-light.png)
|
![user-detail-dark](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/user-detail-dark.png)
|

Widescreen:

| Light | Dark |
| --- | --- |
|
![user-detail-light-wide](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/user-detail-light-wide.png)
|
![user-detail-dark-wide](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/user-detail-dark-wide.png)
|

### Session replays — moved out of `/analytics`

| Light | Dark |
| --- | --- |
|
![session-replays-light](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/session-replays-light.png)
|
![session-replays-dark](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/session-replays-dark.png)
|

Widescreen:

| Light | Dark |
| --- | --- |
|
![session-replays-light-wide](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/session-replays-light-wide.png)
|
![session-replays-dark-wide](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/session-replays-dark-wide.png)
|

### Project permissions — new pagination

| Light | Dark |
| --- | --- |
|
![project-permissions-light](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/project-permissions-light.png)
|
![project-permissions-dark](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/project-permissions-dark.png)
|

Widescreen:

| Light | Dark |
| --- | --- |
|
![project-permissions-light-wide](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/project-permissions-light-wide.png)
|
![project-permissions-dark-wide](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/project-permissions-dark-wide.png)
|

### Other migrated surfaces

| Page | Light | Dark |
| --- | --- | --- |
| Project picker |
![projects-light](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/projects-light.png)
|
![projects-dark](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/projects-dark.png)
|
| Overview / setup |
![overview-light](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/overview-light.png)
|
![overview-dark](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/overview-dark.png)
|
| Teams list |
![teams-list-light](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/teams-list-light.png)
|
![teams-list-dark](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/teams-list-dark.png)
|
| Team permissions |
![team-permissions-light](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/team-permissions-light.png)
|
![team-permissions-dark](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/team-permissions-dark.png)
|
| API keys |
![api-keys-light](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/api-keys-light.png)
|
![api-keys-dark](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/api-keys-dark.png)
|

### Scroll behaviour — new data-grid on the users list

| Light | Dark |
| --- | --- |
|
![users-list-scroll-light](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/users-list-scroll-light.gif)
|
![users-list-scroll-dark](https://gist.githubusercontent.com/mantrakp04/01bf8db4c71ec7a119b73d6ee60717a7/raw/users-list-scroll-dark.gif)
|

## What's new

- **`packages/dashboard-ui-components/src/components/data-grid`** —
rewritten. Trimmed `data-grid.tsx` from ~1.7k LOC, split sizing logic
into `data-grid-sizing.ts`, added `use-url-state.ts` for URL-synced
state, and added `data-grid.test.tsx`.
- **Session replays** moved from `…/analytics/replays` to
`…/session-replays` (top-level surface). New `user-session-replays.tsx`
card on the user detail page; new internal `route.tsx` to feed it.
- **Teams** detail page gains `team-analytics.tsx` and
`team-payments.tsx`.
- **Permissions** — new shared `permission-definitions-pagination.ts`
consumed by both project and team permission CRUD routes.
- **Backend** — Prisma migration `add_project_user_last_active_at_idx` +
a `lastActiveAt` index that backs the new weekly-active metrics.
- **Polish** — `editable-input`, `inline-save-discard`, `settings.tsx`,
walkthrough steps, and several data-table components touched in line
with the data-grid rewrite.

## Notes for reviewers

- The data-grid rewrite changes the *shape* of state (now URL-synced),
not just internals. Consumers in
`apps/dashboard/src/components/data-table/*` were updated to match —
please scan those for any missed knobs.
- The `analytics/replays` → `session-replays` rename is git-tracked as
renames; diffs should be small in those files.
- New SDK surface in
`packages/template/src/lib/stack-app/session-replays/index.ts` and
additions in `admin-app-impl.ts` / `server-app-impl.ts` mean OpenAPI
specs (`docs-mintlify/openapi/{admin,client}.json`) regenerate; the diff
is mostly mechanical.

## Test plan

- [ ] `pnpm typecheck` clean
- [ ] `pnpm lint` clean
- [ ] Data-grid unit tests pass (`packages/dashboard-ui-components`)
- [ ] Manual: users list — column resize, sort, filter, paginate; URL
state reflects each change and survives reload
- [ ] Manual: user detail — session-replays card lists replays;
weekly-metrics card renders without `lastActiveAt` index migration
applied (i.e. on a fresh DB) and after applying it
- [ ] Manual: project + team permissions — pagination cursor advances
and stays consistent under search
- [ ] Manual: session-replays top-level page loads; old
`/analytics/replays/...` URL path is no longer expected to be linked
anywhere


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
  * Session Replays app (embedded mode, search, sorting, share links)
  * Tabbed Team pages with Team Analytics and Team Payments dashboards
* Server-backed cursor pagination, debounced search, and infinite-scroll
for teams/users/permissions

* **UX**
* Permission and member tables refresh after edits; permission creation
triggers table refresh
  * Users list supports sorting by last-active

* **Performance**
  * Index added to speed ProjectUser last-active queries

* **Documentation**
  * API/SDK docs updated for pagination and new query params
* Contributor guidance: explicit git-safety rules added (no destructive
git ops without consent)

* **Tests**
  * Added e2e tests for pagination and filtering on list endpoints
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-15 14:16:47 -07:00
Aman Ganapathy
a9623d976a
[Refactor] [Fix] Remove default prod creation (#1350)
With the new bulldozer rework we dont support default products anymore.
Users are encouraged to currently manually handle granting products to
their end users.

We block api requests and new product creations that attempt to set no
price, and we remove any options to set include-by-default. We also
migrate users' existing product snapshots in `Subscriptions`,
`OneTimePurchases`, and `ProductVersions` to have no price set if it's
an include-by-default product. This will make it so that next time a
user goes onto their products page, they will be informed that the
pricing is invalid and it is no longer delivered by default.

Note, however, that these products will still be providing items and the
like to the users who have them.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Migrated legacy product snapshots so missing included-items no longer
break readers.
* Removed deprecated "include-by-default" pricing sentinel; pricing now
requires explicit price entries and write validation rejects the old
sentinel.

* **Chores**
* Simplified dashboard pricing flows: create/edit/save now use explicit
prices and surface an alert when a formerly implicit free plan needs an
explicit $0 price.
* Config overrides and stored data are auto-normalized to explicit price
objects.

* **Tests**
* Updated and added tests covering migration, validation, and switching
behavior for explicit prices.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: mantrakp04 <mantrakp@gmail.com>
Co-authored-by: Mantra <87142457+mantrakp04@users.noreply.github.com>
2026-05-15 10:38:33 -07:00
BilalG1
15faf709f3
stack-cli: explicit --cloud-project-id / --config-file across exec, config, project (#1422)
## Summary

Reworks the `stack` CLI surface so the cloud-vs-local choice is
**explicit at every invocation**, removing the global `--project-id` /
`STACK_PROJECT_ID` env var and the local-default `exec` behavior
introduced earlier in this branch.

### `stack exec`
- Removes `--cloud`, `STACK_EXEC_DEFAULT_TARGET`, and the implicit local
default. The CLI now requires **exactly one** of:
  - `--cloud-project-id <id>` — run against the Stack Auth cloud API
- `--config-file <path>` — run against the local emulator project mapped
to that absolute config-file path
- The `--config-file` branch resolves the project id by calling the
existing `GET /api/latest/internal/local-emulator/project` endpoint and
matching `absolute_file_path` client-side. No new backend endpoint
introduced.

### `stack config pull` / `stack config push`
- Both now take `--cloud-project-id <id>` per-command instead of the
global flag / `STACK_PROJECT_ID` env.
- `config pull --config-file` is **optional**: when omitted, the CLI
uses `./stack.config.ts` from the current directory. If neither flag nor
cwd file is present, it exits with a clear hint to pass `--config-file`
or `cd` into a directory containing `stack.config.ts`.

### `stack project list`
- Default (no flags) lists both **cloud and local emulator** projects.
Each entry carries a `target: "cloud" | "dev"` field (text format:
`<id>\t<displayName>\t[<target>]`).
- `--cloud` / `--dev` filter to a single source (mutually exclusive —
passing both errors).
- On the default code path, an unreachable local emulator emits a single
stderr warning (`warning: skipping dev projects — local emulator not
reachable …`) and the command still succeeds with cloud results. With
`--dev` explicit, the unreachable case hard-errors.

### `stack project create`
- Now requires `--cloud` to make the cloud-vs-local choice explicit.
There is no local alternative today; the flag exists to surface the
decision so a future local-project create doesn't silently change
behavior.

### Backend
- Bumps the `LIMIT` on `GET /api/latest/internal/local-emulator/project`
from 20 → 100 so `project list --dev` doesn't silently truncate.

### Refactors (from earlier in this branch, unchanged here)
- Local-emulator paths/ports/PCK polling live in
`packages/stack-cli/src/lib/emulator-paths.ts`.
- Shared local-emulator admin credentials live in
`packages/stack-shared/src/local-emulator.ts`.
- `resolveAuth` / `resolveLocalEmulatorAuth` take an explicit
`projectId: string` (no more `Flags` parameter).
- New `packages/stack-cli/src/lib/local-emulator-client.ts` encapsulates
the GET-and-match flow used by both `exec --config-file` and `project
list --dev`.

## Breaking changes

**Scripts that relied on any of the following must be updated:**

| Removed | Replacement |
| --- | --- |
| Global `--project-id <id>` flag | Per-command `--cloud-project-id
<id>` |
| `STACK_PROJECT_ID` env var | Per-command `--cloud-project-id <id>` |
| `stack exec --cloud` | `stack exec --cloud-project-id <id>` |
| `STACK_EXEC_DEFAULT_TARGET=cloud\|local` | `--cloud-project-id <id>`
or `--config-file <path>` |
| `stack exec` defaulting to local emulator | Explicit `--config-file
<path>` required |
| `stack project create` without a flag | `stack project create --cloud
…` required |

## Test plan
- [x] `pnpm lint` (stack-cli, backend, e2e) — clean
- [x] `pnpm --filter @stackframe/stack-cli typecheck` — clean
- [x] `pnpm --filter @stackframe/stack-cli exec vitest run` — **72/72
passing** (new unit tests: `parseExecTarget`,
`resolveConfigFilePathForPull`, `resolveProjectListSources`,
`formatProjectList`)
- [x] `pnpm test run apps/e2e/tests/general/cli.test.ts` — **73 passing,
4 skipped, 0 failing**. New e2e cases cover:
  - `exec` with neither flag → errors with "Specify a target"
  - `exec` with both flags → errors with "not both"
- `exec --config-file` with missing file / missing PCK / unreachable API
- `exec --config-file` happy path against a real local-emulator backend
(gated on `NEXT_PUBLIC_STACK_IS_LOCAL_EMULATOR=true`)
  - `config pull` cwd fallback to `./stack.config.ts`
- `config pull` with no `--config-file` and no cwd `stack.config.ts` →
errors with `Pass --config-file …`
  - `project list --cloud --dev` together → errors
- `project list` default with unreachable emulator → cloud results +
single stderr warning
  - `project create` without `--cloud` → errors
  - All previously-`--cloud` exec cases ported to `--cloud-project-id`
- [x] Manual smoke: `stack exec --help`, `stack project list --cloud
--dev`, `stack project create` all emit the expected friendly errors /
help text.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **New Features**
* CLI `exec`, `config`, and `project` commands now require explicit
targeting via `--cloud-project-id` (cloud) or `--config-file` (local
emulator).
* `project list` now supports `--cloud` and `--dev` flags to display
projects from both sources with target indicators.
* Enhanced environment variable validation for emulator service ports
with proper fallback handling.

* **Bug Fixes**
* `project list` now gracefully handles unreachable emulator with
warning fallback instead of failure.

* **Tests**
* Expanded test coverage for project targeting, config file resolution,
and emulator connectivity scenarios.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-14 17:20:40 -07:00
BilalG1
024da3cacb
[Fix] freestyle-mock honors $PORT, drop server.listen string-patch (#1432)
## Summary

The multi-worker freestyle-mock rewrite
([#1430](https://github.com/hexclave/stack-auth/pull/1430)) hardcoded
`server.listen(8080)`, which collides with qstash inside the
local-emulator container. Supervisord sets `PORT=8180` for
freestyle-mock specifically to avoid this clash, but the new source
ignores `process.env.PORT`.

The local-emulator Dockerfile previously bridged this with a
`server.replace('server.listen(8080)', ...)` string-patch on the
embedded source. The new code is `server.listen(8080, () => { ... })` —
the literal `'server.listen(8080)'` substring no longer matches, so the
replace silently no-ops and freestyle-mock binds 8080. qstash then can't
start (`address already in use: 127.0.0.1:8080` → FATAL), the backend
(which depends on qstash) never comes up, and the emulator smoke test
times out.

Observed in [this
run](https://github.com/hexclave/stack-auth/actions/runs/25832479377):

```
smoke-test: FTL address already in use: 127.0.0.1:8080
smoke-test: WARN exited: qstash (exit status 1; not expected)
smoke-test: INFO gave up: qstash entered FATAL state, too many start retries too quickly
[603s] SMOKE TEST FAILED: backend /health?db=1 did not return 200 within 300s
```

## Changes

- `docker/dependencies/freestyle-mock/Dockerfile`: `server.listen(PORT)`
where `PORT = process.env.PORT || 8080`, plus the startup log reflects
the actual port.
- `docker/local-emulator/Dockerfile`: drop the now-redundant
string-replace for the listen call. The two remaining replaces
(`fs/promises` import + node_modules symlink) are unrelated and kept.

## Test plan

- [ ] QEMU emulator build workflow passes on this branch (smoke test
reaches healthy backend).
- [ ] Verify locally that supervisord's `PORT=8180` is honored by
freestyle-mock and qstash binds 8080 cleanly.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
  * Server listening port is now configurable via PORT (default 8080).
* Local emulator startup adjusted to better handle dependencies and
create a node_modules symlink for smoother local runs.
  * Seed/process transaction timeout increased to 90s for reliability.
* Local database statement timeout changed to 0 (no statement timeout).
* **CI**
  * Added step to enable and validate KVM access during emulator builds.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1432)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-14 15:31:16 -07:00
Madison
2cf0f6f981
[Apps] Adding support app alpha and dogfooding (#1368)
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Support app: inbox UI to create, view, reply, and manage conversations
(status, priority, assignee, tags, internal notes).
* Dashboard pages: Conversations and Support Settings; feedback can
create managed conversations.
* Public/internal APIs for listing, creating, updating, and fetching
conversation details; client-side helpers.

* **SLA**
* Configurable first/next response targets, urgency classification, and
timing logic.

* **Data**
* New conversation persistence (conversations, entry points, messages)
and migration tests; preserves conversations on user/team deletion and
anonymizes sender data.

* **Tests**
  * Unit, migration, and end-to-end tests added.

* **Documentation**
  * Updated docs describing conversation model and workflow rules.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-13 11:36:11 -05:00
Aman Ganapathy
3385d6e2b0
[Fix] recover stale external db requests (#1428)
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
Mirror main branch to main-mirror-for-wdb / lint_and_build (push) Has been cancelled
Publish npm packages / publish (push) Has been cancelled
Publish Swift SDK to prerelease repo / publish (push) Has been cancelled
Sync Main to Dev / sync-commits (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
Failures between claiming and the deletion of outgoing requests from the
handler can leave requests stale and never clean them up. Some of these
requests may also have duplicates that are fresh in the outgoing queue.
These requests need to be deleted or retried.

It's important to still log the stale requests to sentry so the root
cause can be investigated.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Improved detection and recovery of stale outgoing requests; telemetry
now records precise reset/deleted counts and includes sampled affected
IDs.
* Added an early fast path to skip unnecessary external calls when there
are no pending requests.

* **Refactor**
* Consolidated stale-request handling into a dedicated helper and
optimized recovery logic; poller telemetry now includes claim-limit
attributes.

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1428)
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-12 17:55:44 -07:00
Aman Ganapathy
4648fc1899
[Feat] new scripts on migrate/seed/init run for internal (#1421)
### Context

One script grants free plan to any team which is a customer of the
internal project who doesnt have it already.
We also want to migrate our users (internal) to the latest version of
their products.

Needed because some subs on dev right now dont have a plan. And internal
isnt using latest version of its own growth plan.

### Describing the Paths we want to Account for

1. Users on production who currently don't have a plan should get free
plans, since this script is run with every migrate
2. Users on production should get the latest version of each plan of
ours. So a forced migration to latest version of internal project plans
3. No other project's products/product lines should be affected. They
will continue to have product versioning
4. 2 should apply to test mode subscriptions as well, on top of stripe
subscriptions. All of them should be refreshed
5. Internal project itself should get latest version of its own growth
plan
6. If the bulldozer write fails, we should be able to recover on next
migration (this should already be handled by init bulldozer script,
because it checks if prisma db and bulldozer db are out of sync)
7. if the regenerate or backfill fail, we should be able to recover just
by rerunning the script
8. Product version table should not balloon. No table should really
balloon



### What I've tested on local
1. Put in 1000 db subscription rows, made them all stale and then ran
the regen script. It took about 6 minutes to update all of them, and it
was idempotent so rerunning it again did nothing.
2. With proper stripe keys I switched off of test mode on the internal
app, granted a product to a new team and updated the product's item
list. At this point I checked and the new team had the outdated version
of the product. Then I ran the regen script and the new team was moved
to latest product version.
3. Tried the above with the internal team's growth plan too and it
worked as well.
4. Backfill actually grants free plan


### Deployment strategy in prod
Run the backfill and the regen scripts once each after your migrations
on the prod db.
`pnpm db:backfill-internal-free-plans` will make sure every team has a
free plan at least if they dont have an existing plan (and it is
idempotent).
After that, run `pnpm db:regen-internal-subscriptions-to-latest` which
will migrate every user to the latest version of their plan (i.e latest
snapshot). This should also be idempotent.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Automated backfill to grant internal free plans to qualifying billing
teams.
* Regeneration tool to refresh internal subscription snapshots to the
latest product versions.

* **Chores**
* Added CLI commands and package scripts to run backfill and regen jobs.
  * Database init now runs payment initialization before backfill/regen.

* **Tests**
* Integration and unit tests added/updated to validate backfill,
regeneration, and free-plan idempotency.

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1421)
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-12 16:05:45 -07:00
Mantra
e50358710a
fix(tests): use sql.json in onboarding migration test and refresh metrics snapshot (#1420)
## Summary

Two small test-maintenance fixes that came up while running the suite:

- **Onboarding migration test**
(`apps/backend/prisma/migrations/20260420000000_add_project_onboarding_state/tests/default-and-updates.ts`):
switch the JSON insert from `\${JSON.stringify(onboardingState)}::jsonb`
to `\${sql.json(onboardingState)}`. This matches the pattern used by
every other migration test in the repo (see
`20260214000000_fix_trusted_domains_config/tests/*`) and lets the
`postgres` driver handle serialization and parameter binding
consistently rather than relying on a manual `::jsonb` cast.
- **Internal metrics snapshot**
(`apps/e2e/tests/backend/endpoints/api/v1/__snapshots__/internal-metrics.test.ts.snap`):
update `active_users_by_country.AQ` to list `mailbox-2` before
`mailbox-1`. The `should return metrics data with users` test signs in
`mailbox-1` (mailboxes[0]) into AQ first, then later signs `mailbox-2`
(mailboxes[1]) into AQ, so sorted by `last_active_at_millis desc`
`mailbox-2` should come first. The snapshot now matches that ordering.

No production code is touched — both changes are limited to test
fixtures.

## Test plan

- [ ] `pnpm -C apps/backend test run` (migration tests)
- [ ] `pnpm -C apps/e2e test run internal-metrics` (snapshot test)
- [ ] `pnpm lint`
- [ ] `pnpm typecheck`


Made with [Cursor](https://cursor.com)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Tests**
* No user-facing behavior changed; test flows made more robust and less
flaky (migration validation, metrics ingestion polling, CLI expiry
checks, failed-emails digest expectations).
* **API / Documentation**
* CLI auth default expiration reduced from 2 hours to 2 minutes (updated
OpenAPI defaults and related test expectations).
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-05-12 10:06:29 -07:00
Konstantin Wohlwend
7acbd8d56d Improved StackAssertionError error logging 2026-05-07 13:29:01 -07:00
Aman Ganapathy
bc6347e3c3
[Feat]: set flag to disable billing (#1417)
### Context
There are some kinks to work out with deploying plan limits onto prod,
so we'd like to disable it temporarily.

### Summary of Changes
We update all call sites of the item quantity things with a flag based
check. Idea is when flag is set to true, it should function as if there
are no limits.
2026-05-06 14:58:06 -07:00
Konsti Wohlwend
765b0f4e29
New setup (#1413) 2026-05-06 12:03:06 -07:00
Mantra
3a5153f4db
feat(payments): collect 0.9% platform fee on every stripe money movement (#1378)
## Summary

Charges the platform 0.9% on both legs of each transaction on
non-internal projects.

- **Charge leg** — rides along via Stripe's native
\`application_fee_amount\` / \`application_fee_percent\` params on the
PaymentIntent / Subscription.
- **Refund leg** — Stripe's default reverses our charge-leg fee on
refund, netting us zero. We disable that with \`refund_application_fee:
false\`

## Refs
-
https://docs.stripe.com/api/subscriptions/create#create_subscription-application_fee_percent
-
https://docs.stripe.com/api/payment_intents/object#payment_intent_object-application_fee_amount

---------

Co-authored-by: nams1570 <amanganapathy@gmail.com>
2026-05-05 09:22:53 -07:00
Aman Ganapathy
c01c052ac9
[Refactor][Feat] Implement Plan Limits for Hard-and-Soft Item Caps (#1215)
### Suggested Review Areas
Please see `plans.ts` and `seed.ts` to verify whether the item caps are
where they should be. Outside of that, each commit should be atomic so
stepping through the commits should give you an idea of how I
implemented each limit.

### Discussion
Something to discuss: when a user cancels team/growth we regrant free
fine, but any extra-seats they had just keeps billing. So they end up
paying ~$29/mo per extra-seat on top of free's 1 seat, which is strictly
worse than just staying on team. This surfaced while manually testing
this PR, we only enforce the add-on base requirement at purchase time,
nothing cascades on cancel. Should we cascade cancel add ons?

### Context
Now that we have a stable suite of products for stack-auth, we want to
limit the items under each product a customer has access to based on
their plan. So for example, a free plan user has a certain amount of
emails they can send out each month, and so on. We try to implement
limits in this PR.

### Summary of Changes
Implemented hard limits for dashboard admins, analytics per-query
timeouts, sent email monthly capacity, events, and session replays.
Implemented a soft cap for auth users (where if there's a signup beyond
the limit, we log it to sentry so we can manually choose to email that
user/team).

For auth users, we do not block new user sign ups once plan limit has
been hit. We also don't degrade or impact the customer experience. It
logs to sentry and it is up to us to take manual action to email the
user to upgrade the plan. Also, implementation wise, we count all the
users across all the projects for this team and compare it to their plan
item limit, rather than debiting items like we do for other approaches.
As a soft cap, this should be fine plus this is a better source of
truth.

For email capacity, we operate a monthly limit of emails. Once this is
hit, no more emails can be sent until the next month/ a plan upgrade.
These emails will be treated as a send error, so they can be manually
resent once the capacity is reset. With respect to the `email-queue`
state engine, they go from `SENDING`->`SERVER_ERROR`, hooking into the
existing state engine flow, with an external error that shows it's
because of the rate limit. This is cleaner than inventing a new state
that is identical for all intents and purposes to `SERVER_ERROR`. We
check in processSingleEmail since that maps to the sending state.

For analytics query timeouts, the backend route accepts a timeout
parameter with the request. The way we implement the timeout for each
query is by taking the `min(request_timeout,plan_timeout)` and using
that. This determines how long a query can run for.

For analytics events, there are server-side events (like refresh token
refreshes or sign up rule triggers) and client side events (like page
views or clicks). When these events occur, they are written to the
events table in clickhouse. We choose to implement a hard cap for the
total events, not just server side or client side. Once the cap is hit,
we stop storing the events and display a banner on the analytics page. A
different banner renders when we are at >=80% of total plan capacity.

For session replays, we stop creating new session replays when the limit
is hit. Old replays can still have chunks appended to them. The source
of truth here is the session replay table- a new replay corresponds to a
new row in the table. We have similar banners as to the events.

Dashboard admins should be 4 for both team and unlimited.

#### Implementation Caveats

For debiting items across these limits, we now use `tryDecreaseQuantity`
at the beginning. This means we debit first if possible before
conducting the action (like writing events to clickhouse). In practice,
this means that if clickhouse fails, then the user is debited for
something that doesn't happen. However trying to build a refund
workaround would be very clunky, and also, clickhouse is reliable. For
debits that are very small in the order of things (say, 200 items on a
100k plan), it doesn't mean much.

For emails, we don't debit items if it's a retry. This prevents the user
for being charged multiple times for effectively one email.


### UI Changes
The only UI changes in this PR are having certain banners render in
analytics when a customer is approaching/ is at their monthly limit of
session replays or events.


### Out of Scope for this PR
We do not have metered pricing yet, so events/session replays/ email use
beyond the limits cannot be charged yet. This is why for this
implementation, we rely on hard and soft caps.
We do not implement payment per-transaction pricing yet. That is
deferred to a followup PR.
The UI for the onboarding call will be set up as part of the overall
onboarding flow which doesn't exist yet, so it has been deferred.
Since the UI for the dashboard home page and project/account settings is
currently being reworked, finding a better spot for plan upgrades is not
handled in this PR.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Session replays added as a monthly included entitlement; onboarding
calls added to Team/Growth plans. Dashboard banners warn about
analytics-event and session-replay limits. Projects page adds extra-seat
flow and improved invitation error handling.

* **Behavior Changes**
* Monthly renewal semantics for emails-per-month and analytics-events;
analytics query timeouts now respect plan limits and are clamped. Email
sends, analytics events, and new session creation are blocked when
quotas are exhausted. Growth plan seats set to 4.

* **Tests**
* E2E and unit tests added to verify quota enforcement and free-plan
regranting.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Mantra <87142457+mantrakp04@users.noreply.github.com>
2026-05-04 18:25:13 -07:00
Mantra
d2f2fb0e42
[codex] Fix preview dummy payments customer types (#1398)
## Summary

Fixes preview dummy payments seed data so seeded products and items
match their team-scoped product lines.

## Root Cause

The preview seed configured `workspace` and `add_ons` product lines with
`customerType: "team"`, but the products inside those lines (`starter`,
`growth`, and `regression-addon`) were configured as `customerType:
"user"`. Environment override writes validate against the rendered
branch config, so unrelated environment updates could fail with a
product/product-line customer type warning.

## Changes

- Mark preview dummy payments products and included items as
team-scoped.
- Export the dummy payments setup helper for focused validation.
- Add a regression test that validates the generated branch payments
override has no config override errors or incomplete config warnings.

## Validation

Passed in the original checkout with dependencies installed:

- `STACK_SKIP_TEMPLATE_GENERATION=true pnpm exec vitest run --config
vitest.config.ts src/lib/seed-dummy-data.test.ts --reporter=verbose
--maxWorkers=1 --minWorkers=1`
- `pnpm -C apps/backend lint src/lib/seed-dummy-data.ts
src/lib/seed-dummy-data.test.ts`
- `pnpm -C apps/backend typecheck`

The temporary clean worktree used for this PR did not have
`node_modules`, so dependency-backed commands were not rerun there.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Improvements**
* Strengthened payment product configuration with tighter typing and
validation
* Normalized product customer types (switched relevant dummy data from
user to team) for consistency

* **Tests**
* Added tests validating dummy payments configuration and
branch/override validation

* **Documentation**
* Added Q&A documenting a configuration validation failure mode and
required consistency for dummy payments data
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-01 09:44:30 -07:00
Mantra
e831972c4c
Move internal MCP server to backend, use Mintlify MCP for docs tools (#1389)
## Summary
- Move the `/api/internal/[transport]` MCP route from the docs app to
the backend, so the public `ask_stack_auth` MCP tool is served from the
same origin as the AI query API it proxies to.
- Replace the bespoke docs-tools HTTP client in
`apps/backend/src/lib/ai/tools/docs.ts` with an `@ai-sdk/mcp` client
that talks to Mintlify's generated MCP server. The backend AI agent now
consumes Mintlify's lower-level search/fetch tools directly instead of
going through the docs app.
- Swap `STACK_DOCS_INTERNAL_BASE_URL` for `STACK_MINTLIFY_MCP_URL`
(defaults to the Mintlify-hosted MCP URL).
- Move the `@vercel/mcp-adapter` dependency from `docs` to
`apps/backend`.

## Test plan
- [ ] `pnpm typecheck`
- [ ] `pnpm lint`
- [ ] e2e: new
`apps/e2e/tests/backend/endpoints/api/v1/internal/mcp.test.ts` covers
`tools/list` and validation on `tools/call`
- [ ] Manual: hit `POST /api/internal/mcp` on the backend and confirm
`ask_stack_auth` is listed and callable
- [ ] Manual: confirm backend AI agent docs tools resolve via the
Mintlify MCP URL

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Backend docs tooling now uses a Mintlify MCP server for documentation
tools and discovery.

* **Chores**
* Development environment variables updated to point to the Mintlify MCP
endpoint.
* Backend dependency added to support MCP integration; docs package
dependency removed.

* **Tests**
* Added end-to-end tests for the internal MCP endpoint and tool
validation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-29 09:45:52 -07:00
Mantra
65d87a4836
Dashboard: DataGrid refactor + layout (stacked on overview-revamp) (#1338)
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
## Summary

Stacked on `overview-revamp` (now rebased against `dev`). Introduces a
first-class `DataGrid` component in
`@stackframe/dashboard-ui-components`, migrates every dashboard table
off the legacy `DesignDataTable` / hand-rolled `<Table>` pattern to it,
and ships a matching dashboard design guide.

Since the last writeup the `DataGrid` runtime has been substantially
rewritten: the virtualizer now supports `rowHeight="auto"` with
`estimatedRowHeight`, every column can opt into `cellOverflow: "wrap"`,
the toolbar + header stick under a configurable `stickyTop`, and the
seeded dummy data has been fleshed out so the migrated surfaces render
with realistic density. The AI-analytics prompt was also extended with
full schema docs for the auth / team / email / payments tables so
natural-language queries produce better SQL.

**Base:** `dev` → **Head:** `ui-fixes-minor`
**Scope:** 39 files, ~+6.5k / -2.4k

## Screenshots

Captured against the seeded Demo Project on the local dashboard
(`admin@example.com` via mock GitHub OAuth). Viewport: **1920×1200**
(standard) and **2560×1440** (widescreen). Assets hosted in [this
gist](https://gist.github.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9).

### Overview — revamped metrics + line chart

| Light | Dark |
| --- | --- |
|
![overview-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/overview-light.jpg)
|
![overview-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/overview-dark.jpg)
|

Widescreen:

| Light | Dark |
| --- | --- |
|
![overview-wide-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/overview-wide-light.jpg)
|
![overview-wide-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/overview-wide-dark.jpg)
|

### Users — DataGrid with seeded rows

| Light | Dark |
| --- | --- |
|
![users-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/users-light.jpg)
|
![users-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/users-dark.jpg)
|

Widescreen:

| Light | Dark |
| --- | --- |
|
![users-wide-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/users-wide-light.jpg)
|
![users-wide-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/users-wide-dark.jpg)
|

### Transactions — new DataGridToolbar + sticky chrome

| Light | Dark |
| --- | --- |
|
![transactions-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/transactions-light.jpg)
|
![transactions-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/transactions-dark.jpg)
|

Widescreen:

| Light | Dark |
| --- | --- |
|
![transactions-wide-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/transactions-wide-light.jpg)
|
![transactions-wide-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/transactions-wide-dark.jpg)
|

### Teams

| Light | Dark |
| --- | --- |
|
![teams-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/teams-light.jpg)
|
![teams-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/teams-dark.jpg)
|

Widescreen:

| Light | Dark |
| --- | --- |
|
![teams-wide-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/teams-wide-light.jpg)
|
![teams-wide-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/teams-wide-dark.jpg)
|

### Email Outbox

| Light | Dark |
| --- | --- |
|
![email-outbox-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/email-outbox-light.jpg)
|
![email-outbox-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/email-outbox-dark.jpg)
|

Widescreen:

| Light | Dark |
| --- | --- |
|
![email-outbox-wide-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/email-outbox-wide-light.jpg)
|
![email-outbox-wide-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/email-outbox-wide-dark.jpg)
|

### Payments — Customers

| Light | Dark |
| --- | --- |
|
![payments-customers-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/payments-customers-light.jpg)
|
![payments-customers-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/payments-customers-dark.jpg)
|

Widescreen:

| Light | Dark |
| --- | --- |
|
![payments-customers-wide-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/payments-customers-wide-light.jpg)
|
![payments-customers-wide-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/payments-customers-wide-dark.jpg)
|

### Sticky behaviour — scrolled views

Grids scrolled down ~600px. The page header is still pinned, and the
`DataGrid` toolbar + column header row stay put under it (backdrop-blur
+ `stickyTop` offset) while the virtualized body rows scroll past.
Compare the scrolled view against the top-of-page view above.

| Page | Light | Dark |
| --- | --- | --- |
| Users |
![users-light-scrolled](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/users-light-scrolled.jpg)
|
![users-dark-scrolled](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/users-dark-scrolled.jpg)
|
| Teams |
![teams-light-scrolled](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/teams-light-scrolled.jpg)
|
![teams-dark-scrolled](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/teams-dark-scrolled.jpg)
|
| Transactions |
![transactions-light-scrolled](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/transactions-light-scrolled.jpg)
|
![transactions-dark-scrolled](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/transactions-dark-scrolled.jpg)
|
| Payments Customers |
![payments-customers-light-scrolled](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/payments-customers-light-scrolled.jpg)
|
![payments-customers-dark-scrolled](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/payments-customers-dark-scrolled.jpg)
|
| Email Outbox |
![email-outbox-light-scrolled](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/email-outbox-light-scrolled.jpg)
|
![email-outbox-dark-scrolled](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/email-outbox-dark-scrolled.jpg)
|
| Analytics Tables |
![analytics-tables-light-scrolled](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/analytics-tables-light-scrolled.jpg)
|
![analytics-tables-dark-scrolled](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/analytics-tables-dark-scrolled.jpg)
|

### Other migrated surfaces

| Page | Light | Dark |
| --- | --- | --- |
| Analytics Tables |
![analytics-tables-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/analytics-tables-light.jpg)
|
![analytics-tables-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/analytics-tables-dark.jpg)
|
| Emails |
![emails-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/emails-light.jpg)
|
![emails-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/emails-dark.jpg)
|
| Email Sent |
![email-sent-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/email-sent-light.jpg)
|
![email-sent-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/email-sent-dark.jpg)
|
| Domains |
![domains-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/domains-light.jpg)
|
![domains-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/domains-dark.jpg)
|
| Webhooks |
![webhooks-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/webhooks-light.jpg)
|
![webhooks-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/webhooks-dark.jpg)
|
| External DB Sync |
![external-db-sync-light](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/external-db-sync-light.jpg)
|
![external-db-sync-dark](https://gist.githubusercontent.com/mantrakp04/2fe05ddbb2d2d7cd2d237027c909c1b9/raw/external-db-sync-dark.jpg)
|

## What's new

### `DataGrid` in `@stackframe/dashboard-ui-components`

A new, fully-typed, fully-controlled grid component under
`packages/dashboard-ui-components/src/components/data-grid/`. Single
source of truth for tabular UI across the dashboard.

Package files:
- `data-grid.tsx` — main grid renderer (virtualized rows, sticky toolbar
+ header)
- `data-grid-toolbar.tsx` — built-in toolbar (search, columns, density,
export)
- `data-grid-sizing.ts` — column width / flex / min-width resolution
- `state.ts` — state helpers (`createDefaultDataGridState`, sort /
select / paginate utilities, `exportToCsv`, date formatters)
- `strings.ts` — i18n string table + `resolveDataGridStrings`
- `types.ts` — public types (`DataGridColumnDef`, `DataGridProps`,
`DataGridState`, `DataGridDataSource`, etc.)
- `use-data-source.ts` — `useDataSource` hook with `client` / `server` /
`infinite` modes
- `index.ts` — package entrypoint

Features:
- Controlled state (`state` + `onChange`) covering sorting, pagination,
column visibility, column widths, column pinning, selection,
date-display mode, and quick search.
- Column definitions with `string` / `number` / `date` / `dateTime` /
`boolean` / `singleSelect` / `custom` types, custom `renderCell`, custom
sort comparators, per-column `parseValue` / `dateFormat`, pinning,
align, flex / min / max width.
- **Cell overflow control** — new `cellOverflow: "truncate" | "wrap"`
per column. `"wrap"` + `rowHeight="auto"` lets rows grow to fit
multi-line content.
- **Dynamic row heights** — `rowHeight` now accepts `"auto"` with an
`estimatedRowHeight` hint for the virtualizer, eliminating
scroll-position jank while rows are still being measured.
- **Sticky chrome with `stickyTop`** — the toolbar and header stick
under a caller-provided offset (matching the page header height) with a
proper blur backdrop. See the _Sticky behaviour — scrolled views_
section above for the visual.
- Client-side sort + quick-search + pagination via `useDataSource` —
consumer never pre-sorts / paginates.
- Server-side and async-generator data sources for streaming / cursor
pagination.
- Paginated and infinite-scroll UI modes.
- CSV export + clipboard copy.
- Row single / multi selection with shift-range anchor.
- Row + cell click / double-click callbacks.
- Pluggable toolbar / footer / empty / loading states and i18n strings.

### Dashboard design guide

New `apps/dashboard/DESIGN-GUIDE.md`: prescriptive, AI-readable source
of truth for dashboard UI. Documents when to use each
`design-components` primitive, the `DataGrid` canonical pattern, color /
typography / spacing / motion rules, route-specific guidance, and the
migration priority. Now also documents the new `cellOverflow` and
dynamic-`rowHeight` patterns, and marks `DesignDataTable` as deprecated
in favor of `DataGrid` + `useDataSource` + `createDefaultDataGridState`.

### Overview page revamp


`apps/dashboard/src/app/(main)/(protected)/projects/[projectId]/(overview)/line-chart.tsx`
— line chart rewritten on top of the shared `AnalyticsChart` /
`DonutChartDisplay` primitives, feeding the revamped Overview.

### Data-table migrations

Every shared table under `apps/dashboard/src/components/data-table/` has
been rewritten on top of `DataGrid`:

- `api-key-table.tsx`
- `payment-product-table.tsx`
- `permission-table.tsx`
- `team-member-search-table.tsx`
- `team-member-table.tsx`
- `team-search-table.tsx`
- `team-table.tsx`
- `transaction-table.tsx` — now also wires in `DataGridToolbar` with
search / column visibility
- `user-search-picker.tsx`
- `user-table.tsx` — extracted `USER_TABLE_COLUMNS` for readability /
reuse

### Page adoption

Page-level tables migrated to `DataGrid` (or the new `useDataSource` +
`createDefaultDataGridState` pattern):

- `(overview)/line-chart.tsx`
- `analytics/tables/query-data-grid.tsx` (now with sticky header)
- `domains/page-client.tsx`
- `email-drafts/[draftId]/page-client.tsx`
- `email-outbox/page-client.tsx` (with `DataGridToolbar`)
- `email-sent/page-client.tsx`, `grouped-email-table.tsx`,
`sent-emails-view.tsx`
- `emails/page-client.tsx`
- `external-db-sync/page-client.tsx`
- `payments/layout.tsx`, `payments/customers/page-client.tsx`,
`payments/products/[productId]/page-client.tsx`
- `users/[userId]/page-client.tsx`
- `webhooks/page-client.tsx`, `webhooks/[endpointId]/page-client.tsx`
- `design-language/page-client.tsx`,
`design-language/realistic-demo/page-client.tsx`
- `playground/page-client.tsx`

### Backend & supporting changes

- `apps/backend/src/lib/ai/prompts.ts` — extends the AI-analytics prompt
with detailed schema docs for `contact_channels`, `teams`,
`team_member_profiles`, `team_permissions`, `team_invitations`,
`email_outboxes`, `project_permissions`, `notification_preferences`,
`refresh_tokens`, and `connected_accounts`, so natural-language queries
have richer context to compile against.
- `apps/backend/src/lib/seed-dummy-data.ts` — additional OAuth providers
on seed users, improving dummy-data coverage for the migrated tables
(visible on the Users grid).
- `apps/dashboard/src/app/globals.css` — adds `--data-grid-sticky-top`
token used to derive the grid's sticky offset under the page header.
- `packages/template/src/dev-tool/dev-tool-core.ts` — persist the
"closed" state when the user closes the dev-tool panel so it doesn't
reopen on next load.

## Notes for reviewers

- Rebased onto latest `dev`; conflict in `api-key-table.tsx` resolved by
keeping the `DataGrid` implementation (consistent with the other
migrated tables).
- `DesignDataTable` is still in the codebase but marked deprecated in
the design guide — new code must use `DataGrid`.
- `DataGrid` is fully controlled: consumers must pass state + onChange,
must feed `rows` from `useDataSource` (never raw arrays), and must
define columns outside the component or via `useMemo`. The guide's §4.12
spells this out.
- `rowHeight="auto"` is opt-in; the default fixed-height virtualization
path is unchanged and remains the fast path for dense, single-line grids
(users, transactions, etc.).
- Screenshots are JPEG this round — the local capture tooling's PNG path
was producing blank frames, so the new set is `.jpg` end-to-end. Same
viewports, same seeded project.

## Test plan

- [ ] `pnpm lint` passes
- [ ] `pnpm typecheck` passes
- [ ] Load the dashboard and verify every migrated surface renders,
sorts, searches, paginates, and handles row-click navigation:
  - [ ] Overview (line chart + donut metrics)
- [ ] Users list + user detail (teams, sessions, permissions, API keys)
  - [ ] Teams list + team detail (members, permissions)
  - [ ] Domains
  - [ ] Emails, email-sent, email-outbox, email-drafts
  - [ ] Webhooks list + endpoint detail
  - [ ] Payments customers, product detail, transactions (new toolbar)
  - [ ] External DB sync
  - [ ] Analytics query table (sticky header)
- [ ] Verify infinite-scroll surfaces (domains, etc.) load additional
rows on scroll
- [ ] Verify sticky header stays below the page header in light and dark
themes
- [ ] Verify CSV export produces correct output on a representative
table
- [ ] Verify column resize, visibility toggle, and sort work across
themes
- [ ] Verify `cellOverflow: "wrap"` rows grow to fit when
`rowHeight="auto"` and clip when `rowHeight` is numeric
- [ ] Spot-check AI analytics queries against the new schema context
(contact_channels, teams, email_outboxes, …)


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **New Features**
* Unified table components across dashboard with improved infinite
pagination and quick search.

* **Improvements**
* Enhanced table performance with sticky headers and better row height
handling.
* Improved sorting, filtering, and data loading with consistent state
management.
  * Better visual consistency across all data grids and table layouts.

* **UI/Styling**
* Refined table styling for better text truncation and content wrapping.
  * Optimized layout spacing and alignment across dashboard tables.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Developing-Gamer <maxcodes11110@gmail.com>
Co-authored-by: Armaan Jain <84474476+Developing-Gamer@users.noreply.github.com>
Co-authored-by: Konstantin Wohlwend <n2d4xc@gmail.com>
2026-04-27 13:50:24 -07:00
BilalG1
2f719903b1
Redesign Email Server settings + managed domain flow (#1373)
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
## Summary

Rewrites the **Email Server** section of the project email settings page
and the managed-domain setup flow. Replaces the dropdown +
conditional-fields layout with a visual four-card picker, a clearer
unsaved-state model, a stepper dialog for managed-domain onboarding, and
a consistent tracked-domains list. Also fixes two data-correctness bugs
in the managed-domain backend.

## Walkthrough (2×, dead-frames trimmed)


![walkthrough](https://raw.githubusercontent.com/stack-auth/stack-auth/pr-assets-email-ui/pr-assets-walkthrough.gif)

## Before

The saved state was a minimal dropdown, but choosing Custom SMTP /
Resend revealed a long conditional form with a hidden gear toggle for
server config, no clear "what is saved" signal, and a separate dialog
pattern for managed domains.

| Saved (Managed) | Custom SMTP selected |
|---|---|
|
![before-managed](https://raw.githubusercontent.com/stack-auth/stack-auth/pr-assets-email-ui/pr-assets-01-before-shared.png)
|
![before-smtp](https://raw.githubusercontent.com/stack-auth/stack-auth/pr-assets-email-ui/pr-assets-02-before-smtp.png)
|

## After — Provider cards

Four visual cards (Stack Shared, Managed Domain, Resend, Custom SMTP)
with updated copy. The saved provider shows a green **Current** pill;
the card the user is previewing shows an amber dashed **Draft** pill. An
amber unsaved-changes banner appears between the picker and the form
when state diverges from saved, so it is unambiguous that a click is not
yet committed.

| Saved state | Previewing a different provider |
|---|---|
|
![after-saved](https://raw.githubusercontent.com/stack-auth/stack-auth/pr-assets-email-ui/pr-assets-03-after-saved.png)
|
![after-draft](https://raw.githubusercontent.com/stack-auth/stack-auth/pr-assets-email-ui/pr-assets-04-after-draft.png)
|

Copy changes:
- **Stack Shared** — "Only default emails — no custom templates, themes,
or sender identity." (was: "Shared (noreply@stackframe.co)")
- **Managed Domain** — "Bring your own domain. You add DNS records; we
handle signing & delivery." (was: "Managed (via managed domain setup)")
- **Resend** uses the official Resend brand mark (light/dark variants in
`apps/dashboard/public/assets/`)

## After — Managed domain list + stepper dialog

Selecting **Managed Domain** immediately shows the tracked-domain list
with an **Add domain** button. Each row reflects real status (Active /
Verified / Waiting for DNS / Verifying / Failed). Exactly one domain can
be **Active** — the one matching the saved email config; every other
verified/applied domain shows a **Use this domain** button so switching
is always possible.

Adding a domain opens a 3-stage dialog with a horizontal stepper (Verify
is right-aligned for the final step). Stage 2 replaces the old bare
NS-list with a proper **Type / Name / Content** DNS records table with
per-row copy buttons.

| Tracked domains list | DNS records table |
|---|---|
|
![after-list](https://raw.githubusercontent.com/stack-auth/stack-auth/pr-assets-email-ui/pr-assets-05-after-managed-list.png)
|
![after-dns-table](https://raw.githubusercontent.com/stack-auth/stack-auth/pr-assets-email-ui/pr-assets-06-after-dns-table.png)
|

## Bug fixes

- **Backend: applying a managed domain did not demote previously-applied
ones.** Multiple rows could end up with status `APPLIED` even though
only one could be in the saved config. New helper
`demoteOtherAppliedManagedEmailDomains({ tenancyId, keepId })` runs
inside `applyManagedEmailProvider` to demote all other applied rows in
the tenancy back to `VERIFIED` before marking the new one.
- **Frontend: "Use this domain" only appeared for `status ===
verified`.** A domain that had been applied then replaced could never be
re-applied from the UI. Button now appears for any `verified` or
`applied` row that is not currently in use; the **Active** label is
derived from config match instead of DB status.
- **Dev mock onboarding now mirrors production timing.**
`shouldUseMockManagedEmailOnboarding()` used to insert domains as
`verified` synchronously. Now the domain is created as
`pending_verification`, and a fire-and-forget `runAsynchronously(() =>
wait(1000))` updates it to `verified` — mirroring the real Resend
webhook flow so the UI states (pending → verifying → verified) are
exercised in local dev.

## Test plan
- [ ] Cards: clicking each card shows `Draft` pill + amber banner;
Discard restores; Save commits and flips `Current` to the new card
- [ ] Managed: Add domain → stage 1 input → stage 2 DNS table + copy →
Check verification flips to stage 3 → Use this domain sets it Active and
demotes the previously-active domain in the list
- [ ] Managed: clicking **Use this domain** on a non-active verified row
makes it Active and the previously-active row back to Verified
- [ ] Shared / Resend / SMTP: existing save + test-email flows still
work (logic preserved verbatim)
- [ ] `pnpm typecheck` (dashboard + backend) and `pnpm lint` pass

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Redesigned email domain setup flow with multi-step verification dialog
  * Added copy-to-clipboard for DNS records
* Enhanced provider selection interface with improved visual
presentation
* Onboarding now shows initial "pending verification" state and
completes verification asynchronously

* **Bug Fixes**
* Ensures only one managed domain becomes active when applying a domain
  * Improved error handling for email configuration saves

* **Tests**
  * Updated end-to-end tests to reflect async verification timing
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-24 13:35:03 -07:00
BilalG1
4a2595d9f7
Classify ClickHouse NO_COMMON_TYPE (386) as unsafe (#1380)
## Summary
- Add ClickHouse error code `386` (`NO_COMMON_TYPE`) to
`UNSAFE_CLICKHOUSE_ERROR_CODES` in
`apps/backend/src/lib/clickhouse-errors.ts`. This stops the Sentry
`StackAssertionError` (`Unknown Clickhouse error: code 386 not in safe
or unsafe codes`) that was firing whenever an admin wrote a query like
`SELECT [1, 'a']` or `SELECT if(1, 'a', 1)`, while keeping the raw error
message out of prod responses.
- Add two e2e regression tests: one against the cross-project
`analytics_internal.users` table, and one against `system.query_log`, to
pin that 386 is wrapped with the generic `Error during execution of this
query.` message in prod (full detail only surfaces in dev/test).

## Why unsafe, not safe
Both callers of `getSafeClickhouseErrorMessage`
(`apps/backend/src/app/api/latest/internal/analytics/query/route.ts:59`
and `apps/backend/src/lib/ai/tools/sql-query.ts:80`) execute
caller-authored SQL under `readonly: "1"` with
`SQL_project_id`/`SQL_branch_id` scoping. The ClickHouse client runs
under a `limited_user` whose grants restrict most tables — but
ClickHouse resolves types **before** enforcing ACL. That means a query
like `SELECT if(1, query, 1) FROM system.query_log` surfaces code 386
with a message like `There is no supertype for types String, UInt8 ...`,
leaking that `system.query_log.query` is a `String` — schema info from a
table the caller can't actually read.

This is the same type-before-ACL class as code 43
(`ILLEGAL_TYPE_OF_ARGUMENT`), which is already classified unsafe.
Classifying 386 as unsafe keeps the defense-in-depth consistent: if
per-customer tables are ever introduced and grants don't block
reference-resolution in time, 386 won't leak their schema.

Cost: in prod, an admin writing a malformed type-mismatch query sees
only `Error during execution of this query.` instead of the supertype
hint. Dev and test environments still show the full error via the
existing `getNodeEnvironment()` branch, so local iteration is
unaffected.

## Test plan
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/analytics-query.test.ts` — all
64 tests pass, including the two 386 regression tests.
- [ ] Monitor Sentry after deploy to confirm the
`unknown-clickhouse-error-for-query` events for code 386 stop firing.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* Improved handling of a ClickHouse type-mismatch error to prevent
exposure of sensitive data and ensure sanitized error responses.

* **Tests**
* Added regression tests that verify error responses are sanitized,
return consistent error codes, and include expected headers without
leaking internal details.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-24 12:07:16 -07:00
Mantra
a132dd23f9
fix: refresh-token P2025 race with concurrent sign-out (#1372)
## Summary
- Fixes Sentry
[STACK-BACKEND-146](https://stackframe-pw.sentry.io/issues/7377768662/):
`PrismaClientKnownRequestError` P2025 on
`projectUserRefreshToken.update()` during token refresh.
- Root cause: `generateAccessTokenFromRefreshTokenIfValid`
(`apps/backend/src/lib/tokens.tsx`) reads the refresh-token row
upstream, then issues `.update(...)` on it (and on `projectUser`) inside
a `Promise.all`. If a concurrent sign-out (`DELETE
/auth/sessions/current`), session revoke, password change, or user
deletion removes the row between the read and the update, Prisma throws
P2025 and the refresh endpoint 500s.

## Changes
- `apps/backend/src/lib/tokens.tsx` — swap the two `.update(...)`s for
`.updateMany(...)` so a missing row is a no-op, then re-check the
refresh token still exists; return `null` if it doesn't. The refresh
route already maps `null` -> `KnownErrors.RefreshTokenNotFoundOrExpired`
(401), which is the correct user-facing behavior for a just-revoked
session.
- `apps/backend/src/oauth/model.tsx` — in `generateAccessToken`, replace
the "ultra-rare race condition" `throwErr` fallback with `throw new
KnownErrors.RefreshTokenNotFoundOrExpired()` so concurrent sign-out
during an OAuth `refresh_token` grant returns a clean 401 instead of
500.
-
`apps/e2e/tests/backend/endpoints/api/v1/auth/sessions/current/refresh-race.test.ts`
— new regression test that fires `POST /auth/sessions/current/refresh`
and `DELETE /auth/sessions/current` concurrently with the same refresh
token. Before the fix it 500s on the first iteration; after, it passes
in ~12s.

## Test plan
- [x] New regression test passes locally.
- [x] Existing `auth/sessions/**` + `auth/oauth/token.test.ts` still
pass (27 tests, 3 todo, 0 failed).
- [ ] CI green.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Refresh flows now detect a revoked or removed refresh token during
concurrent operations and stop cleanly, preventing issuance of an access
token from stale data.
* A specific refresh-token-not-found/expired error is returned instead
of a generic failure when refresh cannot proceed.

* **Tests**
* Added E2E tests exercising concurrent refresh vs sign-out to prevent
race-condition crashes and validate safe handling of competing requests.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-24 18:44:39 +00:00
Mantra
7957de4182
fix(email-queue): recover stuck sending without duplicate retry (#1356)
## Summary

Email outbox rows can get stuck in `SENDING` if a worker dies after
setting `startedSendingAt` but before finishing or unclaiming. This
change adds `recoverEmailsStuckInSending`, which runs each email queue
step and marks rows past the stuck timeout as **terminal server errors**
with delivery status unknown, **without** scheduling an automatic retry
(to avoid duplicate sends if the provider already accepted the message).

## Changes

- **`recoverEmailsStuckInSending`**: updates stuck rows with
`finishedSendingAt`, `canHaveDeliveryInfo: false`, and server error
fields; emits Sentry via `captureError` when any rows are recovered.
- **Tests**: `email-queue-step.test.tsx` covers recovery of old
`startedSendingAt`, no-op for recent sends, and idempotency (second pass
does not re-queue).

## Test plan

- [ ] `pnpm` / vitest for
`apps/backend/src/lib/email-queue-step.test.tsx` (requires dev DB like
other integration tests in this package)

Made with [Cursor](https://cursor.com)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Email reliability: messages that remained stuck in sending are now
automatically marked as terminal failures, assigned standardized error
details, cleared from retry scheduling, prevented from receiving
delivery info, and recovery emits an alert only when actual work occurs.
Recovery is safe to run repeatedly (idempotent).

* **Tests**
* Added integration tests validating recovery behavior, proper field
updates, and idempotency.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-24 11:00:46 -07:00
BilalG1
37ee5ec320
Fast-start local emulator via RAM snapshot + live secret rotation (#1340)
## Summary

`stack emulator start` now resumes a fully-warm VM snapshot instead of
cold-booting, bringing startup from 30–120s down to ~5–8s with
per-install secret rotation, or ~2.5s with rotation opt-out. The
snapshot is captured **locally on first `stack emulator pull`**, not
shipped from CI — QEMU migration state isn't portable across
accelerators (KVM/HVF/TCG) or `-cpu max` feature sets, so a CI-captured
snapshot couldn't resume reliably on arbitrary user hardware.

Also bundles a pile of CLI QoL fixes (progress bars, PR/run artifact
pulls, PR-build download, native-TS ISO writer replacing
`hdiutil`/`mkisofs`/`genisoimage` host dep, unit tests).

| Scenario | Before | After |
|---|---|---|
| Cold boot (no snapshot) | 30–120s | same, works as fallback |
| `stack emulator pull` (one-time, includes local snapshot capture) |
~30s download | ~30s download + ~1–3 min cold-boot capture |
| Snapshot resume, normal start | — | **~5–8s** |
| Snapshot resume, `EMULATOR_NO_ROTATION=1` | — | **~2.5s** |

Backend (`/health?db=1`) and dashboard (`/handler/sign-in`) return 200
on all paths. Two successive snapshot resumes produce different rotated
PCK/SSK/SAK/CRON_SECRET values per install.

## How it works

**Build (CI)** — `docker/local-emulator/qemu/build-image.sh`:

1. Cloud-init provisioning runs to completion (migrations, seed,
slim-image) producing `stack-emulator-<arch>.qcow2`.
2. Image is built with a topology compatible with later snapshot capture
(pinned SMP=4, phantom seed/bundle ISOs, STACKCFG runtime ISO mounted at
build time, qemu-guest-agent running, placeholder hex secrets baked in
under `STACK_EMULATOR_BUILD_SNAPSHOT=1`).
3. CI publishes **only the qcow2** — no `.savevm.zst` ships.

**Pull (user's machine)** —
`packages/stack-cli/src/commands/emulator.ts` + `run-emulator.sh
capture`:

1. `stack emulator pull` downloads the qcow2 with a progress bar (or
from a PR / workflow run via `--pr` / `--run`).
2. CLI invokes `run-emulator.sh capture`: cold-boots the qcow2 with a
matching device layout (phantom ISOs, fsdev, pcie-root-port, virtfs
detached — migration-incompatible), waits for backend+dashboard health,
then drives QMP: `stop` → set `mapped-ram` + `multifd` caps → `migrate
file:state.raw` → poll `query-migrate` → `quit`. Raw mapped-ram file is
zstd-compressed to `stack-emulator-<arch>.savevm.zst` in the images dir.
3. `--skip-snapshot` opts out (first `start` will then cold-boot).

**Runtime** — `run-emulator.sh start`:

1. Launch QEMU with `-incoming defer` when a `.savevm.zst` is present;
decompress on first use, keep the `.raw` cached for subsequent starts.
2. QMP: same `mapped-ram` + `multifd` caps → `migrate-incoming
file:<.raw>` → poll for `paused` → `cont`.
3. Generate fresh per-install secrets on the host; pipe them
base64-encoded through QGA `guest-exec input-data` →
`trigger-fast-rotate` in the guest → `docker exec -e … rotate-secrets`.
4. `rotate-secrets` in the container: validate keys (hex-only), targeted
`sed` on the placeholder PCK across built JS, `UPDATE ApiKeySet`,
`supervisorctl restart stack-app cron-jobs` (with
`stopasgroup`/`killasgroup` so the Node children actually die and
release their ports).
5. Poll backend+dashboard health; if anything fails, clean up and fall
back to cold boot transparently.

**Security model**: placeholder hex values are baked into the snapshot
(`00…ff` PCK, `00…ee` SSK, `00…dd` SAK, `00…cc` CRON_SECRET). They are
non-secret by construction. Real per-install secrets are generated at
each `emulator start` and never leave the host.

## CLI changes (`packages/stack-cli`)

- **`src/lib/iso.ts`** (new): native TypeScript ISO 9660 + Joliet
writer, replacing the host-side `hdiutil`/`mkisofs`/`genisoimage`
dependency for generating the STACKCFG runtime config disk. Unit tests
in `src/lib/iso.test.ts`.
- **`src/commands/emulator.ts`**:
- `pull`: streamed downloads with progress bar + ETA; `--pr <number>`
and `--run <id>` to pull from a PR build's CI artifacts (uses
`extract-zip` for the nested zip); `--skip-snapshot` to opt out of the
one-time local capture.
- `start` (existing, extended): auto-pulls AND auto-captures when no
image exists, so first-ever `start` is self-bootstrapping; emits
`STACK_EMULATOR_CLI_WROTE_ISO=1` so the shell helper skips its own ISO
regen (avoids the genisoimage host dep).
- `capture` (new, invoked by `pull` and the auto-pull path of `start`):
drives the local snapshot capture via `run-emulator.sh`.
- `status`, `stop`, `reset`, `list-releases`: preflight +
path-resolution tightening (`STACK_EMULATOR_HOME` → images/run dirs).
  - Unit tests in `src/commands/emulator.test.ts`.
- **`EMULATOR_NO_ROTATION=1`** env var skips the post-resume rotation
(intended for tests/CI where the placeholder secrets are fine — comes
with a loud warning).

## CI (`.github/workflows/qemu-emulator-build.yaml`)

- Builds **QEMU 10.2.2 from source** (cached), because
`mapped-ram`/`multifd` migration capabilities aren't available in the
distro's QEMU. Enables KVM on ubicloud runners so amd64 boots at
hardware speed.
- amd64 + arm64 both build on the same amd64 matrix
(`ubicloud-standard-8`); arm64 runs under cross-arch TCG (provisioning
only — boot/verify smoke test is amd64-only).
- Verification now runs through the CLI: `emulator start` → `emulator
status` → `emulator stop` against the freshly-built qcow2 (via
`STACK_EMULATOR_HOME` pointing at the workspace, so the CLI doesn't
silently auto-pull a prior release).
- Packages **only** the qcow2. No `.savevm.zst` upload / publish.
- Release notes updated.

## Key files

**Shell / guest:**
- `docker/local-emulator/qemu/build-image.sh` — snapshot-compatible
device topology + STACKCFG runtime ISO at build time
- `docker/local-emulator/qemu/run-emulator.sh` — `start`, `capture`,
`stop`, `reset`, `status`; `-incoming defer`, `.raw` cache, QGA-driven
rotation, cold-boot fallback
- `docker/local-emulator/qemu/common.sh` (new) — shared `qmp_session` +
`capture_vm_state` (factored out so build-image.sh and run-emulator.sh
share the capture path)
- `docker/local-emulator/qemu/cloud-init/emulator/user-data` —
placeholder secrets in snapshot mode, `wait-for-stack-ready`,
`trigger-fast-rotate`, qemu-guest-agent enabled
- `docker/local-emulator/rotate-secrets.sh` (new) — in-container
rotation (sed + UPDATE + supervisorctl)
- `docker/local-emulator/supervisord.conf` — `stopasgroup`/`killasgroup`
on `stack-app` and `cron-jobs`
- `docker/local-emulator/entrypoint.sh` — only mint CRON_SECRET if unset
(placeholder supplied in snapshot mode via --env-file)
- `docker/local-emulator/Dockerfile` — ships `rotate-secrets` to
`/usr/local/bin`
- `docker/server/entrypoint.sh` — source
`/run/stack-auth/rotated-secrets.env`; skip full-tree sentinel scan on
warm restarts via marker

**CLI:**
- `packages/stack-cli/src/lib/iso.ts` (new) + `iso.test.ts` (new)
- `packages/stack-cli/src/commands/emulator.ts` + `emulator.test.ts`
(new)
- `packages/stack-cli/vitest.config.ts` (new)

**CI:**
- `.github/workflows/qemu-emulator-build.yaml`

## Test plan

- [x] `docker/local-emulator/qemu/build-image.sh {amd64,arm64}` produces
`stack-emulator-<arch>.qcow2` with snapshot-compatible topology
- [x] `stack emulator pull` downloads qcow2 with progress, then captures
locally (~1–3 min) and writes `stack-emulator-<arch>.savevm.zst` in the
images dir
- [x] `stack emulator pull --skip-snapshot` stops after download
- [x] `stack emulator pull --pr <n>` / `--run <id>` pull from PR /
workflow run artifacts
- [x] `stack emulator start` on a fresh dir auto-pulls **and**
auto-captures, then starts; subsequent starts fast-resume in ~5–8s;
backend + dashboard return 200
- [x] `EMULATOR_NO_ROTATION=1 stack emulator start` completes in ~2.5s;
backend + dashboard return 200 with warning printed
- [x] Two consecutive `emulator start` invocations produce different PCK
values in the internal `ApiKeySet` row
- [x] `stack emulator status` / `stop` / `reset` resolve paths from
`STACK_EMULATOR_HOME`
- [x] Verified end-to-end on arm64 macOS under HVF (capture ~50s,
fast-resume ~6.5s)
- [x] `pnpm lint` and `pnpm typecheck` pass; stack-cli unit tests (iso +
emulator) pass
- [ ] CI green on this PR (qemu-emulator-build matrix, smoke test)
- [ ] `gh release download emulator-<branch>-latest` contains only
`stack-emulator-<arch>.qcow2` once this PR merges and publish runs

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Snapshot fast-start/resume with optional warm-snapshot assets, runtime
ISO generation, and a cached QEMU build to speed emulator setup.
* CLI: streamed artifact downloads with progress, improved release/asset
handling, stronger preflight checks, and start/status/stop emulator
commands.
* Automated secret rotation and ability to apply rotated secrets at
container startup; supervisor control socket enabled.

* **Bug Fixes**
* More robust start/stop/resume flows with automatic fallback to cold
boot and improved process-group shutdown behavior.

* **Tests**
  * New tests for CLI utilities and ISO image generation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-20 14:24:49 -07:00
BilalG1
0621ad2032
ai proxy fix (#1343)
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Refactor**
* Request sanitization now includes an extra proxy-specific
preprocessing step for safer AI proxying.
* **New Features**
* Initialization prompts centralized into a shared helper, with a
web-specific prompt variant.
* Authenticated requests can optionally route via a provided external
API key to access alternate models.
* **Chores**
* Added and exposed a preprocessing hook with a default no-op
implementation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-19 22:57:38 -07:00