DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
## Summary
Two small test-maintenance fixes that came up while running the suite:
- **Onboarding migration test**
(`apps/backend/prisma/migrations/20260420000000_add_project_onboarding_state/tests/default-and-updates.ts`):
switch the JSON insert from `\${JSON.stringify(onboardingState)}::jsonb`
to `\${sql.json(onboardingState)}`. This matches the pattern used by
every other migration test in the repo (see
`20260214000000_fix_trusted_domains_config/tests/*`) and lets the
`postgres` driver handle serialization and parameter binding
consistently rather than relying on a manual `::jsonb` cast.
- **Internal metrics snapshot**
(`apps/e2e/tests/backend/endpoints/api/v1/__snapshots__/internal-metrics.test.ts.snap`):
update `active_users_by_country.AQ` to list `mailbox-2` before
`mailbox-1`. The `should return metrics data with users` test signs in
`mailbox-1` (mailboxes[0]) into AQ first, then later signs `mailbox-2`
(mailboxes[1]) into AQ, so sorted by `last_active_at_millis desc`
`mailbox-2` should come first. The snapshot now matches that ordering.
No production code is touched — both changes are limited to test
fixtures.
## Test plan
- [ ] `pnpm -C apps/backend test run` (migration tests)
- [ ] `pnpm -C apps/e2e test run internal-metrics` (snapshot test)
- [ ] `pnpm lint`
- [ ] `pnpm typecheck`
Made with [Cursor](https://cursor.com)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Tests**
* No user-facing behavior changed; test flows made more robust and less
flaky (migration validation, metrics ingestion polling, CLI expiry
checks, failed-emails digest expectations).
* **API / Documentation**
* CLI auth default expiration reduced from 2 hours to 2 minutes (updated
OpenAPI defaults and related test expectations).
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
- Introduced a new API endpoint to fetch weekly and daily user metrics
for managed projects.
- Updated the dashboard to utilize this new endpoint, replacing the
previous daily active users data.
- Created a new component to visualize weekly users metrics in the
project cards.
- Refactored existing components to accommodate the new data structure
and ensure proper rendering of user activity charts.
This change enhances the analytics capabilities of the dashboard,
providing better insights into user engagement over time.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* New internal endpoint providing per-project weekly user totals and
7-day daily activity series.
* **Updates**
* Dashboard and project cards switched from DAU to weekly user metrics;
main metric shows weekly users and label reads "users/wk".
* Charts now display weekly-user-aware sparklines alongside daily
activity.
* **Tests**
* Added unit tests covering weekly aggregation and daily-series merging.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
- **Self-hosted CLI**: read `STACK_API_URL` / `STACK_DASHBOARD_URL` from
env in `stack-cli` so the published CLI can talk to self-hosted Stack
Auth installs without a custom build. The existing
`STACK_CLI_PUBLISHABLE_CLIENT_KEY` override is kept as-is.
- **Docker example**: surface the three CLI-relevant vars in
`docker/server/.env.example` so self-host operators see them.
- **Tighter polling-code TTL**: default `2h -> 2min`, max `24h -> 15min`
for the CLI auth polling code. The code is only valid while a user is
actively waiting in `stack login`, so a tight window limits the blast
radius of a leaked code.
- **Raw-SQL poll handler**: convert
`apps/backend/src/app/api/latest/auth/cli/poll/route.tsx` from
`prisma.cliAuthAttempt.*` to raw SQL targeted at the tenancy
source-of-truth schema, matching the pattern already used by the
initiate handler in
`apps/backend/src/app/api/latest/auth/cli/route.tsx`.
## Test plan
- [ ] `pnpm typecheck`
- [ ] `pnpm lint`
- [ ] `pnpm test run` (focus on CLI-auth tests if any)
- [ ] Manual: `stack login` against a local backend
- polling code now expires after ~2 minutes by default
- `waiting` / `success` / `used` / `expired` branches still return
correct status codes and bodies
- [ ] Manual: published `stack-cli` against a self-hosted backend with
`STACK_API_URL` / `STACK_DASHBOARD_URL` set, end-to-end login
Made with [Cursor](https://cursor.com)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Improvements**
* More robust CLI authentication polling with atomic database updates to
prevent races; returns explicit statuses (waiting/expired/used/success)
and provides the refresh token on success.
* **Changes**
* Default CLI auth token TTL reduced to 2 minutes and capped at 15
minutes.
* Anonymous refresh token is considered present only when not null; null
expiry is treated as not-expired.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
### Summary
Reworks `stack init` UX, adds Sentry error reporting to the CLI,
polishes the emulator start flow, and overhauls the local-emulator
dashboard's "Open config file" dialog.
#### `stack init` flow
- **New top-level flow.** Drops the old "link existing vs. create new
local" fork. `init` now asks *where* to create the project — "Stack Auth
Cloud" or "Local". Adds a new `create-cloud` mode that logs the user in,
creates a cloud project, mints keys, and writes `.env` — no round-trip
through the dashboard.
- **Conditional emulator-install warning.** The "Local" choice label
only shows "(requires local emulator installation, ~1.3gb storage
required)" when the QEMU image isn't already on disk; otherwise it shows
"(emulator already installed)". Driven by a new
`isEmulatorImageInstalled()` helper in `commands/emulator.ts`.
- **Auto-create on zero-projects.** When the link-from-cloud path hits
an empty project list, the CLI now prompts *"You don't have any Stack
Auth projects yet. Would you like to create one?"* and, on yes, runs the
same flow as `stack project create`. Skips the pointless "select a
project" prompt when we just created one.
- **MCP-server notice.** Before invoking the coding agent, the CLI
announces that it's also registering the Stack Auth MCP server
(`mcp.stack-auth.com`) so the agent can answer Stack-specific questions
going forward.
- **Local-emulator env header.** When `writeProjectKeysToEnv` runs in
`local` mode it writes a 3-line comment header above the keys explaining
they're emulator-only and only valid while the emulator is running.
- **"What's next" footer.** After setup finishes, prints a short
orientation block: where the sign-up/sign-in routes live
(`/handler/sign-up`, `/handler/sign-in`), how to start the local
emulator (for `create` mode), a dashboard deep link for cloud projects
(respects `STACK_DASHBOARD_URL`), and a docs link.
#### Sentry error reporting (`lib/sentry.ts`, `index.ts`,
`tsdown.config.ts`)
- New `lib/sentry.ts` initializes `@sentry/node` with PII scrubbing
(Stack key prefixes, JWTs, home-dir paths, sensitive field names like
`token`/`secret`/`password`/`dsn`).
- DSN is baked at build time via a tsdown `define` sentinel
(`__STACK_CLI_SENTRY_DSN__`) — no DSN in source, no runtime env-var
dependency for installed users. CI sets `STACK_CLI_SENTRY_DSN_BUILD`
before `pnpm build`.
- Disabled when `NODE_ENV=development` or `CI`. No user opt-out.
- Wired into `main()`'s catch (only for unexpected errors —
`CliError`/`AuthError` still print and exit cleanly) plus
`uncaughtException` and `unhandledRejection` handlers via a
`handleFatal` helper.
#### `stack emulator start` welcome
- After a fresh start (not when reusing a running VM, not when
`--config-file` keeps stdout JSON-only), prints a short "Emulator is up"
block with service URLs (dashboard / backend / inbucket) and common
commands (`status`, `stop`, `reset`, `run`).
#### Local-emulator dashboard "Open config file" dialog
The dialog at `http://localhost:26700` (when no project is loaded) used
to be a single text input asking for an absolute path, with no
explanation of where that path comes from.
**Backend**
(`apps/backend/src/app/api/latest/internal/local-emulator/project/route.tsx`):
- POST is now tolerant of directory paths or paths that don't end in
`.ts`/`.js`/`.mjs` — it appends `stack.config.ts` and creates the file
if missing (`writeConfigToFile` mkdir's parents). Lets users paste a
project folder instead of hunting for the config file.
- New GET endpoint returns up to 20 most-recent `LocalEmulatorProject`
rows joined with their display names, sorted by `updatedAt` desc. Same
`isLocalEmulatorEnabled()` + client-auth gating as POST.
**Dashboard**
(`apps/dashboard/src/app/(main)/(protected)/(outside-dashboard)/projects/page-client.tsx`):
- Title changed to "Open your Stack Auth project". Description now
explicitly ties the file to `stack init`: *"Point the local dashboard at
the `stack.config.ts` in your project. If you just ran `stack init`, it
was created at the root of that project."*
- Added: *"Don't have one yet? Paste your project folder path instead
and we'll create stack.config.ts for you."*
- Recent-projects list (clickable rows that prefill the input) fetched
from the new GET endpoint when the dialog opens.
- OS-specific copy-path tip below the input (macOS ⌥-Copy as Pathname,
Windows Shift+RC Copy as path, Linux `realpath`).
- "Open project" button is disabled when the input is empty.
- All error paths (empty input, non-absolute path, server errors,
exceptions) surface via destructive toasts instead of throwing.
Why no native file picker: browsers do not expose absolute filesystem
paths from `<input type="file">`, drag-and-drop, or the File System
Access API. The backend requires an absolute path, so a Finder-style
picker isn't possible from a web page. The recent list + OS tips are the
workaround.
### Goal
The previous `init` flow dead-ended new users: if you had no project you
got an error telling you to go create one in the dashboard and come
back. The happy path also forced a choice between "link existing" and
"create local emulator" — not the question most users are trying to
answer. The emulator dashboard's open-project dialog had similar
friction: an unexplained path field with no recall of previously-opened
projects. And the CLI silently swallowed unexpected errors with no
telemetry. This branch makes the first-run path work end-to-end from the
terminal, gives the emulator dashboard a usable open-project surface,
and turns CLI crashes into actionable bug reports.
### How to review
- Start with `packages/stack-cli/src/commands/init.ts` — the whole
user-facing flow lives in `runInit`. Mode dispatch at the top,
`handleCreateCloud` is the new cloud branch, `printNextSteps` is the
footer, the MCP notice prints right before `runClaudeAgent`.
- `packages/stack-cli/src/lib/sentry.ts` is small and self-contained;
the sentinel-replacement contract is in `tsdown.config.ts`'s `define`
block. Confirm `dist/index.js` contains zero `__STACK_CLI_SENTRY_DSN__`
occurrences after a build with the env var unset, and the actual DSN
host after a build with it set.
- `packages/stack-cli/src/commands/emulator.ts` —
`printEmulatorWelcome()` is the welcome block;
`isEmulatorImageInstalled()` is the new exported helper used by
`init.ts`.
-
`apps/backend/src/app/api/latest/internal/local-emulator/project/route.tsx`
— the directory-tolerance branch is in the POST handler around the
`looksLikeConfigFile` check; the GET handler is appended at the bottom.
-
`apps/dashboard/src/app/(main)/(protected)/(outside-dashboard)/projects/page-client.tsx`
— dialog markup, recent-list fetch effect, `pathCopyTip` memo, and the
toast-based error handling in `handleOpenConfigFile`.
- Non-interactive (CI) paths stay strict: empty-project list still
errors with a pointer to `stack project create --display-name`. No
surprise project creation in CI.
- No tests. The CLI has no harness for the interactive flow;
verification is manual.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Recent local emulator projects listed in the config dialog for quick
selection.
* New CLI create-cloud mode and --display-name flag; interactive cloud
project creation and clearer next steps.
* Emulator start shows a welcome banner with service URLs when a new
instance starts.
* **Improvements**
* Config dialog UX, validation, error-toasting, and platform-aware copy
refined; “Open project” disabled for empty/invalid paths.
* CLI: centralized interactive project creation and improved fatal error
handling.
* **Chores**
* Sentry added and initialized for CLI error reporting.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Bilal Godil <bg2002@gmail.com>
## Summary
Splits the Stack Auth MCP server out of `apps/backend` and into a
dedicated Next.js app at `apps/mcp/`, served on port `:42` (suffixed via
`NEXT_PUBLIC_STACK_PORT_PREFIX`) and exposed in production at
`https://mcp.stack-auth.com/mcp`. The backend no longer carries the MCP
transport route; clients now point at the new host.
Base: `dev` → Head: `chore/move-mcp-to-a-sep-app`
Scope: 34 files, +1425 / −353
## What changed
- **New app** `apps/mcp/` — standalone Next.js + `@vercel/mcp-adapter`,
with:
- `src/app/api/internal/[transport]/route.ts` — MCP transport handler
(moved from backend)
- `src/app/mcp/route.ts`, `src/app/route.ts` — public landing + setup
page
- `src/app/health/route.ts` — health check
- `src/mcp-handler.ts`, `src/setup-page.ts`, `src/analytics.ts`
- **Backend** drops
`apps/backend/src/app/api/internal/[transport]/route.ts` (−105) — MCP
code is gone from the backend image.
- **Dashboard** install hint updated to point at
`https://mcp.stack-auth.com/mcp` (was `/`).
- **Dev launchpad** gets an MCP tile so the new service shows up
alongside the rest of the local stack.
- **CI** workflows (`db-migration-backwards-compatibility`,
`e2e-api-tests*`) start the MCP service in the background before running
tests.
- **Docs** (`docs-mintlify`, `docs/`) and `init-stack` / `init-prompt`
updated to reference the new URL.
- **E2E** `apps/e2e/tests/backend/endpoints/api/v1/internal/mcp.test.ts`
reworked to hit the new host; `helpers.ts` and env files gain an MCP
base-URL var.
## Visuals
### New `apps/mcp` setup page (`https://mcp.stack-auth.com/`)
The standalone app's root now serves a self-contained MCP setup guide
with per-client instructions (Cursor, VS Code, Codex, Claude Code,
Claude Desktop, Windsurf, ChatGPT, Gemini CLI):

### Dev launchpad now lists the MCP service
New tile at port suffix `:42`, importance 2, alongside Backend /
Dashboard / Demo app:

## Notes for reviewers
- The MCP transport endpoint moved path: it was mounted under
`/api/internal/[transport]` in the backend; in the new app it's at the
same path but on the dedicated host. The public-facing URL is
`https://mcp.stack-auth.com/mcp`.
- `apps/mcp` ships its own PostHog analytics client (`src/analytics.ts`)
so the backend doesn't have to proxy events for it anymore.
- Port allocation: `${PORT_PREFIX}42` (default `8142` in dev). Picked to
fit the existing dev-launchpad importance-2 row.
- No DB migrations.
## Test plan
- [x] `apps/mcp` builds and `pnpm dev` serves on `:8142`
- [x] Dev launchpad renders the new MCP tile (screenshot above)
- [x] MCP setup page renders client tabs (screenshot above)
- [x] E2E `mcp.test.ts` updated to hit the new host
- [ ] CI green on `e2e-api-tests*` and
`db-migration-backwards-compatibility` workflows (they were touched to
start the MCP service)
- [ ] `init-stack` / `mcp.ts` install flow lands users on the new URL
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Standalone MCP app added with a public /mcp endpoint and health check.
* MCP appears in the dev-launchpad apps list.
* **Documentation**
* MCP endpoint updated to https://mcp.stack-auth.com/mcp in all setup
guides and installer snippets.
* Setup page enhanced with detailed client install tabs and
instructions.
* **Chores**
* MCP service integrated into CI/e2e workflows and local env configs.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
### Context
There are some kinks to work out with deploying plan limits onto prod,
so we'd like to disable it temporarily.
### Summary of Changes
We update all call sites of the item quantity things with a flag based
check. Idea is when flag is set to true, it should function as if there
are no limits.
## Summary
- Renames the env var `STACK_SEED_INTERNAL_PROJECT_SECRET_SERVER_KEY` to
`STACK_INTERNAL_PROJECT_SECRET_SERVER_KEY` everywhere it is used (20
occurrences across 8 files), covering backend env files, the Prisma seed
script, runtime config, and the docker entrypoint/local-emulator
scripts.
- Mirrors the prior publishable-client-key rename in #1411.
## Test plan
- [x] `pnpm lint`
- [x] `pnpm typecheck`
- [ ] Verify local emulator still boots with the renamed variable
- [ ] Verify any deploy/CI configs that set the old name are updated
alongside this change
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Updated internal environment variable naming for API key management
and server configuration consistency across backend systems, Docker
deployment, and local development setup.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
- Renames the env var
`STACK_SEED_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY` to
`STACK_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY` everywhere it is used
(24 occurrences across 9 files), covering backend env files, the Prisma
seed script, runtime config, and the docker entrypoint/local-emulator
scripts.
## Test plan
- [x] `pnpm lint`
- [x] `pnpm typecheck`
- [ ] Verify local emulator still boots with the renamed variable
- [ ] Verify any deploy/CI configs that set the old name are updated
alongside this change
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Updated internal environment variable naming for consistency across
backend configuration files and deployment scripts.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
The `POST /api/latest/analytics/events/batch` endpoint was being dropped
by content-blocking browser extensions (adblockers) because the JSON
request body literally contains the substring `$click`. Many filter
lists pattern-match on tokens like that and silently kill the request —
analytics events from anyone with an adblocker enabled never reached our
backend.
This PR encodes the request body so keyword-matching filters can't see
those tokens, while keeping the URL path unchanged (only the body was
being matched here) and keeping older SDK clients working.
## Approach
- **Client**: gzip the JSON payload via the browser-native
`CompressionStream("gzip")` API and POST it as
`application/octet-stream`. Falls back to plain JSON if
`CompressionStream` isn't available (very old browsers / non-browser
runtimes).
- **Server**: a yup `.transform()` on the body schema detects an
`ArrayBuffer`/`Uint8Array` input, gunzips it, and `JSON.parse`s before
normal schema validation runs. The existing JSON path is untouched, so
requests from older SDK versions in the wild continue to work without
changes — and all existing schema-error snapshot tests still pass
verbatim.
- **Safety**: hard caps on compressed (1 MB) and decompressed (8 MB)
sizes guard against zip-bomb shaped abuse. `node:zlib`'s
`maxOutputLength` enforces the latter at the C++ layer.
Bonus: gzip also gives a meaningful bandwidth win — click/page-view
events compress very well — and keepalive bodies (which have a 64 KB cap
in browsers) get more headroom.
## Files
- `apps/backend/src/app/api/latest/analytics/events/batch/route.tsx` —
body schema gains `.transform()` that gunzips binary inputs; size limits
added; everything else unchanged.
- `packages/stack-shared/src/interface/client-interface.ts` —
`sendAnalyticsEventBatch` now routes through a new module-level
`encodeAnalyticsBody` helper that gzips and switches Content-Type. Same
outer signature; encoding is internal.
- `apps/e2e/tests/backend/backend-helpers.ts` — `niceBackendFetch` gains
optional `rawBody`/`rawContentType` params so tests can send non-JSON
payloads. Existing JSON callers unaffected.
-
`apps/e2e/tests/backend/endpoints/api/v1/analytics-events-batch.test.ts`
— adds two tests:
- happy path: gzipped binary body returns `inserted: 1`
- sad path: garbage bytes return 400
## Out of scope (intentional)
- **URL path renaming**: not all adblockers match on `/analytics/`, but
some do. We're shipping the body fix first and will revisit if requests
still get blocked after deployment.
- **Encryption**: gzip is enough to defeat keyword filters. Encryption
adds key-management cost with no real adversary.
- **SDK regen**: only `client-interface.ts` (in `stack-shared`) was
touched; `event-tracker.ts` (the caller) is unchanged because it already
passes a JSON string. No `pnpm -w run generate-sdks` needed.
## Test plan
- [x] `pnpm typecheck` — green
- [x] `pnpm lint` — green
- [ ] Manually verify in dev: enable adblocker, click around with
analytics enabled, confirm batch requests now go through
- [ ] Spot-check ClickHouse `analytics_internal.events` shows the
expected rows
- [ ] Run the new e2e tests (`pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/analytics-events-batch.test.ts`)
and confirm both new cases plus all preexisting snapshots pass
- [ ] Confirm the JSON back-compat path still works by hitting the route
with the existing JSON-body curl/test payloads
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Analytics batch uploads now accept gzipped binary payloads; clients
can send compressed bytes and the server will detect and decompress.
* Client sender can gzip event batches (falls back to JSON) and uses
keepalive to choose JSON vs compressed bytes.
* **Bug Fixes**
* Malformed, non-gzip, or overly-large compressed payloads now return a
clear 400 response.
* **Tests**
* Added E2E and unit tests plus test-helper support for raw/gzipped
request bodies and encoding behaviors.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Tabbed user profile with Activity (30-day analytics, KPIs, daily
chart, top lists, recent events), Payments (transactions, subscriptions,
product/item balances) and an activity heatmap sidebar.
* New internal user-activity API and admin-facing activity hook; admin
API client can fetch per-user activity.
* **UI/UX Improvements**
* Unified menus, cards and tables; inline editable user details with
accept/revert; metadata editor validates JSON; country-code input has
draft editing; tabs support optional icons.
* **API**
* Transactions endpoint and admin transaction queries now support
optional customer-scoped filtering.
* **Tests**
* End-to-end coverage for the user-activity endpoint.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<img width="1326" height="752" alt="image"
src="https://github.com/user-attachments/assets/97c04dca-db59-4357-98b1-8eae5a7a3673"
/>
<img width="1142" height="251" alt="image"
src="https://github.com/user-attachments/assets/e1aa44fc-0d7e-436d-90a5-c7cb15155e24"
/>
<img width="1170" height="1125" alt="image"
src="https://github.com/user-attachments/assets/bf6659fd-a9b5-4ae6-a13d-dab9956ad650"
/>
### Suggested Review Areas
Please see `plans.ts` and `seed.ts` to verify whether the item caps are
where they should be. Outside of that, each commit should be atomic so
stepping through the commits should give you an idea of how I
implemented each limit.
### Discussion
Something to discuss: when a user cancels team/growth we regrant free
fine, but any extra-seats they had just keeps billing. So they end up
paying ~$29/mo per extra-seat on top of free's 1 seat, which is strictly
worse than just staying on team. This surfaced while manually testing
this PR, we only enforce the add-on base requirement at purchase time,
nothing cascades on cancel. Should we cascade cancel add ons?
### Context
Now that we have a stable suite of products for stack-auth, we want to
limit the items under each product a customer has access to based on
their plan. So for example, a free plan user has a certain amount of
emails they can send out each month, and so on. We try to implement
limits in this PR.
### Summary of Changes
Implemented hard limits for dashboard admins, analytics per-query
timeouts, sent email monthly capacity, events, and session replays.
Implemented a soft cap for auth users (where if there's a signup beyond
the limit, we log it to sentry so we can manually choose to email that
user/team).
For auth users, we do not block new user sign ups once plan limit has
been hit. We also don't degrade or impact the customer experience. It
logs to sentry and it is up to us to take manual action to email the
user to upgrade the plan. Also, implementation wise, we count all the
users across all the projects for this team and compare it to their plan
item limit, rather than debiting items like we do for other approaches.
As a soft cap, this should be fine plus this is a better source of
truth.
For email capacity, we operate a monthly limit of emails. Once this is
hit, no more emails can be sent until the next month/ a plan upgrade.
These emails will be treated as a send error, so they can be manually
resent once the capacity is reset. With respect to the `email-queue`
state engine, they go from `SENDING`->`SERVER_ERROR`, hooking into the
existing state engine flow, with an external error that shows it's
because of the rate limit. This is cleaner than inventing a new state
that is identical for all intents and purposes to `SERVER_ERROR`. We
check in processSingleEmail since that maps to the sending state.
For analytics query timeouts, the backend route accepts a timeout
parameter with the request. The way we implement the timeout for each
query is by taking the `min(request_timeout,plan_timeout)` and using
that. This determines how long a query can run for.
For analytics events, there are server-side events (like refresh token
refreshes or sign up rule triggers) and client side events (like page
views or clicks). When these events occur, they are written to the
events table in clickhouse. We choose to implement a hard cap for the
total events, not just server side or client side. Once the cap is hit,
we stop storing the events and display a banner on the analytics page. A
different banner renders when we are at >=80% of total plan capacity.
For session replays, we stop creating new session replays when the limit
is hit. Old replays can still have chunks appended to them. The source
of truth here is the session replay table- a new replay corresponds to a
new row in the table. We have similar banners as to the events.
Dashboard admins should be 4 for both team and unlimited.
#### Implementation Caveats
For debiting items across these limits, we now use `tryDecreaseQuantity`
at the beginning. This means we debit first if possible before
conducting the action (like writing events to clickhouse). In practice,
this means that if clickhouse fails, then the user is debited for
something that doesn't happen. However trying to build a refund
workaround would be very clunky, and also, clickhouse is reliable. For
debits that are very small in the order of things (say, 200 items on a
100k plan), it doesn't mean much.
For emails, we don't debit items if it's a retry. This prevents the user
for being charged multiple times for effectively one email.
### UI Changes
The only UI changes in this PR are having certain banners render in
analytics when a customer is approaching/ is at their monthly limit of
session replays or events.
### Out of Scope for this PR
We do not have metered pricing yet, so events/session replays/ email use
beyond the limits cannot be charged yet. This is why for this
implementation, we rely on hard and soft caps.
We do not implement payment per-transaction pricing yet. That is
deferred to a followup PR.
The UI for the onboarding call will be set up as part of the overall
onboarding flow which doesn't exist yet, so it has been deferred.
Since the UI for the dashboard home page and project/account settings is
currently being reworked, finding a better spot for plan upgrades is not
handled in this PR.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Session replays added as a monthly included entitlement; onboarding
calls added to Team/Growth plans. Dashboard banners warn about
analytics-event and session-replay limits. Projects page adds extra-seat
flow and improved invitation error handling.
* **Behavior Changes**
* Monthly renewal semantics for emails-per-month and analytics-events;
analytics query timeouts now respect plan limits and are clamped. Email
sends, analytics events, and new session creation are blocked when
quotas are exhausted. Growth plan seats set to 4.
* **Tests**
* E2E and unit tests added to verify quota enforcement and free-plan
regranting.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Mantra <87142457+mantrakp04@users.noreply.github.com>
## Summary
Two authorization fixes in the backend. Both are pre-existing in `dev`
and were found during a security audit of `apps/backend/src`.
### 1. Team invitation accept — email not validated
[`team-invitations/accept/verification-code-handler.tsx`](https://github.com/stack-auth/stack-auth/blob/dev/apps/backend/src/app/api/latest/team-invitations/accept/verification-code-handler.tsx)
destructured the invited email as `{}` and only used `data.team_id` +
the accepting `user`. Any signed-in user in the tenancy who possessed
the 45-char code could join the team as themselves — the invitation was
not actually bound to the email it was addressed to.
**Attack scenarios that work without this fix**
- Forwarded invitation email (shared inbox, assistant inbox,
auto-forward rules).
- Screenshot of the invitation link pasted into Slack / Notion.
- Insider with server-access reading the email outbox (`GET
/api/latest/emails/outbox` returns rendered `html` +
`variables.teamInvitationLink`).
- Stale invite still sitting in spam after the invitee forwarded it
elsewhere.
**Fix.** The accept handler now requires that the accepting user owns
the invited email as a *verified* contact channel on their account.
Matches the invariant already used by the "list invitations for me"
endpoint
([`team-invitations/crud.tsx:41-66`](https://github.com/stack-auth/stack-auth/blob/dev/apps/backend/src/app/api/latest/team-invitations/crud.tsx#L41-L66)).
Rejections return a new `TEAM_INVITATION_EMAIL_MISMATCH` (403) error.
### 2. Verification-code handler TOCTOU
[`route-handlers/verification-code-handler.tsx`](https://github.com/stack-auth/stack-auth/blob/dev/apps/backend/src/route-handlers/verification-code-handler.tsx)
had a classic read-then-write TOCTOU:
```ts
const verificationCode = await prisma.verificationCode.findUnique(...);
if (verificationCode.usedAt) throw new KnownErrors.VerificationCodeAlreadyUsed();
// ... validation ...
await prisma.verificationCode.update({ data: { usedAt: new Date() } }); // unconditional
return await options.handler(...);
```
Five concurrent requests with the same code all pass the `if (usedAt)`
gate, all mark the code used, all run the post-handler. For OTP sign-in
the handler calls `createAuthTokens` which writes a fresh
`projectUserRefreshToken` row per call — so **one OTP → N refresh
tokens**. `auth/sessions/current` only revokes by `id: refreshTokenId`
and there is no bulk-revoke for passwordless users (only password change
in
[`users/crud.tsx:1210`](https://github.com/stack-auth/stack-auth/blob/dev/apps/backend/src/app/api/latest/users/crud.tsx#L1210)
does `deleteMany`). A phished OTP therefore becomes a
session-persistence primitive.
**Fix.** Replace the unconditional `update` with a conditional
`updateMany({ where: { …, usedAt: null } })` executed before
`options.handler`; if `count === 0` the race was already lost and we
throw `VERIFICATION_CODE_ALREADY_USED` (409). This also benefits MFA
sign-in and passkey sign-in, which share the same handler.
## Changes
| File | Change |
|---|---|
| `team-invitations/accept/verification-code-handler.tsx` | Require
verified contact channel matching `method.email` |
| `route-handlers/verification-code-handler.tsx` | Atomic `updateMany`
claim gated on `usedAt: null` |
| `stack-shared/src/known-errors.tsx` | New
`TeamInvitationEmailMismatch` (403) |
| `e2e/.../team-invitations.test.ts` | Two new tests (mismatch + happy
path) |
| `e2e/.../auth/otp/sign-in.test.ts` | One new test: 5 parallel
redemptions of one OTP → 1× 200 + 4× 409 |
## Test plan
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/team-invitations.test.ts` —
27/27 pass
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/auth/otp/sign-in.test.ts` —
12/12 (+ 4 pre-existing `it.todo`)
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/auth/password` — 33/33 (+ 7
pre-existing todos)
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/contact-channels` — 24/24
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/auth/passkey
apps/e2e/tests/backend/endpoints/api/v1/auth/mfa` — 16/16
- [x] `pnpm --filter @stackframe/backend typecheck` — clean
- [x] `pnpm --filter @stackframe/backend lint` + `pnpm --filter
@stackframe/stack-shared lint` — clean
## Notes
- The broader "plaintext credentials in DB + Sentry logs every header"
finding from the same audit is **not** in this PR — a scrubber for
`Sentry.setContext` request headers + unit tests is prepared on a local
stash and will go out as a separate PR.
- The team-invitation fix does not require any config change; fresh
signups via the OTP / password flows that set `primary_email_verified:
true` during creation already land the user with a verified channel
matching the invited email, so the happy path is unaffected.
### Follow-up review (Codex)
Addressed in follow-up commit `954cddb`:
- **Finding 1 (High)**: mismatched invite acceptance was consuming the
invitation before rejecting. Moved the email-ownership check into the
pre-claim `options.validate` hook so a wrong-email attempt leaves
`usedAt` untouched and the real recipient can still redeem. New test
asserts this end-to-end.
- **Finding 3 (Medium)**: invitation stored `body.email` raw but contact
channels are stored via `normalizeEmail`, so case-varied invites (e.g.
`Alice@Example.com`) wouldn't match a `alice@example.com` channel.
`send-code` now normalizes on storage and `accept` normalizes on compare
for back-compat with already-issued invites. New test covers the
mixed-case path.
- **Finding 2 (partial)**: added `expiresAt > now` to the atomic claim
predicate for the boundary case where a code expires between the read
and the claim. The reviewer's broader point about the `attemptCount`
rate-limit check being non-atomic with its own increment **pre-dates
this PR** (it reads the in-memory `verificationCode.attemptCount` from
line 150, not a fresh read) and exists independently of the `usedAt`
TOCTOU I'm fixing here. Tracking that as a separate follow-up so this PR
stays scoped to the two originally-flagged issues.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Invite acceptance now requires the invitee’s verified, normalized
(case‑insensitive) email; mismatches return HTTP 403
(TEAM_INVITATION_EMAIL_MISMATCH).
* Client APIs now surface the new email-mismatch error alongside
verification errors.
* **Bug Fixes**
* OTP verification codes are now guarded against parallel double‑redeem
so only one request succeeds.
* **Tests**
* Added E2E tests for invitation email validation, non‑consuming
rejection, case‑insensitive matching, and OTP concurrency.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
Fixes preview dummy payments seed data so seeded products and items
match their team-scoped product lines.
## Root Cause
The preview seed configured `workspace` and `add_ons` product lines with
`customerType: "team"`, but the products inside those lines (`starter`,
`growth`, and `regression-addon`) were configured as `customerType:
"user"`. Environment override writes validate against the rendered
branch config, so unrelated environment updates could fail with a
product/product-line customer type warning.
## Changes
- Mark preview dummy payments products and included items as
team-scoped.
- Export the dummy payments setup helper for focused validation.
- Add a regression test that validates the generated branch payments
override has no config override errors or incomplete config warnings.
## Validation
Passed in the original checkout with dependencies installed:
- `STACK_SKIP_TEMPLATE_GENERATION=true pnpm exec vitest run --config
vitest.config.ts src/lib/seed-dummy-data.test.ts --reporter=verbose
--maxWorkers=1 --minWorkers=1`
- `pnpm -C apps/backend lint src/lib/seed-dummy-data.ts
src/lib/seed-dummy-data.test.ts`
- `pnpm -C apps/backend typecheck`
The temporary clean worktree used for this PR did not have
`node_modules`, so dependency-backed commands were not rerun there.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Improvements**
* Strengthened payment product configuration with tighter typing and
validation
* Normalized product customer types (switched relevant dummy data from
user to team) for consistency
* **Tests**
* Added tests validating dummy payments configuration and
branch/override validation
* **Documentation**
* Added Q&A documenting a configuration validation failure mode and
required consistency for dummy payments data
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
- Move the `/api/internal/[transport]` MCP route from the docs app to
the backend, so the public `ask_stack_auth` MCP tool is served from the
same origin as the AI query API it proxies to.
- Replace the bespoke docs-tools HTTP client in
`apps/backend/src/lib/ai/tools/docs.ts` with an `@ai-sdk/mcp` client
that talks to Mintlify's generated MCP server. The backend AI agent now
consumes Mintlify's lower-level search/fetch tools directly instead of
going through the docs app.
- Swap `STACK_DOCS_INTERNAL_BASE_URL` for `STACK_MINTLIFY_MCP_URL`
(defaults to the Mintlify-hosted MCP URL).
- Move the `@vercel/mcp-adapter` dependency from `docs` to
`apps/backend`.
## Test plan
- [ ] `pnpm typecheck`
- [ ] `pnpm lint`
- [ ] e2e: new
`apps/e2e/tests/backend/endpoints/api/v1/internal/mcp.test.ts` covers
`tools/list` and validation on `tools/call`
- [ ] Manual: hit `POST /api/internal/mcp` on the backend and confirm
`ask_stack_auth` is listed and callable
- [ ] Manual: confirm backend AI agent docs tools resolve via the
Mintlify MCP URL
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Backend docs tooling now uses a Mintlify MCP server for documentation
tools and discovery.
* **Chores**
* Development environment variables updated to point to the Mintlify MCP
endpoint.
* Backend dependency added to support MCP integration; docs package
dependency removed.
* **Tests**
* Added end-to-end tests for the internal MCP endpoint and tool
validation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
# Shareable Session Replay Links
Adds the ability to share individual session replays via unique, direct
URLs.
https://www.loom.com/share/1e3298a19b114fc38af4bc43dcd5ec48
## What changed
- New admin endpoint — GET /api/v1/internal/session-replays/:id
- Fetches a single session replay by ID with user metadata (display
name, primary email) and chunk/event counts
- Returns 404 if the replay doesn't exist
- Admin-only access, consistent with the existing list endpoint
## New standalone replay page —
/projects/:projectId/analytics/replays/:replayId
- Thin server page wrapper that passes the replay ID to the existing
PageClient
- PageClient detects standalone mode via initialReplayId prop and
fetches replay metadata directly instead of loading the full session
list
- Sidebar is hidden; the replay viewer takes the full width
- "Back to all replays" link shown under the page title
## Copy link button
- Moved from per-session sidebar items to the replay viewer header (next
to the settings gear)
- Copies a direct URL to the currently selected replay
## SDK plumbing
- AdminGetSessionReplayResponse type in stack-shared
- getSessionReplay() on StackAdminInterface, StackAdminApp interface,
and _StackAdminAppImplIncomplete
## Tests
- Happy path: fetch single replay by ID with inline snapshot
- 404 for nonexistent replay ID
- 401 for non-admin access (client and server)
## Test plan
- [ ] Open /analytics/replays, select a replay, click the link icon in
the header — verify URL is copied to clipboard
- [ ] Paste that URL in a new tab — verify the standalone replay page
loads and plays the correct replay
- [ ] Verify "Back to all replays" link navigates back to the list page
- [ ] Verify the original /analytics/replays list page still works as
before (selecting, filtering, pagination)
- [ ] Run pnpm test run session-replays
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Backend: internal endpoint to fetch a single session replay with user
info, millisecond timestamps, and chunk/event counts.
* Admin SDK/App: added response type and admin method to retrieve a
single session replay; admin app maps response into the app model.
* Dashboard: standalone session-replay page, UI adjustments for
standalone mode, and a “copy replay link” button.
* **Tests**
* Added end-to-end tests for retrieval, not-found, and access-control
scenarios.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
## Summary
Rewrites the **Email Server** section of the project email settings page
and the managed-domain setup flow. Replaces the dropdown +
conditional-fields layout with a visual four-card picker, a clearer
unsaved-state model, a stepper dialog for managed-domain onboarding, and
a consistent tracked-domains list. Also fixes two data-correctness bugs
in the managed-domain backend.
## Walkthrough (2×, dead-frames trimmed)

## Before
The saved state was a minimal dropdown, but choosing Custom SMTP /
Resend revealed a long conditional form with a hidden gear toggle for
server config, no clear "what is saved" signal, and a separate dialog
pattern for managed domains.
| Saved (Managed) | Custom SMTP selected |
|---|---|
|

|

|
## After — Provider cards
Four visual cards (Stack Shared, Managed Domain, Resend, Custom SMTP)
with updated copy. The saved provider shows a green **Current** pill;
the card the user is previewing shows an amber dashed **Draft** pill. An
amber unsaved-changes banner appears between the picker and the form
when state diverges from saved, so it is unambiguous that a click is not
yet committed.
| Saved state | Previewing a different provider |
|---|---|
|

|

|
Copy changes:
- **Stack Shared** — "Only default emails — no custom templates, themes,
or sender identity." (was: "Shared (noreply@stackframe.co)")
- **Managed Domain** — "Bring your own domain. You add DNS records; we
handle signing & delivery." (was: "Managed (via managed domain setup)")
- **Resend** uses the official Resend brand mark (light/dark variants in
`apps/dashboard/public/assets/`)
## After — Managed domain list + stepper dialog
Selecting **Managed Domain** immediately shows the tracked-domain list
with an **Add domain** button. Each row reflects real status (Active /
Verified / Waiting for DNS / Verifying / Failed). Exactly one domain can
be **Active** — the one matching the saved email config; every other
verified/applied domain shows a **Use this domain** button so switching
is always possible.
Adding a domain opens a 3-stage dialog with a horizontal stepper (Verify
is right-aligned for the final step). Stage 2 replaces the old bare
NS-list with a proper **Type / Name / Content** DNS records table with
per-row copy buttons.
| Tracked domains list | DNS records table |
|---|---|
|

|

|
## Bug fixes
- **Backend: applying a managed domain did not demote previously-applied
ones.** Multiple rows could end up with status `APPLIED` even though
only one could be in the saved config. New helper
`demoteOtherAppliedManagedEmailDomains({ tenancyId, keepId })` runs
inside `applyManagedEmailProvider` to demote all other applied rows in
the tenancy back to `VERIFIED` before marking the new one.
- **Frontend: "Use this domain" only appeared for `status ===
verified`.** A domain that had been applied then replaced could never be
re-applied from the UI. Button now appears for any `verified` or
`applied` row that is not currently in use; the **Active** label is
derived from config match instead of DB status.
- **Dev mock onboarding now mirrors production timing.**
`shouldUseMockManagedEmailOnboarding()` used to insert domains as
`verified` synchronously. Now the domain is created as
`pending_verification`, and a fire-and-forget `runAsynchronously(() =>
wait(1000))` updates it to `verified` — mirroring the real Resend
webhook flow so the UI states (pending → verifying → verified) are
exercised in local dev.
## Test plan
- [ ] Cards: clicking each card shows `Draft` pill + amber banner;
Discard restores; Save commits and flips `Current` to the new card
- [ ] Managed: Add domain → stage 1 input → stage 2 DNS table + copy →
Check verification flips to stage 3 → Use this domain sets it Active and
demotes the previously-active domain in the list
- [ ] Managed: clicking **Use this domain** on a non-active verified row
makes it Active and the previously-active row back to Verified
- [ ] Shared / Resend / SMTP: existing save + test-email flows still
work (logic preserved verbatim)
- [ ] `pnpm typecheck` (dashboard + backend) and `pnpm lint` pass
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Redesigned email domain setup flow with multi-step verification dialog
* Added copy-to-clipboard for DNS records
* Enhanced provider selection interface with improved visual
presentation
* Onboarding now shows initial "pending verification" state and
completes verification asynchronously
* **Bug Fixes**
* Ensures only one managed domain becomes active when applying a domain
* Improved error handling for email configuration saves
* **Tests**
* Updated end-to-end tests to reflect async verification timing
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
- Add ClickHouse error code `386` (`NO_COMMON_TYPE`) to
`UNSAFE_CLICKHOUSE_ERROR_CODES` in
`apps/backend/src/lib/clickhouse-errors.ts`. This stops the Sentry
`StackAssertionError` (`Unknown Clickhouse error: code 386 not in safe
or unsafe codes`) that was firing whenever an admin wrote a query like
`SELECT [1, 'a']` or `SELECT if(1, 'a', 1)`, while keeping the raw error
message out of prod responses.
- Add two e2e regression tests: one against the cross-project
`analytics_internal.users` table, and one against `system.query_log`, to
pin that 386 is wrapped with the generic `Error during execution of this
query.` message in prod (full detail only surfaces in dev/test).
## Why unsafe, not safe
Both callers of `getSafeClickhouseErrorMessage`
(`apps/backend/src/app/api/latest/internal/analytics/query/route.ts:59`
and `apps/backend/src/lib/ai/tools/sql-query.ts:80`) execute
caller-authored SQL under `readonly: "1"` with
`SQL_project_id`/`SQL_branch_id` scoping. The ClickHouse client runs
under a `limited_user` whose grants restrict most tables — but
ClickHouse resolves types **before** enforcing ACL. That means a query
like `SELECT if(1, query, 1) FROM system.query_log` surfaces code 386
with a message like `There is no supertype for types String, UInt8 ...`,
leaking that `system.query_log.query` is a `String` — schema info from a
table the caller can't actually read.
This is the same type-before-ACL class as code 43
(`ILLEGAL_TYPE_OF_ARGUMENT`), which is already classified unsafe.
Classifying 386 as unsafe keeps the defense-in-depth consistent: if
per-customer tables are ever introduced and grants don't block
reference-resolution in time, 386 won't leak their schema.
Cost: in prod, an admin writing a malformed type-mismatch query sees
only `Error during execution of this query.` instead of the supertype
hint. Dev and test environments still show the full error via the
existing `getNodeEnvironment()` branch, so local iteration is
unaffected.
## Test plan
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/analytics-query.test.ts` — all
64 tests pass, including the two 386 regression tests.
- [ ] Monitor Sentry after deploy to confirm the
`unknown-clickhouse-error-for-query` events for code 386 stop firing.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved handling of a ClickHouse type-mismatch error to prevent
exposure of sensitive data and ensure sanitized error responses.
* **Tests**
* Added regression tests that verify error responses are sanitized,
return consistent error codes, and include expected headers without
leaking internal details.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->