Commit Graph

3008 Commits

Author SHA1 Message Date
BilalG1
0532a18c36
fix(dashboard): wrap "Block new purchases" toggle in a Card (#1364)
## Summary

The **Block new purchases** toggle on the Payments → Settings page was
visually out of place: it rendered as a bare `SettingSwitch` outside the
`max-w-3xl` settings column, while every neighboring setting (Stripe
Connection, Test Mode, Payment Methods, Platform-Managed Methods) was a
full-width `Card`.

This PR wraps it in a `Card` that matches the existing `TestModeToggle`
pattern so it inherits the same width constraint, border, padding,
title/description structure, and state-colored icon badge.

**File changed:**
[`apps/dashboard/src/app/(main)/(protected)/projects/[projectId]/payments/settings/page-client.tsx`](https://github.com/stack-auth/stack-auth/blob/fix/payments-block-new-purchases-card/apps/dashboard/src/app/(main)/(protected)/projects/%5BprojectId%5D/payments/settings/page-client.tsx)

## What was wrong

Two concrete mismatches with the rest of the page:

1. **Wrong container.** The `SettingSwitch` was a direct child of
`<PageLayout>` rather than the `<div className="space-y-6 max-w-3xl">`
column that wraps the other settings — so it stretched to the full page
width instead of the 3xl column and broke the vertical rhythm (no
consistent `space-y-6` gap from the card above).
2. **Wrong style primitive.** It used the bare `SettingSwitch` row
component instead of a `Card` +
`CardHeader`/`CardTitle`/`CardDescription`/`CardContent` structure — so
there was no border, no heading hierarchy, and no state-colored icon
badge, which every other setting on the page has.

## Fix

- Moved the block inside the `space-y-6 max-w-3xl` column so it's
constrained and spaced like its siblings.
- Replaced the `SettingSwitch` with a `Card` mirroring `TestModeToggle`:
- `CardHeader` with `CardTitle` (\"Block New Purchases\") and
`CardDescription` (\"Stops new checkouts while keeping existing
subscriptions active.\").
- `CardContent` with an icon badge (`ProhibitIcon`) that turns red when
blocking is active, plus a short \"Block new purchases\" label and the
`Switch`.
- Copy is intentionally minimal: one title, one sentence of description,
one label next to the switch. No two-state narration.

## Visual comparison

### Pixel diff (changed pixels tinted red over the after image)
4.7% of pixels changed, all concentrated in the bottom of the settings
column — everything else is pixel-identical, confirming the fix is
scoped.

![pixel
diff](https://gist.githubusercontent.com/BilalG1/faacb21aea28bc6acae0f527f232c38c/raw/compare_pixel_diff.png)

### Cropped before/after toggle (zoomed to the changed region)
Full-viewport comparisons are noisy when the delta is a single component
at the bottom. This one is cropped to the changed bbox so the card fix
is the whole frame — 1s before, 1s after, looped.

![crop
toggle](https://gist.githubusercontent.com/BilalG1/faacb21aea28bc6acae0f527f232c38c/raw/compare_crop_toggle.gif)

### Wipe reveal (before on the left, after swept in from the left)
A vertical red sweeps across the full page, revealing the after state
over the before state. Useful for spotting any unintended drift
elsewhere on the page (there is none).


![wipe](https://gist.githubusercontent.com/BilalG1/faacb21aea28bc6acae0f527f232c38c/raw/compare_wipe.gif)

## Test plan

- [ ] Open `/projects/<id>/payments/settings` in the dashboard.
- [ ] Verify \"Block New Purchases\" renders as a `Card` with the same
width as Stripe Connection / Test Mode / Payment Methods.
- [ ] Toggle the switch on — icon badge turns red, config write fires
(`payments.blockNewPurchases = true`, `pushable: true`).
- [ ] Toggle off — icon returns to muted gray, config write fires with
`false`.
- [ ] Reload the page and confirm the persisted state matches the
toggle.
- [ ] `pnpm lint` and `pnpm typecheck` pass.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Improvements**
* Redesigned the "Block New Purchases" toggle in payment settings with a
new card-based interface and visual prohibit indicator for improved
clarity and user experience.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-22 17:28:09 -07:00
BilalG1
4f198bd55b
Fix dashboard UI bugs: webhook detail crash and http domain silent https upgrade (#1362)
## Summary

Fixes two dashboard UI bugs surfaced while auditing the project area for
large user-visible issues:

1. **Webhook detail page completely broken** — the page shows a blank
screen because the SvixProvider token was being set to the string
`"[object Object]"`.
2. **Editing a trusted domain with an `http://` base URL silently
upgrades it to `https://`** — saving the edit dialog without changing
anything changes the protocol, breaking callbacks to the original host.

Both are corrected with minimal, targeted changes in the dashboard app.
No API, schema, or shared package changes are required.

---

## Bug 1 — Webhook detail page crashes because `svixToken + ''` yields
`"[object Object]"`

### Where


`apps/dashboard/src/app/(main)/(protected)/projects/[projectId]/webhooks/[endpointId]/page-client.tsx`

### Root cause

`stackAdminApp.useSvixToken()` returns an object of shape `{ token:
string, url: string | null }` (see
`packages/template/src/lib/stack-app/apps/implementations/admin-app-impl.ts`).
The page was doing:

```ts
const svixToken = stackAdminApp.useSvixToken();
const [updateCounter, setUpdateCounter] = useState(0);

// This is a hack to make sure svix hooks update when content changes
const svixTokenUpdated = useMemo(() => {
  return svixToken + '';
}, [svixToken, updateCounter]);

// …
<SvixProvider token={svixTokenUpdated} …>
```

`svixToken + ''` coerces the object to the string `"[object Object]"`,
which is then passed to `<SvixProvider>` as the auth token. Every nested
Svix hook (`useEndpoint`, `useEndpointSecret`,
`useEndpointMessageAttempts`) authenticates with that bogus token, gets
a `401 {"code":"authentication_failed","detail":"Invalid token"}` from
Svix, and `getSvixResult`
(`apps/dashboard/src/app/(main)/(protected)/projects/[projectId]/webhooks/utils.tsx`)
throws, crashing the page.

Additional notes while in there:
- `setUpdateCounter` was declared but never called anywhere, so the
surrounding `useMemo`/`useState` was dead weight as well as broken.
Removing it removes the dead code too.
- The neighbouring list page (`webhooks/page-client.tsx`) already uses
the correct shape (`svixToken.token`, `svixToken.url`), which is why the
list page rendered correctly while the detail page didn't.

### Fix

Pass `svixToken.token` directly to `<SvixProvider>` and drop the unused
counter/memo.

```ts
export default function PageClient(props: { endpointId: string }) {
  const stackAdminApp = useAdminApp();
  const svixToken = stackAdminApp.useSvixToken();

  return (
    <AppEnabledGuard appId="webhooks">
      <SvixProvider
        token={svixToken.token}
        appId={stackAdminApp.projectId}
        options={{ serverUrl: getPublicEnvVar('NEXT_PUBLIC_STACK_SVIX_SERVER_URL') }}
      >
        <PageInner endpointId={props.endpointId} />
      </SvixProvider>
    </AppEnabledGuard>
  );
}
```

### Reproduction (before fix)

1. Enable the Webhooks app on a project.
2. Create an endpoint with any URL.
3. Open the row's action menu and click **View Details**.
4. The page renders blank (Svix hooks throw 401 Invalid token; the error
boundary unmounts the detail tree). URL, Description, Verification
Secret, and Events History never appear.

### Before / After

| Before | After |
| --- | --- |
| ![Webhook detail blank before
fix](https://gist.githubusercontent.com/BilalG1/f31b7631cb914ea8fd0113b97d26319e/raw/bug1-webhook-detail-before.png)
| ![Webhook detail renders after
fix](https://gist.githubusercontent.com/BilalG1/f31b7631cb914ea8fd0113b97d26319e/raw/bug1-webhook-detail-after.png)
|

---

## Bug 2 — Editing an `http://` trusted domain silently upgrades it to
`https://`

### Where


`apps/dashboard/src/app/(main)/(protected)/projects/[projectId]/domains/page-client.tsx`

### Root cause

In `EditDialog`, the form's `defaultValues` always set `insecureHttp:
false`, regardless of the protocol of the domain being edited:

```ts
defaultValues={{
  addWww: props.type === 'create',
  domain: props.type === 'update' ? props.defaultDomain.replace(/^https?:\/\//, "") : undefined,
  handlerPath: props.type === 'update' ? props.defaultHandlerPath : "/handler",
  insecureHttp: false, // ← ignores the existing protocol
}}
```

The `domain` field strips `http(s)://` for display but the protocol
itself is only tracked through the `insecureHttp` switch, which lives
inside the collapsed-by-default **Advanced** accordion. On submit:

```ts
const protocol = values.insecureHttp ? 'http://' : 'https://';
const baseUrl = protocol + values.domain;
```

So an `http://myapp.test` entry reopens with `insecureHttp: false`, the
Advanced section stays collapsed, the user sees nothing wrong, and
hitting **Save** (even with zero visible changes) writes
`https://myapp.test` back to config. Existing redirects from SSO / email
verification flows that depend on the original `http://` host stop
working.

### Fix

Derive `insecureHttp` from the existing `defaultDomain` when editing:

```ts
insecureHttp: props.type === 'update' ? props.defaultDomain.startsWith('http://') : false,
```

This makes the switch in the Advanced panel pre-check itself correctly
and the submit path emits the preserved protocol.

### Reproduction (before fix)

1. Go to **Project Settings → Trusted Domains**.
2. Add a new domain, expand **Advanced**, toggle **Use HTTP instead of
HTTPS** on, enter `myapp.test`, click **Create**. The list now shows
`http://myapp.test`.
3. Click the row's **⋯ → Edit**, then **Save** without changing
anything.
4. Observe the list now shows `https://myapp.test`.

### Before / After

**Domain list after an edit+save:**

| Before (http silently became https) | After (http preserved) |
| --- | --- |
| ![Domain list
before](https://gist.githubusercontent.com/BilalG1/f31b7631cb914ea8fd0113b97d26319e/raw/bug4-domain-list-before.png)
| ![Domain list
after](https://gist.githubusercontent.com/BilalG1/f31b7631cb914ea8fd0113b97d26319e/raw/bug4-domain-list-after.png)
|

In the "before" screenshot, `http://myapp.test` was edited with no
changes and silently became `https://myapp.test`.
`http://www.myapp.test` (not edited) stayed `http://`, confirming the
bug is triggered only through the edit-save path.

**Edit dialog (Advanced expanded):**

| Before (HTTP switch always off) | After (reflects stored protocol) |
| --- | --- |
| ![Edit dialog
before](https://gist.githubusercontent.com/BilalG1/f31b7631cb914ea8fd0113b97d26319e/raw/bug4-edit-dialog-before.png)
| ![Edit dialog
after](https://gist.githubusercontent.com/BilalG1/f31b7631cb914ea8fd0113b97d26319e/raw/bug4-edit-dialog-after.png)
|

The "after" dialog also shows the protocol prefix label flip from
`https://` to `http://` next to the input — a second visual cue that the
user is editing an HTTP domain.

---

## Scope / out of scope

In scope here:
- The two fixes above, plus a small amount of dead-code cleanup adjacent
to the first fix (the unused `updateCounter` / `useMemo` hack).

Intentionally **not** included (tracked separately from the same audit —
see internal notes):
- Cursor pagination cache wipe across Users/Teams/Transactions tables
(`data-table/common/cursor-pagination.tsx`)
- Email Outbox "Scheduled At" input being reset on every keystroke and
rendered in the wrong timezone (`email-outbox/page-client.tsx`)
- Latent empty-group handling in the sign-up rule builder (validator +
CEL emitter), which is real in code but not currently reachable through
the editor UI

These are broader and deserve their own PRs.

## Test plan

- [ ] **Bug 1 (webhook detail):** Enable Webhooks on a project, create
an endpoint, open **View Details**. Confirm URL, Description,
Verification Secret, and Events History render (no 401s in the console,
no blank page). Confirm the Copy button on the verification secret still
copies the key.
- [ ] **Bug 2 (domain edit preserves http):** Add an `http://` trusted
domain. Edit it and save with no changes — list should still show
`http://`. Edit again, flip the Advanced switch to HTTPS, save — list
should show `https://`. Repeat with the inverse direction (start https,
flip to http).
- [ ] **Regression sweep:** Webhooks list page, create/delete endpoint,
copy signing secret; Trusted Domains add/delete; auth-methods callbacks
against an `http://localhost` domain continue to work.
- [ ] `pnpm typecheck` passes locally. (`pnpm lint` was also run against
the dashboard app and is clean.)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Bug Fixes**
* Domain editing now correctly initializes and preserves the protocol
type (HTTP or HTTPS) based on the existing domain configuration.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-22 17:27:37 -07:00
BilalG1
a22d5a49c1
Reduce Jaeger memory usage in local dev (#1357)
## Summary

Jaeger's `all-in-one` image uses an unbounded in-memory span store by
default, so on long-running local dev sessions it quietly grows until
the Docker VM is under memory pressure. This caps it in two ways:

- `MEMORY_MAX_TRACES=50000` — tells Jaeger to evict old traces once the
in-memory ring buffer hits 50k. Plenty for local debugging without
retaining every trace forever.
- `mem_limit: 1g` — hard Docker memory cap as a backstop so a runaway
collector can't starve the rest of the compose stack (Postgres,
ClickHouse, Svix, etc.).

Only touches `docker/dependencies/docker.compose.yaml` — no app code, no
prod config.

## Test plan

- [ ] `docker compose -f docker/dependencies/docker.compose.yaml up
jaeger` starts cleanly
- [ ] Jaeger UI reachable at `localhost:8107`
- [ ] OTLP endpoint on `localhost:8131` still ingests spans from the
API/dashboard
- [ ] `docker stats` shows the jaeger container capped at ~1GB under
load
- [ ] Old traces get evicted once trace count exceeds 50k (verify via UI
search)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **Chores**
* Configured memory constraints and trace buffering optimization for the
tracing service to improve resource management and system stability.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-22 17:27:29 -07:00
Mantra
000634607a
fix(internal-tool): continue dev startup when spacetime publish fails (#1371)
## Summary
- pre-dev.mjs now warns and exits 0 when the local SpacetimeDB publish
fails, instead of aborting `next dev`
- Lets contributors without a running local SpacetimeDB server still
start the internal-tool dev server
- Updates the header comment to reflect the new behavior

## Test plan
- [ ] Run `pnpm dev` in `apps/internal-tool` with no SpacetimeDB server
running — dev server should still start, with a warning
- [ ] Run with SpacetimeDB server running and `spacetime` CLI installed
— publish still runs and dev proceeds
- [ ] Run without `spacetime` CLI installed — existing warn-and-continue
path still works

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Updated local publishing configuration to derive server settings from
environment variables for improved flexibility and easier customization.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-22 22:35:27 +00:00
BilalG1
f89b97bc54
fix connected accounts tokens (#1358)
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* OAuth flows now consistently block extra scopes and access tokens for
shared OAuth keys, enforcing restrictions earlier in the request
processing and across all environments.
* **Tests**
* Added end-to-end regression tests to verify requests with extra scopes
against shared OAuth providers return a 400 response indicating extra
scopes/access tokens are not allowed.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-20 19:33:47 -07:00
Konstantin Wohlwend
3ea8052d35 chore: update package versions 2026-04-20 19:06:56 -07:00
Konstantin Wohlwend
d9492ac5f1 Update submodules 2026-04-20 19:01:16 -07:00
Konstantin Wohlwend
6f1df1a0c7 Update submodules 2026-04-20 18:53:36 -07:00
BilalG1
37ee5ec320
Fast-start local emulator via RAM snapshot + live secret rotation (#1340)
## Summary

`stack emulator start` now resumes a fully-warm VM snapshot instead of
cold-booting, bringing startup from 30–120s down to ~5–8s with
per-install secret rotation, or ~2.5s with rotation opt-out. The
snapshot is captured **locally on first `stack emulator pull`**, not
shipped from CI — QEMU migration state isn't portable across
accelerators (KVM/HVF/TCG) or `-cpu max` feature sets, so a CI-captured
snapshot couldn't resume reliably on arbitrary user hardware.

Also bundles a pile of CLI QoL fixes (progress bars, PR/run artifact
pulls, PR-build download, native-TS ISO writer replacing
`hdiutil`/`mkisofs`/`genisoimage` host dep, unit tests).

| Scenario | Before | After |
|---|---|---|
| Cold boot (no snapshot) | 30–120s | same, works as fallback |
| `stack emulator pull` (one-time, includes local snapshot capture) |
~30s download | ~30s download + ~1–3 min cold-boot capture |
| Snapshot resume, normal start | — | **~5–8s** |
| Snapshot resume, `EMULATOR_NO_ROTATION=1` | — | **~2.5s** |

Backend (`/health?db=1`) and dashboard (`/handler/sign-in`) return 200
on all paths. Two successive snapshot resumes produce different rotated
PCK/SSK/SAK/CRON_SECRET values per install.

## How it works

**Build (CI)** — `docker/local-emulator/qemu/build-image.sh`:

1. Cloud-init provisioning runs to completion (migrations, seed,
slim-image) producing `stack-emulator-<arch>.qcow2`.
2. Image is built with a topology compatible with later snapshot capture
(pinned SMP=4, phantom seed/bundle ISOs, STACKCFG runtime ISO mounted at
build time, qemu-guest-agent running, placeholder hex secrets baked in
under `STACK_EMULATOR_BUILD_SNAPSHOT=1`).
3. CI publishes **only the qcow2** — no `.savevm.zst` ships.

**Pull (user's machine)** —
`packages/stack-cli/src/commands/emulator.ts` + `run-emulator.sh
capture`:

1. `stack emulator pull` downloads the qcow2 with a progress bar (or
from a PR / workflow run via `--pr` / `--run`).
2. CLI invokes `run-emulator.sh capture`: cold-boots the qcow2 with a
matching device layout (phantom ISOs, fsdev, pcie-root-port, virtfs
detached — migration-incompatible), waits for backend+dashboard health,
then drives QMP: `stop` → set `mapped-ram` + `multifd` caps → `migrate
file:state.raw` → poll `query-migrate` → `quit`. Raw mapped-ram file is
zstd-compressed to `stack-emulator-<arch>.savevm.zst` in the images dir.
3. `--skip-snapshot` opts out (first `start` will then cold-boot).

**Runtime** — `run-emulator.sh start`:

1. Launch QEMU with `-incoming defer` when a `.savevm.zst` is present;
decompress on first use, keep the `.raw` cached for subsequent starts.
2. QMP: same `mapped-ram` + `multifd` caps → `migrate-incoming
file:<.raw>` → poll for `paused` → `cont`.
3. Generate fresh per-install secrets on the host; pipe them
base64-encoded through QGA `guest-exec input-data` →
`trigger-fast-rotate` in the guest → `docker exec -e … rotate-secrets`.
4. `rotate-secrets` in the container: validate keys (hex-only), targeted
`sed` on the placeholder PCK across built JS, `UPDATE ApiKeySet`,
`supervisorctl restart stack-app cron-jobs` (with
`stopasgroup`/`killasgroup` so the Node children actually die and
release their ports).
5. Poll backend+dashboard health; if anything fails, clean up and fall
back to cold boot transparently.

**Security model**: placeholder hex values are baked into the snapshot
(`00…ff` PCK, `00…ee` SSK, `00…dd` SAK, `00…cc` CRON_SECRET). They are
non-secret by construction. Real per-install secrets are generated at
each `emulator start` and never leave the host.

## CLI changes (`packages/stack-cli`)

- **`src/lib/iso.ts`** (new): native TypeScript ISO 9660 + Joliet
writer, replacing the host-side `hdiutil`/`mkisofs`/`genisoimage`
dependency for generating the STACKCFG runtime config disk. Unit tests
in `src/lib/iso.test.ts`.
- **`src/commands/emulator.ts`**:
- `pull`: streamed downloads with progress bar + ETA; `--pr <number>`
and `--run <id>` to pull from a PR build's CI artifacts (uses
`extract-zip` for the nested zip); `--skip-snapshot` to opt out of the
one-time local capture.
- `start` (existing, extended): auto-pulls AND auto-captures when no
image exists, so first-ever `start` is self-bootstrapping; emits
`STACK_EMULATOR_CLI_WROTE_ISO=1` so the shell helper skips its own ISO
regen (avoids the genisoimage host dep).
- `capture` (new, invoked by `pull` and the auto-pull path of `start`):
drives the local snapshot capture via `run-emulator.sh`.
- `status`, `stop`, `reset`, `list-releases`: preflight +
path-resolution tightening (`STACK_EMULATOR_HOME` → images/run dirs).
  - Unit tests in `src/commands/emulator.test.ts`.
- **`EMULATOR_NO_ROTATION=1`** env var skips the post-resume rotation
(intended for tests/CI where the placeholder secrets are fine — comes
with a loud warning).

## CI (`.github/workflows/qemu-emulator-build.yaml`)

- Builds **QEMU 10.2.2 from source** (cached), because
`mapped-ram`/`multifd` migration capabilities aren't available in the
distro's QEMU. Enables KVM on ubicloud runners so amd64 boots at
hardware speed.
- amd64 + arm64 both build on the same amd64 matrix
(`ubicloud-standard-8`); arm64 runs under cross-arch TCG (provisioning
only — boot/verify smoke test is amd64-only).
- Verification now runs through the CLI: `emulator start` → `emulator
status` → `emulator stop` against the freshly-built qcow2 (via
`STACK_EMULATOR_HOME` pointing at the workspace, so the CLI doesn't
silently auto-pull a prior release).
- Packages **only** the qcow2. No `.savevm.zst` upload / publish.
- Release notes updated.

## Key files

**Shell / guest:**
- `docker/local-emulator/qemu/build-image.sh` — snapshot-compatible
device topology + STACKCFG runtime ISO at build time
- `docker/local-emulator/qemu/run-emulator.sh` — `start`, `capture`,
`stop`, `reset`, `status`; `-incoming defer`, `.raw` cache, QGA-driven
rotation, cold-boot fallback
- `docker/local-emulator/qemu/common.sh` (new) — shared `qmp_session` +
`capture_vm_state` (factored out so build-image.sh and run-emulator.sh
share the capture path)
- `docker/local-emulator/qemu/cloud-init/emulator/user-data` —
placeholder secrets in snapshot mode, `wait-for-stack-ready`,
`trigger-fast-rotate`, qemu-guest-agent enabled
- `docker/local-emulator/rotate-secrets.sh` (new) — in-container
rotation (sed + UPDATE + supervisorctl)
- `docker/local-emulator/supervisord.conf` — `stopasgroup`/`killasgroup`
on `stack-app` and `cron-jobs`
- `docker/local-emulator/entrypoint.sh` — only mint CRON_SECRET if unset
(placeholder supplied in snapshot mode via --env-file)
- `docker/local-emulator/Dockerfile` — ships `rotate-secrets` to
`/usr/local/bin`
- `docker/server/entrypoint.sh` — source
`/run/stack-auth/rotated-secrets.env`; skip full-tree sentinel scan on
warm restarts via marker

**CLI:**
- `packages/stack-cli/src/lib/iso.ts` (new) + `iso.test.ts` (new)
- `packages/stack-cli/src/commands/emulator.ts` + `emulator.test.ts`
(new)
- `packages/stack-cli/vitest.config.ts` (new)

**CI:**
- `.github/workflows/qemu-emulator-build.yaml`

## Test plan

- [x] `docker/local-emulator/qemu/build-image.sh {amd64,arm64}` produces
`stack-emulator-<arch>.qcow2` with snapshot-compatible topology
- [x] `stack emulator pull` downloads qcow2 with progress, then captures
locally (~1–3 min) and writes `stack-emulator-<arch>.savevm.zst` in the
images dir
- [x] `stack emulator pull --skip-snapshot` stops after download
- [x] `stack emulator pull --pr <n>` / `--run <id>` pull from PR /
workflow run artifacts
- [x] `stack emulator start` on a fresh dir auto-pulls **and**
auto-captures, then starts; subsequent starts fast-resume in ~5–8s;
backend + dashboard return 200
- [x] `EMULATOR_NO_ROTATION=1 stack emulator start` completes in ~2.5s;
backend + dashboard return 200 with warning printed
- [x] Two consecutive `emulator start` invocations produce different PCK
values in the internal `ApiKeySet` row
- [x] `stack emulator status` / `stop` / `reset` resolve paths from
`STACK_EMULATOR_HOME`
- [x] Verified end-to-end on arm64 macOS under HVF (capture ~50s,
fast-resume ~6.5s)
- [x] `pnpm lint` and `pnpm typecheck` pass; stack-cli unit tests (iso +
emulator) pass
- [ ] CI green on this PR (qemu-emulator-build matrix, smoke test)
- [ ] `gh release download emulator-<branch>-latest` contains only
`stack-emulator-<arch>.qcow2` once this PR merges and publish runs

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Snapshot fast-start/resume with optional warm-snapshot assets, runtime
ISO generation, and a cached QEMU build to speed emulator setup.
* CLI: streamed artifact downloads with progress, improved release/asset
handling, stronger preflight checks, and start/status/stop emulator
commands.
* Automated secret rotation and ability to apply rotated secrets at
container startup; supervisor control socket enabled.

* **Bug Fixes**
* More robust start/stop/resume flows with automatic fallback to cold
boot and improved process-group shutdown behavior.

* **Tests**
  * New tests for CLI utilities and ISO image generation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-20 14:24:49 -07:00
Madison
9f3ea9a766 Tutorial style docs init 2026-04-20 14:09:27 -05:00
Madison
24f32e6845 init tutorial docs 2026-04-20 14:09:27 -05:00
BilalG1
dad752a1cb
Add pr-visual-writeup skill to .agents/skills (#1354)
Shared agent skill under the cross-client `.agents/skills/` convention
so Claude Code, Codex, and other compliant agents can discover it.

<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added an end-to-end workflow skill for automating visual-rich PR
descriptions with screenshot capture, asset processing, and upload
capabilities.

* **Documentation**
* Added reference guides covering screenshot capture patterns, asset
upload workflows, and structured PR body templates.

* **Chores**
* Added helper scripts for development server detection, media
conversion, and GitHub gist uploads.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-20 11:54:55 -07:00
BilalG1
6bc1836e66
fix(dashboard): resolve UI issues across email-* pages (#1345)
## Summary

Six UI issues found across the email-* dashboard pages, ranked by
impact, fixed here:

1. **email-sent layout** — the email log table and domain reputation
card were forced side-by-side at all widths. A fixed-width sidebar plus
a flex-1 table meant that on tablet the table got crushed, and on mobile
the row overflowed horizontally. Fix: stack vertically below `lg`, and
let the reputation card span full width on narrow viewports.
2. **Domain status enum leaks to the UI** — `<span>Status:
{domain.status}</span>` rendered raw values like `pending_dns` /
`pending_verification`. Added a `MANAGED_DOMAIN_STATUS_LABELS` map and
route through it before rendering.
3. **email-themes dialog grid cramped on mobile** — the Change Theme
dialog hardcoded `grid-cols-2`, so at 375px each theme card had ~150px
and the preview images were illegible. Changed to `grid-cols-1
sm:grid-cols-2`.
4. **Template name row overflow** — long template names pushed the Edit
Template button off the right edge of the card because the flex row had
no `min-w-0` / `truncate`. Fixed both, and made the action column
`shrink-0`.
5. **Boosted-capacity label was color-only** — during an active boost
the label used a red strikethrough for the base value and a blue number
for the boosted value with no non-color cue. Added an explicit `→` arrow
between the two numbers, `title` tooltips on each, and a visible
\"(boosted)\" marker after `/h max`.
6. **Draft progress bar overflowed at mobile width** — the 4-step
progress bar used fixed 80px connectors, giving a minimum width of
~400px that clipped off both ends at 375px. Changed connectors to `w-8
sm:w-20` (32px on mobile, 80px otherwise) so all four steps and their
labels fit below 640px.

## Before / after

Each GIF below loops \"before\" (1s) → \"after\" (1s) with a red pill in
the top-right indicating which frame is which. Full-size stills (before
+ after + extra viewports) are listed under **All screenshots** at the
bottom.

### 1. email-sent — two-column layout collapses on narrow viewports

Mobile (375px):

![email-sent
mobile](https://gist.githubusercontent.com/BilalG1/edb04740a19c3f2d048da6e602209d45/raw/gif-01-email-sent-mobile.gif)

Tablet (900px):

![email-sent
tablet](https://gist.githubusercontent.com/BilalG1/edb04740a19c3f2d048da6e602209d45/raw/gif-01-email-sent-tablet.gif)

### 2. email-settings — managed-domain status label

![domain
status](https://gist.githubusercontent.com/BilalG1/edb04740a19c3f2d048da6e602209d45/raw/gif-02-domain-status.gif)

### 3. email-themes — Change Theme dialog on mobile

![themes
mobile](https://gist.githubusercontent.com/BilalG1/edb04740a19c3f2d048da6e602209d45/raw/gif-03-themes-mobile.gif)

### 4. email-templates — long name overflow

![templates
overflow](https://gist.githubusercontent.com/BilalG1/edb04740a19c3f2d048da6e602209d45/raw/gif-04-templates-overflow.gif)

### 5. email-sent — boosted capacity label

![capacity
label](https://gist.githubusercontent.com/BilalG1/edb04740a19c3f2d048da6e602209d45/raw/gif-05-capacity-label.gif)

### 7. email-drafts — draft progress bar on mobile

![draft progress
bar](https://gist.githubusercontent.com/BilalG1/edb04740a19c3f2d048da6e602209d45/raw/gif-07-draft-progress-mobile.gif)

## Test plan

- [x] \`pnpm --filter @stackframe/dashboard lint\` — clean
- [x] \`pnpm --filter @stackframe/dashboard typecheck\` — clean
- [x] Manual verification in a browser at 375px / 900px / 1440px, light
+ dark mode, for each fixed page
- [ ] Reviewer sanity check of the remaining email-* pages
(email-outbox, email-viewer) for similar responsive regressions

## Notes

- The initial review flagged a \"white-on-white capacity boost timer\" —
on closer look the label sits on a deliberately dark `bg-zinc-900/0.82`
overlay inside the boost card, so it reads fine in light and dark mode.
Not fixing; that part of the review was a false positive.
- The initial review also flagged a missing empty state on
email-templates. Because Stack seeds built-in templates, the empty
branch is unreachable in practice — skipping that fix to avoid dead
code.

## All screenshots

Gist with all the individual before/after PNGs and the GIFs themselves:
https://gist.github.com/BilalG1/edb04740a19c3f2d048da6e602209d45

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added human-readable status labels for managed domains in domain
settings

* **Improvements**
* Enhanced responsive layouts across dashboard pages for improved mobile
experience
* Improved email capacity display with visual indicators and tooltips
for boost status
* Refined template and theme selection layouts with better text handling
and spacing

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-19 22:58:07 -07:00
BilalG1
85ae4b1c9e
Fix ClickHouse OOM in MAU query + optimize /internal/metrics route (#1344)
## Summary

Fixes the Sentry `StackAssertionError: Failed to load monthly active
users for internal metrics` crash (ClickHouse OOM at the 7.2 GiB
per-query cap) and applies two related optimizations to other queries in
the same route while here. Adds a local benchmark harness that validates
correctness and measures peak memory / duration before & after.

## Root cause (the original Sentry error)

`loadMonthlyActiveUsers` was written as `SELECT user_id … GROUP BY
user_id` and then counting in Node via a `Set`. On a large project that
ships back millions of user_ids. Two failure modes stacked:

1. **Result materialization** — every distinct user_id had to be
buffered in the server before streaming to Node (~20 MiB of result for
450k users; much more at real scale).
2. **`JSONExtract(toJSONString(data), 'is_anonymous', 'UInt8')`** — the
`toJSONString(data)` per-row re-serialization of the entire nested JSON
column, billions of times, just to pull one boolean. Dominates
bytes-read.

Combined, on a single partition read from S3-backed MergeTree, this can
exceed ClickHouse's 7.2 GiB per-query memory cap. That's exactly what
the Sentry trace showed.

## Changes

### 1. Fix MAU query (`loadMonthlyActiveUsers`)

Moved counting to the server with
`uniqExact(sipHash64(normalized_user_id))` and pulled the JS-side
normalization (`lower`, `trim`, `isUuid`) into SQL. Picked `sipHash64`
after benchmarking 7 variants — it's exact (at <<2³² users) and halves
the uniqExact hash-state vs. raw string keys.

### 2. Fix 1 — `JSONExtract(toJSONString(data), …)` → direct
`CAST(data.is_anonymous, …)`

Applied everywhere the pattern appeared in the metrics route:
- `loadDailyActiveUsers`
- the `analyticsUserJoin` subquery
- the `nonAnonymousAnalyticsUserFilter`
- `analyticsOverview:topRegion`
- `analyticsOverview:online`

Semantics preserved (`coalesce(CAST(data.is_anonymous,
'Nullable(UInt8)'), 0)` matches `JSONExtract(…, 'UInt8')` behavior when
the field is missing).

### 3. Fix 3 — server-aggregate the split queries

`loadDailyActiveUsersSplit` and `loadDailyActiveTeamsSplit` used to ship
1.2M+ `(day, user_id)` rows back to Node just so the JS could bucket
them into new / retained / reactivated. Rewrote both as one CTE-style
query that returns 31 rows (one per day in the 30-day window) with the
counts precomputed.

**Minor semantic shift** (documented inline in `route.tsx`): \"new\" is
now based on the user's first-ever `\$token-refresh` event rather than
their Postgres `signedUpAt`. Agrees for users who log in immediately
after sign-up (the common case). Disagrees for the rare edge case of an
account that existed pre-window but never generated a `\$token-refresh`
until now — old code classified as \"reactivated,\" new code classifies
as \"new.\" Judged acceptable; can be revisited.

Postgres round-trips for `ProjectUser.signedUpAt` / `Team.createdAt` are
no longer needed for the split, and the 76 MiB-ish wire ship is gone.

### 4. Benchmark harness
(`apps/backend/scripts/benchmark-internal-metrics.ts`)

Local-only tool. Three modes:
- **MAU equivalence matrix** — 13 edge cases (empty, dedup, anonymous
filter, window boundary, null user_id, non-UUID user_id, case variation,
project isolation, missing/null `is_anonymous`, wrong event_type).
Asserts OLD pipeline and NEW query return the **same set** of users, not
just the same count.
- **MAU perf** — OLD vs NEW plus 6 other candidate variants (inline
regex, UUID keys, sipHash64, HLL sketches), reads `memory_usage` /
`read_rows` / `result_bytes` from `system.query_log` for each, prints a
ranked table.
- **Full-route benchmark** (`BENCH_ROUTE_QUERIES=1`) — runs every
ClickHouse query in `/internal/metrics` in three stages (BEFORE, AFTER,
candidate OPTIMIZED) against the same seed and prints per-query deltas
plus endpoint-level totals.

Seeds under a synthetic `project_id` so real data is never touched;
cleans up on exit via `ALTER TABLE … DELETE`.

## Benchmark results

### MAU query alone

Ran at two scales; set-equality verified (new query identifies the same
individual users, not just the same count).

| seed | MAU | peak memory (old → new) | bytes read | duration |
|---|---|---|---|---|
| 500k events | 89,939 | 158.7 MiB → 46.7 MiB (**3.4×**, −70%) | 175.7
MiB → 63.0 MiB (2.8×) | 483 ms → 76 ms (**6.4×**) |
| 2.5M events | 449,990 | 439.2 MiB → 281.4 MiB (1.56×, −36%) | 865.0
MiB → 310.9 MiB (2.8×) | 783 ms → 126 ms (**6.2×**) |

MAU variant bake-off at 2.5M events (all exact, all set-equal to OLD):

| variant | memory | duration | notes |
|---|---|---|---|
| v0_old (baseline) | 440 MiB | 567 ms | — |
| v1_uniqExact_string | 284 MiB | 110 ms | naive fix |
| v3_uniqExact_toUUID | 244 MiB | 153 ms | UUID keys, slower per-row |
| **v4_uniqExact_sipHash64** | **125 MiB** | **95 ms** | **shipped** |
| v5_uniq (HLL) ~approx | 30 MiB | 86 ms | −0.25% error |
| v6_uniqCombined ~approx | 31 MiB | 67 ms | −0.15% error |

### Full `/internal/metrics` route (2.7M events, 300k users + page-views
+ clicks + teams)

Ranked by BEFORE peak memory:

| query | mem BEFORE | mem AFTER | Δ mem | dur BEFORE | dur AFTER | Δ
dur |
|---|---|---|---|---|---|---|
| analyticsOverview:topReferrers | 588.1 MiB | 411.1 MiB | 1.43× | 1833
ms | 110 ms | **16.66×** |
| analyticsOverview:totalVisitors | 584.3 MiB | 403.5 MiB | 1.45× | 1829
ms | 121 ms | 15.12× |
| analyticsOverview:dailyEvents | 584.1 MiB | 403.7 MiB | 1.45× | 1897
ms | 140 ms | 13.55× |
| loadUsersByCountry | 393.1 MiB | 385.4 MiB | ≈same | 74 ms | 80 ms |
≈same |
| loadDailyActiveUsersSplit | 363.4 MiB | 396.8 MiB | *+9%* | 1966 ms |
356 ms | 5.52× |
| analyticsOverview:topRegion | 269.9 MiB | 106.4 MiB | 2.54× | 1602 ms
| 65 ms | 24.65× |
| loadDailyActiveUsers | 268.3 MiB | 84.0 MiB | 3.19× | 1111 ms | 44 ms
| 25.25× |
| loadDailyActiveTeamsSplit | 59.6 MiB | 78.1 MiB | *+31%* | 70 ms | 123
ms | *+76%* |
| loadMonthlyActiveUsers | 54.9 MiB | 54.9 MiB | ≈same | 68 ms | 56 ms |
≈same |
| analyticsOverview:online | 18.4 MiB | 5.8 MiB | 3.17× | 58 ms | 4 ms |
14.50× |

**Endpoint-level totals**

| metric | BEFORE | AFTER | Δ |
|---|---|---|---|
| Sum peak ClickHouse memory | 3.11 GiB | 2.28 GiB | **−27%** |
| **Max query duration** (endpoint wall-clock floor) | **1966 ms** |
**356 ms** | **−82%** (5.5×) |
| Sum query duration (total CPU) | 10508 ms | 1099 ms | **−90%** (9.6×)
|
| Bytes read | 10.70 GiB | 4.55 GiB | −57% |
| Bytes shipped to Node | 94.8 MiB | 44.2 KiB | **−99.95%** |

Both split queries show a small memory *regression* at this seed size
(the new server-side window-function + self-join has its own state cost
that's near break-even with \"materialize + ship\" at 300k users); at
prod scale the 76 MiB-ship saving dominates. Duration is unambiguously
better.

## Why we don't need to drop the `analyticsUserJoin` in this PR

The benchmark includes an OPTIMIZED stage that drops the LEFT JOIN and
trusts `e.data.is_anonymous` directly, which would shave another **1.2
GiB / 1.9× duration** off the endpoint. **But we can't ship that here**
— an audit of the client tracker
(`packages/js/src/lib/stack-app/apps/implementations/event-tracker.ts`)
confirmed `is_anonymous` is never set on client-emitted `$page-view` /
`$click` events. The JOIN is currently load-bearing. A follow-up PR will
enrich `is_anonymous` at the batch ingest endpoint using
`auth.user.is_anonymous`; after one metrics-window cycle (~30 days) the
JOIN can be dropped.

## Follow-up work (out of scope for this PR)

- **Batch-endpoint enrichment** + drop the analytics-overview LEFT JOIN
(est. further −53% endpoint memory, −46% duration per the benchmark).
- **Teams-split hash-variant count mismatch** — `sipHash64(team_id)`
variant of the teams split shows a count discrepancy vs. the
string-keyed version in the benchmark. Not blocking since teams-split is
only #8 by memory; needs a root-cause pass before shipping that
particular optimization.
- **`loadUsersByCountry` window bound** — currently scans every
`$token-refresh` event ever for the tenancy (no time filter). Bounding
to 30 days would bound memory growth with project age, but changes
semantics (\"country of latest login ever\" → \"in last 30 days\").
Deferred because it's product-facing.

## Snapshot changes in `internal-metrics.test.ts.snap`

The `should return metrics data with users` test signs in 10 users
today, then deletes one of them mid-test. Two small snapshot values
change on today's date; both are just a reclassification of that single
deleted user — the total (10 active users) is unchanged.

- **`daily_active_users_split.new[today]`: 9 → 10**
All 10 users really did sign in for the first time today. The old code
only counted 9 because the deleted user's Postgres row was gone by the
time the metrics query ran, so the old classifier couldn't see they were
created today. The new query looks at ClickHouse events directly, sees
the deleted user's first event was today, and counts them as new like
everyone else.

- **`daily_active_users_split.reactivated[today]`: 1 → 0**
No user was "reactivated" today — nobody was active on an earlier day
and came back. The old "1" was the deleted user falling into this bucket
by default (the old classifier had no other rule that fit them). The new
code correctly reports zero.

Totals match either way (9 + 1 = 10 + 0). We're moving one deleted user
out of the "returning visitor" bucket and into the "brand-new user"
bucket, which is what they actually were.

## Test plan

- [x] `pnpm typecheck` and `pnpm lint` pass on the backend package
- [x] MAU equivalence matrix: 13/13 cases return the same set of users
(not just the same count) between OLD and NEW pipelines
- [x] Set-equality verified at 500k-MAU perf scale
- [x] Full-route benchmark confirms the expected memory / duration
improvements
- [ ] Sanity-check the dashboard rendering after deploy (split charts,
MAU counter, analytics overview)
- [ ] Monitor Sentry for the assertion error — should drop to zero

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Performance Improvements**
* Monthly and daily active metrics are now computed entirely server-side
for faster queries and reduced client-side processing.

* **Bug Fixes**
* More consistent handling of anonymous/missing IDs and stricter ID
filtering to improve accuracy across edge cases.

* **Tests**
* Added a comprehensive benchmark and validation harness to measure
query performance and verify result equivalence across variants.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-19 22:57:46 -07:00
BilalG1
0621ad2032
ai proxy fix (#1343)
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Refactor**
* Request sanitization now includes an extra proxy-specific
preprocessing step for safer AI proxying.
* **New Features**
* Initialization prompts centralized into a shared helper, with a
web-specific prompt variant.
* Authenticated requests can optionally route via a provided external
API key to access alternate models.
* **Chores**
* Added and exposed a preprocessing hook with a default no-op
implementation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-19 22:57:38 -07:00
Konstantin Wohlwend
be6970cd81 Fix Docker build scripts
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
2026-04-18 23:50:50 -07:00
Konstantin Wohlwend
f0bbdb1c34 Make access token warning just a log 2026-04-18 23:14:27 -07:00
Konstantin Wohlwend
82c923e03c waitUntil Sentry flush is complete 2026-04-18 22:28:02 -07:00
Konstantin Wohlwend
560ee4c16e Fix memory leak 2026-04-18 22:21:05 -07:00
Konstantin Wohlwend
d568ad5149 Increase Clickhouse request timeout 2026-04-18 21:46:10 -07:00
Konstantin Wohlwend
cf67d37611 Don't override 5xx errors 2026-04-18 19:31:13 -07:00
Konstantin Wohlwend
ac9707b89e Update metrics endpoint to no longer trigger global error boundary on failure
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
Mirror main branch to main-mirror-for-wdb / lint_and_build (push) Has been cancelled
Publish npm packages / publish (push) Has been cancelled
Publish Swift SDK to prerelease repo / publish (push) Has been cancelled
Sync Main to Dev / sync-commits (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
2026-04-18 19:08:52 -07:00
Konstantin Wohlwend
ee68908111 Skip Swift tests temporarily 2026-04-18 17:32:41 -07:00
Konstantin Wohlwend
1594ed94d5 Speed up seed script by a lot 2026-04-18 17:29:21 -07:00
Konstantin Wohlwend
f85b4f3997 Make Bulldozer SQL statements deterministic 2026-04-18 16:43:26 -07:00
Konstantin Wohlwend
8046a7dd8f Fix dashboard sidebar hover states 2026-04-18 16:18:55 -07:00
Konstantin Wohlwend
2e247dd06d Improve dashboard sidebar styling 2026-04-18 14:54:40 -07:00
Konstantin Wohlwend
fd68701097 Fix bigint serialization error on tracing 2026-04-18 14:46:03 -07:00
Konstantin Wohlwend
91fbf63f7f chore: update package versions 2026-04-18 14:20:39 -07:00
Aman Ganapathy
847d14df70
[Fix]: Assortment of Bugs with Timefold Table and Payments (#1348) 2026-04-18 14:17:24 -07:00
Konstantin Wohlwend
f4ca6cb4c7 More tracing for replication-related functions 2026-04-17 17:57:34 -07:00
Aman Ganapathy
665870a144
[Fix] Bulldozer Studio and SpaceTime DB port conflict (#1346) 2026-04-17 17:56:11 -07:00
Konstantin Wohlwend
22ae47fe73 Replace Cmd with Ctrl on Windows computers 2026-04-17 17:04:30 -07:00
Aman Ganapathy
1de8a17183
Payments bulldozer txn rework (#1315)
### Object of this PR
This PR is NOT a monolithic series of fixes for the payments suite + a
complete rework. Its aims were
a) introducing and robustly testing the bulldozer db system 
b) reworking the payments underlying architecture to use bulldozer for
correctness and scalability
c) Achieving parity with the old payments system excepting a few changes
like ensuring correctness of the ledger algo
There may still be some work to do with handling refunds, decoupling the
concepts of purchases from that of products, and some other things.

### Ledger Algorithm
This has been tuned and fixed. Item removals i.e negative item quantity
changes will apply to the soonest expiring item grant i.e positive item
quantity change. This is what is best for the user. Item grants can also
expire, and when they expire we obviate whatever is left of their
original capacity (meaning after all the removals that were applied to
it). Our ledger algo is applied via Bulldozer, so automatic
re-computation is handled when a new grant/ removal is inserted in the
middle of the existing ones.

### Things we got rid of 
* No more automatic support for default products. You can use $0 plan
provisions to accomplish the same effect but it's manual
* Negative item quantity changes (i.e item removals) no longer can have
expiries



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Enhanced payment processing pipeline with improved data consistency
and state management.
  * Advanced refund handling with comprehensive transaction tracking.
* Better tracking and management of customer item quantities and owned
products.
* Improved subscription lifecycle management including period-end
handling.

* **Bug Fixes**
  * Fixed payment data integrity verification.
  * Improved handling of edge cases in refund scenarios.

* **Chores**
  * Updated cSpell configuration with additional words.
  * Expanded developer documentation for linting workflows.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Konstantin Wohlwend <n2d4xc@gmail.com>
Co-authored-by: Aadesh Kheria <kheriaaadesh@gmail.com>
Co-authored-by: Mantra <87142457+mantrakp04@users.noreply.github.com>
2026-04-17 22:11:21 +00:00
BilalG1
8af48c1e94
fix(dashboard): correct keyboard shortcut display and HTML entity rendering (#1342)
## Summary

Two small UI bugs found while auditing `apps/dashboard` for visible
defects.

### 1. Dashboards empty state hardcoded `Cmd+K`


`apps/dashboard/src/app/(main)/(protected)/projects/[projectId]/dashboards/page-client.tsx:80`

The empty state copy referenced the command palette as `Cmd+K`. The rest
of the dashboard renders the shortcut as the `⌘ K` keycap (see
`cmdk-search.tsx:1062`), so this one string was inconsistent. Replaced
with `⌘ K` to match the convention.

**Before/after flicker:**

![bug 5
flicker](https://gist.githubusercontent.com/BilalG1/adf11369b93cf280e1af49428a5b4f89/raw/bug5-flicker.gif)

**Pixel diff** — 3,500 diff pixels (0.270%). Changed regions: the "No
dashboards yet" description line (the Cmd+K text) and the "DEV" badge in
the bottom-right.

![bug 5 pixel
diff](https://gist.githubusercontent.com/BilalG1/adf11369b93cf280e1af49428a5b4f89/raw/bug5-pixeldiff.png)

| Before | After |
|---|---|
|
![before](https://gist.githubusercontent.com/BilalG1/adf11369b93cf280e1af49428a5b4f89/raw/before-bug5-cmdk.png)
|
![after](https://gist.githubusercontent.com/BilalG1/adf11369b93cf280e1af49428a5b4f89/raw/after-bug5-cmdk.png)
|

### 2. Vercel page rendered `&apos;` as raw text


`apps/dashboard/src/app/(main)/(protected)/projects/[projectId]/vercel/page-client.tsx:168`,
`:169`, `:414`

Three string literals contained `&apos;`:

```tsx
? "You&apos;ll receive a publishable client key and a secret server key for this project."
: "You&apos;ll receive a secret server key for this project."
…
subtitle="See Vercel&apos;s documentation on environment variables for more details."
```

These are JS strings passed into props, not JSX text nodes — React only
decodes HTML entities in JSX text, so the literal characters `&apos;`
ended up in the DOM. Verified via `document.querySelector` — actual text
content was `You&apos;ll receive a secret server key for this project.`.
Replaced with a plain ASCII apostrophe.

**Before/after flicker:**

![bug 7
flicker](https://gist.githubusercontent.com/BilalG1/adf11369b93cf280e1af49428a5b4f89/raw/bug7-flicker.gif)

**Pixel diff** — 1,252 diff pixels (0.163%). Changed region: the
`You&apos;ll` → `You'll` line.

![bug 7 pixel
diff](https://gist.githubusercontent.com/BilalG1/adf11369b93cf280e1af49428a5b4f89/raw/bug7-pixeldiff.png)

| Before | After |
|---|---|
|
![before](https://gist.githubusercontent.com/BilalG1/adf11369b93cf280e1af49428a5b4f89/raw/before-bug7-apos.png)
|
![after](https://gist.githubusercontent.com/BilalG1/adf11369b93cf280e1af49428a5b4f89/raw/after-bug7-apos.png)
|

## Test plan

- [x] Visited `/projects/<id>/dashboards` with no dashboards — empty
state now reads `(⌘ K)`
- [x] Visited `/projects/<id>/vercel` — both the "API keys generated"
subtitle and the "Need more detail?" subtitle render `'` as a real
apostrophe
- [x] `eslint` clean on both touched files
2026-04-17 14:13:49 -07:00
Konstantin Wohlwend
b5273f7326 Clicking a dashboard category now opens its first page 2026-04-17 12:15:29 -07:00
aadesh18
5341371782
LLM MCP Flow (#1321)
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
Mirror main branch to main-mirror-for-wdb / lint_and_build (push) Has been cancelled
Publish npm packages / publish (push) Has been cancelled
Publish Swift SDK to prerelease repo / publish (push) Has been cancelled
Sync Main to Dev / sync-commits (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Automated AI QA review pipeline and human-verified knowledge base
consulted first
* Internal MCP review tool: call log viewer, conversation replay,
add/edit/publish Q&A, knowledge editor, and analytics
  * Docs search now preserves follow-up conversation context

* **Documentation**
  * Added “Ask DeepWiki” badge to README

* **Chores**
* Added local SpacetimeDB background service and internal-tool app
scaffolding
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: mantrakp04 <mantrakp@gmail.com>
Co-authored-by: Mantra <87142457+mantrakp04@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com>
2026-04-15 17:57:08 +00:00
Konstantin Wohlwend
c5436be6ac Skip setup tests on non-dev branches 2026-04-15 10:21:26 -07:00
Armaan Jain
94dd22c1c5
Overview revamp (#1238) 2026-04-15 09:36:00 -07:00
Armaan Jain
654c97c56e
Onboarding redo (#1308) 2026-04-15 09:35:48 -07:00
Mantra
74f2df9c79
fix(ai): Accept header for docs-tools MCP endpoint (#1334) 2026-04-14 21:36:31 -07:00
Konstantin Wohlwend
d21bdb0ea8 Skip diagnostics for analytics requests 2026-04-14 20:35:22 -07:00
BilalG1
c66bdfb5ae
Fix five dashboard UI issues (#1337)
## Summary

Fixes five independent UI bugs in the dashboard. Each is a narrow,
localized fix — no changes to shared table / card primitives.

### 1. Auth methods preview didn't update until save
Toggling Email/password, Magic link, or Passkey updated the switch UI
but the right-hand sign-in preview kept rendering the pre-save config
until "Save changes" was clicked. The preview was reading
`project.config` instead of the local pending state.

**Fix:** pass the computed local state (`passwordEnabled`, `otpEnabled`,
`passkeyEnabled`) into `AuthPage`'s `mockProject.config` so the preview
reflects toggles immediately.

| Before | After |
|---|---|
|
![before](b6d4f39f66/01-auth-methods-before.gif)
|
![after](b6d4f39f66/01-auth-methods-after.gif)
|

---

### 2. Email-drafts "New Draft" dropdown items stacked on two rows
Icon rendered above text in the dropdown because the icon was a child of
a non-flex inner wrapper inside `DropdownMenuItem` and phosphor icons
default to `display: block`.

**Fix:** use `DropdownMenuItem`'s built-in `icon` prop (which
absolute-positions the icon) instead of passing it as a child.

| Before | After |
|---|---|
|
![before](b6d4f39f66/02-email-drafts-before.png)
|
![after](b6d4f39f66/02-email-drafts-after.png)
|

---

### 3. Project-keys status filter: clicking options did nothing visible
`DesignDataTable` renders the toolbar outside the card when
`glassmorphic && !insideDesignCard`. The table instance was captured
once via `onTableReady`; filter clicks updated the table's internal
state (rows actually filtered to "No results") but the toolbar's parent
never re-rendered, so checkboxes, chip count, and button label stayed
frozen.

**Fix:** wrap `InternalApiKeyTable` in `DesignCard` so
`useInsideDesignCard()` returns true, `needsOwnCard` becomes false, and
the toolbar renders inside the `DataTable` where it re-renders normally.
No changes to the shared `DesignDataTable` component.

| Before | After |
|---|---|
|
![before](b6d4f39f66/03-project-keys-before.gif)
|
![after](b6d4f39f66/03-project-keys-after.gif)
|

---

### 4. Analytics "Tables" page only listed Events
`AVAILABLE_TABLES` was hardcoded to a single entry.

**Fix:** registered all 12 ClickHouse views that exist in the `default`
schema (events, users, contact_channels, teams, team_member_profiles,
team_permissions, team_invitations, email_outboxes, project_permissions,
notification_preferences, refresh_tokens, connected_accounts) with
sensible default sort columns. Widened `TableId` to `string`.

| Before | After |
|---|---|
|
![before](b6d4f39f66/04-analytics-tables-before.png)
|
![after](b6d4f39f66/04-analytics-tables-after.png)
|

---

### 5. Price input `$` prefix overlapped the number on prod
The Input composed `h-9 px-3 ... pl-7`. In production's CSS bundle order
`.px-3` declared after `.pl-7`, so `padding-left` resolved to 12px —
same as the prefix's `left-3` position — making `$` overlap the first
digit. The emulator's bundle happened to order them the other way, which
is why it only reproduced in prod. Verified with a devtools injection
that mimics the prod CSS ordering.

**Fix:** change `pl-7` → `!pl-7` in `repeating-input.tsx` so the prefix
padding wins regardless of CSS order.

| Before (prod CSS ordering) | After (same ordering) |
|---|---|
|
![before](b6d4f39f66/05-price-overlap-before.png)
|
![after](b6d4f39f66/05-price-overlap-after.png)
|

---

## Test plan

- [x] `pnpm --filter @stackframe/dashboard typecheck`
- [x] `pnpm --filter @stackframe/dashboard lint`
- [x] Manual verification of each issue against the local dev dashboard
at localhost:8101
- [ ] Reviewer: confirm no visual regressions on other `DesignDataTable`
usages (api-key-table is the only one wrapped here)
- [ ] Reviewer: confirm analytics queries on added tables work with the
signed-in user's permissions


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added 12 new analytics tables to the dashboard for enhanced data
visibility and tracking.

* **Bug Fixes**
  * Fixed input styling issue with prefix alignment.

* **Style**
* Improved visual presentation of data tables with enhanced card
styling.
  * Refined dropdown menu icon display for better UI consistency.
* Enhanced authentication preview settings to reflect current
configuration state.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-14 19:38:52 -07:00
Konstantin Wohlwend
b68710e98e chore: update package versions 2026-04-14 18:06:36 -07:00
BilalG1
88d3317b22
local emulator security and features fixes (#1247)
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Release Notes

* **New Features**
* Added Stripe, OAuth, and Freestyle mock services to the local emulator
* Introduced `emulator run` CLI command to execute applications with
emulator credentials automatically injected
  * Enhanced credential management for local development

* **Improvements**
  * Improved ARM64 QEMU emulation with cross-architecture support
  * Better error detection and logging during emulator provisioning
  * Added example middleware configuration with authentication support
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-14 15:36:24 -07:00
Konstantin Wohlwend
e68015909d Fix lint 2026-04-14 13:43:33 -07:00
Konstantin Wohlwend
7f9eac40c5 Downgrade Next.js to 16.1.7 2026-04-14 12:39:55 -07:00
Konstantin Wohlwend
3ca2fae3e1 Revert commit 2026-04-14 10:03:53 -07:00
Konstantin Wohlwend
e63daf8606 Make backend not module 2026-04-14 09:51:39 -07:00
BilalG1
2af2a591b4
Skip analytics init on apps without persistent token store (#1336)
Owned admin apps are constructed with `tokenStore: null`, which caused
EventTracker/SessionRecorder flushes to throw from
_ensurePersistentTokenStore() after #1331 removed the silencing.

<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Improved analytics stability and privacy by restricting session
recording and event tracking to environments with required persistent
storage.
* **Tests**
* Adjusted a few end-to-end tests to skip when running against a local
emulator to reduce spurious failures.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-14 09:43:37 -07:00