## Summary
Six UI issues found across the email-* dashboard pages, ranked by
impact, fixed here:
1. **email-sent layout** — the email log table and domain reputation
card were forced side-by-side at all widths. A fixed-width sidebar plus
a flex-1 table meant that on tablet the table got crushed, and on mobile
the row overflowed horizontally. Fix: stack vertically below `lg`, and
let the reputation card span full width on narrow viewports.
2. **Domain status enum leaks to the UI** — `<span>Status:
{domain.status}</span>` rendered raw values like `pending_dns` /
`pending_verification`. Added a `MANAGED_DOMAIN_STATUS_LABELS` map and
route through it before rendering.
3. **email-themes dialog grid cramped on mobile** — the Change Theme
dialog hardcoded `grid-cols-2`, so at 375px each theme card had ~150px
and the preview images were illegible. Changed to `grid-cols-1
sm:grid-cols-2`.
4. **Template name row overflow** — long template names pushed the Edit
Template button off the right edge of the card because the flex row had
no `min-w-0` / `truncate`. Fixed both, and made the action column
`shrink-0`.
5. **Boosted-capacity label was color-only** — during an active boost
the label used a red strikethrough for the base value and a blue number
for the boosted value with no non-color cue. Added an explicit `→` arrow
between the two numbers, `title` tooltips on each, and a visible
\"(boosted)\" marker after `/h max`.
6. **Draft progress bar overflowed at mobile width** — the 4-step
progress bar used fixed 80px connectors, giving a minimum width of
~400px that clipped off both ends at 375px. Changed connectors to `w-8
sm:w-20` (32px on mobile, 80px otherwise) so all four steps and their
labels fit below 640px.
## Before / after
Each GIF below loops \"before\" (1s) → \"after\" (1s) with a red pill in
the top-right indicating which frame is which. Full-size stills (before
+ after + extra viewports) are listed under **All screenshots** at the
bottom.
### 1. email-sent — two-column layout collapses on narrow viewports
Mobile (375px):

Tablet (900px):

### 2. email-settings — managed-domain status label

### 3. email-themes — Change Theme dialog on mobile

### 4. email-templates — long name overflow

### 5. email-sent — boosted capacity label

### 7. email-drafts — draft progress bar on mobile

## Test plan
- [x] \`pnpm --filter @stackframe/dashboard lint\` — clean
- [x] \`pnpm --filter @stackframe/dashboard typecheck\` — clean
- [x] Manual verification in a browser at 375px / 900px / 1440px, light
+ dark mode, for each fixed page
- [ ] Reviewer sanity check of the remaining email-* pages
(email-outbox, email-viewer) for similar responsive regressions
## Notes
- The initial review flagged a \"white-on-white capacity boost timer\" —
on closer look the label sits on a deliberately dark `bg-zinc-900/0.82`
overlay inside the boost card, so it reads fine in light and dark mode.
Not fixing; that part of the review was a false positive.
- The initial review also flagged a missing empty state on
email-templates. Because Stack seeds built-in templates, the empty
branch is unreachable in practice — skipping that fix to avoid dead
code.
## All screenshots
Gist with all the individual before/after PNGs and the GIFs themselves:
https://gist.github.com/BilalG1/edb04740a19c3f2d048da6e602209d45
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **New Features**
* Added human-readable status labels for managed domains in domain
settings
* **Improvements**
* Enhanced responsive layouts across dashboard pages for improved mobile
experience
* Improved email capacity display with visual indicators and tooltips
for boost status
* Refined template and theme selection layouts with better text handling
and spacing
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
Fixes the Sentry `StackAssertionError: Failed to load monthly active
users for internal metrics` crash (ClickHouse OOM at the 7.2 GiB
per-query cap) and applies two related optimizations to other queries in
the same route while here. Adds a local benchmark harness that validates
correctness and measures peak memory / duration before & after.
## Root cause (the original Sentry error)
`loadMonthlyActiveUsers` was written as `SELECT user_id … GROUP BY
user_id` and then counting in Node via a `Set`. On a large project that
ships back millions of user_ids. Two failure modes stacked:
1. **Result materialization** — every distinct user_id had to be
buffered in the server before streaming to Node (~20 MiB of result for
450k users; much more at real scale).
2. **`JSONExtract(toJSONString(data), 'is_anonymous', 'UInt8')`** — the
`toJSONString(data)` per-row re-serialization of the entire nested JSON
column, billions of times, just to pull one boolean. Dominates
bytes-read.
Combined, on a single partition read from S3-backed MergeTree, this can
exceed ClickHouse's 7.2 GiB per-query memory cap. That's exactly what
the Sentry trace showed.
## Changes
### 1. Fix MAU query (`loadMonthlyActiveUsers`)
Moved counting to the server with
`uniqExact(sipHash64(normalized_user_id))` and pulled the JS-side
normalization (`lower`, `trim`, `isUuid`) into SQL. Picked `sipHash64`
after benchmarking 7 variants — it's exact (at <<2³² users) and halves
the uniqExact hash-state vs. raw string keys.
### 2. Fix 1 — `JSONExtract(toJSONString(data), …)` → direct
`CAST(data.is_anonymous, …)`
Applied everywhere the pattern appeared in the metrics route:
- `loadDailyActiveUsers`
- the `analyticsUserJoin` subquery
- the `nonAnonymousAnalyticsUserFilter`
- `analyticsOverview:topRegion`
- `analyticsOverview:online`
Semantics preserved (`coalesce(CAST(data.is_anonymous,
'Nullable(UInt8)'), 0)` matches `JSONExtract(…, 'UInt8')` behavior when
the field is missing).
### 3. Fix 3 — server-aggregate the split queries
`loadDailyActiveUsersSplit` and `loadDailyActiveTeamsSplit` used to ship
1.2M+ `(day, user_id)` rows back to Node just so the JS could bucket
them into new / retained / reactivated. Rewrote both as one CTE-style
query that returns 31 rows (one per day in the 30-day window) with the
counts precomputed.
**Minor semantic shift** (documented inline in `route.tsx`): \"new\" is
now based on the user's first-ever `\$token-refresh` event rather than
their Postgres `signedUpAt`. Agrees for users who log in immediately
after sign-up (the common case). Disagrees for the rare edge case of an
account that existed pre-window but never generated a `\$token-refresh`
until now — old code classified as \"reactivated,\" new code classifies
as \"new.\" Judged acceptable; can be revisited.
Postgres round-trips for `ProjectUser.signedUpAt` / `Team.createdAt` are
no longer needed for the split, and the 76 MiB-ish wire ship is gone.
### 4. Benchmark harness
(`apps/backend/scripts/benchmark-internal-metrics.ts`)
Local-only tool. Three modes:
- **MAU equivalence matrix** — 13 edge cases (empty, dedup, anonymous
filter, window boundary, null user_id, non-UUID user_id, case variation,
project isolation, missing/null `is_anonymous`, wrong event_type).
Asserts OLD pipeline and NEW query return the **same set** of users, not
just the same count.
- **MAU perf** — OLD vs NEW plus 6 other candidate variants (inline
regex, UUID keys, sipHash64, HLL sketches), reads `memory_usage` /
`read_rows` / `result_bytes` from `system.query_log` for each, prints a
ranked table.
- **Full-route benchmark** (`BENCH_ROUTE_QUERIES=1`) — runs every
ClickHouse query in `/internal/metrics` in three stages (BEFORE, AFTER,
candidate OPTIMIZED) against the same seed and prints per-query deltas
plus endpoint-level totals.
Seeds under a synthetic `project_id` so real data is never touched;
cleans up on exit via `ALTER TABLE … DELETE`.
## Benchmark results
### MAU query alone
Ran at two scales; set-equality verified (new query identifies the same
individual users, not just the same count).
| seed | MAU | peak memory (old → new) | bytes read | duration |
|---|---|---|---|---|
| 500k events | 89,939 | 158.7 MiB → 46.7 MiB (**3.4×**, −70%) | 175.7
MiB → 63.0 MiB (2.8×) | 483 ms → 76 ms (**6.4×**) |
| 2.5M events | 449,990 | 439.2 MiB → 281.4 MiB (1.56×, −36%) | 865.0
MiB → 310.9 MiB (2.8×) | 783 ms → 126 ms (**6.2×**) |
MAU variant bake-off at 2.5M events (all exact, all set-equal to OLD):
| variant | memory | duration | notes |
|---|---|---|---|
| v0_old (baseline) | 440 MiB | 567 ms | — |
| v1_uniqExact_string | 284 MiB | 110 ms | naive fix |
| v3_uniqExact_toUUID | 244 MiB | 153 ms | UUID keys, slower per-row |
| **v4_uniqExact_sipHash64** | **125 MiB** | **95 ms** | **shipped** |
| v5_uniq (HLL) ~approx | 30 MiB | 86 ms | −0.25% error |
| v6_uniqCombined ~approx | 31 MiB | 67 ms | −0.15% error |
### Full `/internal/metrics` route (2.7M events, 300k users + page-views
+ clicks + teams)
Ranked by BEFORE peak memory:
| query | mem BEFORE | mem AFTER | Δ mem | dur BEFORE | dur AFTER | Δ
dur |
|---|---|---|---|---|---|---|
| analyticsOverview:topReferrers | 588.1 MiB | 411.1 MiB | 1.43× | 1833
ms | 110 ms | **16.66×** |
| analyticsOverview:totalVisitors | 584.3 MiB | 403.5 MiB | 1.45× | 1829
ms | 121 ms | 15.12× |
| analyticsOverview:dailyEvents | 584.1 MiB | 403.7 MiB | 1.45× | 1897
ms | 140 ms | 13.55× |
| loadUsersByCountry | 393.1 MiB | 385.4 MiB | ≈same | 74 ms | 80 ms |
≈same |
| loadDailyActiveUsersSplit | 363.4 MiB | 396.8 MiB | *+9%* | 1966 ms |
356 ms | 5.52× |
| analyticsOverview:topRegion | 269.9 MiB | 106.4 MiB | 2.54× | 1602 ms
| 65 ms | 24.65× |
| loadDailyActiveUsers | 268.3 MiB | 84.0 MiB | 3.19× | 1111 ms | 44 ms
| 25.25× |
| loadDailyActiveTeamsSplit | 59.6 MiB | 78.1 MiB | *+31%* | 70 ms | 123
ms | *+76%* |
| loadMonthlyActiveUsers | 54.9 MiB | 54.9 MiB | ≈same | 68 ms | 56 ms |
≈same |
| analyticsOverview:online | 18.4 MiB | 5.8 MiB | 3.17× | 58 ms | 4 ms |
14.50× |
**Endpoint-level totals**
| metric | BEFORE | AFTER | Δ |
|---|---|---|---|
| Sum peak ClickHouse memory | 3.11 GiB | 2.28 GiB | **−27%** |
| **Max query duration** (endpoint wall-clock floor) | **1966 ms** |
**356 ms** | **−82%** (5.5×) |
| Sum query duration (total CPU) | 10508 ms | 1099 ms | **−90%** (9.6×)
|
| Bytes read | 10.70 GiB | 4.55 GiB | −57% |
| Bytes shipped to Node | 94.8 MiB | 44.2 KiB | **−99.95%** |
Both split queries show a small memory *regression* at this seed size
(the new server-side window-function + self-join has its own state cost
that's near break-even with \"materialize + ship\" at 300k users); at
prod scale the 76 MiB-ship saving dominates. Duration is unambiguously
better.
## Why we don't need to drop the `analyticsUserJoin` in this PR
The benchmark includes an OPTIMIZED stage that drops the LEFT JOIN and
trusts `e.data.is_anonymous` directly, which would shave another **1.2
GiB / 1.9× duration** off the endpoint. **But we can't ship that here**
— an audit of the client tracker
(`packages/js/src/lib/stack-app/apps/implementations/event-tracker.ts`)
confirmed `is_anonymous` is never set on client-emitted `$page-view` /
`$click` events. The JOIN is currently load-bearing. A follow-up PR will
enrich `is_anonymous` at the batch ingest endpoint using
`auth.user.is_anonymous`; after one metrics-window cycle (~30 days) the
JOIN can be dropped.
## Follow-up work (out of scope for this PR)
- **Batch-endpoint enrichment** + drop the analytics-overview LEFT JOIN
(est. further −53% endpoint memory, −46% duration per the benchmark).
- **Teams-split hash-variant count mismatch** — `sipHash64(team_id)`
variant of the teams split shows a count discrepancy vs. the
string-keyed version in the benchmark. Not blocking since teams-split is
only #8 by memory; needs a root-cause pass before shipping that
particular optimization.
- **`loadUsersByCountry` window bound** — currently scans every
`$token-refresh` event ever for the tenancy (no time filter). Bounding
to 30 days would bound memory growth with project age, but changes
semantics (\"country of latest login ever\" → \"in last 30 days\").
Deferred because it's product-facing.
## Snapshot changes in `internal-metrics.test.ts.snap`
The `should return metrics data with users` test signs in 10 users
today, then deletes one of them mid-test. Two small snapshot values
change on today's date; both are just a reclassification of that single
deleted user — the total (10 active users) is unchanged.
- **`daily_active_users_split.new[today]`: 9 → 10**
All 10 users really did sign in for the first time today. The old code
only counted 9 because the deleted user's Postgres row was gone by the
time the metrics query ran, so the old classifier couldn't see they were
created today. The new query looks at ClickHouse events directly, sees
the deleted user's first event was today, and counts them as new like
everyone else.
- **`daily_active_users_split.reactivated[today]`: 1 → 0**
No user was "reactivated" today — nobody was active on an earlier day
and came back. The old "1" was the deleted user falling into this bucket
by default (the old classifier had no other rule that fit them). The new
code correctly reports zero.
Totals match either way (9 + 1 = 10 + 0). We're moving one deleted user
out of the "returning visitor" bucket and into the "brand-new user"
bucket, which is what they actually were.
## Test plan
- [x] `pnpm typecheck` and `pnpm lint` pass on the backend package
- [x] MAU equivalence matrix: 13/13 cases return the same set of users
(not just the same count) between OLD and NEW pipelines
- [x] Set-equality verified at 500k-MAU perf scale
- [x] Full-route benchmark confirms the expected memory / duration
improvements
- [ ] Sanity-check the dashboard rendering after deploy (split charts,
MAU counter, analytics overview)
- [ ] Monitor Sentry for the assertion error — should drop to zero
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Performance Improvements**
* Monthly and daily active metrics are now computed entirely server-side
for faster queries and reduced client-side processing.
* **Bug Fixes**
* More consistent handling of anonymous/missing IDs and stricter ID
filtering to improve accuracy across edge cases.
* **Tests**
* Added a comprehensive benchmark and validation harness to measure
query performance and verify result equivalence across variants.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Refactor**
* Request sanitization now includes an extra proxy-specific
preprocessing step for safer AI proxying.
* **New Features**
* Initialization prompts centralized into a shared helper, with a
web-specific prompt variant.
* Authenticated requests can optionally route via a provided external
API key to access alternate models.
* **Chores**
* Added and exposed a preprocessing hook with a default no-op
implementation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
### Object of this PR
This PR is NOT a monolithic series of fixes for the payments suite + a
complete rework. Its aims were
a) introducing and robustly testing the bulldozer db system
b) reworking the payments underlying architecture to use bulldozer for
correctness and scalability
c) Achieving parity with the old payments system excepting a few changes
like ensuring correctness of the ledger algo
There may still be some work to do with handling refunds, decoupling the
concepts of purchases from that of products, and some other things.
### Ledger Algorithm
This has been tuned and fixed. Item removals i.e negative item quantity
changes will apply to the soonest expiring item grant i.e positive item
quantity change. This is what is best for the user. Item grants can also
expire, and when they expire we obviate whatever is left of their
original capacity (meaning after all the removals that were applied to
it). Our ledger algo is applied via Bulldozer, so automatic
re-computation is handled when a new grant/ removal is inserted in the
middle of the existing ones.
### Things we got rid of
* No more automatic support for default products. You can use $0 plan
provisions to accomplish the same effect but it's manual
* Negative item quantity changes (i.e item removals) no longer can have
expiries
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Enhanced payment processing pipeline with improved data consistency
and state management.
* Advanced refund handling with comprehensive transaction tracking.
* Better tracking and management of customer item quantities and owned
products.
* Improved subscription lifecycle management including period-end
handling.
* **Bug Fixes**
* Fixed payment data integrity verification.
* Improved handling of edge cases in refund scenarios.
* **Chores**
* Updated cSpell configuration with additional words.
* Expanded developer documentation for linting workflows.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Konstantin Wohlwend <n2d4xc@gmail.com>
Co-authored-by: Aadesh Kheria <kheriaaadesh@gmail.com>
Co-authored-by: Mantra <87142457+mantrakp04@users.noreply.github.com>
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled