Commit Graph

82 Commits

Author SHA1 Message Date
BilalG1
91b8e4caa4
Fix /internal/metrics ClickHouse OOM (#1457)
Fixes Sentry
[STACK-BACKEND-16H](https://stackframe-pw.sentry.io/issues/STACK-BACKEND-16H)
— the `/api/v1/internal/metrics` endpoint was triggering the cluster's
10.8 GiB OvercommitTracker kill on tenants with months of
`$token-refresh` history.

## Root cause

Three queries in `loadAnalyticsOverview` plus `loadUsersByCountry` did
`GROUP BY user_id` over the events table with **no lower `event_at`
bound**, so their hash table working set scaled with
cumulative-distinct-users-ever-seen instead of the 30-day metrics
window.

## Changes

- Add 30-day `event_at` lower bound to `loadUsersByCountry` and to the
`analyticsUserJoin` inner subquery (used by `dailyEvents`,
`totalVisitors`, `topReferrers`).
- New `getClickhouseAdminClientForMetrics()` factory in
`lib/clickhouse.tsx` with connection-level safety net: per-query +
per-user memory caps, external GROUP BY spill, and `join_algorithm:
'grace_hash,parallel_hash,hash'` (grace_hash measured to give 48% memory
reduction at zero latency cost — see benchmark notes in the file).
- Inline comment + concrete next steps for the long-term fix (option C:
stamp `is_anonymous` at ingest on page-view/click events, then drop the
join entirely).
- Extend `scripts/benchmark-internal-metrics.ts` with the
historical-seed knob and three new modes (`BENCH_BACKFILL_COMPARE`,
`BENCH_JOIN_ALGO_COMPARE`, plus the existing `BENCH_ROUTE_QUERIES`
updated) used to validate the choices above.

## Benchmark — pre-PR vs post-PR

Synthetic seed: 300k users × 9 events spread over 365 days (~2.7M
events).

| | pre-PR | post-PR | delta |
|---|---:|---:|---:|
| Sum peak memory | 2.18 GiB | 515 MiB | **4.3× less** |
| Max query duration | 1293 ms | 101 ms | **12.8× faster** |
| Sum CPU duration | 5119 ms | 394 ms | 13× less work |
| Sum bytes read | 3.87 GiB | 929 MiB | 4.3× less I/O |

Per-query at 300k users:
- `analyticsOverview:dailyEvents` 561 → 44 MiB (12.8× less)
- `analyticsOverview:totalVisitors` 560 → 50 MiB (11.2× less)
- `analyticsOverview:topReferrers` 546 → 50 MiB (10.9× less)
- `loadUsersByCountry` 388 → 44 MiB (8.9× less)

## Caveats

- `loadDailyActiveSplitFromClickhouse` still scans all-history on its
`min(event_at)` subquery. It can't be naively bounded — `first_date` is
used to classify entities as new vs reactivated, and a 30d bound would
silently mislabel old-but-active entities as "new." The new SETTINGS
cap+spill it; the proper fix is option C (documented inline).
- A user with a page-view but no `$token-refresh` in the last 30 days
now falls through to `coalesce(NULL, 0)` and is classified
non-anonymous. Token-refresh fires every few minutes per active session,
so this is rare but not impossible (embedded SDKs that poll less
frequently, sessions straddling the 30d boundary).
- `max_memory_usage_for_user: 9 GB` trades "cluster-wide
OvercommitTracker kill of a random query" for "clean per-user memory
error attributed to the specific query." After our 30d bounds, no query
is anywhere near 9 GB.

## Test plan

- [x] `pnpm typecheck` passes
- [x] `pnpm lint` passes
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/internal-metrics.test.ts` — 9/10
pass; the 1 failure (`risk_scores` snapshot drift) reproduces on clean
`dev` and is unrelated
- [x] `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/analytics-{events,events-batch,query}.test.ts
apps/e2e/tests/backend/endpoints/api/v1/token-refresh-events.test.ts
apps/e2e/tests/backend/performance/metrics.test.ts` — all passing tests
pass; 10 pre-existing `PRODUCT_DOES_NOT_EXIST` setup failures reproduce
on clean `dev`
- [x] Benchmark `BENCH_ROUTE_QUERIES=1` at 300k users shows the deltas
above

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Improved internal metrics collection to use metrics-specific DB
settings for more reliable, safer analytical reads.
* Added guardrails to metrics queries to enforce time-window bounds and
avoid unbounded scans.
* Expanded benchmark modes (backfill and join-algo comparisons),
extended perf seeding, and improved logging/retry behavior to capture
more complete stats and reduce missing log rows.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1457?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-21 13:47:32 -07:00
Aman Ganapathy
a9623d976a
[Refactor] [Fix] Remove default prod creation (#1350)
With the new bulldozer rework we dont support default products anymore.
Users are encouraged to currently manually handle granting products to
their end users.

We block api requests and new product creations that attempt to set no
price, and we remove any options to set include-by-default. We also
migrate users' existing product snapshots in `Subscriptions`,
`OneTimePurchases`, and `ProductVersions` to have no price set if it's
an include-by-default product. This will make it so that next time a
user goes onto their products page, they will be informed that the
pricing is invalid and it is no longer delivered by default.

Note, however, that these products will still be providing items and the
like to the users who have them.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
* Migrated legacy product snapshots so missing included-items no longer
break readers.
* Removed deprecated "include-by-default" pricing sentinel; pricing now
requires explicit price entries and write validation rejects the old
sentinel.

* **Chores**
* Simplified dashboard pricing flows: create/edit/save now use explicit
prices and surface an alert when a formerly implicit free plan needs an
explicit $0 price.
* Config overrides and stored data are auto-normalized to explicit price
objects.

* **Tests**
* Updated and added tests covering migration, validation, and switching
behavior for explicit prices.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: mantrakp04 <mantrakp@gmail.com>
Co-authored-by: Mantra <87142457+mantrakp04@users.noreply.github.com>
2026-05-15 10:38:33 -07:00
Aman Ganapathy
4648fc1899
[Feat] new scripts on migrate/seed/init run for internal (#1421)
### Context

One script grants free plan to any team which is a customer of the
internal project who doesnt have it already.
We also want to migrate our users (internal) to the latest version of
their products.

Needed because some subs on dev right now dont have a plan. And internal
isnt using latest version of its own growth plan.

### Describing the Paths we want to Account for

1. Users on production who currently don't have a plan should get free
plans, since this script is run with every migrate
2. Users on production should get the latest version of each plan of
ours. So a forced migration to latest version of internal project plans
3. No other project's products/product lines should be affected. They
will continue to have product versioning
4. 2 should apply to test mode subscriptions as well, on top of stripe
subscriptions. All of them should be refreshed
5. Internal project itself should get latest version of its own growth
plan
6. If the bulldozer write fails, we should be able to recover on next
migration (this should already be handled by init bulldozer script,
because it checks if prisma db and bulldozer db are out of sync)
7. if the regenerate or backfill fail, we should be able to recover just
by rerunning the script
8. Product version table should not balloon. No table should really
balloon



### What I've tested on local
1. Put in 1000 db subscription rows, made them all stale and then ran
the regen script. It took about 6 minutes to update all of them, and it
was idempotent so rerunning it again did nothing.
2. With proper stripe keys I switched off of test mode on the internal
app, granted a product to a new team and updated the product's item
list. At this point I checked and the new team had the outdated version
of the product. Then I ran the regen script and the new team was moved
to latest product version.
3. Tried the above with the internal team's growth plan too and it
worked as well.
4. Backfill actually grants free plan


### Deployment strategy in prod
Run the backfill and the regen scripts once each after your migrations
on the prod db.
`pnpm db:backfill-internal-free-plans` will make sure every team has a
free plan at least if they dont have an existing plan (and it is
idempotent).
After that, run `pnpm db:regen-internal-subscriptions-to-latest` which
will migrate every user to the latest version of their plan (i.e latest
snapshot). This should also be idempotent.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Automated backfill to grant internal free plans to qualifying billing
teams.
* Regeneration tool to refresh internal subscription snapshots to the
latest product versions.

* **Chores**
* Added CLI commands and package scripts to run backfill and regen jobs.
  * Database init now runs payment initialization before backfill/regen.

* **Tests**
* Integration and unit tests added/updated to validate backfill,
regeneration, and free-plan idempotency.

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/hexclave/stack-auth/pull/1421)
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-05-12 16:05:45 -07:00
Konsti Wohlwend
765b0f4e29
New setup (#1413) 2026-05-06 12:03:06 -07:00
BilalG1
85ae4b1c9e
Fix ClickHouse OOM in MAU query + optimize /internal/metrics route (#1344)
## Summary

Fixes the Sentry `StackAssertionError: Failed to load monthly active
users for internal metrics` crash (ClickHouse OOM at the 7.2 GiB
per-query cap) and applies two related optimizations to other queries in
the same route while here. Adds a local benchmark harness that validates
correctness and measures peak memory / duration before & after.

## Root cause (the original Sentry error)

`loadMonthlyActiveUsers` was written as `SELECT user_id … GROUP BY
user_id` and then counting in Node via a `Set`. On a large project that
ships back millions of user_ids. Two failure modes stacked:

1. **Result materialization** — every distinct user_id had to be
buffered in the server before streaming to Node (~20 MiB of result for
450k users; much more at real scale).
2. **`JSONExtract(toJSONString(data), 'is_anonymous', 'UInt8')`** — the
`toJSONString(data)` per-row re-serialization of the entire nested JSON
column, billions of times, just to pull one boolean. Dominates
bytes-read.

Combined, on a single partition read from S3-backed MergeTree, this can
exceed ClickHouse's 7.2 GiB per-query memory cap. That's exactly what
the Sentry trace showed.

## Changes

### 1. Fix MAU query (`loadMonthlyActiveUsers`)

Moved counting to the server with
`uniqExact(sipHash64(normalized_user_id))` and pulled the JS-side
normalization (`lower`, `trim`, `isUuid`) into SQL. Picked `sipHash64`
after benchmarking 7 variants — it's exact (at <<2³² users) and halves
the uniqExact hash-state vs. raw string keys.

### 2. Fix 1 — `JSONExtract(toJSONString(data), …)` → direct
`CAST(data.is_anonymous, …)`

Applied everywhere the pattern appeared in the metrics route:
- `loadDailyActiveUsers`
- the `analyticsUserJoin` subquery
- the `nonAnonymousAnalyticsUserFilter`
- `analyticsOverview:topRegion`
- `analyticsOverview:online`

Semantics preserved (`coalesce(CAST(data.is_anonymous,
'Nullable(UInt8)'), 0)` matches `JSONExtract(…, 'UInt8')` behavior when
the field is missing).

### 3. Fix 3 — server-aggregate the split queries

`loadDailyActiveUsersSplit` and `loadDailyActiveTeamsSplit` used to ship
1.2M+ `(day, user_id)` rows back to Node just so the JS could bucket
them into new / retained / reactivated. Rewrote both as one CTE-style
query that returns 31 rows (one per day in the 30-day window) with the
counts precomputed.

**Minor semantic shift** (documented inline in `route.tsx`): \"new\" is
now based on the user's first-ever `\$token-refresh` event rather than
their Postgres `signedUpAt`. Agrees for users who log in immediately
after sign-up (the common case). Disagrees for the rare edge case of an
account that existed pre-window but never generated a `\$token-refresh`
until now — old code classified as \"reactivated,\" new code classifies
as \"new.\" Judged acceptable; can be revisited.

Postgres round-trips for `ProjectUser.signedUpAt` / `Team.createdAt` are
no longer needed for the split, and the 76 MiB-ish wire ship is gone.

### 4. Benchmark harness
(`apps/backend/scripts/benchmark-internal-metrics.ts`)

Local-only tool. Three modes:
- **MAU equivalence matrix** — 13 edge cases (empty, dedup, anonymous
filter, window boundary, null user_id, non-UUID user_id, case variation,
project isolation, missing/null `is_anonymous`, wrong event_type).
Asserts OLD pipeline and NEW query return the **same set** of users, not
just the same count.
- **MAU perf** — OLD vs NEW plus 6 other candidate variants (inline
regex, UUID keys, sipHash64, HLL sketches), reads `memory_usage` /
`read_rows` / `result_bytes` from `system.query_log` for each, prints a
ranked table.
- **Full-route benchmark** (`BENCH_ROUTE_QUERIES=1`) — runs every
ClickHouse query in `/internal/metrics` in three stages (BEFORE, AFTER,
candidate OPTIMIZED) against the same seed and prints per-query deltas
plus endpoint-level totals.

Seeds under a synthetic `project_id` so real data is never touched;
cleans up on exit via `ALTER TABLE … DELETE`.

## Benchmark results

### MAU query alone

Ran at two scales; set-equality verified (new query identifies the same
individual users, not just the same count).

| seed | MAU | peak memory (old → new) | bytes read | duration |
|---|---|---|---|---|
| 500k events | 89,939 | 158.7 MiB → 46.7 MiB (**3.4×**, −70%) | 175.7
MiB → 63.0 MiB (2.8×) | 483 ms → 76 ms (**6.4×**) |
| 2.5M events | 449,990 | 439.2 MiB → 281.4 MiB (1.56×, −36%) | 865.0
MiB → 310.9 MiB (2.8×) | 783 ms → 126 ms (**6.2×**) |

MAU variant bake-off at 2.5M events (all exact, all set-equal to OLD):

| variant | memory | duration | notes |
|---|---|---|---|
| v0_old (baseline) | 440 MiB | 567 ms | — |
| v1_uniqExact_string | 284 MiB | 110 ms | naive fix |
| v3_uniqExact_toUUID | 244 MiB | 153 ms | UUID keys, slower per-row |
| **v4_uniqExact_sipHash64** | **125 MiB** | **95 ms** | **shipped** |
| v5_uniq (HLL) ~approx | 30 MiB | 86 ms | −0.25% error |
| v6_uniqCombined ~approx | 31 MiB | 67 ms | −0.15% error |

### Full `/internal/metrics` route (2.7M events, 300k users + page-views
+ clicks + teams)

Ranked by BEFORE peak memory:

| query | mem BEFORE | mem AFTER | Δ mem | dur BEFORE | dur AFTER | Δ
dur |
|---|---|---|---|---|---|---|
| analyticsOverview:topReferrers | 588.1 MiB | 411.1 MiB | 1.43× | 1833
ms | 110 ms | **16.66×** |
| analyticsOverview:totalVisitors | 584.3 MiB | 403.5 MiB | 1.45× | 1829
ms | 121 ms | 15.12× |
| analyticsOverview:dailyEvents | 584.1 MiB | 403.7 MiB | 1.45× | 1897
ms | 140 ms | 13.55× |
| loadUsersByCountry | 393.1 MiB | 385.4 MiB | ≈same | 74 ms | 80 ms |
≈same |
| loadDailyActiveUsersSplit | 363.4 MiB | 396.8 MiB | *+9%* | 1966 ms |
356 ms | 5.52× |
| analyticsOverview:topRegion | 269.9 MiB | 106.4 MiB | 2.54× | 1602 ms
| 65 ms | 24.65× |
| loadDailyActiveUsers | 268.3 MiB | 84.0 MiB | 3.19× | 1111 ms | 44 ms
| 25.25× |
| loadDailyActiveTeamsSplit | 59.6 MiB | 78.1 MiB | *+31%* | 70 ms | 123
ms | *+76%* |
| loadMonthlyActiveUsers | 54.9 MiB | 54.9 MiB | ≈same | 68 ms | 56 ms |
≈same |
| analyticsOverview:online | 18.4 MiB | 5.8 MiB | 3.17× | 58 ms | 4 ms |
14.50× |

**Endpoint-level totals**

| metric | BEFORE | AFTER | Δ |
|---|---|---|---|
| Sum peak ClickHouse memory | 3.11 GiB | 2.28 GiB | **−27%** |
| **Max query duration** (endpoint wall-clock floor) | **1966 ms** |
**356 ms** | **−82%** (5.5×) |
| Sum query duration (total CPU) | 10508 ms | 1099 ms | **−90%** (9.6×)
|
| Bytes read | 10.70 GiB | 4.55 GiB | −57% |
| Bytes shipped to Node | 94.8 MiB | 44.2 KiB | **−99.95%** |

Both split queries show a small memory *regression* at this seed size
(the new server-side window-function + self-join has its own state cost
that's near break-even with \"materialize + ship\" at 300k users); at
prod scale the 76 MiB-ship saving dominates. Duration is unambiguously
better.

## Why we don't need to drop the `analyticsUserJoin` in this PR

The benchmark includes an OPTIMIZED stage that drops the LEFT JOIN and
trusts `e.data.is_anonymous` directly, which would shave another **1.2
GiB / 1.9× duration** off the endpoint. **But we can't ship that here**
— an audit of the client tracker
(`packages/js/src/lib/stack-app/apps/implementations/event-tracker.ts`)
confirmed `is_anonymous` is never set on client-emitted `$page-view` /
`$click` events. The JOIN is currently load-bearing. A follow-up PR will
enrich `is_anonymous` at the batch ingest endpoint using
`auth.user.is_anonymous`; after one metrics-window cycle (~30 days) the
JOIN can be dropped.

## Follow-up work (out of scope for this PR)

- **Batch-endpoint enrichment** + drop the analytics-overview LEFT JOIN
(est. further −53% endpoint memory, −46% duration per the benchmark).
- **Teams-split hash-variant count mismatch** — `sipHash64(team_id)`
variant of the teams split shows a count discrepancy vs. the
string-keyed version in the benchmark. Not blocking since teams-split is
only #8 by memory; needs a root-cause pass before shipping that
particular optimization.
- **`loadUsersByCountry` window bound** — currently scans every
`$token-refresh` event ever for the tenancy (no time filter). Bounding
to 30 days would bound memory growth with project age, but changes
semantics (\"country of latest login ever\" → \"in last 30 days\").
Deferred because it's product-facing.

## Snapshot changes in `internal-metrics.test.ts.snap`

The `should return metrics data with users` test signs in 10 users
today, then deletes one of them mid-test. Two small snapshot values
change on today's date; both are just a reclassification of that single
deleted user — the total (10 active users) is unchanged.

- **`daily_active_users_split.new[today]`: 9 → 10**
All 10 users really did sign in for the first time today. The old code
only counted 9 because the deleted user's Postgres row was gone by the
time the metrics query ran, so the old classifier couldn't see they were
created today. The new query looks at ClickHouse events directly, sees
the deleted user's first event was today, and counts them as new like
everyone else.

- **`daily_active_users_split.reactivated[today]`: 1 → 0**
No user was "reactivated" today — nobody was active on an earlier day
and came back. The old "1" was the deleted user falling into this bucket
by default (the old classifier had no other rule that fit them). The new
code correctly reports zero.

Totals match either way (9 + 1 = 10 + 0). We're moving one deleted user
out of the "returning visitor" bucket and into the "brand-new user"
bucket, which is what they actually were.

## Test plan

- [x] `pnpm typecheck` and `pnpm lint` pass on the backend package
- [x] MAU equivalence matrix: 13/13 cases return the same set of users
(not just the same count) between OLD and NEW pipelines
- [x] Set-equality verified at 500k-MAU perf scale
- [x] Full-route benchmark confirms the expected memory / duration
improvements
- [ ] Sanity-check the dashboard rendering after deploy (split charts,
MAU counter, analytics overview)
- [ ] Monitor Sentry for the assertion error — should drop to zero

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Performance Improvements**
* Monthly and daily active metrics are now computed entirely server-side
for faster queries and reduced client-side processing.

* **Bug Fixes**
* More consistent handling of anonymous/missing IDs and stricter ID
filtering to improve accuracy across edge cases.

* **Tests**
* Added a comprehensive benchmark and validation harness to measure
query performance and verify result equivalence across variants.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-19 22:57:46 -07:00
Konstantin Wohlwend
f85b4f3997 Make Bulldozer SQL statements deterministic 2026-04-18 16:43:26 -07:00
Aman Ganapathy
665870a144
[Fix] Bulldozer Studio and SpaceTime DB port conflict (#1346) 2026-04-17 17:56:11 -07:00
Aman Ganapathy
1de8a17183
Payments bulldozer txn rework (#1315)
### Object of this PR
This PR is NOT a monolithic series of fixes for the payments suite + a
complete rework. Its aims were
a) introducing and robustly testing the bulldozer db system 
b) reworking the payments underlying architecture to use bulldozer for
correctness and scalability
c) Achieving parity with the old payments system excepting a few changes
like ensuring correctness of the ledger algo
There may still be some work to do with handling refunds, decoupling the
concepts of purchases from that of products, and some other things.

### Ledger Algorithm
This has been tuned and fixed. Item removals i.e negative item quantity
changes will apply to the soonest expiring item grant i.e positive item
quantity change. This is what is best for the user. Item grants can also
expire, and when they expire we obviate whatever is left of their
original capacity (meaning after all the removals that were applied to
it). Our ledger algo is applied via Bulldozer, so automatic
re-computation is handled when a new grant/ removal is inserted in the
middle of the existing ones.

### Things we got rid of 
* No more automatic support for default products. You can use $0 plan
provisions to accomplish the same effect but it's manual
* Negative item quantity changes (i.e item removals) no longer can have
expiries



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Enhanced payment processing pipeline with improved data consistency
and state management.
  * Advanced refund handling with comprehensive transaction tracking.
* Better tracking and management of customer item quantities and owned
products.
* Improved subscription lifecycle management including period-end
handling.

* **Bug Fixes**
  * Fixed payment data integrity verification.
  * Improved handling of edge cases in refund scenarios.

* **Chores**
  * Updated cSpell configuration with additional words.
  * Expanded developer documentation for linting workflows.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Konstantin Wohlwend <n2d4xc@gmail.com>
Co-authored-by: Aadesh Kheria <kheriaaadesh@gmail.com>
Co-authored-by: Mantra <87142457+mantrakp04@users.noreply.github.com>
2026-04-17 22:11:21 +00:00
BilalG1
9e342da0f2
Fix cron jobs using dev env instead of test env in CI workflows (#1319)
The custom-base-port and db-migration-backwards-compatibility workflows
were running cron jobs with `with-env:dev` instead of `with-env:test`,
causing ClickHouse sync mismatches in verify-data-integrity.

<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Streamlined CI test workflows to standardize background cron job
startup for more consistent test runs.
* **Tests**
* Improved end-to-end test reliability by aligning background process
behavior across suites.
* **Bug Fixes**
* Enhanced data verification reliability by ensuring external database
sync before integrity checks and tightening comparison ordering for
certain records, reducing false mismatch detections.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-09 21:27:18 -07:00
BilalG1
4f99c469fe
stack auth preview mode (#1307)
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Preview mode: sandboxed experience with mock projects, placeholder
data, and disabled external integrations (payments, webhooks, email
rendering, session replays).
* One-click preview project creation and automatic preview sign-in for
quick access.

* **New Features — Walkthrough**
* Interactive guided walkthroughs with spotlight, animated cursor,
step-driven navigation, and targeted element hooks.

* **Style**
* UI/UX adjustments for preview: theme behavior, conditional
banners/alerts, informational alerts, and walkthrough attributes added
across pages.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-08 16:57:42 -07:00
Madison
63296fd3e0 chore(backend): align OpenAPI output with Mintlify and mirror specs to docs-mintlify.
- Normalize empty route path to / for valid OpenAPI path keys
- Drop invalid OAS3 top-level type on header parameters
- Write client/server/admin/webhooks JSON to docs-mintlify/openapi on codegen
2026-04-08 17:12:27 -05:00
BilalG1
8857dbaa48
clickhouse new syncs and verify-data (#1304)
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* External DB sync now covers teams, team members, permissions,
invitations, email outbox, session replays, refresh tokens, and
connected accounts.
* New sequence ID fields and automatic change-flagging added to many
record types to enable incremental sync.

* **Improvements**
* Added concurrent indexes, faster/parallelized sync pipelines,
verification tooling, and richer observability.
* Dashboard sequencer stats expanded and end-to-end sync tests
significantly extended.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-04-08 14:43:22 -07:00
Mantra
d22593d535
private files n sm build shit (#1276)
- Introduced a fallback mechanism for the private sign-up risk engine,
allowing for zero-score assessments when the primary engine is
unavailable.
- Updated Next.js configuration to support dynamic resolution of the
private risk engine, including aliasing for both Turbopack and Webpack.
- Added a new fallback implementation in
`private-sign-up-risk-engine-fallback.ts` to ensure consistent behavior
during builds.
- Adjusted `risk-scores.tsx` to utilize the new compiled engine,
improving error handling and logging for risk assessment failures.

This update improves the robustness of the sign-up risk scoring system
and enhances the development experience by streamlining engine
resolution.

<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Improvements**
* Sign-up risk engine is initialized and validated at startup for more
predictable performance.
* If the risk engine is unavailable or invalid, the system immediately
returns safe zero-risk scores to avoid runtime failures.
* **Tests**
* End-to-end tests updated to match the new engine initialization and
detection behavior.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Konstantin Wohlwend <n2d4xc@gmail.com>
2026-03-23 12:31:36 -07:00
Aman Ganapathy
1d00ed2c64
[Fix]: Investigate Memory Leak on Verify Data Integrity (#1269)
### Context
We encountered an out of memory error when running verify-data-integrity
against the prod database. This was the error:
`FATAL ERROR: Ineffective mark-compacts near heap limit Allocation
failed - JavaScript heap out of memory`. This was one of the things
preventing verify-data-integrity from running successfully in prod.

### Summary of Changes
Local stress testing with constrained heap and memory telemetry revealed
that the rise in used heap memory was directly proportional to the
number of api calls. Investigation revealed that the `currentOutputData`
array was growing with each api call and was kept in memory. Since it
was still being appended to, it was actively kept in the heap. We
refactor the script to no longer use it, and for the two flags
`--save-output` and `--verify-output` that used it before, we refactor
them to not need to. `--save-output` now streams responses to disk as
JSONL and `--verify-output` now compares each response immediately and
discards it.
We also note a potential source of a future memory leak in the
`allUsers` array that is populated in memory for each project. We
refactor to paginate instead. Note that this didn't cause a memory leak
on local, this is a preventive measure.

### Out of Scope
fetching all transactions in the payments section of the script is
another potential cause for concern, but since the payments section of
the script will be refactored soon, we defer that discussion.
2026-03-23 08:55:10 -07:00
Konstantin Wohlwend
10a03a31ad Fix Docker build 2026-03-09 10:49:42 -07:00
Konstantin Wohlwend
00fd0eb4c8 Revert Docker build fix 2026-03-09 10:06:14 -07:00
Konstantin Wohlwend
48ac83e858 Fix Docker script 2026-03-08 14:34:55 -07:00
Konstantin Wohlwend
973e190875 Don't bundle @prisma/client
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Lint & build / lint_and_build (latest) (push) Has been cancelled
Dev Environment Test With Custom Base Port / restart-dev-and-test-with-custom-base-port (push) Has been cancelled
Dev Environment Test / restart-dev-and-test (push) Has been cancelled
Run setup tests with custom base port / setup-tests-with-custom-base-port (push) Has been cancelled
Run setup tests / setup-tests (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
2026-03-02 18:01:21 -08:00
Konstantin Wohlwend
ba51f19d6f Fix lint 2026-02-27 09:59:26 -08:00
Konstantin Wohlwend
37dea79fda Another build issue 2026-02-27 02:04:02 -08:00
Konstantin Wohlwend
74a4f5a601 More build stuff 2026-02-27 01:55:43 -08:00
Konstantin Wohlwend
48f0e998d5 More fix build? 2026-02-27 01:47:01 -08:00
Konstantin Wohlwend
48a8f0b072 Fix build 2026-02-27 00:48:07 -08:00
Konstantin Wohlwend
e0ea6834d0 Upgrade TypeScript 2026-02-27 00:28:35 -08:00
Konstantin Wohlwend
d63db64e19 Migrate from tsup to tsdown 2026-02-26 17:42:09 -08:00
BilalG1
145bcb7e92
Analytics event tracking (#1208)
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Browser-side event tracker with batching, navigation & click capture
and background/keepalive delivery
* Server endpoint to accept batched analytics events and associate them
with session replay segments
* Client APIs to send analytics batches and integrate with session
replay

* **Bug Fixes / UX**
* Pausing replay now uses the UI-facing playback time for more accurate
pause positions
* Replay endpoint now returns a clear analytics-disabled error
(ANALYTICS_NOT_ENABLED) when analytics is off

* **Tests**
* End-to-end tests covering batch ingestion, validation, and replay
timing behavior
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-02-17 18:33:01 -08:00
BilalG1
fa27c80319
rename tabId to sessionReplaySegmentId (#1206)
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added new session replay analytics columns to ClickHouse for enhanced
tracking and reporting

* **Refactor**
* Renamed session recording segment identifier across APIs and data
models from `tab_id` to `session_replay_segment_id`
* Updated internal data structures and type definitions to align with
new naming convention

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-02-17 11:00:07 -08:00
Konsti Wohlwend
d319285403
Queries view (#1145) 2026-02-16 11:39:21 -08:00
BilalG1
907a98320a
Clickhouse sync fixing (#1198)
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->
2026-02-16 11:30:38 -08:00
BilalG1
5b149bebaa
fix clickhouse flaky tests (#1196)
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->
2026-02-13 13:05:35 -08:00
BilalG1
d09a180dfe
clickhouse user sync (#1159)
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migrations are backwards-compatible / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Lint & build / lint_and_build (latest) (push) Has been cancelled
Dev Environment Test With Custom Base Port / restart-dev-and-test-with-custom-base-port (push) Has been cancelled
Dev Environment Test / restart-dev-and-test (push) Has been cancelled
Run setup tests with custom base port / setup-tests-with-custom-base-port (push) Has been cancelled
Run setup tests / setup-tests (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
DB migrations are backwards-compatible / Test migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migrations are backwards-compatible / No migration changes (skipped) (push) Has been cancelled
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Real-time AI search with project-scoped analytics and dynamic query
execution; streaming AI responses replace the placeholder flow.
* External DB sync adds ClickHouse support: users sync, sync metadata
tracking, tenancy-aware status, and per-mapping throttling.
* AI assistant UI shows expandable tool-invocation results and streams
via the real AI pipeline.

* **Chores**
* Dashboard dependencies and workspace exclusions updated; development
OpenAI env var added; editor config flag toggled.

* **Tests**
* E2E coverage extended to validate ClickHouse user sync and analytics
queries.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: aadesh18 <110230993+aadesh18@users.noreply.github.com>
Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com>
2026-02-12 16:52:20 -08:00
aadesh18
2055d98dea
External db sync (#1036)
<img width="1920" height="969" alt="Screenshot 2026-02-04 at 9 47 16 AM"
src="https://github.com/user-attachments/assets/d7d0cd04-0051-4fc4-b857-e6f87ee97a59"
/>

**This PR revolves around the following components**
1. Sequencer - sequences the updates in the internal db
2. Poller - polls for the latest updates to sync with the external db
3. Outgoing Request Handler - essentially a trigger that can make http
requests based on a change in the internal db
4. Sync Engine - syncs with the latest changes from the internal db to
the external db

**What has been done**
- Added a global sequence id for ProjectUser, ContactChannel and
DeletedRow.
- Added the deletedRow table to keep track of the rows that were deleted
across ProjectUser and ContactChannel.
- Added the OutgoingRequest table to keep track of the outgoing requests
- Added function for the sequencer to call to sequence updates
- Added a sequencer that sequences all the changes in the internal db
every 50 ms
- Added a poller that polls for the latest changes in the internal db
every 50 ms, and adds to a queue
- Added a Vercel cron that calls sequencer and poller every minute
- Added a queue that fulfills the outgoing requests by making http calls
(for external db sync, it calls the sync engine endpoint)
- Added a sync engine that uses the defined sql mapping query in the
user's schema to pull in the changes for the user, and sync them with
the external db
- Added tests to test out each functionality


**How to review this PR:**
1. Review the migrations (sequence id, deletedRow, triggers, backlog
sync) (all files created under the migrations folder)
2. Review sequencer
3. Review poller
4. Review the changes in schema
5. Review sync-engine (the function, and it's helper file)
6. Review the schema changes, and query mappings
7. Review the tests (basic, advanced and race, along with the helper
file)
8. Review the changes made in Dockerfile to support local testing using
the postgres docker

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces a cron-driven external DB sync pipeline with global
sequencing, internal poller and webhook sync engine, new DB
tables/functions, config schema/mappings, and comprehensive e2e tests.
> 
> - **Database (Prisma/Migrations)**:
> - Add global sequence (`global_seq_id`) and
`sequenceId`/`shouldUpdateSequenceId` to `ProjectUser`,
`ContactChannel`, `DeletedRow` with partial indexes.
> - Create `DeletedRow` (capture deletes) and `OutgoingRequest` (queue)
tables; add unique/indexes.
> - Add triggers/functions: `log_deleted_row`,
`reset_sequence_id_on_update`, `backfill_null_sequence_ids`,
`enqueue_tenant_sync`.
> - **Backend/API**:
> - New internal routes: `GET
/api/latest/internal/external-db-sync/sequencer`, `GET /poller`, `POST
/sync-engine` (Upstash-verified) for sync orchestration.
> - Add cron wiring: `vercel.json` schedules and local
`scripts/run-cron-jobs.ts`; start in dev via `dev` script.
> - Tweak route handler (remove noisy logging) without behavior change.
> - **Sync Engine**:
> - Implement `src/lib/external-db-sync.ts` to read tenant mappings and
upsert to external Postgres (schema bootstrap, param checks,
sequencing).
> - Add default mappings `DEFAULT_DB_SYNC_MAPPINGS` and config schema
`dbSync.externalDatabases` in shared config.
> - **Testing/Infra**:
> - Add extensive e2e tests (basics, advanced, race conditions) for
sequencing, idempotency, deletes, pagination, multi-mapping, and
permissions.
> - Docker compose: add `external-db-test` Postgres for tests; e2e deps
for `pg` types.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
3f2a8efcfb. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* External PostgreSQL sync: automatic, batched replication with
mappings, resume/idempotency, and on-demand enqueueing.

* **Admin UI**
* Real-time External DB Sync dashboard and status API showing
per-mapping backlog, sequencer/poller/sync-engine telemetry, and fusebox
controls.

* **Tests**
* Large e2e suite: basic, advanced, race, high-volume tests and test
utilities for external DB sync.

* **Chores**
* DB migrations, CI/workflow updates, background cron runner and
local/dev test support.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com>
Co-authored-by: Bilal Godil <bg2002@gmail.com>
2026-02-05 12:04:31 -08:00
Konstantin Wohlwend
097c0310c4 Check all users when verifying data integrity 2026-02-03 10:00:30 -08:00
Konstantin Wohlwend
4c22b37fdf --no-bail for verify-data-integrity script 2026-01-28 13:53:28 -08:00
Konstantin Wohlwend
8fd5b13a3b TokenRefreshEventType 2026-01-28 11:18:15 -08:00
BilalG1
484c3a6332
clickhouse setup (#1032) 2026-01-28 09:12:33 -08:00
BilalG1
e439bd0b7e
verify payment transactions integrity (#1128)
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Added a comprehensive payments data-integrity verifier, Stripe payout
reconciliation, API validation helpers, and a throttled progress utility
for long-running checks.

* **Bug Fixes**
* Improved subscription/product filtering to correctly respect customer
type during verification.

* **Chores**
* Reorganized verification scripts and updated the verification
entrypoint invocation.

* **Tests**
* Enhanced test fixtures to include full product data for subscriptions.

<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2026-01-27 21:17:43 +00:00
Konstantin Wohlwend
e574f526fa Import fixes 2026-01-23 11:52:54 -08:00
Konstantin Wohlwend
0aeb120aa8 Make DB migration script interactive 2026-01-23 11:52:25 -08:00
Konsti Wohlwend
8f74949a7f
Speed up tests (#1063) 2025-12-28 11:25:04 -08:00
Konsti Wohlwend
b4ae80874e
Upgrade Prisma to v7 (#1064) 2025-12-26 08:13:34 -08:00
Konstantin Wohlwend
7bd91dcf93 fixes? 2025-12-12 17:29:57 -08:00
Konsti Wohlwend
e7e792d462
Email outbox backend (#1030) 2025-12-12 10:26:38 -08:00
BilalG1
b5b311554b
Metrics Endpoint Speed (#966)
<img width="567" height="249" alt="Screenshot 2025-10-20 at 11 23 10 AM"
src="https://github.com/user-attachments/assets/340df844-f619-489f-8d41-cc26bc165018"
/>
<img width="595" height="255" alt="Screenshot 2025-10-20 at 11 24 00 AM"
src="https://github.com/user-attachments/assets/9321bda1-e6f0-4f53-8c6b-e29d0fc16038"
/>

<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->

<!-- RECURSEML_SUMMARY:START -->
## High-level PR Summary
This PR optimizes the performance of user list and metrics endpoints by
refactoring SQL queries to use more efficient patterns. The changes
include rewriting queries to use `LATERAL` joins and CTEs with proper
filtering, extracting common user mapping logic into reusable functions,
and adding performance tests with SQL scripts to generate realistic test
data (10,000 mock users and activity events across 100 countries).

⏱️ Estimated Review Time: 30-90 minutes

<details>
<summary>💡 Review Order Suggestion</summary>

| Order | File Path |
|-------|-----------|
| 1 | `apps/e2e/tests/backend/performance/mock-users.sql` |
| 2 | `apps/e2e/tests/backend/performance/mock-metric-events.sql` |
| 3 | `apps/e2e/tests/backend/performance/users-list.test.ts` |
| 4 | `apps/backend/src/app/api/latest/users/crud.tsx` |
| 5 | `apps/backend/src/app/api/latest/internal/metrics/route.tsx` |
</details>



[![Need help? Join our
Discord](https://img.shields.io/badge/Need%20help%3F%20Join%20our%20Discord-5865F2?style=plastic&logo=discord&logoColor=white)](https://discord.gg/n3SsVDAW6U)


[![Analyze latest
changes](f22b2c44a1/?repo_owner=stack-auth&repo_name=stack-auth&pr_number=966)
<!-- RECURSEML_SUMMARY:END -->
<!-- ELLIPSIS_HIDDEN -->


----

> [!IMPORTANT]
> Optimize metrics and user list endpoints with SQL refactoring,
caching, and performance tests, adding a `CacheEntry` model and mock
data scripts.
> 
>   - **Performance Optimization**:
> - Refactor SQL queries in `route.tsx` to use `LATERAL` joins and CTEs
for efficient data retrieval.
> - Implement caching in `route.tsx` using `getOrSetCacheValue()` to
reduce database load.
>   - **Database Changes**:
> - Add `CacheEntry` model to `schema.prisma` and create corresponding
table and index in `migration.sql`.
> - Remove auto-migration metadata step from
`check-prisma-migrations.yaml`.
>   - **Testing**:
> - Add performance tests in `metrics.test.ts` to benchmark metrics and
user endpoints.
> - Create mock data scripts `mock-users.sql` and
`mock-metric-events.sql` for testing with 10,000 users and events across
100 countries.
>   - **Miscellaneous**:
> - Update `db-migrations.ts` to include new migration file generation
logic.
>     - Add `cache.tsx` for caching logic implementation.
> 
> <sup>This description was created by </sup>[<img alt="Ellipsis"
src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=stack-auth%2Fstack-auth&utm_source=github&utm_medium=referral)<sup>
for 4d9be71063. You can
[customize](https://app.ellipsis.dev/stack-auth/settings/summaries) this
summary. It will automatically update as commits are pushed.</sup>

----


<!-- ELLIPSIS_HIDDEN -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Metrics now use a cache layer with per-entry TTL and tenancy-aware
loaders.

* **Bug Fixes**
* Improved accuracy of daily active and related metrics with
tenancy-aware counting and more robust last-active computation.

* **Performance**
* Faster metrics responses via batched reads and cache-backed endpoints.

* **Tests**
* Added end-to-end performance benchmarks and SQL seed scripts for
metrics/user load testing.

* **Chores**
* DB migration added support for cached entries; CI migration check flow
adjusted; migration tooling improved.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com>
2025-11-05 16:24:04 -08:00
Konstantin Wohlwend
cd02113441 several changes 2025-10-14 12:23:22 -07:00
Konstantin Wohlwend
1ed9c6150f Use custom migration script for self-hosting container 2025-10-14 11:29:41 -07:00
Konsti Wohlwend
8a77e07f19
Rename offer to product, offer group to product catalog (#914)
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
Docker Emulator Test / docker (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Test / docker (push) Has been cancelled
Runs E2E API Tests / build (22.x) (push) Has been cancelled
Runs E2E API Tests with external source of truth / build (22.x) (push) Has been cancelled
Lint & build / lint_and_build (latest) (push) Has been cancelled
Dev Environment Test / restart-dev-and-test (push) Has been cancelled
Run setup tests / setup-tests (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
<!--

Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md

-->

<!-- RECURSEML_SUMMARY:START -->
## High-level PR Summary
This PR implements a comprehensive renaming of "offer" to "product" and
"offer group" to "product catalog" throughout the codebase. The changes
include database migrations, schema updates, API compatibility layers,
function renames, and updates to client and server implementations.
Backwards compatibility is maintained through migration layers that
handle requests using the old terminology, translating them to the new
terminology before processing. The PR includes documentation of this
approach in CLAUDE-KNOWLEDGE.md. This rename affects multiple parts of
the system including the database schema, API endpoints, error types,
and SDK interfaces.

⏱️ Estimated Review Time: 1-3 hours

<details>
<summary>💡 Review Order Suggestion</summary>

| Order | File Path |
|-------|-----------|
| 1 |
`apps/backend/prisma/migrations/20250923191615_rename_offers_to_products/migration.sql`
|
| 2 |
`apps/backend/src/app/api/migrations/v2beta1/payments/purchases/offers-compat.ts`
|
| 3 |
`apps/backend/src/app/api/migrations/v2beta1/payments/purchases/create-purchase-url/route.ts`
|
| 4 |
`apps/backend/src/app/api/migrations/v2beta1/payments/purchases/validate-code/route.ts`
|
| 5 | `apps/backend/src/lib/payments.tsx` |
| 6 | `.claude/CLAUDE-KNOWLEDGE.md` |
| 7 | `packages/stack-shared/src/schema-fields.ts` |
| 8 | `packages/stack-shared/src/known-errors.tsx` |
| 9 | `packages/stack-shared/src/config/schema.ts` |
| 10 | `packages/template/src/lib/stack-app/customers/index.ts` |
| 11 |
`packages/template/src/lib/stack-app/apps/implementations/client-app-impl.ts`
|
| 12 |
`packages/template/src/lib/stack-app/apps/implementations/server-app-impl.ts`
|
</details>



[![Need help? Join our
Discord](https://img.shields.io/badge/Need%20help%3F%20Join%20our%20Discord-5865F2?style=plastic&logo=discord&logoColor=white)](https://discord.gg/n3SsVDAW6U)

<!-- RECURSEML_SUMMARY:END -->
<!-- ELLIPSIS_HIDDEN -->


----

> [!IMPORTANT]
> Renames 'offer' to 'product' and 'offer group' to 'product catalog'
across the codebase, updating database schema, API endpoints, and
application logic for consistency and backward compatibility.
> 
>   - **Database**:
> - Rename columns `offer` to `product` and `offerId` to `productId` in
`OneTimePurchase` and `Subscription` tables in `migration.sql`.
>   - **API & Migrations**:
> - Update API endpoints to accept `product_id`/`product_inline` instead
of `offer_id`/`offer_inline`.
> - Add `v2beta5` compatibility layer to map legacy `offer` fields to
`product` equivalents.
>   - **Shared Schemas**:
> - Rename `offerSchema` to `productSchema` and related schemas in
`schema-fields.ts`.
>   - **Server Implementation**:
> - Update `createCheckoutUrl` method in `server-app-impl.ts` to use
`productId`/`InlineProduct`.
>   - **Tests**:
> - Update tests to reflect renaming in `backend-helpers.ts` and other
test files.
>   - **Miscellaneous**:
>     - Remove dummy data related to offers in `dummy-data.tsx`.
> - Update documentation and comments to reflect terminology changes.
> 
> <sup>This description was created by </sup>[<img alt="Ellipsis"
src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=stack-auth%2Fstack-auth&utm_source=github&utm_medium=referral)<sup>
for e3227bcbd2. You can
[customize](https://app.ellipsis.dev/stack-auth/settings/summaries) this
summary. It will automatically update as commits are pushed.</sup>

----


<!-- ELLIPSIS_HIDDEN -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Backwards-compatibility: legacy offer_id/offer_inline requests are
accepted, normalized, and routed to product-based handlers.

* **Refactor**
* Global rename from Offer/Group → Product/Catalog across UI, APIs,
types, client/server interfaces, and error codes.

* **Bug Fixes**
* Responses, webhooks and UI consistently surface product_display_name
and product-related metadata.

* **Documentation**
* Migration notes and docs updated to explain compatibility and
parameter changes.

* **Tests**
  * Unit and E2E suites updated to cover product/catalog flows.

* **Chores**
  * Database schema migration, seed and config updates applied.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Renames offers→products and groups→catalogs end-to-end (DB, APIs,
schemas, UI, SDK, docs), adding v2beta5 compatibility to accept legacy
offer fields while updating all internals.
> 
> - **Backend/DB**:
> - Prisma migration: rename `offer`/`offerId`→`product`/`productId` in
`OneTimePurchase` and `Subscription`.
> - Update Stripe webhook, purchase-session, and internal test-mode
flows to use `product*` metadata/fields.
> - **API & Migrations**:
>   - Latest endpoints now accept `product_id`/`product_inline`.
> - Add `v2beta5` compat layer mapping legacy `offer_id`/`offer_inline`
to product equivalents; responses alias conflicting products.
> - **Shared Schemas/Errors/Config**:
> - `offerSchema`→`productSchema`,
`inlineOfferSchema`→`inlineProductSchema`, prices/types renamed.
>   - KnownErrors renamed (e.g., `PRODUCT_DOES_NOT_EXIST`).
> - Config: `groups`→`catalogs`, defaults/migrations updated; improved
override validation messages; ID regex loosened; formatter tweaks; add
schema fuzzer tests.
> - **Payments Lib**:
> - Rename APIs and logic (`offers`→`products`, `groupId`→`catalogId`),
subscription and item-quantity computation updated.
> - **Dashboard/UI**:
> - Routes, dialogs, editors, tables, and code samples switched to
products/catalogs; removed offers dummy data.
> - **SDK/Template**:
> - Client/server `createCheckoutUrl` now uses
`productId`/`InlineProduct`.
> - **Tests/Docs/Utilities**:
>   - E2E and unit tests updated; add legacy (pre-rename) tests.
> - Docs and knowledge base revised; minor script tweaks (recent-first,
limits).
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
e6e20ecd72. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: BilalG1 <bg2002@gmail.com>
2025-10-04 02:28:28 -07:00
Konstantin Wohlwend
914942dd2f Migration summary 2025-08-25 10:13:18 -07:00
Zai Shi
e1c5018bd8 fix db:migration-gen 2025-07-30 09:46:27 -07:00
Konsti Wohlwend
7c0417d7d9
Several project config improvements (#811) 2025-07-29 04:13:46 -07:00