stack

mirror of https://github.com/stack-auth/stack.git synced 2026-06-04 21:04:37 +08:00

Author	SHA1	Message	Date
BilalG1	85ae4b1c9e	Fix ClickHouse OOM in MAU query + optimize /internal/metrics route (#1344 ) ## Summary Fixes the Sentry `StackAssertionError: Failed to load monthly active users for internal metrics` crash (ClickHouse OOM at the 7.2 GiB per-query cap) and applies two related optimizations to other queries in the same route while here. Adds a local benchmark harness that validates correctness and measures peak memory / duration before & after. ## Root cause (the original Sentry error) `loadMonthlyActiveUsers` was written as `SELECT user_id … GROUP BY user_id` and then counting in Node via a `Set`. On a large project that ships back millions of user_ids. Two failure modes stacked: 1. Result materialization — every distinct user_id had to be buffered in the server before streaming to Node (~20 MiB of result for 450k users; much more at real scale). 2. `JSONExtract(toJSONString(data), 'is_anonymous', 'UInt8')` — the `toJSONString(data)` per-row re-serialization of the entire nested JSON column, billions of times, just to pull one boolean. Dominates bytes-read. Combined, on a single partition read from S3-backed MergeTree, this can exceed ClickHouse's 7.2 GiB per-query memory cap. That's exactly what the Sentry trace showed. ## Changes ### 1. Fix MAU query (`loadMonthlyActiveUsers`) Moved counting to the server with `uniqExact(sipHash64(normalized_user_id))` and pulled the JS-side normalization (`lower`, `trim`, `isUuid`) into SQL. Picked `sipHash64` after benchmarking 7 variants — it's exact (at <<2³² users) and halves the uniqExact hash-state vs. raw string keys. ### 2. Fix 1 — `JSONExtract(toJSONString(data), …)` → direct `CAST(data.is_anonymous, …)` Applied everywhere the pattern appeared in the metrics route: - `loadDailyActiveUsers` - the `analyticsUserJoin` subquery - the `nonAnonymousAnalyticsUserFilter` - `analyticsOverview:topRegion` - `analyticsOverview:online` Semantics preserved (`coalesce(CAST(data.is_anonymous, 'Nullable(UInt8)'), 0)` matches `JSONExtract(…, 'UInt8')` behavior when the field is missing). ### 3. Fix 3 — server-aggregate the split queries `loadDailyActiveUsersSplit` and `loadDailyActiveTeamsSplit` used to ship 1.2M+ `(day, user_id)` rows back to Node just so the JS could bucket them into new / retained / reactivated. Rewrote both as one CTE-style query that returns 31 rows (one per day in the 30-day window) with the counts precomputed. Minor semantic shift (documented inline in `route.tsx`): \"new\" is now based on the user's first-ever `\$token-refresh` event rather than their Postgres `signedUpAt`. Agrees for users who log in immediately after sign-up (the common case). Disagrees for the rare edge case of an account that existed pre-window but never generated a `\$token-refresh` until now — old code classified as \"reactivated,\" new code classifies as \"new.\" Judged acceptable; can be revisited. Postgres round-trips for `ProjectUser.signedUpAt` / `Team.createdAt` are no longer needed for the split, and the 76 MiB-ish wire ship is gone. ### 4. Benchmark harness (`apps/backend/scripts/benchmark-internal-metrics.ts`) Local-only tool. Three modes: - MAU equivalence matrix — 13 edge cases (empty, dedup, anonymous filter, window boundary, null user_id, non-UUID user_id, case variation, project isolation, missing/null `is_anonymous`, wrong event_type). Asserts OLD pipeline and NEW query return the same set of users, not just the same count. - MAU perf — OLD vs NEW plus 6 other candidate variants (inline regex, UUID keys, sipHash64, HLL sketches), reads `memory_usage` / `read_rows` / `result_bytes` from `system.query_log` for each, prints a ranked table. - Full-route benchmark (`BENCH_ROUTE_QUERIES=1`) — runs every ClickHouse query in `/internal/metrics` in three stages (BEFORE, AFTER, candidate OPTIMIZED) against the same seed and prints per-query deltas plus endpoint-level totals. Seeds under a synthetic `project_id` so real data is never touched; cleans up on exit via `ALTER TABLE … DELETE`. ## Benchmark results ### MAU query alone Ran at two scales; set-equality verified (new query identifies the same individual users, not just the same count). \| seed \| MAU \| peak memory (old → new) \| bytes read \| duration \| \|---\|---\|---\|---\|---\| \| 500k events \| 89,939 \| 158.7 MiB → 46.7 MiB (3.4×, −70%) \| 175.7 MiB → 63.0 MiB (2.8×) \| 483 ms → 76 ms (6.4×) \| \| 2.5M events \| 449,990 \| 439.2 MiB → 281.4 MiB (1.56×, −36%) \| 865.0 MiB → 310.9 MiB (2.8×) \| 783 ms → 126 ms (6.2×) \| MAU variant bake-off at 2.5M events (all exact, all set-equal to OLD): \| variant \| memory \| duration \| notes \| \|---\|---\|---\|---\| \| v0_old (baseline) \| 440 MiB \| 567 ms \| — \| \| v1_uniqExact_string \| 284 MiB \| 110 ms \| naive fix \| \| v3_uniqExact_toUUID \| 244 MiB \| 153 ms \| UUID keys, slower per-row \| \| v4_uniqExact_sipHash64 \| 125 MiB \| 95 ms \| shipped \| \| v5_uniq (HLL) ~approx \| 30 MiB \| 86 ms \| −0.25% error \| \| v6_uniqCombined ~approx \| 31 MiB \| 67 ms \| −0.15% error \| ### Full `/internal/metrics` route (2.7M events, 300k users + page-views + clicks + teams) Ranked by BEFORE peak memory: \| query \| mem BEFORE \| mem AFTER \| Δ mem \| dur BEFORE \| dur AFTER \| Δ dur \| \|---\|---\|---\|---\|---\|---\|---\| \| analyticsOverview:topReferrers \| 588.1 MiB \| 411.1 MiB \| 1.43× \| 1833 ms \| 110 ms \| 16.66× \| \| analyticsOverview:totalVisitors \| 584.3 MiB \| 403.5 MiB \| 1.45× \| 1829 ms \| 121 ms \| 15.12× \| \| analyticsOverview:dailyEvents \| 584.1 MiB \| 403.7 MiB \| 1.45× \| 1897 ms \| 140 ms \| 13.55× \| \| loadUsersByCountry \| 393.1 MiB \| 385.4 MiB \| ≈same \| 74 ms \| 80 ms \| ≈same \| \| loadDailyActiveUsersSplit \| 363.4 MiB \| 396.8 MiB \| +9% \| 1966 ms \| 356 ms \| 5.52× \| \| analyticsOverview:topRegion \| 269.9 MiB \| 106.4 MiB \| 2.54× \| 1602 ms \| 65 ms \| 24.65× \| \| loadDailyActiveUsers \| 268.3 MiB \| 84.0 MiB \| 3.19× \| 1111 ms \| 44 ms \| 25.25× \| \| loadDailyActiveTeamsSplit \| 59.6 MiB \| 78.1 MiB \| +31% \| 70 ms \| 123 ms \| +76% \| \| loadMonthlyActiveUsers \| 54.9 MiB \| 54.9 MiB \| ≈same \| 68 ms \| 56 ms \| ≈same \| \| analyticsOverview:online \| 18.4 MiB \| 5.8 MiB \| 3.17× \| 58 ms \| 4 ms \| 14.50× \| Endpoint-level totals \| metric \| BEFORE \| AFTER \| Δ \| \|---\|---\|---\|---\| \| Sum peak ClickHouse memory \| 3.11 GiB \| 2.28 GiB \| −27% \| \| Max query duration (endpoint wall-clock floor) \| 1966 ms \| 356 ms \| −82% (5.5×) \| \| Sum query duration (total CPU) \| 10508 ms \| 1099 ms \| −90% (9.6×) \| \| Bytes read \| 10.70 GiB \| 4.55 GiB \| −57% \| \| Bytes shipped to Node \| 94.8 MiB \| 44.2 KiB \| −99.95% \| Both split queries show a small memory regression at this seed size (the new server-side window-function + self-join has its own state cost that's near break-even with \"materialize + ship\" at 300k users); at prod scale the 76 MiB-ship saving dominates. Duration is unambiguously better. ## Why we don't need to drop the `analyticsUserJoin` in this PR The benchmark includes an OPTIMIZED stage that drops the LEFT JOIN and trusts `e.data.is_anonymous` directly, which would shave another 1.2 GiB / 1.9× duration off the endpoint. But we can't ship that here — an audit of the client tracker (`packages/js/src/lib/stack-app/apps/implementations/event-tracker.ts`) confirmed `is_anonymous` is never set on client-emitted `$page-view` / `$click` events. The JOIN is currently load-bearing. A follow-up PR will enrich `is_anonymous` at the batch ingest endpoint using `auth.user.is_anonymous`; after one metrics-window cycle (~30 days) the JOIN can be dropped. ## Follow-up work (out of scope for this PR) - Batch-endpoint enrichment + drop the analytics-overview LEFT JOIN (est. further −53% endpoint memory, −46% duration per the benchmark). - Teams-split hash-variant count mismatch — `sipHash64(team_id)` variant of the teams split shows a count discrepancy vs. the string-keyed version in the benchmark. Not blocking since teams-split is only #8 by memory; needs a root-cause pass before shipping that particular optimization. - `loadUsersByCountry` window bound — currently scans every `$token-refresh` event ever for the tenancy (no time filter). Bounding to 30 days would bound memory growth with project age, but changes semantics (\"country of latest login ever\" → \"in last 30 days\"). Deferred because it's product-facing. ## Snapshot changes in `internal-metrics.test.ts.snap` The `should return metrics data with users` test signs in 10 users today, then deletes one of them mid-test. Two small snapshot values change on today's date; both are just a reclassification of that single deleted user — the total (10 active users) is unchanged. - `daily_active_users_split.new[today]`: 9 → 10 All 10 users really did sign in for the first time today. The old code only counted 9 because the deleted user's Postgres row was gone by the time the metrics query ran, so the old classifier couldn't see they were created today. The new query looks at ClickHouse events directly, sees the deleted user's first event was today, and counts them as new like everyone else. - `daily_active_users_split.reactivated[today]`: 1 → 0 No user was "reactivated" today — nobody was active on an earlier day and came back. The old "1" was the deleted user falling into this bucket by default (the old classifier had no other rule that fit them). The new code correctly reports zero. Totals match either way (9 + 1 = 10 + 0). We're moving one deleted user out of the "returning visitor" bucket and into the "brand-new user" bucket, which is what they actually were. ## Test plan - [x] `pnpm typecheck` and `pnpm lint` pass on the backend package - [x] MAU equivalence matrix: 13/13 cases return the same set of users (not just the same count) between OLD and NEW pipelines - [x] Set-equality verified at 500k-MAU perf scale - [x] Full-route benchmark confirms the expected memory / duration improvements - [ ] Sanity-check the dashboard rendering after deploy (split charts, MAU counter, analytics overview) - [ ] Monitor Sentry for the assertion error — should drop to zero <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Performance Improvements * Monthly and daily active metrics are now computed entirely server-side for faster queries and reduced client-side processing. * Bug Fixes * More consistent handling of anonymous/missing IDs and stricter ID filtering to improve accuracy across edge cases. * Tests * Added a comprehensive benchmark and validation harness to measure query performance and verify result equivalence across variants. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-19 22:57:46 -07:00
Konstantin Wohlwend	f85b4f3997	Make Bulldozer SQL statements deterministic	2026-04-18 16:43:26 -07:00
Aman Ganapathy	665870a144	[Fix] Bulldozer Studio and SpaceTime DB port conflict (#1346 )	2026-04-17 17:56:11 -07:00
Aman Ganapathy	1de8a17183	Payments bulldozer txn rework (#1315 ) ### Object of this PR This PR is NOT a monolithic series of fixes for the payments suite + a complete rework. Its aims were a) introducing and robustly testing the bulldozer db system b) reworking the payments underlying architecture to use bulldozer for correctness and scalability c) Achieving parity with the old payments system excepting a few changes like ensuring correctness of the ledger algo There may still be some work to do with handling refunds, decoupling the concepts of purchases from that of products, and some other things. ### Ledger Algorithm This has been tuned and fixed. Item removals i.e negative item quantity changes will apply to the soonest expiring item grant i.e positive item quantity change. This is what is best for the user. Item grants can also expire, and when they expire we obviate whatever is left of their original capacity (meaning after all the removals that were applied to it). Our ledger algo is applied via Bulldozer, so automatic re-computation is handled when a new grant/ removal is inserted in the middle of the existing ones. ### Things we got rid of * No more automatic support for default products. You can use $0 plan provisions to accomplish the same effect but it's manual * Negative item quantity changes (i.e item removals) no longer can have expiries <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Enhanced payment processing pipeline with improved data consistency and state management. * Advanced refund handling with comprehensive transaction tracking. * Better tracking and management of customer item quantities and owned products. * Improved subscription lifecycle management including period-end handling. * Bug Fixes * Fixed payment data integrity verification. * Improved handling of edge cases in refund scenarios. * Chores * Updated cSpell configuration with additional words. * Expanded developer documentation for linting workflows. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Konstantin Wohlwend <n2d4xc@gmail.com> Co-authored-by: Aadesh Kheria <kheriaaadesh@gmail.com> Co-authored-by: Mantra <87142457+mantrakp04@users.noreply.github.com>	2026-04-17 22:11:21 +00:00
BilalG1	9e342da0f2	Fix cron jobs using dev env instead of test env in CI workflows (#1319 ) The custom-base-port and db-migration-backwards-compatibility workflows were running cron jobs with `with-env:dev` instead of `with-env:test`, causing ClickHouse sync mismatches in verify-data-integrity. <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Streamlined CI test workflows to standardize background cron job startup for more consistent test runs. * Tests * Improved end-to-end test reliability by aligning background process behavior across suites. * Bug Fixes * Enhanced data verification reliability by ensuring external database sync before integrity checks and tightening comparison ordering for certain records, reducing false mismatch detections. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-09 21:27:18 -07:00
BilalG1	4f99c469fe	stack auth preview mode (#1307 ) <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Preview mode: sandboxed experience with mock projects, placeholder data, and disabled external integrations (payments, webhooks, email rendering, session replays). * One-click preview project creation and automatic preview sign-in for quick access. * New Features — Walkthrough * Interactive guided walkthroughs with spotlight, animated cursor, step-driven navigation, and targeted element hooks. * Style * UI/UX adjustments for preview: theme behavior, conditional banners/alerts, informational alerts, and walkthrough attributes added across pages. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-08 16:57:42 -07:00
Madison	63296fd3e0	chore(backend): align OpenAPI output with Mintlify and mirror specs to docs-mintlify. - Normalize empty route path to / for valid OpenAPI path keys - Drop invalid OAS3 top-level type on header parameters - Write client/server/admin/webhooks JSON to docs-mintlify/openapi on codegen	2026-04-08 17:12:27 -05:00
BilalG1	8857dbaa48	clickhouse new syncs and verify-data (#1304 ) <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * External DB sync now covers teams, team members, permissions, invitations, email outbox, session replays, refresh tokens, and connected accounts. * New sequence ID fields and automatic change-flagging added to many record types to enable incremental sync. * Improvements * Added concurrent indexes, faster/parallelized sync pipelines, verification tooling, and richer observability. * Dashboard sequencer stats expanded and end-to-end sync tests significantly extended. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-04-08 14:43:22 -07:00
Mantra	d22593d535	private files n sm build shit (#1276 ) - Introduced a fallback mechanism for the private sign-up risk engine, allowing for zero-score assessments when the primary engine is unavailable. - Updated Next.js configuration to support dynamic resolution of the private risk engine, including aliasing for both Turbopack and Webpack. - Added a new fallback implementation in `private-sign-up-risk-engine-fallback.ts` to ensure consistent behavior during builds. - Adjusted `risk-scores.tsx` to utilize the new compiled engine, improving error handling and logging for risk assessment failures. This update improves the robustness of the sign-up risk scoring system and enhances the development experience by streamlining engine resolution. <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Improvements * Sign-up risk engine is initialized and validated at startup for more predictable performance. * If the risk engine is unavailable or invalid, the system immediately returns safe zero-risk scores to avoid runtime failures. * Tests * End-to-end tests updated to match the new engine initialization and detection behavior. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Konstantin Wohlwend <n2d4xc@gmail.com>	2026-03-23 12:31:36 -07:00
Aman Ganapathy	1d00ed2c64	[Fix]: Investigate Memory Leak on Verify Data Integrity (#1269 ) ### Context We encountered an out of memory error when running verify-data-integrity against the prod database. This was the error: `FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory`. This was one of the things preventing verify-data-integrity from running successfully in prod. ### Summary of Changes Local stress testing with constrained heap and memory telemetry revealed that the rise in used heap memory was directly proportional to the number of api calls. Investigation revealed that the `currentOutputData` array was growing with each api call and was kept in memory. Since it was still being appended to, it was actively kept in the heap. We refactor the script to no longer use it, and for the two flags `--save-output` and `--verify-output` that used it before, we refactor them to not need to. `--save-output` now streams responses to disk as JSONL and `--verify-output` now compares each response immediately and discards it. We also note a potential source of a future memory leak in the `allUsers` array that is populated in memory for each project. We refactor to paginate instead. Note that this didn't cause a memory leak on local, this is a preventive measure. ### Out of Scope fetching all transactions in the payments section of the script is another potential cause for concern, but since the payments section of the script will be refactored soon, we defer that discussion.	2026-03-23 08:55:10 -07:00
Konstantin Wohlwend	10a03a31ad	Fix Docker build	2026-03-09 10:49:42 -07:00
Konstantin Wohlwend	00fd0eb4c8	Revert Docker build fix	2026-03-09 10:06:14 -07:00
Konstantin Wohlwend	48ac83e858	Fix Docker script	2026-03-08 14:34:55 -07:00
Konstantin Wohlwend	973e190875	Don't bundle @prisma/client Some checks failed all-good: Did all the other checks pass? / all-good (push) Has been cancelled Details Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled Details DB migration compat / Check if migrations changed (push) Has been cancelled Details Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled Details Docker Server Build and Run / docker (push) Has been cancelled Details Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled Details Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled Details Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled Details Lint & build / lint_and_build (latest) (push) Has been cancelled Details Dev Environment Test With Custom Base Port / restart-dev-and-test-with-custom-base-port (push) Has been cancelled Details Dev Environment Test / restart-dev-and-test (push) Has been cancelled Details Run setup tests with custom base port / setup-tests-with-custom-base-port (push) Has been cancelled Details Run setup tests / setup-tests (push) Has been cancelled Details TOC Generator / TOC Generator (push) Has been cancelled Details DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled Details DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled Details DB migration compat / No migration changes (skipped) (push) Has been cancelled Details	2026-03-02 18:01:21 -08:00
Konstantin Wohlwend	ba51f19d6f	Fix lint	2026-02-27 09:59:26 -08:00
Konstantin Wohlwend	37dea79fda	Another build issue	2026-02-27 02:04:02 -08:00
Konstantin Wohlwend	74a4f5a601	More build stuff	2026-02-27 01:55:43 -08:00
Konstantin Wohlwend	48f0e998d5	More fix build?	2026-02-27 01:47:01 -08:00
Konstantin Wohlwend	48a8f0b072	Fix build	2026-02-27 00:48:07 -08:00
Konstantin Wohlwend	e0ea6834d0	Upgrade TypeScript	2026-02-27 00:28:35 -08:00
Konstantin Wohlwend	d63db64e19	Migrate from tsup to tsdown	2026-02-26 17:42:09 -08:00
BilalG1	145bcb7e92	Analytics event tracking (#1208 ) <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Browser-side event tracker with batching, navigation & click capture and background/keepalive delivery * Server endpoint to accept batched analytics events and associate them with session replay segments * Client APIs to send analytics batches and integrate with session replay * Bug Fixes / UX * Pausing replay now uses the UI-facing playback time for more accurate pause positions * Replay endpoint now returns a clear analytics-disabled error (ANALYTICS_NOT_ENABLED) when analytics is off * Tests * End-to-end tests covering batch ingestion, validation, and replay timing behavior <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-02-17 18:33:01 -08:00
BilalG1	fa27c80319	rename tabId to sessionReplaySegmentId (#1206 ) <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Added new session replay analytics columns to ClickHouse for enhanced tracking and reporting * Refactor * Renamed session recording segment identifier across APIs and data models from `tab_id` to `session_replay_segment_id` * Updated internal data structures and type definitions to align with new naming convention <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-02-17 11:00:07 -08:00
Konsti Wohlwend	d319285403	Queries view (#1145 )	2026-02-16 11:39:21 -08:00
BilalG1	907a98320a	Clickhouse sync fixing (#1198 ) <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md -->	2026-02-16 11:30:38 -08:00
BilalG1	5b149bebaa	fix clickhouse flaky tests (#1196 ) <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md -->	2026-02-13 13:05:35 -08:00
BilalG1	d09a180dfe	clickhouse user sync (#1159 ) Some checks failed all-good: Did all the other checks pass? / all-good (push) Has been cancelled Details Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled Details DB migrations are backwards-compatible / Check if migrations changed (push) Has been cancelled Details Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled Details Docker Server Build and Run / docker (push) Has been cancelled Details Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled Details Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled Details Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled Details Lint & build / lint_and_build (latest) (push) Has been cancelled Details Dev Environment Test With Custom Base Port / restart-dev-and-test-with-custom-base-port (push) Has been cancelled Details Dev Environment Test / restart-dev-and-test (push) Has been cancelled Details Run setup tests with custom base port / setup-tests-with-custom-base-port (push) Has been cancelled Details Run setup tests / setup-tests (push) Has been cancelled Details TOC Generator / TOC Generator (push) Has been cancelled Details DB migrations are backwards-compatible / Test migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled Details DB migrations are backwards-compatible / No migration changes (skipped) (push) Has been cancelled Details <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Real-time AI search with project-scoped analytics and dynamic query execution; streaming AI responses replace the placeholder flow. * External DB sync adds ClickHouse support: users sync, sync metadata tracking, tenancy-aware status, and per-mapping throttling. * AI assistant UI shows expandable tool-invocation results and streams via the real AI pipeline. * Chores * Dashboard dependencies and workspace exclusions updated; development OpenAI env var added; editor config flag toggled. * Tests * E2E coverage extended to validate ClickHouse user sync and analytics queries. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: aadesh18 <110230993+aadesh18@users.noreply.github.com> Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com>	2026-02-12 16:52:20 -08:00
aadesh18	2055d98dea	External db sync (#1036 ) <img width="1920" height="969" alt="Screenshot 2026-02-04 at 9 47 16 AM" src="https://github.com/user-attachments/assets/d7d0cd04-0051-4fc4-b857-e6f87ee97a59" /> This PR revolves around the following components 1. Sequencer - sequences the updates in the internal db 2. Poller - polls for the latest updates to sync with the external db 3. Outgoing Request Handler - essentially a trigger that can make http requests based on a change in the internal db 4. Sync Engine - syncs with the latest changes from the internal db to the external db What has been done - Added a global sequence id for ProjectUser, ContactChannel and DeletedRow. - Added the deletedRow table to keep track of the rows that were deleted across ProjectUser and ContactChannel. - Added the OutgoingRequest table to keep track of the outgoing requests - Added function for the sequencer to call to sequence updates - Added a sequencer that sequences all the changes in the internal db every 50 ms - Added a poller that polls for the latest changes in the internal db every 50 ms, and adds to a queue - Added a Vercel cron that calls sequencer and poller every minute - Added a queue that fulfills the outgoing requests by making http calls (for external db sync, it calls the sync engine endpoint) - Added a sync engine that uses the defined sql mapping query in the user's schema to pull in the changes for the user, and sync them with the external db - Added tests to test out each functionality How to review this PR: 1. Review the migrations (sequence id, deletedRow, triggers, backlog sync) (all files created under the migrations folder) 2. Review sequencer 3. Review poller 4. Review the changes in schema 5. Review sync-engine (the function, and it's helper file) 6. Review the schema changes, and query mappings 7. Review the tests (basic, advanced and race, along with the helper file) 8. Review the changes made in Dockerfile to support local testing using the postgres docker <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Introduces a cron-driven external DB sync pipeline with global sequencing, internal poller and webhook sync engine, new DB tables/functions, config schema/mappings, and comprehensive e2e tests. > > - Database (Prisma/Migrations): > - Add global sequence (`global_seq_id`) and `sequenceId`/`shouldUpdateSequenceId` to `ProjectUser`, `ContactChannel`, `DeletedRow` with partial indexes. > - Create `DeletedRow` (capture deletes) and `OutgoingRequest` (queue) tables; add unique/indexes. > - Add triggers/functions: `log_deleted_row`, `reset_sequence_id_on_update`, `backfill_null_sequence_ids`, `enqueue_tenant_sync`. > - Backend/API: > - New internal routes: `GET /api/latest/internal/external-db-sync/sequencer`, `GET /poller`, `POST /sync-engine` (Upstash-verified) for sync orchestration. > - Add cron wiring: `vercel.json` schedules and local `scripts/run-cron-jobs.ts`; start in dev via `dev` script. > - Tweak route handler (remove noisy logging) without behavior change. > - Sync Engine: > - Implement `src/lib/external-db-sync.ts` to read tenant mappings and upsert to external Postgres (schema bootstrap, param checks, sequencing). > - Add default mappings `DEFAULT_DB_SYNC_MAPPINGS` and config schema `dbSync.externalDatabases` in shared config. > - Testing/Infra: > - Add extensive e2e tests (basics, advanced, race conditions) for sequencing, idempotency, deletes, pagination, multi-mapping, and permissions. > - Docker compose: add `external-db-test` Postgres for tests; e2e deps for `pg` types. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `3f2a8efcfb`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * External PostgreSQL sync: automatic, batched replication with mappings, resume/idempotency, and on-demand enqueueing. * Admin UI * Real-time External DB Sync dashboard and status API showing per-mapping backlog, sequencer/poller/sync-engine telemetry, and fusebox controls. * Tests * Large e2e suite: basic, advanced, race, high-volume tests and test utilities for external DB sync. * Chores * DB migrations, CI/workflow updates, background cron runner and local/dev test support. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com> Co-authored-by: Bilal Godil <bg2002@gmail.com>	2026-02-05 12:04:31 -08:00
Konstantin Wohlwend	097c0310c4	Check all users when verifying data integrity	2026-02-03 10:00:30 -08:00
Konstantin Wohlwend	4c22b37fdf	--no-bail for verify-data-integrity script	2026-01-28 13:53:28 -08:00
Konstantin Wohlwend	8fd5b13a3b	TokenRefreshEventType	2026-01-28 11:18:15 -08:00
BilalG1	484c3a6332	clickhouse setup (#1032 )	2026-01-28 09:12:33 -08:00
BilalG1	e439bd0b7e	verify payment transactions integrity (#1128 ) <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Added a comprehensive payments data-integrity verifier, Stripe payout reconciliation, API validation helpers, and a throttled progress utility for long-running checks. * Bug Fixes * Improved subscription/product filtering to correctly respect customer type during verification. * Chores * Reorganized verification scripts and updated the verification entrypoint invocation. * Tests * Enhanced test fixtures to include full product data for subscriptions. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2026-01-27 21:17:43 +00:00
Konstantin Wohlwend	e574f526fa	Import fixes	2026-01-23 11:52:54 -08:00
Konstantin Wohlwend	0aeb120aa8	Make DB migration script interactive	2026-01-23 11:52:25 -08:00
Konsti Wohlwend	8f74949a7f	Speed up tests (#1063 )	2025-12-28 11:25:04 -08:00
Konsti Wohlwend	b4ae80874e	Upgrade Prisma to v7 (#1064 )	2025-12-26 08:13:34 -08:00
Konstantin Wohlwend	7bd91dcf93	fixes?	2025-12-12 17:29:57 -08:00
Konsti Wohlwend	e7e792d462	Email outbox backend (#1030 )	2025-12-12 10:26:38 -08:00
BilalG1	b5b311554b	Metrics Endpoint Speed (#966 ) <img width="567" height="249" alt="Screenshot 2025-10-20 at 11 23 10 AM" src="https://github.com/user-attachments/assets/340df844-f619-489f-8d41-cc26bc165018" /> <img width="595" height="255" alt="Screenshot 2025-10-20 at 11 24 00 AM" src="https://github.com/user-attachments/assets/9321bda1-e6f0-4f53-8c6b-e29d0fc16038" /> <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md --> <!-- RECURSEML_SUMMARY:START --> ## High-level PR Summary This PR optimizes the performance of user list and metrics endpoints by refactoring SQL queries to use more efficient patterns. The changes include rewriting queries to use `LATERAL` joins and CTEs with proper filtering, extracting common user mapping logic into reusable functions, and adding performance tests with SQL scripts to generate realistic test data (10,000 mock users and activity events across 100 countries). ⏱️ Estimated Review Time: 30-90 minutes <details> <summary>💡 Review Order Suggestion</summary> \| Order \| File Path \| \|-------\|-----------\| \| 1 \| `apps/e2e/tests/backend/performance/mock-users.sql` \| \| 2 \| `apps/e2e/tests/backend/performance/mock-metric-events.sql` \| \| 3 \| `apps/e2e/tests/backend/performance/users-list.test.ts` \| \| 4 \| `apps/backend/src/app/api/latest/users/crud.tsx` \| \| 5 \| `apps/backend/src/app/api/latest/internal/metrics/route.tsx` \| </details> [![Need help? Join our Discord](https://img.shields.io/badge/Need%20help%3F%20Join%20our%20Discord-5865F2?style=plastic&logo=discord&logoColor=white)](https://discord.gg/n3SsVDAW6U) [![Analyze latest changes](`f22b2c44a1`/?repo_owner=stack-auth&repo_name=stack-auth&pr_number=966) <!-- RECURSEML_SUMMARY:END --> <!-- ELLIPSIS_HIDDEN --> ---- > [!IMPORTANT] > Optimize metrics and user list endpoints with SQL refactoring, caching, and performance tests, adding a `CacheEntry` model and mock data scripts. > > - Performance Optimization: > - Refactor SQL queries in `route.tsx` to use `LATERAL` joins and CTEs for efficient data retrieval. > - Implement caching in `route.tsx` using `getOrSetCacheValue()` to reduce database load. > - Database Changes: > - Add `CacheEntry` model to `schema.prisma` and create corresponding table and index in `migration.sql`. > - Remove auto-migration metadata step from `check-prisma-migrations.yaml`. > - Testing: > - Add performance tests in `metrics.test.ts` to benchmark metrics and user endpoints. > - Create mock data scripts `mock-users.sql` and `mock-metric-events.sql` for testing with 10,000 users and events across 100 countries. > - Miscellaneous: > - Update `db-migrations.ts` to include new migration file generation logic. > - Add `cache.tsx` for caching logic implementation. > > <sup>This description was created by </sup>[<img alt="Ellipsis" src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=stack-auth%2Fstack-auth&utm_source=github&utm_medium=referral)<sup> for `4d9be71063`. You can [customize](https://app.ellipsis.dev/stack-auth/settings/summaries) this summary. It will automatically update as commits are pushed.</sup> ---- <!-- ELLIPSIS_HIDDEN --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Metrics now use a cache layer with per-entry TTL and tenancy-aware loaders. * Bug Fixes * Improved accuracy of daily active and related metrics with tenancy-aware counting and more robust last-active computation. * Performance * Faster metrics responses via batched reads and cache-backed endpoints. * Tests * Added end-to-end performance benchmarks and SQL seed scripts for metrics/user load testing. * Chores * DB migration added support for cached entries; CI migration check flow adjusted; migration tooling improved. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com>	2025-11-05 16:24:04 -08:00
Konstantin Wohlwend	cd02113441	several changes	2025-10-14 12:23:22 -07:00
Konstantin Wohlwend	1ed9c6150f	Use custom migration script for self-hosting container	2025-10-14 11:29:41 -07:00
Konsti Wohlwend	8a77e07f19	Rename offer to product, offer group to product catalog (#914 ) Some checks failed all-good: Did all the other checks pass? / all-good (push) Has been cancelled Details Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled Details Docker Emulator Test / docker (push) Has been cancelled Details Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled Details Docker Server Test / docker (push) Has been cancelled Details Runs E2E API Tests / build (22.x) (push) Has been cancelled Details Runs E2E API Tests with external source of truth / build (22.x) (push) Has been cancelled Details Lint & build / lint_and_build (latest) (push) Has been cancelled Details Dev Environment Test / restart-dev-and-test (push) Has been cancelled Details Run setup tests / setup-tests (push) Has been cancelled Details TOC Generator / TOC Generator (push) Has been cancelled Details <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md --> <!-- RECURSEML_SUMMARY:START --> ## High-level PR Summary This PR implements a comprehensive renaming of "offer" to "product" and "offer group" to "product catalog" throughout the codebase. The changes include database migrations, schema updates, API compatibility layers, function renames, and updates to client and server implementations. Backwards compatibility is maintained through migration layers that handle requests using the old terminology, translating them to the new terminology before processing. The PR includes documentation of this approach in CLAUDE-KNOWLEDGE.md. This rename affects multiple parts of the system including the database schema, API endpoints, error types, and SDK interfaces. ⏱️ Estimated Review Time: 1-3 hours <details> <summary>💡 Review Order Suggestion</summary> \| Order \| File Path \| \|-------\|-----------\| \| 1 \| `apps/backend/prisma/migrations/20250923191615_rename_offers_to_products/migration.sql` \| \| 2 \| `apps/backend/src/app/api/migrations/v2beta1/payments/purchases/offers-compat.ts` \| \| 3 \| `apps/backend/src/app/api/migrations/v2beta1/payments/purchases/create-purchase-url/route.ts` \| \| 4 \| `apps/backend/src/app/api/migrations/v2beta1/payments/purchases/validate-code/route.ts` \| \| 5 \| `apps/backend/src/lib/payments.tsx` \| \| 6 \| `.claude/CLAUDE-KNOWLEDGE.md` \| \| 7 \| `packages/stack-shared/src/schema-fields.ts` \| \| 8 \| `packages/stack-shared/src/known-errors.tsx` \| \| 9 \| `packages/stack-shared/src/config/schema.ts` \| \| 10 \| `packages/template/src/lib/stack-app/customers/index.ts` \| \| 11 \| `packages/template/src/lib/stack-app/apps/implementations/client-app-impl.ts` \| \| 12 \| `packages/template/src/lib/stack-app/apps/implementations/server-app-impl.ts` \| </details> [![Need help? Join our Discord](https://img.shields.io/badge/Need%20help%3F%20Join%20our%20Discord-5865F2?style=plastic&logo=discord&logoColor=white)](https://discord.gg/n3SsVDAW6U) <!-- RECURSEML_SUMMARY:END --> <!-- ELLIPSIS_HIDDEN --> ---- > [!IMPORTANT] > Renames 'offer' to 'product' and 'offer group' to 'product catalog' across the codebase, updating database schema, API endpoints, and application logic for consistency and backward compatibility. > > - Database: > - Rename columns `offer` to `product` and `offerId` to `productId` in `OneTimePurchase` and `Subscription` tables in `migration.sql`. > - API & Migrations: > - Update API endpoints to accept `product_id`/`product_inline` instead of `offer_id`/`offer_inline`. > - Add `v2beta5` compatibility layer to map legacy `offer` fields to `product` equivalents. > - Shared Schemas: > - Rename `offerSchema` to `productSchema` and related schemas in `schema-fields.ts`. > - Server Implementation: > - Update `createCheckoutUrl` method in `server-app-impl.ts` to use `productId`/`InlineProduct`. > - Tests: > - Update tests to reflect renaming in `backend-helpers.ts` and other test files. > - Miscellaneous: > - Remove dummy data related to offers in `dummy-data.tsx`. > - Update documentation and comments to reflect terminology changes. > > <sup>This description was created by </sup>[<img alt="Ellipsis" src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=stack-auth%2Fstack-auth&utm_source=github&utm_medium=referral)<sup> for `e3227bcbd2`. You can [customize](https://app.ellipsis.dev/stack-auth/settings/summaries) this summary. It will automatically update as commits are pushed.</sup> ---- <!-- ELLIPSIS_HIDDEN --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * New Features * Backwards-compatibility: legacy offer_id/offer_inline requests are accepted, normalized, and routed to product-based handlers. * Refactor * Global rename from Offer/Group → Product/Catalog across UI, APIs, types, client/server interfaces, and error codes. * Bug Fixes * Responses, webhooks and UI consistently surface product_display_name and product-related metadata. * Documentation * Migration notes and docs updated to explain compatibility and parameter changes. * Tests * Unit and E2E suites updated to cover product/catalog flows. * Chores * Database schema migration, seed and config updates applied. <!-- end of auto-generated comment: release notes by coderabbit.ai --> <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Renames offers→products and groups→catalogs end-to-end (DB, APIs, schemas, UI, SDK, docs), adding v2beta5 compatibility to accept legacy offer fields while updating all internals. > > - Backend/DB: > - Prisma migration: rename `offer`/`offerId`→`product`/`productId` in `OneTimePurchase` and `Subscription`. > - Update Stripe webhook, purchase-session, and internal test-mode flows to use `product` metadata/fields. > - API & Migrations: > - Latest endpoints now accept `product_id`/`product_inline`. > - Add `v2beta5` compat layer mapping legacy `offer_id`/`offer_inline` to product equivalents; responses alias conflicting products. > - Shared Schemas/Errors/Config: > - `offerSchema`→`productSchema`, `inlineOfferSchema`→`inlineProductSchema`, prices/types renamed. > - KnownErrors renamed (e.g., `PRODUCT_DOES_NOT_EXIST`). > - Config: `groups`→`catalogs`, defaults/migrations updated; improved override validation messages; ID regex loosened; formatter tweaks; add schema fuzzer tests. > - Payments Lib: > - Rename APIs and logic (`offers`→`products`, `groupId`→`catalogId`), subscription and item-quantity computation updated. > - Dashboard/UI: > - Routes, dialogs, editors, tables, and code samples switched to products/catalogs; removed offers dummy data. > - SDK/Template: > - Client/server `createCheckoutUrl` now uses `productId`/`InlineProduct`. > - Tests/Docs/Utilities*: > - E2E and unit tests updated; add legacy (pre-rename) tests. > - Docs and knowledge base revised; minor script tweaks (recent-first, limits). > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `e6e20ecd72`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> --------- Co-authored-by: BilalG1 <bg2002@gmail.com>	2025-10-04 02:28:28 -07:00
Konstantin Wohlwend	914942dd2f	Migration summary	2025-08-25 10:13:18 -07:00
Zai Shi	e1c5018bd8	fix db:migration-gen	2025-07-30 09:46:27 -07:00
Konsti Wohlwend	7c0417d7d9	Several project config improvements (#811 )	2025-07-29 04:13:46 -07:00
Konstantin Wohlwend	5ee342af7c	Remove mcp-server Some checks failed all-good: Did all the other checks pass? / all-good (push) Has been cancelled Details Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled Details Docker Emulator Test / docker (push) Has been cancelled Details Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled Details Docker Server Test / docker (push) Has been cancelled Details Runs E2E API Tests / build (22.x) (push) Has been cancelled Details Runs E2E API Tests with external source of truth / build (22.x) (push) Has been cancelled Details Lint & build / lint_and_build (latest) (push) Has been cancelled Details Dev Environment Test / restart-dev-and-test (push) Has been cancelled Details Run setup tests / setup-tests (push) Has been cancelled Details TOC Generator / TOC Generator (push) Has been cancelled Details	2025-07-28 22:38:28 -07:00
Zai Shi	d9a9ad65ec	fix db migration script	2025-07-25 10:08:38 -07:00
Zai Shi	a7acab4646	Auto migration (#526 ) <!-- Make sure you've read the CONTRIBUTING.md guidelines: https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md --> <!-- ELLIPSIS_HIDDEN --> ---- > [!IMPORTANT] > Introduces an automated database migration system, replacing manual Prisma commands with new scripts and updating workflows, configurations, and tests accordingly. > > - Auto-Migration System: > - Introduces `db-migrations.ts` script for handling database migrations automatically. > - Adds utility functions in `utils.tsx` for managing migration files. > - Implements `applyMigrations` and `runMigrationNeeded` in `index.tsx` for executing migrations. > - Workflow and Scripts: > - Updates GitHub workflows (`check-prisma-migrations.yaml`, `e2e-api-tests.yaml`) to use new migration commands. > - Replaces `prisma migrate` commands with `db:init`, `db:migrate`, etc., in `package.json` and `README.md`. > - Testing: > - Adds `auto-migration.tests.ts` for testing migration logic and concurrency handling. > - Configuration: > - Updates `.env.development` and `vitest.config.ts` for new environment variables and paths. > - Modifies `turbo.json` and `package.json` to include new migration tasks and scripts. > > <sup>This description was created by </sup>[<img alt="Ellipsis" src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=stack-auth%2Fstack-auth&utm_source=github&utm_medium=referral)<sup> for `2c24183879`. You can [customize](https://app.ellipsis.dev/stack-auth/settings/summaries) this summary. It will automatically update as commits are pushed.</sup> <!-- ELLIPSIS_HIDDEN --> --------- Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com> Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>	2025-07-24 02:38:37 +02:00
Konsti Wohlwend	a7ef394be4	Source of Truth (#660 ) Co-authored-by: Zai Shi <zaishi00@outlook.com>	2025-07-11 17:59:03 -07:00

1 2

78 Commits