<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Browser-side event tracker with batching, navigation & click capture
and background/keepalive delivery
* Server endpoint to accept batched analytics events and associate them
with session replay segments
* Client APIs to send analytics batches and integrate with session
replay
* **Bug Fixes / UX**
* Pausing replay now uses the UI-facing playback time for more accurate
pause positions
* Replay endpoint now returns a clear analytics-disabled error
(ANALYTICS_NOT_ENABLED) when analytics is off
* **Tests**
* End-to-end tests covering batch ingestion, validation, and replay
timing behavior
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added new session replay analytics columns to ClickHouse for enhanced
tracking and reporting
* **Refactor**
* Renamed session recording segment identifier across APIs and data
models from `tab_id` to `session_replay_segment_id`
* Updated internal data structures and type definitions to align with
new naming convention
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
https://www.loom.com/share/3b7c9288149e4f878693281778c9d7e0
## Todos (future PRs)
- Fix pre-login recording
- Better session search (filters, cmd-k, etc)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Analytics → Replays: session recording & multi-tab replay with
timeline, speed, seek, and playback settings; dashboard UI for listing
and viewing replays.
* **Admin APIs**
* Admin endpoints to list recordings, list chunks, fetch chunk events,
and retrieve all events (paginated).
* **Client**
* Client-side rrweb recording with batching, deduplication, upload API
and a send-batch client method.
* **Configuration**
* New STACK_S3_PRIVATE_BUCKET for private session storage.
* **Tests**
* Extensive unit and end-to-end tests for replay logic, streams,
playback, and APIs.
* **Chores**
* Removed an E2E API test GitHub Actions workflow.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
### Context
Some of our users' emails were getting stuck in sending. The long delays
in processing the retries caused a vercel function timeout.
### Summary of Changes
We refactor the low level email sending functions to remove the retry
logic there. We kick it up to the email queue step. Additionally, we
flag emails to be retried when they encounter issues but leave it for a
future iteration to actually perform the retry. We perform an
exponential backoff with a random component to decide when they have to
be retried. We also make some small adjustments to the queuing function
to not queue skipped emails.
When an email fails to send during the sending function, we check to see
if it is a retryable error or not. Some errors are transient and trying
again may succeed while others indicate deeper issues. If it is
retryable, and the max number of retry attempts hasn't been reached, we
set `nextSendRetryAt` to a time determined by an exponential backoff
calculation function. When the queuing function looks for emails to
queue, it doesn't just pick up the `SCHEDULED`. emails whose
`scheduledAt` time <= `NOW()`, but also those emails whose
`nextSendRetryAt` time <= `NOW()`. What this means in practice is that
one iteration of the `email-queue-step` will mark emails as retryable
while another iteration will perform the retry. This should be cleaner
and prevent long delays in the `email-queue-step` process due to
retries. This also makes it easier to scale up the number of retries if
need be.
DB migrations are backwards-compatible / Test migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Real-time AI search with project-scoped analytics and dynamic query
execution; streaming AI responses replace the placeholder flow.
* External DB sync adds ClickHouse support: users sync, sync metadata
tracking, tenancy-aware status, and per-mapping throttling.
* AI assistant UI shows expandable tool-invocation results and streams
via the real AI pipeline.
* **Chores**
* Dashboard dependencies and workspace exclusions updated; development
OpenAI env var added; editor config flag toggled.
* **Tests**
* E2E coverage extended to validate ClickHouse user sync and analytics
queries.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: aadesh18 <110230993+aadesh18@users.noreply.github.com>
Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com>
<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Improved pricing accuracy by implementing proper rounding in unit
price calculations during checkout. This ensures correct cent-level
precision in purchase computations, preventing potential rounding errors
in transaction totals.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
DB migrations are backwards-compatible / Test migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migrations are backwards-compatible / Test migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
### Context
A small amount of email monitor requests take slightly less than 3
minutes to complete. This meant our old timeout of 2 minutes was
flagging a few kuma pulls as "down" when they weren't.
### Summary of Changes
We just bumped the timeout. This should be ok in prod because the uptime
kuma timeout is set to 200 seconds.
<img width="1920" height="969" alt="Screenshot 2026-02-04 at 9 47 16 AM"
src="https://github.com/user-attachments/assets/d7d0cd04-0051-4fc4-b857-e6f87ee97a59"
/>
**This PR revolves around the following components**
1. Sequencer - sequences the updates in the internal db
2. Poller - polls for the latest updates to sync with the external db
3. Outgoing Request Handler - essentially a trigger that can make http
requests based on a change in the internal db
4. Sync Engine - syncs with the latest changes from the internal db to
the external db
**What has been done**
- Added a global sequence id for ProjectUser, ContactChannel and
DeletedRow.
- Added the deletedRow table to keep track of the rows that were deleted
across ProjectUser and ContactChannel.
- Added the OutgoingRequest table to keep track of the outgoing requests
- Added function for the sequencer to call to sequence updates
- Added a sequencer that sequences all the changes in the internal db
every 50 ms
- Added a poller that polls for the latest changes in the internal db
every 50 ms, and adds to a queue
- Added a Vercel cron that calls sequencer and poller every minute
- Added a queue that fulfills the outgoing requests by making http calls
(for external db sync, it calls the sync engine endpoint)
- Added a sync engine that uses the defined sql mapping query in the
user's schema to pull in the changes for the user, and sync them with
the external db
- Added tests to test out each functionality
**How to review this PR:**
1. Review the migrations (sequence id, deletedRow, triggers, backlog
sync) (all files created under the migrations folder)
2. Review sequencer
3. Review poller
4. Review the changes in schema
5. Review sync-engine (the function, and it's helper file)
6. Review the schema changes, and query mappings
7. Review the tests (basic, advanced and race, along with the helper
file)
8. Review the changes made in Dockerfile to support local testing using
the postgres docker
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Introduces a cron-driven external DB sync pipeline with global
sequencing, internal poller and webhook sync engine, new DB
tables/functions, config schema/mappings, and comprehensive e2e tests.
>
> - **Database (Prisma/Migrations)**:
> - Add global sequence (`global_seq_id`) and
`sequenceId`/`shouldUpdateSequenceId` to `ProjectUser`,
`ContactChannel`, `DeletedRow` with partial indexes.
> - Create `DeletedRow` (capture deletes) and `OutgoingRequest` (queue)
tables; add unique/indexes.
> - Add triggers/functions: `log_deleted_row`,
`reset_sequence_id_on_update`, `backfill_null_sequence_ids`,
`enqueue_tenant_sync`.
> - **Backend/API**:
> - New internal routes: `GET
/api/latest/internal/external-db-sync/sequencer`, `GET /poller`, `POST
/sync-engine` (Upstash-verified) for sync orchestration.
> - Add cron wiring: `vercel.json` schedules and local
`scripts/run-cron-jobs.ts`; start in dev via `dev` script.
> - Tweak route handler (remove noisy logging) without behavior change.
> - **Sync Engine**:
> - Implement `src/lib/external-db-sync.ts` to read tenant mappings and
upsert to external Postgres (schema bootstrap, param checks,
sequencing).
> - Add default mappings `DEFAULT_DB_SYNC_MAPPINGS` and config schema
`dbSync.externalDatabases` in shared config.
> - **Testing/Infra**:
> - Add extensive e2e tests (basics, advanced, race conditions) for
sequencing, idempotency, deletes, pagination, multi-mapping, and
permissions.
> - Docker compose: add `external-db-test` Postgres for tests; e2e deps
for `pg` types.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
3f2a8efcfb. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* External PostgreSQL sync: automatic, batched replication with
mappings, resume/idempotency, and on-demand enqueueing.
* **Admin UI**
* Real-time External DB Sync dashboard and status API showing
per-mapping backlog, sequencer/poller/sync-engine telemetry, and fusebox
controls.
* **Tests**
* Large e2e suite: basic, advanced, race, high-volume tests and test
utilities for external DB sync.
* **Chores**
* DB migrations, CI/workflow updates, background cron runner and
local/dev test support.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com>
Co-authored-by: Bilal Godil <bg2002@gmail.com>
### Context
Via the email-monitor and sentry traces, we found out that we were
running the "without fallback" path in production. This was because the
right environment variables weren't populated in Vercel, but it exposed
how our system let this slide. This occurred because the check we did
for fallback vs non-fallback routes relied on checking whether a certain
environment variable was empty.
### Summary of Changes
We now use a sentinel value for the check instead of an empty value.
Additionally, we remove default values from the `getEnvVariable` checks
to ensure they throw if not populated. We also guard against the
sentinel value being used in production
### Config Changes
We need to update and make sure the vercel sandbox environment variables
are populated in production.