## Summary
Jaeger's `all-in-one` image uses an unbounded in-memory span store by
default, so on long-running local dev sessions it quietly grows until
the Docker VM is under memory pressure. This caps it in two ways:
- `MEMORY_MAX_TRACES=50000` — tells Jaeger to evict old traces once the
in-memory ring buffer hits 50k. Plenty for local debugging without
retaining every trace forever.
- `mem_limit: 1g` — hard Docker memory cap as a backstop so a runaway
collector can't starve the rest of the compose stack (Postgres,
ClickHouse, Svix, etc.).
Only touches `docker/dependencies/docker.compose.yaml` — no app code, no
prod config.
## Test plan
- [ ] `docker compose -f docker/dependencies/docker.compose.yaml up
jaeger` starts cleanly
- [ ] Jaeger UI reachable at `localhost:8107`
- [ ] OTLP endpoint on `localhost:8131` still ingests spans from the
API/dashboard
- [ ] `docker stats` shows the jaeger container capped at ~1GB under
load
- [ ] Old traces get evicted once trace count exceeds 50k (verify via UI
search)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **Chores**
* Configured memory constraints and trace buffering optimization for the
tracing service to improve resource management and system stability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary
`stack emulator start` now resumes a fully-warm VM snapshot instead of
cold-booting, bringing startup from 30–120s down to ~5–8s with
per-install secret rotation, or ~2.5s with rotation opt-out. The
snapshot is captured **locally on first `stack emulator pull`**, not
shipped from CI — QEMU migration state isn't portable across
accelerators (KVM/HVF/TCG) or `-cpu max` feature sets, so a CI-captured
snapshot couldn't resume reliably on arbitrary user hardware.
Also bundles a pile of CLI QoL fixes (progress bars, PR/run artifact
pulls, PR-build download, native-TS ISO writer replacing
`hdiutil`/`mkisofs`/`genisoimage` host dep, unit tests).
| Scenario | Before | After |
|---|---|---|
| Cold boot (no snapshot) | 30–120s | same, works as fallback |
| `stack emulator pull` (one-time, includes local snapshot capture) |
~30s download | ~30s download + ~1–3 min cold-boot capture |
| Snapshot resume, normal start | — | **~5–8s** |
| Snapshot resume, `EMULATOR_NO_ROTATION=1` | — | **~2.5s** |
Backend (`/health?db=1`) and dashboard (`/handler/sign-in`) return 200
on all paths. Two successive snapshot resumes produce different rotated
PCK/SSK/SAK/CRON_SECRET values per install.
## How it works
**Build (CI)** — `docker/local-emulator/qemu/build-image.sh`:
1. Cloud-init provisioning runs to completion (migrations, seed,
slim-image) producing `stack-emulator-<arch>.qcow2`.
2. Image is built with a topology compatible with later snapshot capture
(pinned SMP=4, phantom seed/bundle ISOs, STACKCFG runtime ISO mounted at
build time, qemu-guest-agent running, placeholder hex secrets baked in
under `STACK_EMULATOR_BUILD_SNAPSHOT=1`).
3. CI publishes **only the qcow2** — no `.savevm.zst` ships.
**Pull (user's machine)** —
`packages/stack-cli/src/commands/emulator.ts` + `run-emulator.sh
capture`:
1. `stack emulator pull` downloads the qcow2 with a progress bar (or
from a PR / workflow run via `--pr` / `--run`).
2. CLI invokes `run-emulator.sh capture`: cold-boots the qcow2 with a
matching device layout (phantom ISOs, fsdev, pcie-root-port, virtfs
detached — migration-incompatible), waits for backend+dashboard health,
then drives QMP: `stop` → set `mapped-ram` + `multifd` caps → `migrate
file:state.raw` → poll `query-migrate` → `quit`. Raw mapped-ram file is
zstd-compressed to `stack-emulator-<arch>.savevm.zst` in the images dir.
3. `--skip-snapshot` opts out (first `start` will then cold-boot).
**Runtime** — `run-emulator.sh start`:
1. Launch QEMU with `-incoming defer` when a `.savevm.zst` is present;
decompress on first use, keep the `.raw` cached for subsequent starts.
2. QMP: same `mapped-ram` + `multifd` caps → `migrate-incoming
file:<.raw>` → poll for `paused` → `cont`.
3. Generate fresh per-install secrets on the host; pipe them
base64-encoded through QGA `guest-exec input-data` →
`trigger-fast-rotate` in the guest → `docker exec -e … rotate-secrets`.
4. `rotate-secrets` in the container: validate keys (hex-only), targeted
`sed` on the placeholder PCK across built JS, `UPDATE ApiKeySet`,
`supervisorctl restart stack-app cron-jobs` (with
`stopasgroup`/`killasgroup` so the Node children actually die and
release their ports).
5. Poll backend+dashboard health; if anything fails, clean up and fall
back to cold boot transparently.
**Security model**: placeholder hex values are baked into the snapshot
(`00…ff` PCK, `00…ee` SSK, `00…dd` SAK, `00…cc` CRON_SECRET). They are
non-secret by construction. Real per-install secrets are generated at
each `emulator start` and never leave the host.
## CLI changes (`packages/stack-cli`)
- **`src/lib/iso.ts`** (new): native TypeScript ISO 9660 + Joliet
writer, replacing the host-side `hdiutil`/`mkisofs`/`genisoimage`
dependency for generating the STACKCFG runtime config disk. Unit tests
in `src/lib/iso.test.ts`.
- **`src/commands/emulator.ts`**:
- `pull`: streamed downloads with progress bar + ETA; `--pr <number>`
and `--run <id>` to pull from a PR build's CI artifacts (uses
`extract-zip` for the nested zip); `--skip-snapshot` to opt out of the
one-time local capture.
- `start` (existing, extended): auto-pulls AND auto-captures when no
image exists, so first-ever `start` is self-bootstrapping; emits
`STACK_EMULATOR_CLI_WROTE_ISO=1` so the shell helper skips its own ISO
regen (avoids the genisoimage host dep).
- `capture` (new, invoked by `pull` and the auto-pull path of `start`):
drives the local snapshot capture via `run-emulator.sh`.
- `status`, `stop`, `reset`, `list-releases`: preflight +
path-resolution tightening (`STACK_EMULATOR_HOME` → images/run dirs).
- Unit tests in `src/commands/emulator.test.ts`.
- **`EMULATOR_NO_ROTATION=1`** env var skips the post-resume rotation
(intended for tests/CI where the placeholder secrets are fine — comes
with a loud warning).
## CI (`.github/workflows/qemu-emulator-build.yaml`)
- Builds **QEMU 10.2.2 from source** (cached), because
`mapped-ram`/`multifd` migration capabilities aren't available in the
distro's QEMU. Enables KVM on ubicloud runners so amd64 boots at
hardware speed.
- amd64 + arm64 both build on the same amd64 matrix
(`ubicloud-standard-8`); arm64 runs under cross-arch TCG (provisioning
only — boot/verify smoke test is amd64-only).
- Verification now runs through the CLI: `emulator start` → `emulator
status` → `emulator stop` against the freshly-built qcow2 (via
`STACK_EMULATOR_HOME` pointing at the workspace, so the CLI doesn't
silently auto-pull a prior release).
- Packages **only** the qcow2. No `.savevm.zst` upload / publish.
- Release notes updated.
## Key files
**Shell / guest:**
- `docker/local-emulator/qemu/build-image.sh` — snapshot-compatible
device topology + STACKCFG runtime ISO at build time
- `docker/local-emulator/qemu/run-emulator.sh` — `start`, `capture`,
`stop`, `reset`, `status`; `-incoming defer`, `.raw` cache, QGA-driven
rotation, cold-boot fallback
- `docker/local-emulator/qemu/common.sh` (new) — shared `qmp_session` +
`capture_vm_state` (factored out so build-image.sh and run-emulator.sh
share the capture path)
- `docker/local-emulator/qemu/cloud-init/emulator/user-data` —
placeholder secrets in snapshot mode, `wait-for-stack-ready`,
`trigger-fast-rotate`, qemu-guest-agent enabled
- `docker/local-emulator/rotate-secrets.sh` (new) — in-container
rotation (sed + UPDATE + supervisorctl)
- `docker/local-emulator/supervisord.conf` — `stopasgroup`/`killasgroup`
on `stack-app` and `cron-jobs`
- `docker/local-emulator/entrypoint.sh` — only mint CRON_SECRET if unset
(placeholder supplied in snapshot mode via --env-file)
- `docker/local-emulator/Dockerfile` — ships `rotate-secrets` to
`/usr/local/bin`
- `docker/server/entrypoint.sh` — source
`/run/stack-auth/rotated-secrets.env`; skip full-tree sentinel scan on
warm restarts via marker
**CLI:**
- `packages/stack-cli/src/lib/iso.ts` (new) + `iso.test.ts` (new)
- `packages/stack-cli/src/commands/emulator.ts` + `emulator.test.ts`
(new)
- `packages/stack-cli/vitest.config.ts` (new)
**CI:**
- `.github/workflows/qemu-emulator-build.yaml`
## Test plan
- [x] `docker/local-emulator/qemu/build-image.sh {amd64,arm64}` produces
`stack-emulator-<arch>.qcow2` with snapshot-compatible topology
- [x] `stack emulator pull` downloads qcow2 with progress, then captures
locally (~1–3 min) and writes `stack-emulator-<arch>.savevm.zst` in the
images dir
- [x] `stack emulator pull --skip-snapshot` stops after download
- [x] `stack emulator pull --pr <n>` / `--run <id>` pull from PR /
workflow run artifacts
- [x] `stack emulator start` on a fresh dir auto-pulls **and**
auto-captures, then starts; subsequent starts fast-resume in ~5–8s;
backend + dashboard return 200
- [x] `EMULATOR_NO_ROTATION=1 stack emulator start` completes in ~2.5s;
backend + dashboard return 200 with warning printed
- [x] Two consecutive `emulator start` invocations produce different PCK
values in the internal `ApiKeySet` row
- [x] `stack emulator status` / `stop` / `reset` resolve paths from
`STACK_EMULATOR_HOME`
- [x] Verified end-to-end on arm64 macOS under HVF (capture ~50s,
fast-resume ~6.5s)
- [x] `pnpm lint` and `pnpm typecheck` pass; stack-cli unit tests (iso +
emulator) pass
- [ ] CI green on this PR (qemu-emulator-build matrix, smoke test)
- [ ] `gh release download emulator-<branch>-latest` contains only
`stack-emulator-<arch>.qcow2` once this PR merges and publish runs
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Snapshot fast-start/resume with optional warm-snapshot assets, runtime
ISO generation, and a cached QEMU build to speed emulator setup.
* CLI: streamed artifact downloads with progress, improved release/asset
handling, stronger preflight checks, and start/status/stop emulator
commands.
* Automated secret rotation and ability to apply rotated secrets at
container startup; supervisor control socket enabled.
* **Bug Fixes**
* More robust start/stop/resume flows with automatic fallback to cold
boot and improved process-group shutdown behavior.
* **Tests**
* New tests for CLI utilities and ISO image generation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
### Object of this PR
This PR is NOT a monolithic series of fixes for the payments suite + a
complete rework. Its aims were
a) introducing and robustly testing the bulldozer db system
b) reworking the payments underlying architecture to use bulldozer for
correctness and scalability
c) Achieving parity with the old payments system excepting a few changes
like ensuring correctness of the ledger algo
There may still be some work to do with handling refunds, decoupling the
concepts of purchases from that of products, and some other things.
### Ledger Algorithm
This has been tuned and fixed. Item removals i.e negative item quantity
changes will apply to the soonest expiring item grant i.e positive item
quantity change. This is what is best for the user. Item grants can also
expire, and when they expire we obviate whatever is left of their
original capacity (meaning after all the removals that were applied to
it). Our ledger algo is applied via Bulldozer, so automatic
re-computation is handled when a new grant/ removal is inserted in the
middle of the existing ones.
### Things we got rid of
* No more automatic support for default products. You can use $0 plan
provisions to accomplish the same effect but it's manual
* Negative item quantity changes (i.e item removals) no longer can have
expiries
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Enhanced payment processing pipeline with improved data consistency
and state management.
* Advanced refund handling with comprehensive transaction tracking.
* Better tracking and management of customer item quantities and owned
products.
* Improved subscription lifecycle management including period-end
handling.
* **Bug Fixes**
* Fixed payment data integrity verification.
* Improved handling of edge cases in refund scenarios.
* **Chores**
* Updated cSpell configuration with additional words.
* Expanded developer documentation for linting workflows.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Konstantin Wohlwend <n2d4xc@gmail.com>
Co-authored-by: Aadesh Kheria <kheriaaadesh@gmail.com>
Co-authored-by: Mantra <87142457+mantrakp04@users.noreply.github.com>
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **New Features**
* Added Stripe, OAuth, and Freestyle mock services to the local emulator
* Introduced `emulator run` CLI command to execute applications with
emulator credentials automatically injected
* Enhanced credential management for local development
* **Improvements**
* Improved ARM64 QEMU emulation with cross-architecture support
* Better error detection and logging during emulator provisioning
* Added example middleware configuration with authentication support
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
… V8 --jitless
2.6 GB to 1.3 GB final image
Flip arm64 matrix back to ubicloud-standard-8 so both arches share one
runner fleet. Cross-arch TCG on an amd64 host previously SIGTRAP'd in
migrations because V8's JIT emitted arm64 instructions that QEMU's
cross-arch translator mis-handled; pair the existing -cpu cortex-a72
fallback with NODE_OPTIONS=--jitless on the migration docker exec to
force V8 to stay on the interpreter. Does not affect amd64 migrations
(KVM, no TCG).
<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Optimized emulator images with binary stripping, compression, and
preservation of standalone runtime dependencies.
* Improved multi-architecture build matrix, added optional KVM
detection/fallback, and gated certain emulator runtime steps for arm64.
* Enhanced build scripts to generate and include env files and persist
richer logs and artifacts.
* **New Features**
* Centralized provision entrypoint to streamline install → migrations →
slimming sequence.
* **Tests**
* Added a fast QEMU serial boot test for architecture validation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
- Added support for `@opentelemetry/sdk-node` in the backend.
- Updated various dependencies including AWS SDK and OpenTelemetry
packages.
- Implemented graceful shutdown handling for non-Vercel runtimes in
`prisma-client.tsx`.
- Enhanced AWS credentials retrieval to support GCP Workload Identity
Federation.
- Introduced a Dockerfile for Cloud Run deployment, optimizing the
backend build process.
- Updated `.gitignore` to include Terraform runtime files and secrets.
This commit improves the backend's observability and deployment
flexibility, particularly for Cloud Run environments.
<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* OpenTelemetry observability with dynamic provider selection per
deployment.
* Cloud Run trusted-proxy support for accurate client IP handling.
* Graceful shutdown that waits for in-flight background work.
* New background-task handling to improve async webhook/email delivery
reliability.
* AWS credential providers added (Vercel OIDC & GCP Workload Identity
Federation).
* Dockerized backend image for Cloud Run / self-host deployments.
* **Chores**
* Updated dependencies for OpenTelemetry and AWS SDK support.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Konstantin Wohlwend <n2d4xc@gmail.com>
s3mock 5.0.0 was released today and latest now points to it. The major
version bump breaks initialBuckets env var handling, causing
NoSuchBucket errors in CI.
<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Chores**
* Pinned service dependency to a specific version for improved
deployment consistency and stability.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
commit 5d43722575b826a8ed8dbb6b828f48eae4bca02c
Author: mantrakp04 <mantrakp@gmail.com>
Date: Wed Mar 18 12:27:01 2026 -0700
Add QEMU emulator snapshot functionality and reset command
- Introduced a new `emulator-qemu:reset` command in package.json to
clear snapshots and force a fresh boot of the emulator.
- Enhanced the `run-emulator.sh` script to support saving and restoring
snapshots, significantly reducing restart time from ~62s to ~4s.
- Implemented logic to check for existing snapshots and restore them
during startup, improving the emulator's efficiency.
- Updated documentation in CLAUDE-KNOWLEDGE.md to explain the new
snapshot restore process and its benefits.
These changes enhance the QEMU emulator's performance and usability for
developers, providing a more efficient workflow during development.
commit 3877445bdd83cb8690da18c8520bf260d2795172
Author: mantrakp04 <mantrakp@gmail.com>
Date: Wed Mar 18 11:55:18 2026 -0700
Enhance QEMU emulator performance and configuration management
- Added optimizations to the QEMU emulator's app container startup
process, reducing startup time from ~92s to ~62s by using qcow2 backing
files and setting the working directory to /app.
- Updated the build-image.sh script to conditionally wait for background
processes, improving robustness.
- Modified the run-emulator.sh script to create the disk image using
qcow2 format instead of copying, enhancing efficiency.
- Adjusted the cloud-init user-data to set STACK_RUNTIME_WORK_DIR to
/app, streamlining file operations during container initialization.
- Improved the entrypoint script to avoid unnecessary file copying when
the working directory is set to /app.
These changes significantly enhance the performance and usability of the
QEMU emulator for developers.
commit e0b86d3f1d5c08e46d0d343bc632e2a8c5777845
Author: mantrakp04 <mantrakp@gmail.com>
Date: Wed Mar 18 11:07:55 2026 -0700
Refactor local emulator configuration management and enhance Docker
setup
- Removed redundant comments and improved code clarity in the local
emulator's route handling.
- Streamlined the Dockerfile and docker-compose.yaml for better
readability and maintenance.
- Updated entrypoint and initialization scripts to enhance service
startup processes.
- Introduced a new common script for QEMU emulator to centralize
architecture detection and firmware handling.
- Enhanced error handling in the host file bridge for improved
robustness.
- Removed obsolete country code utilities to clean up the codebase.
These changes significantly improve the local emulator's configuration
management and overall setup experience for developers.
commit 4fb0f93c6cc4f749a14acf0228c261e180875609
Author: mantrakp04 <mantrakp@gmail.com>
Date: Wed Mar 18 10:24:53 2026 -0700
Implement local emulator file bridge for enhanced configuration
management
- Introduced a new host file bridge to facilitate reading and writing
configuration files between the local emulator and the host system.
- Refactored the local-emulator module to utilize the file bridge for
file operations, improving error handling and response validation.
- Added tests to ensure the file bridge functionality works as expected,
including handling of non-existent files and writing configurations.
- Updated the run-emulator script to start the file bridge
automatically, ensuring seamless integration during emulator startup.
- Enhanced documentation to reflect the new file bridge capabilities and
usage instructions.
These changes significantly improve the local emulator's ability to
manage configuration files, enhancing the development experience.
commit 3d18a7ce5bbf00a62a40a3f48f27856e79ecc62f
Author: mantrakp04 <mantrakp@gmail.com>
Date: Tue Mar 17 22:36:46 2026 -0700
Refactor QEMU local emulator setup and enhance app bundle handling
- Introduced a new script for packaging Docker images into a compressed
app bundle, improving the emulator's deployment process.
- Updated build-image.sh to create a runtime configuration ISO, ensuring
better management of environment settings.
- Enhanced cloud-init user-data scripts for both dev-server and deps
guests, streamlining service setup and configuration.
- Improved the run-emulator.sh script to facilitate better handling of
runtime configurations and dependencies.
- Adjusted the .gitignore to include .DS_Store and removed obsolete
entries, cleaning up the repository.
These changes significantly enhance the local emulator's functionality
and reliability for developers.
commit 8a35fb1ce79898d73e2259e256c11b6fd9b0a584
Author: mantrakp04 <mantrakp@gmail.com>
Date: Tue Mar 17 21:52:24 2026 -0700
Enhance local emulator functionality and configuration
- Updated package.json to improve the start-emulator command, providing
clearer dashboard and backend URLs.
- Added a new wait-until-emulator-is-ready command to ensure the
emulator is fully operational before proceeding.
- Refactored the local-emulator project route to streamline file
existence checks and default config creation.
- Enhanced user guidance in the dashboard for local Stack config file
handling.
- Updated tests to reflect changes in config file handling, ensuring
non-existent files are created with default settings.
- Improved Docker configurations for the local emulator, including new
environment variables and service dependencies.
These changes significantly enhance the local development experience and
emulator reliability.
commit 3910ed4bc40bbb37340c1c316c24c2826ba372bd
Author: mantrakp04 <mantrakp@gmail.com>
Date: Tue Mar 17 19:59:36 2026 -0700
Remove unused stash-0.patch file to clean up the repository.
commit 74146d974458037a7a9590120a524629a1a6a162
Author: mantrakp04 <mantrakp@gmail.com>
Date: Tue Mar 17 19:58:46 2026 -0700
Enhance QEMU local emulator with app bundle support and runtime
configuration
- Introduced a new script to package the backend and dashboard assets
into a standalone app bundle for the QEMU emulator.
- Updated the build-image.sh script to create an ISO containing the app
bundle, ensuring the guest image includes the full runtime.
- Modified cloud-init user-data to handle the new app bundle and runtime
configuration, improving the setup process for local development.
- Enhanced the run-emulator.sh script to prepare and mount the runtime
configuration ISO, facilitating better environment management for the
emulator.
- Updated the user-data to include necessary environment variables for
the stack application, ensuring seamless integration during startup.
These changes significantly improve the local emulator's functionality
and ease of use for developers.
commit 9e865a1cf524398bc58f00e0836278775c4ae936
Author: mantrakp04 <mantrakp@gmail.com>
Date: Tue Mar 17 16:50:45 2026 -0700
Enhance local emulator setup with new services and configurations
- Added Docker support for a local emulator, integrating PostgreSQL,
Redis, Inbucket, Svix, ClickHouse, MinIO, and QStash.
- Introduced new scripts for managing the emulator lifecycle, including
build and run commands.
- Implemented cloud-init provisioning for automatic service setup on
first boot.
- Updated package.json with new commands for emulator management and
added dotenv-cli for environment variable management.
- Added tests for OAuth authorization flow to return JSON responses.
- Included configuration files for ClickHouse and user management.
This commit significantly improves the local development experience by
providing a comprehensive emulator environment.
<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **New Features**
* Introduced a local QEMU-based emulator for development with bundled
services (PostgreSQL, Redis, ClickHouse, MinIO, Inbucket, Svix, QStash).
* Added CLI commands to manage the emulator (start, stop, reset, status,
pull images).
* Added emulator status dashboard to monitor service health.
* Introduced new configuration system via `stack.config.ts`.
* **Tests**
* Added configuration read/write tests for the emulator.
* Added emulator CLI validation tests.
* **Documentation**
* Added emulator setup and usage guide.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Enhances sign-up process with Turnstile integration for fraud
protection. Builds on top of fraud-protection-temp-emails.
Made with [Cursor](https://cursor.com)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Cloudflare Turnstile bot-protection across signup/sign-in flows
(including SDK JSON mode).
* Email deliverability checks via Emailable.
* Sign-up risk scoring with persisted risk metrics and country code
tracking.
* UI: country-code selector, risk-score editing in user details, users
list refresh button, and Turnstile signup demo pages.
* **Bug Fixes**
* Use actual sign-up timestamp for reporting/metrics.
* **Documentation**
* Expanded knowledge base on Turnstile, risk scoring, and env
configuration.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Konstantin Wohlwend <n2d4xc@gmail.com>
Co-authored-by: BilalG1 <bg2002@gmail.com>
Co-authored-by: Armaan Jain <84474476+Developing-Gamer@users.noreply.github.com>
Co-authored-by: nams1570 <amanganapathy@gmail.com>
<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Provision local-emulator projects from a local config file and return
emulator credentials via a new internal endpoint.
* Dashboard: "Open config file" flow to open local projects and refresh
owned projects.
* **Changes**
* Branch config can prefer/read/write local files for emulator projects.
* Environment config updates/resets are blocked for local-emulator
projects.
* Dashboard UI shows read-only notices and disables project creation in
emulator mode.
* Added DB mapping and a standard env flag to identify local-emulator
projects.
* **Tests**
* New E2E tests covering provisioning and config restrictions.
* **Chores**
* Removed legacy emulator docs and compose; added CI workflow for
local-emulator E2E runs.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Konstantin Wohlwend <n2d4xc@gmail.com>
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
https://www.loom.com/share/3b7c9288149e4f878693281778c9d7e0
## Todos (future PRs)
- Fix pre-login recording
- Better session search (filters, cmd-k, etc)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Analytics → Replays: session recording & multi-tab replay with
timeline, speed, seek, and playback settings; dashboard UI for listing
and viewing replays.
* **Admin APIs**
* Admin endpoints to list recordings, list chunks, fetch chunk events,
and retrieve all events (paginated).
* **Client**
* Client-side rrweb recording with batching, deduplication, upload API
and a send-batch client method.
* **Configuration**
* New STACK_S3_PRIVATE_BUCKET for private session storage.
* **Tests**
* Extensive unit and end-to-end tests for replay logic, streams,
playback, and APIs.
* **Chores**
* Removed an E2E API test GitHub Actions workflow.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<img width="1920" height="969" alt="Screenshot 2026-02-04 at 9 47 16 AM"
src="https://github.com/user-attachments/assets/d7d0cd04-0051-4fc4-b857-e6f87ee97a59"
/>
**This PR revolves around the following components**
1. Sequencer - sequences the updates in the internal db
2. Poller - polls for the latest updates to sync with the external db
3. Outgoing Request Handler - essentially a trigger that can make http
requests based on a change in the internal db
4. Sync Engine - syncs with the latest changes from the internal db to
the external db
**What has been done**
- Added a global sequence id for ProjectUser, ContactChannel and
DeletedRow.
- Added the deletedRow table to keep track of the rows that were deleted
across ProjectUser and ContactChannel.
- Added the OutgoingRequest table to keep track of the outgoing requests
- Added function for the sequencer to call to sequence updates
- Added a sequencer that sequences all the changes in the internal db
every 50 ms
- Added a poller that polls for the latest changes in the internal db
every 50 ms, and adds to a queue
- Added a Vercel cron that calls sequencer and poller every minute
- Added a queue that fulfills the outgoing requests by making http calls
(for external db sync, it calls the sync engine endpoint)
- Added a sync engine that uses the defined sql mapping query in the
user's schema to pull in the changes for the user, and sync them with
the external db
- Added tests to test out each functionality
**How to review this PR:**
1. Review the migrations (sequence id, deletedRow, triggers, backlog
sync) (all files created under the migrations folder)
2. Review sequencer
3. Review poller
4. Review the changes in schema
5. Review sync-engine (the function, and it's helper file)
6. Review the schema changes, and query mappings
7. Review the tests (basic, advanced and race, along with the helper
file)
8. Review the changes made in Dockerfile to support local testing using
the postgres docker
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Introduces a cron-driven external DB sync pipeline with global
sequencing, internal poller and webhook sync engine, new DB
tables/functions, config schema/mappings, and comprehensive e2e tests.
>
> - **Database (Prisma/Migrations)**:
> - Add global sequence (`global_seq_id`) and
`sequenceId`/`shouldUpdateSequenceId` to `ProjectUser`,
`ContactChannel`, `DeletedRow` with partial indexes.
> - Create `DeletedRow` (capture deletes) and `OutgoingRequest` (queue)
tables; add unique/indexes.
> - Add triggers/functions: `log_deleted_row`,
`reset_sequence_id_on_update`, `backfill_null_sequence_ids`,
`enqueue_tenant_sync`.
> - **Backend/API**:
> - New internal routes: `GET
/api/latest/internal/external-db-sync/sequencer`, `GET /poller`, `POST
/sync-engine` (Upstash-verified) for sync orchestration.
> - Add cron wiring: `vercel.json` schedules and local
`scripts/run-cron-jobs.ts`; start in dev via `dev` script.
> - Tweak route handler (remove noisy logging) without behavior change.
> - **Sync Engine**:
> - Implement `src/lib/external-db-sync.ts` to read tenant mappings and
upsert to external Postgres (schema bootstrap, param checks,
sequencing).
> - Add default mappings `DEFAULT_DB_SYNC_MAPPINGS` and config schema
`dbSync.externalDatabases` in shared config.
> - **Testing/Infra**:
> - Add extensive e2e tests (basics, advanced, race conditions) for
sequencing, idempotency, deletes, pagination, multi-mapping, and
permissions.
> - Docker compose: add `external-db-test` Postgres for tests; e2e deps
for `pg` types.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
3f2a8efcfb. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* External PostgreSQL sync: automatic, batched replication with
mappings, resume/idempotency, and on-demand enqueueing.
* **Admin UI**
* Real-time External DB Sync dashboard and status API showing
per-mapping backlog, sequencer/poller/sync-engine telemetry, and fusebox
controls.
* **Tests**
* Large e2e suite: basic, advanced, race, high-volume tests and test
utilities for external DB sync.
* **Chores**
* DB migrations, CI/workflow updates, background cron runner and
local/dev test support.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com>
Co-authored-by: Bilal Godil <bg2002@gmail.com>
<!-- CURSOR_SUMMARY -->
> [!NOTE]
> **High Risk**
> Touches core sign-up/auth flows and user restriction semantics
(including new DB constraints) and introduces dynamic rule
evaluation/logging; misconfiguration or CEL/parser bugs could block
sign-ups or incorrectly restrict users.
>
> **Overview**
> Introduces **CEL-based sign-up rules** (config-driven) that are
evaluated during password/OTP/OAuth sign-ups and anonymous upgrades;
matching rules can reject sign-ups or mark users as admin-restricted,
and triggers are logged for analytics.
>
> Extends `ProjectUser` with `restrictedByAdmin` plus public/private
restriction details, updates restriction computation/filtering, and
exposes these fields via user CRUD (including validation + DB constraint
enforcing consistency when unrestricted).
>
> Adds a new dashboard **Sign-up Rules** page with a visual condition
builder (CEL <-> visual tree), drag-reorder by priority, per-rule 48h
sparkline analytics via a new hidden internal endpoint, and adds
user-page UI to view/edit manual restrictions. Also refactors ClickHouse
client initialization to require env vars (removing
`isClickhouseConfigured` checks) and adjusts CI container startup wait
time.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
2141e689e8c1b72303b805e9234f996010d0880. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Sign-up Rules: visual rule builder, in-project CRUD with drag-reorder,
per-rule analytics, backend evaluation, and admin UI.
* Admin user restrictions: dashboard controls, banners/status,
public/private admin details surfaced in user views.
* **APIs & Schema**
* Config and user schemas extended; new SignUpRejected error and sign-up
rule types added.
* **Tests**
* Extensive unit and E2E coverage for rules, parser, evaluator,
analytics, and restricted-user flows.
* **Docs**
* Editorial guidance added to AGENTS.md.
* **Chores**
* DB statement timeout, updated clean script, minor dependency
additions.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
### Context
We noticed some errors pop up on sentry related to email rendering.
These errors seem to have been triggered by the same issue, and could be
categorized as follows:
1. Sanity test mismatch, even when the errors from freestyle and vercel
sandbox were broadly similar. This occurred due to stack traces
differing in different execution environments.
2. Rendering errors from freestyle and vercel sandbox caused by the
theme not being imported/ empty theme component.
Upon investigation, this occurred because hitting save on the email
themes page with an invalid theme (ex: deleting the `export` keyword, or
renaming the `EmailTheme` component) still triggers `bundleAndExecute`
with the invalid themes. This will obviously fail and cause the errors
to be logged, however there is no cause for concern here because the
error is returned and the save is denied because an error is returned.
It's more of a matter of noisy error logs and too strict sanity test
comparisons.
Beyond that, `js-execution` is a little opaque and hard to understand,
and this can mask errors in logic.
We also noticed a new issue: manually throwing an error in the email
theme code editor, and then trying to save was actually successful. This
was because the version of `react-email/components` we were using had
faulty error handling, and fell back to client side rendering, masking
the error. This wasn't caught by our `try-catch` safeguards because it
was a render time issue that was masked. More specifically, this was
what `react-email` was doing: `Switched to client rendering because the
server rendering errored`.
### Summary of Changes
We loosen the sanity test comparison between engine execution results in
case of errors. We then refactor the `js-execution` and
`email-rendering` files to read better, and to only `captureError` when
a service is down, but not for runtime errors in the user submitted
code.
To deal with the other bug, we bumped `react-email/components` to the
latest version. However, doing so exposed a gap between real `freestyle`
and our `freestyle-mock`: with the mock, the errors that were now raised
were treated as uncaught exceptions, crashing the mock server.
Consequently, we switched to using `node` over `bun`.
We also expanded test coverage to account for different error paths.
Co-authored-by: Konstantin Wohlwend <n2d4xc@gmail.com>
<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
Conditionally generate secrets. This stops docker image from generating
new secrets upon every restart.
Originally reported in #578.
This fix aims to resolve this issue.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Bug Fixes**
* Secret values can now be externally injected during startup without
being overwritten. Pre-configured secrets are preserved instead of being
regenerated.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
We update the sdk dependencies (the ones present in
`package-template.json`) to the latest versions. Since several packages
have major version bumps, this results in a variety of breaking changes
that have been handled here. Incidentally, when possible, we update
similar dependencies across the codebase.
We decide to defer the tailwind update to another PR owing to its scale.
The rest of the updates and changes have been catalogued below:
1.
[Bumping](https://github.com/panva/oauth4webapi/blob/v3.x/CHANGELOG.md)
`oauth4webapi` to 3.8.3: this was a major version changed. While there
were no compatibility issues in the sdk, there were several breaking
changes in `stack-shared`. Namely:
a. The removal of `isOauth2Error`. We used this to check if the results
of our `oauth4webapi` api invocations had issues. The functions were
changed to explicitly throw either `ResponseBodyErrors` or
`AuthorizationResponseErrors`, so the code was reworked to account for
that with no loss in error handling.
b. Dropping of support for http broadly: `oauth4webapi` now only accepts
https. This is desired, but I add a carve out for our test environments
only.
c. `refreshTokenGrantRequest` and `authorizationCodeGrantRequest` now
require `clientAuthentication` to be passed explicitly to them.
d. Changes in how we handle our `MultiFactorAuthenticationRequired`
error: This is an error that we created and is passed to the
`oauth4webapi` API if there are MFA issues. Since the
`processAuthorizationCodeResponse` now explicitly throws a
`ResponseBodyError`, we access the error cause from the body of the
error instead.
2. [Bumping](https://github.com/Qix-/color/releases) `color` to 5.0.4:
this was a major version bump. Simple type checking change, I checked
the API for the correct interface.
3.
[Bumping](https://github.com/MasterKale/SimpleWebAuthn/blob/master/CHANGELOG.md)
`simplewebauthn` to 13.2.2: two major version bumps, but no
incompatibilities surprisingly
4. [Bumping](https://github.com/jshttp/cookie/releases) `cookie` to
1.1.1: this was a major version bump.
a. Changing `parse` to `parseCookie`. In the most recent version,
`parse` is still maintained as an alias for `parseCookie` for backwards
compatibility, but I thought it would be best to change it over now. No
change in functionality.
b. Typing is now strongly enforced. A cookie can be `string |
undefined`, and the `Cookies` are now `Record<string, string |
undefined>`. We already have code to handle if a cookie is returned as
undefined/ null, so the changes here were more to ensure type
compatibility rather than big changes in functionality.
5. [Bumping ](https://github.com/isaacs/rimraf#readme)`rimraf` to 6.1.2:
No breaking changes, mostly just bug fixes.
6. [Bumping](https://github.com/panva/jose/releases?page=1) `jose` to
6.1.3: This is another major version bump. We update it across the
codebase to ensure compatibility. We use this for importing and
processing jwk tokens. There are a few big changes in the version bump,
but the only one that applies to us is that `importJwk` now yields a
`CryptoKey` instead of a `KeyObject` in Node.js. However, this doesn't
appear to break our code. We use `importJwk` in
`stack-auth/packages/stack-shared/src/utils/jwt.tsx`.
7. [Bumping](https://github.com/react-hook-form/resolvers/releases)
`hookform/resolvers` to 5.2.2 (two major version jumps), and
consequently bumping `react-hook-form` to 7.70.0: We already use the
patterns that `hookform/resolvers`' latest versions seem to be
enforcing. The only other breaking change is that it requires version
7.55.0+ of `react-hook-form`. Though we should pay attention to any
interactions with zod and `hookform/resolvers`, some people have
reported compatibility issues if they aren't using the latest compatible
versions of both.
8. [Bumping](https://github.com/jquense/yup/blob/master/CHANGELOG.md)
`yup` to 1.7.1: this was a minor version change, but we had
incompatibility issues with this change. Versions 1.4.1 and 1.7.1 cannot
exist in the same codebase due to incompatibility, so we bumped it up
across the codebase, including in peer dependencies.
9. Some minor version changes for some packages, but these were mostly
bug fixes.
10. **Edited to add**: Bumping freestyle to 0.1.6, and reworking the
freestyle mock server. In 0.1.6, freestyle changed their API in two
ways:
a. We're now supposed to hit their `execute/v2/...` endpoint and
b. They've flattened the `config` argument to `serverless.runs.create`.
These changes are minor, but are important. As part of a general suite
of dependency bumps, this was judged to fit here.
We have linked the changelogs for the packages on each line.
<!--
Make sure you've read the CONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Added environment variable handling and logging interception for
improved script execution visibility.
* **Bug Fixes**
* Enhanced error handling with structured error responses and logs.
* **Chores**
* Updated dependencies: arktype and react-dom.
* Improved temporary work directory management and cleanup process.
* Optimized script execution model for better performance.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!--
ONTRIBUTING.md guidelines:
https://github.com/stack-auth/stack-auth/blob/dev/CONTRIBUTING.md
-->
<!-- RECURSEML_SUMMARY:START -->
## High-level PR Summary
This PR changes the default development ports for several background
services to avoid conflicts. PostgreSQL moves from port `5432` to
`8128`, Inbucket SMTP from `2500` to `8129`, Inbucket POP3 from `1100`
to `8130`, and the OpenTelemetry collector from `4318` to `8131`. All
references across configuration files, Docker Compose setups,
environment files, CI/CD workflows, test files, and documentation have
been updated to reflect these new port assignments. A knowledge base
document has been added to document the new port mappings.
⏱️ Estimated Review Time: 15-30 minutes
<details>
<summary>💡 Review Order Suggestion</summary>
| Order | File Path |
| --- | --- |
| 1 | `claude/CLAUDE-KNOWLEDGE.md` |
| 2 | `apps/dev-launchpad/public/index.html` |
| 3 | `docker/dependencies/docker.compose.yaml` |
| 4 | `docker/emulator/docker.compose.yaml` |
| 5 | `apps/backend/.env` |
| 6 | `apps/backend/.env.development` |
| 7 | `docker/server/.env.example` |
| 8 | `package.json` |
| 9 | `.devcontainer/devcontainer.json` |
| 10 | `apps/e2e/.env.development` |
| 11 | `.github/workflows/check-prisma-migrations.yaml` |
| 12 | `.github/workflows/docker-server-test.yaml` |
| 13 | `.github/workflows/e2e-api-tests.yaml` |
| 14 | `.github/workflows/e2e-source-of-truth-api-tests.yaml` |
| 15 | `.github/workflows/restart-dev-and-test.yaml` |
| 16 |
`apps/e2e/tests/backend/endpoints/api/v1/internal/email-drafts.test.ts`
|
| 17 | `apps/e2e/tests/backend/endpoints/api/v1/internal/email.test.ts`
|
| 18 | `apps/e2e/tests/backend/endpoints/api/v1/send-email.test.ts` |
| 19 |
`apps/e2e/tests/backend/endpoints/api/v1/unsubscribe-link.test.ts` |
| 20 | `apps/e2e/tests/backend/workflows.test.ts` |
| 21 | `docs/templates/others/self-host.mdx` |
</details>
[](https://discord.gg/n3SsVDAW6U)
[
<!-- RECURSEML_SUMMARY:END -->
<!-- ELLIPSIS_HIDDEN -->
----
> [!IMPORTANT]
> This PR introduces customizable development ports using
`NEXT_PUBLIC_STACK_PORT_PREFIX`, updating configurations, documentation,
and tests accordingly.
>
> - **Behavior**:
> - Default development ports for services are now customizable via
`NEXT_PUBLIC_STACK_PORT_PREFIX`.
> - PostgreSQL port changed from `5432` to
`${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}28`.
> - Inbucket SMTP port changed from `2500` to
`${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}29`.
> - Inbucket POP3 port changed from `1100` to
`${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}30`.
> - OpenTelemetry collector port changed from `4318` to
`${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}31`.
> - **Configuration**:
> - Updated `docker.compose.yaml` to use new port variables for services
like PostgreSQL, Inbucket, and OpenTelemetry.
> - Environment files in `apps/backend`, `apps/dashboard`, and
`apps/e2e` updated to use `NEXT_PUBLIC_STACK_PORT_PREFIX`.
> - `package.json` scripts updated to reflect new port configurations.
> - **Documentation**:
> - Added `CLAUDE-KNOWLEDGE.md` to document new port mappings.
> - Updated `self-host.mdx` to reflect new port configurations.
> - **Testing**:
> - Updated test files in `apps/e2e/tests` to use new port
configurations.
> - Added `helpers/ports.ts` for port-related utilities in tests.
>
> <sup>This description was created by </sup>[<img alt="Ellipsis"
src="https://img.shields.io/badge/Ellipsis-blue?color=175173">](https://www.ellipsis.dev?ref=stack-auth%2Fstack-auth&utm_source=github&utm_medium=referral)<sup>
for 76ef55f58f. You can
[customize](https://app.ellipsis.dev/stack-auth/settings/summaries) this
summary. It will automatically update as commits are pushed.</sup>
----
<!-- ELLIPSIS_HIDDEN -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Enable configurable development ports via a
NEXT_PUBLIC_STACK_PORT_PREFIX, allowing parallel local environments with
custom port prefixes.
- **Bug Fixes**
- Updated local service port mappings and CI/workflow settings so
tooling and tests use the new prefixed ports consistently.
- **Documentation**
- Added docs and contributor guidance for running multiple parallel
workspaces with custom port prefixes.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: N2D4 <N2D4@users.noreply.github.com>