stack/docker/dependencies/docker.compose.yaml
aadesh18 2055d98dea
External db sync (#1036)
<img width="1920" height="969" alt="Screenshot 2026-02-04 at 9 47 16 AM"
src="https://github.com/user-attachments/assets/d7d0cd04-0051-4fc4-b857-e6f87ee97a59"
/>

**This PR revolves around the following components**
1. Sequencer - sequences the updates in the internal db
2. Poller - polls for the latest updates to sync with the external db
3. Outgoing Request Handler - essentially a trigger that can make http
requests based on a change in the internal db
4. Sync Engine - syncs with the latest changes from the internal db to
the external db

**What has been done**
- Added a global sequence id for ProjectUser, ContactChannel and
DeletedRow.
- Added the deletedRow table to keep track of the rows that were deleted
across ProjectUser and ContactChannel.
- Added the OutgoingRequest table to keep track of the outgoing requests
- Added function for the sequencer to call to sequence updates
- Added a sequencer that sequences all the changes in the internal db
every 50 ms
- Added a poller that polls for the latest changes in the internal db
every 50 ms, and adds to a queue
- Added a Vercel cron that calls sequencer and poller every minute
- Added a queue that fulfills the outgoing requests by making http calls
(for external db sync, it calls the sync engine endpoint)
- Added a sync engine that uses the defined sql mapping query in the
user's schema to pull in the changes for the user, and sync them with
the external db
- Added tests to test out each functionality


**How to review this PR:**
1. Review the migrations (sequence id, deletedRow, triggers, backlog
sync) (all files created under the migrations folder)
2. Review sequencer
3. Review poller
4. Review the changes in schema
5. Review sync-engine (the function, and it's helper file)
6. Review the schema changes, and query mappings
7. Review the tests (basic, advanced and race, along with the helper
file)
8. Review the changes made in Dockerfile to support local testing using
the postgres docker

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> Introduces a cron-driven external DB sync pipeline with global
sequencing, internal poller and webhook sync engine, new DB
tables/functions, config schema/mappings, and comprehensive e2e tests.
> 
> - **Database (Prisma/Migrations)**:
> - Add global sequence (`global_seq_id`) and
`sequenceId`/`shouldUpdateSequenceId` to `ProjectUser`,
`ContactChannel`, `DeletedRow` with partial indexes.
> - Create `DeletedRow` (capture deletes) and `OutgoingRequest` (queue)
tables; add unique/indexes.
> - Add triggers/functions: `log_deleted_row`,
`reset_sequence_id_on_update`, `backfill_null_sequence_ids`,
`enqueue_tenant_sync`.
> - **Backend/API**:
> - New internal routes: `GET
/api/latest/internal/external-db-sync/sequencer`, `GET /poller`, `POST
/sync-engine` (Upstash-verified) for sync orchestration.
> - Add cron wiring: `vercel.json` schedules and local
`scripts/run-cron-jobs.ts`; start in dev via `dev` script.
> - Tweak route handler (remove noisy logging) without behavior change.
> - **Sync Engine**:
> - Implement `src/lib/external-db-sync.ts` to read tenant mappings and
upsert to external Postgres (schema bootstrap, param checks,
sequencing).
> - Add default mappings `DEFAULT_DB_SYNC_MAPPINGS` and config schema
`dbSync.externalDatabases` in shared config.
> - **Testing/Infra**:
> - Add extensive e2e tests (basics, advanced, race conditions) for
sequencing, idempotency, deletes, pagination, multi-mapping, and
permissions.
> - Docker compose: add `external-db-test` Postgres for tests; e2e deps
for `pg` types.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
3f2a8efcfb. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* External PostgreSQL sync: automatic, batched replication with
mappings, resume/idempotency, and on-demand enqueueing.

* **Admin UI**
* Real-time External DB Sync dashboard and status API showing
per-mapping backlog, sequencer/poller/sync-engine telemetry, and fusebox
controls.

* **Tests**
* Large e2e suite: basic, advanced, race, high-volume tests and test
utilities for external DB sync.

* **Chores**
* DB migrations, CI/workflow updates, background cron runner and
local/dev test support.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Konsti Wohlwend <n2d4xc@gmail.com>
Co-authored-by: Bilal Godil <bg2002@gmail.com>
2026-02-05 12:04:31 -08:00

323 lines
9.2 KiB
YAML

services:
# ================= PostgreSQL =================
db:
build: ../dev-postgres-with-extensions
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: PASSWORD-PLACEHOLDER--uqfEC1hmmv
# Note: A readonly user is also created with password PASSWORD-PLACEHOLDER--readonlyuqfEC1hmmv
# for read replica emulation. See the Dockerfile for details.
POSTGRES_DB: stackframe
POSTGRES_DELAY_MS: ${POSTGRES_DELAY_MS:-0}
POSTGRES_INITDB_ARGS: --nosync
# Increase max_connections for E2E tests that create many databases
command: postgres -c max_connections=500
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}28:5432"
volumes:
- postgres-data:/var/lib/postgresql/data
cap_add:
- NET_ADMIN # required for the fake latency during dev
# ================= PostgreSQL Replica (with replication lag) =================
db-replica:
build: ../dev-postgres-replica
environment:
PGDATA: /var/lib/postgresql/data
PRIMARY_HOST: db
PRIMARY_PORT: 5432
REPLICATOR_USER: replicator
REPLICATOR_PASSWORD: PASSWORD-PLACEHOLDER--replicatorpass
RECOVERY_MIN_APPLY_DELAY: ${STACK_DATABASE_REPLICA_LAG_MS:-15}ms
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}34:5432"
volumes:
- postgres-replica-data:/var/lib/postgresql/data
depends_on:
- db
# ================= PgHero =================
pghero:
image: ankane/pghero:latest
environment:
DATABASE_URL: postgres://postgres:PASSWORD-PLACEHOLDER--uqfEC1hmmv@db:5432/stackframe
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}16:8080"
# ================= WAL Info =================
wal-info:
build: ./wal-info
environment:
PRIMARY_HOST: db
PRIMARY_PORT: 5432
REPLICA_HOST: db-replica
REPLICA_PORT: 5432
POSTGRES_USER: postgres
POSTGRES_PASSWORD: PASSWORD-PLACEHOLDER--uqfEC1hmmv
POSTGRES_DB: stackframe
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}38:8080"
depends_on:
- db
- db-replica
# ================= PgHero (Replica) =================
pghero-replica:
image: ankane/pghero:latest
environment:
DATABASE_URL: postgres://postgres:PASSWORD-PLACEHOLDER--uqfEC1hmmv@db-replica:5432/stackframe
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}35:8080"
depends_on:
- db-replica
# ================= PgAdmin =================
pgadmin:
image: dpage/pgadmin4
environment:
PGADMIN_DEFAULT_EMAIL: admin@example.com
PGADMIN_DEFAULT_PASSWORD: PASSWORD-PLACEHOLDER--vu9p2iy3f
PGADMIN_CONFIG_SERVER_MODE: "False"
PGADMIN_CONFIG_MASTER_PASSWORD_REQUIRED: "False"
configs:
- source: pgadmin_servers
target: /pgadmin4/servers.json
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}17:80"
# ================= Supabase Studio =================
supabase-studio:
image: supabase/studio:20241202-71e5240
restart: unless-stopped
healthcheck:
test:
[
"CMD",
"node",
"-e",
"fetch('http://studio:3000/api/profile').then((r) => {if (r.status !== 200) throw new Error(r.status)})"
]
timeout: 10s
interval: 5s
retries: 3
environment:
STUDIO_PG_META_URL: http://supabase-meta:8080
POSTGRES_PASSWORD: PASSWORD-PLACEHOLDER--uqfEC1hmmv
OPENAI_API_KEY: ${OPENAI_API_KEY:-}
NEXT_PUBLIC_ENABLE_LOGS: true
NEXT_ANALYTICS_BACKEND_PROVIDER: postgres
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}18:3000"
supabase-meta:
image: supabase/postgres-meta:v0.84.2
restart: unless-stopped
environment:
PG_META_PORT: 8080
PG_META_DB_HOST: db
PG_META_DB_PORT: 5432
PG_META_DB_NAME: stackframe
PG_META_DB_USER: postgres
PG_META_DB_PASSWORD: PASSWORD-PLACEHOLDER--uqfEC1hmmv
# ================= Inbucket =================
inbucket:
image: inbucket/inbucket:3.1.0
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}29:2500"
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}05:9000"
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}30:1100"
volumes:
- inbucket-data:/data
# ================= OpenTelemetry & Jaeger =================
jaeger:
image: jaegertracing/all-in-one:latest
environment:
- COLLECTOR_OTLP_ENABLED=true
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}07:16686" # Jaeger UI
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}31:4318" # OTLP Endpoint
restart: always
# ================= svix =================
svix-db:
image: "docker.io/postgres:16.1"
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: PASSWORD-PLACEHOLDER--KsoIMcchtp
POSTGRES_DB: svix
volumes:
- svix-postgres-data:/var/lib/postgresql/data
svix-redis:
image: docker.io/redis:7-alpine
command: --save 60 500 --appendonly yes --appendfsync everysec --requirepass PASSWORD-PLACEHOLDER--oVn8GSD6b9
volumes:
- svix-redis-data:/data
svix-server:
image: svix/svix-server
environment:
WAIT_FOR: 'true'
SVIX_REDIS_DSN: redis://:PASSWORD-PLACEHOLDER--oVn8GSD6b9@svix-redis:6379
SVIX_DB_DSN: postgres://postgres:PASSWORD-PLACEHOLDER--KsoIMcchtp@svix-db:5432/svix
SVIX_CACHE_TYPE: memory
SVIX_JWT_SECRET: secret
SVIX_LOG_LEVEL: trace
SVIX_QUEUE_TYPE: redis
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}13:8071"
depends_on:
- svix-redis
- svix-db
# ================= Adobe S3 Mock =================
s3mock:
image: adobe/s3mock:latest
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}21:9090"
environment:
- initialBuckets=stack-storage
- root=s3mockroot
- debug=false
volumes:
- s3mock-data:/tmp/s3mock
healthcheck:
test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:9090/"]
interval: 30s
timeout: 10s
retries: 3
# ================= LocalStack =================
localstack:
image: localstack/localstack:4.7
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}24:4566" # LocalStack Gateway
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}50-${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}99:4510-4559" # external services port range
environment:
# LocalStack configuration: https://docs.localstack.cloud/references/configuration/
- DEBUG=${DEBUG:-0}
volumes:
- localstack-data:/var/lib/localstack
- "/var/run/docker.sock:/var/run/docker.sock"
# ================= Freestyle mock =================
freestyle-mock:
build:
context: ./freestyle-mock
dockerfile: Dockerfile
image: freestyle-mock
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}22:8080" # POST http://localhost:${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}19/execute/v1/script
extra_hosts:
- "host.docker.internal:host-gateway" # noop on Docker Desktop/Orbstack, enables host.docker.internal on Linux
environment:
DENO_DIR: /deno-cache
HOST_ON_HOST: host.docker.internal
volumes:
- deno-cache:/deno-cache
# ================= Stripe =================
stripe-mock:
image: stripe/stripe-mock:v0.195.0
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}23:12111"
environment:
- STRIPE_API_KEY=sk_test_1234567890
# ================= QStash =================
qstash:
image: bgodil/qstash:latest
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}25:8080"
command: qstash dev
extra_hosts:
- "host.docker.internal:host-gateway" # noop on Docker Desktop/Orbstack, enables host.docker.internal on Linux
environment:
HOST_ON_HOST: host.docker.internal
# ================= ClickHouse =================
clickhouse:
image: clickhouse/clickhouse-server:25.10
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}36:8123" # HTTP interface
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}37:9000" # Native interface
environment:
CLICKHOUSE_DB: analytics
CLICKHOUSE_USER: stackframe
CLICKHOUSE_PASSWORD: PASSWORD-PLACEHOLDER--9gKyMxJeMx
CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT: 1
volumes:
- clickhouse-data:/var/lib/clickhouse
ulimits:
nofile:
soft: 262144
hard: 262144
# ================= Drizzle Gateway =================
drizzle-gateway:
image: ghcr.io/drizzle-team/gateway:latest
restart: always
ports:
- "${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}33:1133"
environment:
PORT: 1133
STORE_PATH: /app
volumes:
- drizzle-gateway:/app
# ================= volumes =================
volumes:
postgres-data:
postgres-replica-data:
inbucket-data:
svix-redis-data:
svix-postgres-data:
s3mock-data:
deno-cache:
localstack-data:
clickhouse-data:
drizzle-gateway:
# ================= configs =================
configs:
pgadmin_servers:
content: |
{
"Servers": {
"1": {
"Name": "Local Postgres DB",
"Group": "Servers",
"Host": "db",
"Port": 5432,
"Username": "postgres",
"PasswordExecCommand": "echo 'PASSWORD-PLACEHOLDER--uqfEC1hmmv'",
"MaintenanceDB": "stackframe"
}
}
}