mirror of
https://github.com/stack-auth/stack.git
synced 2026-06-30 21:01:54 +08:00
dev
4 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
38ae913fc9
|
Rename STACK_* env vars to HEXCLAVE_* in env templates, with legacy dual-read (#1588)
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
## Summary
Completes the env-var side of the Hexclave rebrand: every
`STACK_*`-prefixed variable (including `NEXT_PUBLIC_STACK_*` and
`VITE_STACK_*`) is renamed to `HEXCLAVE_*` across all checked-in `.env`,
`.env.development`, and `.env.example` files (30 files, ~135 keys).
Legacy `STACK_*` names keep working everywhere via dual-read, so
**existing deployments, `.env.local` files, and self-hosted setups need
no immediate migration**.
## How legacy names keep working
- **Server code** already resolves `HEXCLAVE_*` first with `STACK_*`
fallback via `getEnvVariable`. Direct `process.env.STACK_X` readers fed
by the renamed files (prisma seed, e2e tests/helpers, internal-tool
scripts, examples, `prisma.config.ts`) now read `HEXCLAVE_X || STACK_X`.
- **Client code** (Next.js build-time inlining) uses literal dual-read
expressions; the dashboard's `_inlineEnvVars` already had them.
- **Docker/self-hosting**: `docker/server/entrypoint.sh` (shared by the
server and local-emulator images) gets a generic two-way
`HEXCLAVE_`↔`STACK_` env mirror — runs at startup and again before
sentinel replacement — replacing the previous URL-trio-only mirror.
Operators can use either prefix.
## The empty-placeholder trap (`||` vs `??`)
The checked-in templates define empty placeholders (`HEXCLAVE_X=#
comment` parses to `""` via dotenv). With `?? `-based fallbacks, that
empty string would silently shadow a real value under the legacy name —
including legacy vars set in Vercel/CI env at build time, since the
tracked `.env` is present during builds. All fallback chains therefore
treat empty-as-unset (`||`):
- `getEnvVariable` and `getProcessEnv` in `packages/shared`
- the dashboard/docs/example literal dual-reads
- the generated SDK env getters (via
`packages/template/scripts/generate-env.ts`; the generated
`src/generated/env.ts` files are gitignored and regenerate at build)
## Other notable changes
- Tests that override env now set the canonical `HEXCLAVE_*` name (it
wins over `STACK_*`): e2e `cross-domain-auth`, backend
`internal-feedback-emails` in-source test.
- e2e `helpers.ts` port-prefix expansion loop also matches the
`HEXCLAVE_` prefixes.
- `docker/local-emulator/generate-env-development.mjs` reads source keys
canonically (legacy fallback) and emits canonical keys; regenerated
output matches.
- `rotate-secrets.sh` falls back to
`HEXCLAVE_DATABASE_CONNECTION_STRING`.
- Docs code snippets (`docs/code-examples`) renamed outright to
canonical names, consistent with #1571.
- OAuth callback `console.warn` in `packages/template/src/lib/auth.ts`
now says Hexclave.
## Migration note for the team
Local `.env.local` files with legacy `STACK_*` overrides keep working
**unless** the override targets a var that `.env.development` now sets
to a real (non-empty) `HEXCLAVE_*` value — the canonical name wins over
file precedence. Rename those keys in your `.env.local` once.
## Verification
- `typecheck` + `lint` pass on every touched package (shared, backend,
dashboard, e2e, internal-tool, cli, docs, template). Pre-existing
failures on dev (`admin-app-impl.ts` typecheck, dashboard metrics-page
errors) are unchanged (identical error counts with/without this change).
- `getEnvVariable`/`getProcessEnv` fallback semantics smoke-tested
directly (empty-HEXCLAVE → legacy fallback, HEXCLAVE wins when set,
defaults intact).
- `internal-feedback-emails` in-source vitest passes; emulator env
generator `--check` passes; `bash -n` on touched shell scripts.
- Two independent review agents audited the diff for correctness bugs
and coverage gaps; all confirmed findings are fixed in the third commit.
<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Renamed all `STACK_*` env vars (including
`NEXT_PUBLIC_STACK_*`/`VITE_STACK_*`) to `HEXCLAVE_*` across env
templates and code, with dual‑read that treats empty as unset, detects
conflicts, ignores post‑build sentinels, and falls back to legacy names.
All GitHub Actions now use `HEXCLAVE_*`; local‑emulator e2e is fixed by
setting `NEXT_PUBLIC_HEXCLAVE_IS_LOCAL_EMULATOR` in CI.
- **Refactors**
- Added conflict‑aware dual‑read helpers (prefer `HEXCLAVE_*`,
empty‑as‑unset, ignore post‑build sentinels, preserve empty passthrough)
and used them across `packages/shared` (resolver + tests),
`apps/dashboard` inline/public envs (with tests), `apps/backend` Prisma
config/seed and vitest (accept both prefixes), `packages/cli`
(API/Dashboard URLs, project ID, `HEXCLAVE_EMULATOR_HOME`; tests),
Docker (`entrypoint.sh` mirroring + `rotate-secrets.sh` DB URL),
docs/components (`docs/src/lib/env.ts`), and examples; hosted/Vite apps
now error if both spellings differ.
- Port‑prefix expansion includes `HEXCLAVE_*`; backend tests use a new
helper to resolve DB connection strings; Prisma prefers
`HEXCLAVE_DATABASE_CONNECTION_STRING` with legacy fallback.
- Generated SDK env getters use plain `HEXCLAVE_*` || `STACK_*` (no
conflict throw); dashboard inline resolver preserves empty/sentinel
passthrough to avoid build failures; docs/examples include dual‑read
utilities.
- Tests now stub canonical `HEXCLAVE_*` flags (e.g., plan limits, bot
challenge, OAuth tokens, hosted handler) to avoid shadowing/conflict
with committed defaults.
- **Migration**
- No immediate action; legacy `STACK_*` names still work.
- If both names are set with different values, builds/scripts error. Set
only `HEXCLAVE_*` or make both equal.
- SDK consumers won’t see conflict throws; update env names to
`HEXCLAVE_*` over time.
<sup>Written for commit
|
||
|
|
8901a93b55
|
Rename STACK_SEED_INTERNAL_PROJECT_SECRET_SERVER_KEY to STACK_INTERNAL_PROJECT_SECRET_SERVER_KEY (#1415)
## Summary - Renames the env var `STACK_SEED_INTERNAL_PROJECT_SECRET_SERVER_KEY` to `STACK_INTERNAL_PROJECT_SECRET_SERVER_KEY` everywhere it is used (20 occurrences across 8 files), covering backend env files, the Prisma seed script, runtime config, and the docker entrypoint/local-emulator scripts. - Mirrors the prior publishable-client-key rename in #1411. ## Test plan - [x] `pnpm lint` - [x] `pnpm typecheck` - [ ] Verify local emulator still boots with the renamed variable - [ ] Verify any deploy/CI configs that set the old name are updated alongside this change <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Updated internal environment variable naming for API key management and server configuration consistency across backend systems, Docker deployment, and local development setup. <!-- end of auto-generated comment: release notes by coderabbit.ai --> |
||
|
|
24245ae54e
|
Rename STACK_SEED_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY to STACK_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY (#1411)
## Summary - Renames the env var `STACK_SEED_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY` to `STACK_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY` everywhere it is used (24 occurrences across 9 files), covering backend env files, the Prisma seed script, runtime config, and the docker entrypoint/local-emulator scripts. ## Test plan - [x] `pnpm lint` - [x] `pnpm typecheck` - [ ] Verify local emulator still boots with the renamed variable - [ ] Verify any deploy/CI configs that set the old name are updated alongside this change <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Updated internal environment variable naming for consistency across backend configuration files and deployment scripts. <!-- end of auto-generated comment: release notes by coderabbit.ai --> |
||
|
|
37ee5ec320
|
Fast-start local emulator via RAM snapshot + live secret rotation (#1340)
## Summary
`stack emulator start` now resumes a fully-warm VM snapshot instead of
cold-booting, bringing startup from 30–120s down to ~5–8s with
per-install secret rotation, or ~2.5s with rotation opt-out. The
snapshot is captured **locally on first `stack emulator pull`**, not
shipped from CI — QEMU migration state isn't portable across
accelerators (KVM/HVF/TCG) or `-cpu max` feature sets, so a CI-captured
snapshot couldn't resume reliably on arbitrary user hardware.
Also bundles a pile of CLI QoL fixes (progress bars, PR/run artifact
pulls, PR-build download, native-TS ISO writer replacing
`hdiutil`/`mkisofs`/`genisoimage` host dep, unit tests).
| Scenario | Before | After |
|---|---|---|
| Cold boot (no snapshot) | 30–120s | same, works as fallback |
| `stack emulator pull` (one-time, includes local snapshot capture) |
~30s download | ~30s download + ~1–3 min cold-boot capture |
| Snapshot resume, normal start | — | **~5–8s** |
| Snapshot resume, `EMULATOR_NO_ROTATION=1` | — | **~2.5s** |
Backend (`/health?db=1`) and dashboard (`/handler/sign-in`) return 200
on all paths. Two successive snapshot resumes produce different rotated
PCK/SSK/SAK/CRON_SECRET values per install.
## How it works
**Build (CI)** — `docker/local-emulator/qemu/build-image.sh`:
1. Cloud-init provisioning runs to completion (migrations, seed,
slim-image) producing `stack-emulator-<arch>.qcow2`.
2. Image is built with a topology compatible with later snapshot capture
(pinned SMP=4, phantom seed/bundle ISOs, STACKCFG runtime ISO mounted at
build time, qemu-guest-agent running, placeholder hex secrets baked in
under `STACK_EMULATOR_BUILD_SNAPSHOT=1`).
3. CI publishes **only the qcow2** — no `.savevm.zst` ships.
**Pull (user's machine)** —
`packages/stack-cli/src/commands/emulator.ts` + `run-emulator.sh
capture`:
1. `stack emulator pull` downloads the qcow2 with a progress bar (or
from a PR / workflow run via `--pr` / `--run`).
2. CLI invokes `run-emulator.sh capture`: cold-boots the qcow2 with a
matching device layout (phantom ISOs, fsdev, pcie-root-port, virtfs
detached — migration-incompatible), waits for backend+dashboard health,
then drives QMP: `stop` → set `mapped-ram` + `multifd` caps → `migrate
file:state.raw` → poll `query-migrate` → `quit`. Raw mapped-ram file is
zstd-compressed to `stack-emulator-<arch>.savevm.zst` in the images dir.
3. `--skip-snapshot` opts out (first `start` will then cold-boot).
**Runtime** — `run-emulator.sh start`:
1. Launch QEMU with `-incoming defer` when a `.savevm.zst` is present;
decompress on first use, keep the `.raw` cached for subsequent starts.
2. QMP: same `mapped-ram` + `multifd` caps → `migrate-incoming
file:<.raw>` → poll for `paused` → `cont`.
3. Generate fresh per-install secrets on the host; pipe them
base64-encoded through QGA `guest-exec input-data` →
`trigger-fast-rotate` in the guest → `docker exec -e … rotate-secrets`.
4. `rotate-secrets` in the container: validate keys (hex-only), targeted
`sed` on the placeholder PCK across built JS, `UPDATE ApiKeySet`,
`supervisorctl restart stack-app cron-jobs` (with
`stopasgroup`/`killasgroup` so the Node children actually die and
release their ports).
5. Poll backend+dashboard health; if anything fails, clean up and fall
back to cold boot transparently.
**Security model**: placeholder hex values are baked into the snapshot
(`00…ff` PCK, `00…ee` SSK, `00…dd` SAK, `00…cc` CRON_SECRET). They are
non-secret by construction. Real per-install secrets are generated at
each `emulator start` and never leave the host.
## CLI changes (`packages/stack-cli`)
- **`src/lib/iso.ts`** (new): native TypeScript ISO 9660 + Joliet
writer, replacing the host-side `hdiutil`/`mkisofs`/`genisoimage`
dependency for generating the STACKCFG runtime config disk. Unit tests
in `src/lib/iso.test.ts`.
- **`src/commands/emulator.ts`**:
- `pull`: streamed downloads with progress bar + ETA; `--pr <number>`
and `--run <id>` to pull from a PR build's CI artifacts (uses
`extract-zip` for the nested zip); `--skip-snapshot` to opt out of the
one-time local capture.
- `start` (existing, extended): auto-pulls AND auto-captures when no
image exists, so first-ever `start` is self-bootstrapping; emits
`STACK_EMULATOR_CLI_WROTE_ISO=1` so the shell helper skips its own ISO
regen (avoids the genisoimage host dep).
- `capture` (new, invoked by `pull` and the auto-pull path of `start`):
drives the local snapshot capture via `run-emulator.sh`.
- `status`, `stop`, `reset`, `list-releases`: preflight +
path-resolution tightening (`STACK_EMULATOR_HOME` → images/run dirs).
- Unit tests in `src/commands/emulator.test.ts`.
- **`EMULATOR_NO_ROTATION=1`** env var skips the post-resume rotation
(intended for tests/CI where the placeholder secrets are fine — comes
with a loud warning).
## CI (`.github/workflows/qemu-emulator-build.yaml`)
- Builds **QEMU 10.2.2 from source** (cached), because
`mapped-ram`/`multifd` migration capabilities aren't available in the
distro's QEMU. Enables KVM on ubicloud runners so amd64 boots at
hardware speed.
- amd64 + arm64 both build on the same amd64 matrix
(`ubicloud-standard-8`); arm64 runs under cross-arch TCG (provisioning
only — boot/verify smoke test is amd64-only).
- Verification now runs through the CLI: `emulator start` → `emulator
status` → `emulator stop` against the freshly-built qcow2 (via
`STACK_EMULATOR_HOME` pointing at the workspace, so the CLI doesn't
silently auto-pull a prior release).
- Packages **only** the qcow2. No `.savevm.zst` upload / publish.
- Release notes updated.
## Key files
**Shell / guest:**
- `docker/local-emulator/qemu/build-image.sh` — snapshot-compatible
device topology + STACKCFG runtime ISO at build time
- `docker/local-emulator/qemu/run-emulator.sh` — `start`, `capture`,
`stop`, `reset`, `status`; `-incoming defer`, `.raw` cache, QGA-driven
rotation, cold-boot fallback
- `docker/local-emulator/qemu/common.sh` (new) — shared `qmp_session` +
`capture_vm_state` (factored out so build-image.sh and run-emulator.sh
share the capture path)
- `docker/local-emulator/qemu/cloud-init/emulator/user-data` —
placeholder secrets in snapshot mode, `wait-for-stack-ready`,
`trigger-fast-rotate`, qemu-guest-agent enabled
- `docker/local-emulator/rotate-secrets.sh` (new) — in-container
rotation (sed + UPDATE + supervisorctl)
- `docker/local-emulator/supervisord.conf` — `stopasgroup`/`killasgroup`
on `stack-app` and `cron-jobs`
- `docker/local-emulator/entrypoint.sh` — only mint CRON_SECRET if unset
(placeholder supplied in snapshot mode via --env-file)
- `docker/local-emulator/Dockerfile` — ships `rotate-secrets` to
`/usr/local/bin`
- `docker/server/entrypoint.sh` — source
`/run/stack-auth/rotated-secrets.env`; skip full-tree sentinel scan on
warm restarts via marker
**CLI:**
- `packages/stack-cli/src/lib/iso.ts` (new) + `iso.test.ts` (new)
- `packages/stack-cli/src/commands/emulator.ts` + `emulator.test.ts`
(new)
- `packages/stack-cli/vitest.config.ts` (new)
**CI:**
- `.github/workflows/qemu-emulator-build.yaml`
## Test plan
- [x] `docker/local-emulator/qemu/build-image.sh {amd64,arm64}` produces
`stack-emulator-<arch>.qcow2` with snapshot-compatible topology
- [x] `stack emulator pull` downloads qcow2 with progress, then captures
locally (~1–3 min) and writes `stack-emulator-<arch>.savevm.zst` in the
images dir
- [x] `stack emulator pull --skip-snapshot` stops after download
- [x] `stack emulator pull --pr <n>` / `--run <id>` pull from PR /
workflow run artifacts
- [x] `stack emulator start` on a fresh dir auto-pulls **and**
auto-captures, then starts; subsequent starts fast-resume in ~5–8s;
backend + dashboard return 200
- [x] `EMULATOR_NO_ROTATION=1 stack emulator start` completes in ~2.5s;
backend + dashboard return 200 with warning printed
- [x] Two consecutive `emulator start` invocations produce different PCK
values in the internal `ApiKeySet` row
- [x] `stack emulator status` / `stop` / `reset` resolve paths from
`STACK_EMULATOR_HOME`
- [x] Verified end-to-end on arm64 macOS under HVF (capture ~50s,
fast-resume ~6.5s)
- [x] `pnpm lint` and `pnpm typecheck` pass; stack-cli unit tests (iso +
emulator) pass
- [ ] CI green on this PR (qemu-emulator-build matrix, smoke test)
- [ ] `gh release download emulator-<branch>-latest` contains only
`stack-emulator-<arch>.qcow2` once this PR merges and publish runs
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Snapshot fast-start/resume with optional warm-snapshot assets, runtime
ISO generation, and a cached QEMU build to speed emulator setup.
* CLI: streamed artifact downloads with progress, improved release/asset
handling, stronger preflight checks, and start/status/stop emulator
commands.
* Automated secret rotation and ability to apply rotated secrets at
container startup; supervisor control socket enabled.
* **Bug Fixes**
* More robust start/stop/resume flows with automatic fallback to cold
boot and improved process-group shutdown behavior.
* **Tests**
* New tests for CLI utilities and ISO image generation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
|