stack/docker/server/entrypoint.sh
BilalG1 f7e389809e
Some checks failed
all-good: Did all the other checks pass? / all-good (push) Has been cancelled
Ensure Prisma migrations are in sync with the schema / check_prisma_migrations (22.x) (push) Has been cancelled
DB migration compat / Check if migrations changed (push) Has been cancelled
Docker Server Build and Push / Docker Build and Push Server (push) Has been cancelled
Docker Server Build and Run / docker (push) Has been cancelled
Runs E2E API Tests (Local Emulator) / E2E Tests (Local Emulator, Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (mock, 22.x) (push) Has been cancelled
Runs E2E API Tests / E2E Tests (Node ${{ matrix.node-version }}, Freestyle ${{ matrix.freestyle-mode }}) (prod, 22.x) (push) Has been cancelled
Runs E2E API Tests with custom port prefix / build (22.x) (push) Has been cancelled
Runs E2E Fallback Tests / E2E Fallback Tests (Node ${{ matrix.node-version }}) (22.x) (push) Has been cancelled
Lint & build / lint_and_build (24) (push) Has been cancelled
TOC Generator / TOC Generator (push) Has been cancelled
DB migration compat / Back-compat — Current branch migrations with ${{ needs.check-migrations-changed.outputs.base_branch }} branch code (push) Has been cancelled
DB migration compat / Forward-compat — Current branch code with ${{ needs.check-migrations-changed.outputs.base_branch }} branch migrations (push) Has been cancelled
DB migration compat / No migration changes (skipped) (push) Has been cancelled
feat(hexclave): PR 1 — wire compatibility layer (invisible) (#1475)
## Summary

**Stacked on #1468** (`docs/hexclave-rename-plan` — the plan doc). Diff
vs that base = the actual PR 1 code.

This is **PR 1 of the Hexclave rebrand: the invisible compatibility
layer**. Everything is additive. Old SDKs, old wire identifiers, and old
env var names keep working unchanged. The backend dual-accepts and
dual-emits; new SDK code emits `x-hexclave-*` headers and the
`hexclave_` Bearer prefix; cookies dual-write; env vars dual-read across
every category. **No user-visible rebranding lands here** — that's PR 2.

See [`RENAME-TO-HEXCLAVE.md`](./RENAME-TO-HEXCLAVE.md) → *"PR 1
implementation guide"* for the full per-work-area spec, file pointers,
and chosen approach.

## What's implemented (all 14 PR-1 work-areas)

- **SDK export aliases** — `Hexclave*` aliases for the user-facing
`Stack*` exports added in `packages/template`; codegen propagates them
to `@stackframe/{js,stack,react,tanstack-start}`. React-only aliases
correctly excluded from `@stackframe/js`. (`e60550a2`)
- **JWT issuer dual-accept** — `decodeAccessToken` accepts both
`api.stack-auth.com` and `api.hexclave.com` issuers. Signing unchanged.
(`fc781def`)
- **Request-header dual-accept** — backend + dashboard proxies normalize
`x-hexclave-*` → `x-stack-*` at the existing empty proxy hook (so
`smart-request.tsx` and every route schema keep working unchanged); CORS
allowlists extended via a derive-once helper. (`2a056eac`)
- **MCP `ask_hexclave`** — registered alongside `ask_stack_auth` via a
shared helper; `ask_stack_auth` behavior byte-identical. (`30ffd604`)
- **Dev-tool** — DOM ids + header emit switched.
`window.HexclaveDevTool` exposed alongside `window.StackDevTool`.
(`32131ea7`)
- **The big consolidated commit** (`7fed864a`):
- **Env vars** — central `getEnvVariable` prefix-transform (HEXCLAVE
first, STACK fallback); dashboard + template client env files dual-read;
`turbo.json` globalEnv; `NEXT_PUBLIC_STACK_PORT_PREFIX` renamed outright
across ~82 files including docker.
- **Cookies** — dual-write/dual-read auth (`stack-access`/`-refresh-*`
and custom-domain variants), OAuth-state
(`stack-oauth-{inner,outer}-*`), and low-risk cookies (`stack-is-https`,
`stack-last-seen-changelog-version`). Bypass sites patched (backend
OAuth callback, dashboard remote-dev auth route, impersonation snippets,
snapshot serializer).
- **Bearer prefix** — SDK token parser accepts both `stackauth_` and
`hexclave_`; emits `hexclave_`. Discovery correction: this is purely
SDK-internal — the backend never parses it.
- **Response headers** — backend dual-emits
`x-hexclave-{request-id,actual-status,known-error}`; SDKs dual-read (new
first, stack fallback).
- **SDK request-header emit switch** —
`client/server/admin-interface.ts` + dashboard `api-headers.ts` +
`internal-project-headers.ts` + `feedback-form.tsx` switched to
`x-hexclave-*`. Plus `stack_response_mode` query param.
- **Storage keys** — dev-tool / cli-auth / oauth-button / docs keys
renamed (straight); `stack:session-replay:v1` dual-read so in-progress
recordings survive SDK upgrades; `stack_mfa_attempt_code` dual-read.
- **Query params** — cross-domain params dual-emit/dual-accept via
shared helpers; backend `oauth/authorize` accepts
`hexclave_response_mode` and `stack_response_mode`; `stack-init-id`
renamed.
- **`Symbol.for`** — app-internals symbol gets a parallel
`Symbol.for("Hexclave--app-internals")` getter on each attach site (no
read-site churn — old symbol still attached). 3 file-private symbols
renamed outright.
- **Config discovery** — prefer `hexclave.config.ts`, fall back to
`stack.config.ts` at every discovery site (CLI / dashboard / backend /
local-emulator); `init` writes the new filename; CLI credentials path
migrates.
- **Internal renames** — `StackAssertionError`,
`StackClient/Server/AdminInterface` renamed outright (no alias, per the
"internal-only → rename" rule). ~264 files touched.
- **Review-pass fixes** (`21217fbe`) — three real bugs found by parallel
review agents and fixed:
- `snapshot-serializer.ts` was interpolating the whole
`keyedCookieNamePrefixes` array (`${arr}`) — adding a second prefix
would have corrupted **every** OAuth-cookie snapshot, not just new ones.
- **Docker port-prefix producer/consumer mismatch** —
`entrypoint.sh`/`run-emulator.sh`/cloud-init `user-data` were still
producing `NEXT_PUBLIC_STACK_PORT_PREFIX` while the dashboard sentinel +
consumers had been renamed; silent self-host regression (custom port
prefix would be ignored).
- **Missing `hexclave-oauth-inner-*` dual-write** in the OAuth authorize
route — callback's fallback masked it but the dual-write was specified
by the plan.
- Plus: `mcp.test.ts` tool-list assertions updated to include
`ask_hexclave`; two dashboard header-emit sites switched to
`x-hexclave-*` for consistency.
- **E2E snapshot serializer follow-up** (`4b16cc5d`) —
`x-hexclave-request-id` added to the hidden-headers list (mirroring
`x-stack-request-id` treatment), and 2 sample inline snapshots
regenerated in `projects.test.ts` to include the new dual-emitted
headers.

## Verification

- **`pnpm typecheck`** — clean (the fresh-worktree `@/.source` / Prisma
codegen gap in `stack-docs` is pre-existing and unrelated).
- **`pnpm lint`** — 29/29 packages green.
- **`pnpm exec turbo run build --filter=./packages/*`** — 13/13 packages
build (including `@stackframe/stack-cli` once the dashboard standalone
is present).
- **Live E2E** against a running backend on `cl/hexclave-pr1`:
- `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/internal/mcp.test.ts` — **6/6
pass** (verifies the new `ask_hexclave` tool — the hand-written inline
snapshot matched actual MCP server output).
- `pnpm test run
apps/e2e/tests/backend/endpoints/api/v1/internal/projects.test.ts` —
**11/11 pass** (verifies wire dual-accept + dual-emit end-to-end; the
snapshot serializer fix was found and applied during this check).

A four-agent parallel **review pass** also audited the full diff for
logic/runtime bugs across the work-areas (wire headers + JWT, cookies +
bearer + symbols, env vars, query params + config + MCP + aliases). All
in-slice review verdicts were ✓ except the three bugs listed above,
which are now fixed.

## Known follow-ups (out of scope for this PR)

- **E2E snapshots across the rest of the suite** — backend now
dual-emits `x-hexclave-{known-error,actual-status}` alongside
`x-stack-*`, which legitimately appears in inline snapshots throughout
`apps/e2e`. Two were regenerated here as a sample; the rest should regen
with `vitest -u` in CI.
- **Docker shell env vars beyond `PORT_PREFIX`** — `entrypoint.sh` still
reads `STACK_*` env vars directly (the JS-side `getEnvVariable`
transform doesn't help the shell). JS consumers dual-read so it works in
practice; full shell-level dual-read is a deeper self-host follow-up.
- **`@stackframe/stack-cli` build ordering** — pre-existing; needs
`build:rde-standalone` first. Not affected by this PR.

## Test plan

- [ ] CI runs full e2e suite (with `vitest -u` to absorb dual-emit
snapshot deltas, then committed back)
- [ ] Spot-check: an old SDK build (emitting only `x-stack-*`) still
authenticates against the new backend
- [ ] Spot-check: a new SDK (emitting `x-hexclave-*` / `Bearer
hexclave_*`) still authenticates against an old backend during deploy
ordering
- [ ] Manual: `npx @stackframe/stack-cli@latest init` (new onboarding
entrypoint) generates `hexclave.config.ts`
- [ ] Manual: existing `stack.config.ts`-only project still resolves (no
migration required)

---------

Co-authored-by: bilal <bilal@stack-auth.com>
2026-05-23 17:24:55 -07:00

253 lines
12 KiB
Bash
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#!/bin/bash
set -e
# ============= ROTATED SECRETS OVERLAY =============
# On emulator snapshot resume, the host injects freshly-generated secrets into
# /run/stack-auth/rotated-secrets.env before supervisorctl restarts us. Sourcing
# here lets a fast-restart pick up new values without a full container restart.
if [ -f /run/stack-auth/rotated-secrets.env ]; then
set -a
# shellcheck disable=SC1091
source /run/stack-auth/rotated-secrets.env
set +a
fi
# ============= FORWARD MOCK OAUTH SERVER =============
# Start socat to forward port 32202 for mock-oauth-server if enabled
if [ "$STACK_FORWARD_MOCK_OAUTH_SERVER" = "true" ]; then
socat TCP-LISTEN:32202,fork,reuseaddr TCP:host.docker.internal:32202 &
fi
# ============= ENV VARS =============
if [ "$NEXT_PUBLIC_STACK_IS_LOCAL_EMULATOR" = "true" ]; then
for v in STACK_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY STACK_INTERNAL_PROJECT_SECRET_SERVER_KEY STACK_SEED_INTERNAL_PROJECT_SUPER_SECRET_ADMIN_KEY; do
if [ -z "${!v:-}" ]; then
echo "$v must be set in local-emulator mode (injected by the QEMU VM)." >&2
exit 1
fi
done
export STACK_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY STACK_INTERNAL_PROJECT_SECRET_SERVER_KEY STACK_SEED_INTERNAL_PROJECT_SUPER_SECRET_ADMIN_KEY
else
export STACK_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY=${STACK_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY:-$(openssl rand -base64 32)}
export STACK_INTERNAL_PROJECT_SECRET_SERVER_KEY=${STACK_INTERNAL_PROJECT_SECRET_SERVER_KEY:-$(openssl rand -base64 32)}
export STACK_SEED_INTERNAL_PROJECT_SUPER_SECRET_ADMIN_KEY=${STACK_SEED_INTERNAL_PROJECT_SUPER_SECRET_ADMIN_KEY:-$(openssl rand -base64 32)}
fi
export NEXT_PUBLIC_STACK_PROJECT_ID=internal
export NEXT_PUBLIC_STACK_PUBLISHABLE_CLIENT_KEY=${STACK_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY}
if [ -n "${STACK_INTERNAL_PROJECT_SECRET_SERVER_KEY:-}" ]; then
export STACK_SECRET_SERVER_KEY=${STACK_INTERNAL_PROJECT_SECRET_SERVER_KEY}
fi
if [ -n "${STACK_SEED_INTERNAL_PROJECT_SUPER_SECRET_ADMIN_KEY:-}" ]; then
export STACK_SUPER_SECRET_ADMIN_KEY=${STACK_SEED_INTERNAL_PROJECT_SUPER_SECRET_ADMIN_KEY}
fi
# ============= HEXCLAVE ↔ STACK URL ENV MIRROR =============
# The dashboard bundle inlines BOTH process.env.NEXT_PUBLIC_HEXCLAVE_* and
# process.env.NEXT_PUBLIC_STACK_* references as sentinels (dual-read). At
# runtime the sentinel-replace loop only substitutes a sentinel when the
# corresponding env var is set — but the dashboard's fallback chain
# (`HEXCLAVE_X ?? STACK_X`) treats an unreplaced sentinel as truthy, so it
# would pick the literal sentinel string instead of the real URL whenever
# only one of the two env names is set by the self-host operator.
# Mirror the URL trio HEXCLAVE → STACK and STACK → HEXCLAVE before the
# sentinel-replace runs, so both sentinels resolve to the same real value
# regardless of which name the operator chose.
for _legacy in STACK_API_URL STACK_DASHBOARD_URL STACK_SVIX_SERVER_URL; do
_new=HEXCLAVE_${_legacy#STACK_}
_legacy_full=NEXT_PUBLIC_${_legacy}
_new_full=NEXT_PUBLIC_${_new}
_legacy_val=${!_legacy_full:-}
_new_val=${!_new_full:-}
if [ -n "$_new_val" ] && [ -z "$_legacy_val" ]; then
export "$_legacy_full=$_new_val"
elif [ -n "$_legacy_val" ] && [ -z "$_new_val" ]; then
export "$_new_full=$_legacy_val"
fi
done
export NEXT_PUBLIC_BROWSER_STACK_DASHBOARD_URL=${NEXT_PUBLIC_STACK_DASHBOARD_URL}
# Hexclave rebrand: the port-prefix var was renamed outright to
# NEXT_PUBLIC_HEXCLAVE_PORT_PREFIX. The dashboard bundle's post-build sentinel
# is STACK_ENV_VAR_SENTINEL_NEXT_PUBLIC_HEXCLAVE_PORT_PREFIX, and the sentinel
# substitution loop below derives the env var name from the sentinel — so this
# MUST export NEXT_PUBLIC_HEXCLAVE_PORT_PREFIX or the sentinel never resolves.
# Accept the legacy NEXT_PUBLIC_STACK_PORT_PREFIX as input for back-compat with
# existing self-host configs.
export NEXT_PUBLIC_HEXCLAVE_PORT_PREFIX=${NEXT_PUBLIC_HEXCLAVE_PORT_PREFIX:-${NEXT_PUBLIC_STACK_PORT_PREFIX:-81}}
PORT_PREFIX=${NEXT_PUBLIC_HEXCLAVE_PORT_PREFIX}
export NEXT_PUBLIC_SERVER_STACK_DASHBOARD_URL="http://localhost:${PORT_PREFIX}01"
export NEXT_PUBLIC_BROWSER_STACK_API_URL=${NEXT_PUBLIC_STACK_API_URL}
export NEXT_PUBLIC_SERVER_STACK_API_URL="http://localhost:${PORT_PREFIX}02"
export BACKEND_PORT=${BACKEND_PORT:-${PORT_PREFIX}02}
export DASHBOARD_PORT=${DASHBOARD_PORT:-${PORT_PREFIX}01}
export USE_INLINE_ENV_VARS=true
if [ -z "${NEXT_PUBLIC_STACK_SVIX_SERVER_URL}" ]; then
export NEXT_PUBLIC_STACK_SVIX_SERVER_URL=${STACK_SVIX_SERVER_URL}
fi
# ============= MIGRATIONS =============
should_run_migrations=true
if [ "$STACK_SKIP_MIGRATIONS" = "true" ] || [ "$STACK_RUN_MIGRATIONS" = "false" ]; then
should_run_migrations=false
fi
if [ "$should_run_migrations" = "false" ]; then
echo "Skipping migrations."
else
echo "Running migrations..."
cd apps/backend
node dist/db-migrations.mjs migrate
cd ../..
fi
should_run_seed_script=true
if [ "$STACK_SKIP_SEED_SCRIPT" = "true" ] || [ "$STACK_RUN_SEED_SCRIPT" = "false" ]; then
should_run_seed_script=false
fi
if [ "$should_run_seed_script" = "false" ]; then
echo "Skipping seed script."
else
echo "Running seed script..."
cd apps/backend
node dist/db-migrations.mjs seed
cd ../..
fi
# ============= LOCAL EMULATOR: BOOTSTRAP INTERNAL API KEY SET =============
# The build-time seed ran without any keys (the VM generates random ones on
# first boot). The slim image strips apps/backend/dist so we can't re-run the
# full seed here. Instead, targeted-upsert the internal api key set with the
# VM-supplied keys:
# - pck: used by stack-cli to auth against /api/v1/internal/local-emulator/project
# - ssk/sak: required by the emulator's own dashboard (StackServerApp ctor
# throws without ssk). User-app flows don't use these — per-project
# credentials come from the /local-emulator/project route.
if [ "$NEXT_PUBLIC_STACK_IS_LOCAL_EMULATOR" = "true" ] && [ -n "${STACK_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY:-}" ] && [ -n "${STACK_DATABASE_CONNECTION_STRING:-}" ]; then
# Validate the keys are hex-only to defuse any SQL-injection risk (the VM
# generates them via `openssl rand -hex 32`, so this is an assert, not a filter).
for varname in STACK_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY STACK_INTERNAL_PROJECT_SECRET_SERVER_KEY STACK_SEED_INTERNAL_PROJECT_SUPER_SECRET_ADMIN_KEY; do
val="${!varname:-}"
if [ -z "$val" ]; then
echo "ERROR: $varname is not set; refusing to bootstrap internal api key set." >&2
exit 1
fi
if ! printf '%s' "$val" | grep -Eq '^[0-9a-fA-F]+$'; then
echo "ERROR: $varname is not hex-only; refusing to bootstrap internal api key set." >&2
exit 1
fi
done
echo "Bootstrapping internal API key set (emulator runtime)..."
psql "$STACK_DATABASE_CONNECTION_STRING" -v ON_ERROR_STOP=1 <<SQL
INSERT INTO "ApiKeySet" ("projectId", id, description, "expiresAt", "createdAt", "updatedAt", "publishableClientKey", "secretServerKey", "superSecretAdminKey")
VALUES ('internal', '3142e763-b230-44b5-8636-aa62f7489c26', 'Internal API key set', '2099-12-31T23:59:59Z', NOW(), NOW(),
'${STACK_INTERNAL_PROJECT_PUBLISHABLE_CLIENT_KEY}',
'${STACK_INTERNAL_PROJECT_SECRET_SERVER_KEY}',
'${STACK_SEED_INTERNAL_PROJECT_SUPER_SECRET_ADMIN_KEY}')
ON CONFLICT ("projectId", id) DO UPDATE SET
"publishableClientKey" = EXCLUDED."publishableClientKey",
"secretServerKey" = EXCLUDED."secretServerKey",
"superSecretAdminKey" = EXCLUDED."superSecretAdminKey",
"updatedAt" = NOW();
SQL
fi
# ============= ENV VARS =============
# Create a working directory for our processed files.
# Keep this off /tmp so local-emulator config sharing can bind-mount /tmp
# without pushing the whole runtime copy step onto the host filesystem.
WORK_DIR="${STACK_RUNTIME_WORK_DIR:-/var/tmp/stack-runtime}"
mkdir -p "$WORK_DIR"
if [ "$WORK_DIR" != "/app" ]; then
echo "Copying files to working directory..."
cp -r /app/. "$WORK_DIR"/.
fi
# The full-tree sentinel scan is expensive (several seconds over the whole built
# app tree). On a fast-restart — triggered by the emulator snapshot rotation
# path — the placeholders have already been sed-replaced by rotate-secrets,
# and no new sentinels need substitution. Skip the scan in that case. Marker
# lives in WORK_DIR because the docker/server image runs as the unprivileged
# `node` user and cannot write to /var/run.
SENTINEL_MARKER="$WORK_DIR/.stack-sentinels-replaced"
if [ -f "$SENTINEL_MARKER" ]; then
echo "Sentinels already replaced on a previous start; skipping scan."
else
# Find all files in the apps directory that contain a STACK_ENV_VAR_SENTINEL and extract the unique sentinel strings.
# Require at least one character after `STACK_ENV_VAR_SENTINEL_` — a bare
# `STACK_ENV_VAR_SENTINEL_` (trailing underscore but no suffix) makes env_var
# empty below, which would crash `${!env_var}` with "invalid variable name"
# under `set -e`. The dashboard bundle's sentinel-construction code embeds
# the prefix as a literal string, so this case occurs in practice.
echo "Finding unhandled sentinels..."
unhandled_sentinels=$(find "$WORK_DIR/apps" -type f -exec grep -l "STACK_ENV_VAR_SENTINEL" {} + | \
xargs grep -h "STACK_ENV_VAR_SENTINEL" | \
grep -oE "STACK_ENV_VAR_SENTINEL_[A-Z_]*[A-Z]+[A-Z_]*" | \
sort -u)
# Choose an uncommon delimiter here, we use the ASCII Unit Separator (0x1F)
delimiter=$(printf '\037')
echo "Replacing sentinels..."
for sentinel in $unhandled_sentinels; do
# The sentinel is like "STACK_ENV_VAR_SENTINEL_MY_VAR", so extract the env var name.
env_var=${sentinel#STACK_ENV_VAR_SENTINEL_}
# Defense in depth: skip if env_var name is empty. The regex above already
# excludes bare-prefix matches, but `${!env_var}` with an empty name aborts
# the whole script under `set -e`, so guard it explicitly.
if [ -z "$env_var" ]; then
continue
fi
# Get the corresponding environment variable value.
value="${!env_var}"
# If the env var is not set, skip replacement.
if [ -z "$value" ]; then
continue
fi
# Although the sentinel only contains [A-Z_] we still escape it for any regex meta-characters.
escaped_sentinel=$(printf '%s\n' "$sentinel" | sed -e 's/\\/\\\\/g' -e 's/[][\/.^$*]/\\&/g')
# For the replacement value, first escape backslashes, then escape any occurrence of
# the chosen delimiter and the '&' (which has special meaning in sed replacements).
escaped_value=$(printf '%s\n' "$value" | sed -e 's/\\/\\\\/g' -e "s/[${delimiter}&]/\\\\&/g")
# Hexclave rebrand: only sed files that actually contain the sentinel. The previous
# `find … -exec sed -i … {} +` ran sed across the ENTIRE standalone build for every
# sentinel (22 sentinels × thousands of files), and got unworkable once the dashboard
# bundle grew to include dual-literal _inlineEnvVars references. Restrict to matching
# files; also log per-sentinel so a hang at any specific sentinel is visible.
echo " - Replacing $sentinel"
files=$(grep -rl "$sentinel" "$WORK_DIR/apps" 2>/dev/null || true)
if [ -n "$files" ]; then
echo "$files" | xargs sed -i "s${delimiter}${escaped_sentinel}${delimiter}${escaped_value}${delimiter}g"
fi
done
echo "Sentinel replacement complete."
touch "$SENTINEL_MARKER"
fi
# ============= START BACKEND AND DASHBOARD =============
echo "Starting backend on port $BACKEND_PORT..."
cd "$WORK_DIR"
PORT=$BACKEND_PORT HOSTNAME=0.0.0.0 node apps/backend/server.js &
echo "Starting dashboard on port $DASHBOARD_PORT..."
PORT=$DASHBOARD_PORT HOSTNAME=0.0.0.0 node apps/dashboard/server.js &
# Wait for both to finish
wait -n