开源的用户管理解决方案,自带前端组件和管理后台。
Go to file
BilalG1 969bf03c5a
perf(platform-analytics): cut ClickHouse query peak memory (#1632)
## What

Performance pass on the internal **platform-analytics** route. All 17
ClickHouse queries fire in a single `Promise.all` on the shared
`stackframe` admin user, which is subject to a **9 GB per-user** memory
cap — so the worst case is the *sum* of per-query peaks, not the max.
Benchmarked at 10k projects / 1M users / 50M events (power-law, top
project ≈100k users), the sum of peaks was ~6.7 GiB. This PR brings it
down to ~3.8 GiB.

## Changes

**ClickHouse — `sipHash64(user_id)` as the distinct key** (exact,
verified byte-identical):

| query | peak mem | Δ |
|---|---|---|
| `dauSeries` | 949 → 373 MiB | −61% |
| `mauProjects` | 715 → 313 MiB | −56% |
| `activeByProject` | 635 → 374 MiB | −41% |
| `sparkByProject` | 1165 → 809 MiB | −31% |

A 64-bit hash has negligible collision probability over 1M users; the
benchmark confirmed identical output. (Same trick already used in the
internal-metrics MAU query.)

**ClickHouse — sample the activity split**
(`new`/`retained`/`reactivated`):
The split was the single heaviest query (~1.3 GiB) — its cost is a
window function over ~25.8M `(user, day)` rows plus an all-history scan,
which `sipHash` alone barely helped (−7%). It now uses **consistent
1-in-4 user sampling** (same `cityHash64(user_id) % 4` bucket applied to
both subqueries so each sampled user's full activity sequence is
preserved; counts scaled ×4):

- **317 MiB (−78%)** peak memory, **~0.4% mean error** (max 1.4% on the
smallest day) vs the exact result.

This is an **approximation** — the dashboard "Growth quality" chart now
notes it (`subtitle: "… · sampled estimate (~0.4%)"`).
`ACTIVITY_SPLIT_SAMPLE` is a single constant in the route; set it to `1`
to go back to exact.

## What I tried that did NOT make the cut (documented in the harnesses)

- `country` — peak memory is dominated by the per-user `argMax(country,
event_at)` payload, not the key, so hashing does nothing. Left
exact/unchanged.
- PG `authMethods` / `email` — with the production composite PK indexes
the original plans are already best; correlated-subquery / anti-join
rewrites were far worse. No PG query changes in this PR.

## Benchmark harnesses (added)

- `apps/backend/scripts/benchmark-platform-analytics.ts` — full-route
baseline (per-query time/memory/rows).
- `apps/backend/scripts/optimize-platform-analytics.ts` — sipHash & PG
variant comparison with byte-equality checks.
- `apps/backend/scripts/optimize-split.ts` — exact vs sampled split
variants with accuracy measurement.

They seed isolated `bench_pa` databases (server-side, auto-cleaned) and
read `system.query_log` / `EXPLAIN (ANALYZE, BUFFERS)`. Run e.g.:
`pnpm --filter @hexclave/backend run with-env:dev tsx
scripts/optimize-split.ts`

## Testing

- Backend `typecheck` passes. (Dashboard has pre-existing typecheck
errors on the base branch in unrelated files — auth-methods,
team-analytics, user-emails, RDE config — not touched here.)
- All exact rewrites verified byte-identical to the originals by the
harnesses; the sampled split measured at ~0.4% mean error.

Numbers are local warm-cache (relative shape, not production latency).

<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Cuts worst-case ClickHouse memory for the internal platform analytics
route by switching to hashed distinct keys and sampling the heaviest
query. On a 10k projects / 1M users / 50M events benchmark, the sum of
per-query peaks drops from ~6.7 GiB to ~3.8 GiB with exact results (or
~0.4% error on the sampled chart).

- **Performance**
- Use sipHash64(user_id) as the distinct key in uniqExact/uniqExactIf
for DAU series, MAU/projects, active-by-project, and sparkline. Exact
results (verified). Peak memory down 31–61% per query.
- Sample the new/retained/reactivated split at 1-in-4 users (consistent
`cityHash64` bucket across subqueries, counts ×4). Peak memory ~−78%
(~1.3 GiB → ~0.3 GiB) with ~0.4% mean error. Toggle via
`ACTIVITY_SPLIT_SAMPLE` (set to 4; set to 1 for exact). Dashboard
subtitle now notes “sampled estimate (~0.4%).”
- Added local harnesses to seed isolated data and measure
time/memory/equality:
`apps/backend/scripts/internal-analytics/benchmark-platform-analytics.ts`,
`optimize-platform-analytics.ts`, `optimize-split.ts`.

<sup>Written for commit 60ccf1a06f.
Summary will update on new commits.</sup>

<a
href="https://cubic.dev/pr/hexclave/hexclave/pull/1632?utm_source=github"
target="_blank" rel="noopener noreferrer"
data-no-image-dialog="true"><picture><source
media="(prefers-color-scheme: dark)"
srcset="https://www.cubic.dev/buttons/review-in-cubic-dark.svg"><source
media="(prefers-color-scheme: light)"
srcset="https://www.cubic.dev/buttons/review-in-cubic-light.svg"><img
alt="Review in cubic"
src="https://www.cubic.dev/buttons/review-in-cubic-dark.svg"></picture></a>

<!-- End of auto-generated description by cubic. -->

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Updates

* **Improvements**
* Enhanced platform analytics calculations for more consistent and
efficient user counting across key performance indicators (DAU, MAU,
per-project metrics).
* Updated the Growth Quality chart to indicate that user counts
represent sampled estimates with approximately 0.4% margin of error for
improved performance.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: mantrakp04 <mantrakp@gmail.com>
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: mantra <mantra@stack-auth.com>
2026-06-19 12:44:28 -07:00
.agents/skills feat(hexclave): PR 2 — visible rebrand (Hexclave brand goes public) (#1481) 2026-05-26 19:18:20 -07:00
.changeset Disable changesets changelogs 2026-01-12 15:21:56 -08:00
.claude Support local dashboard in remote SSH and GH Codespaces (#1538) 2026-06-04 16:36:17 -07:00
.cursor Update pre-push.md 2026-06-04 10:44:39 -07:00
.devcontainer feat(hexclave): PR 2 — visible rebrand (Hexclave brand goes public) (#1481) 2026-05-26 19:18:20 -07:00
.github feat(hexclave): PR 5 — internal symbol/path/package renames + brand strings (#1547) 2026-06-03 18:57:09 -07:00
.vscode Make it clear there are more SDK packages 2026-06-16 10:37:58 -07:00
apps perf(platform-analytics): cut ClickHouse query peak memory (#1632) 2026-06-19 12:44:28 -07:00
configs [Fix] Infinite Loop on handler/sign-in due to useStackApp not being able to find the StackProvider given context (#1248) 2026-03-12 22:28:47 -07:00
docker [codex] Add analytics overview filters (#1496) 2026-06-10 17:50:35 -07:00
docs chore: update package versions 2026-06-17 20:31:22 +00:00
docs-mintlify [codex] Add skill context to Ask Hexclave (#1605) 2026-06-18 11:40:02 -07:00
examples chore: update package versions 2026-06-17 20:31:22 +00:00
packages Fix typecheck in template cross-domain test (#1628) 2026-06-18 17:55:17 -07:00
patches Fix MS OAuth (#457) 2025-02-21 19:39:22 +01:00
scripts [codex] Add skill context to Ask Hexclave (#1605) 2026-06-18 11:40:02 -07:00
sdks chore: update package versions 2026-06-17 20:31:22 +00:00
skills/hexclave feat(hexclave): PR 5 — internal symbol/path/package renames + brand strings (#1547) 2026-06-03 18:57:09 -07:00
.dockerignore feat(hexclave): PR 5 — internal symbol/path/package renames + brand strings (#1547) 2026-06-03 18:57:09 -07:00
.gitignore feat(hexclave): PR 5 — internal symbol/path/package renames + brand strings (#1547) 2026-06-03 18:57:09 -07:00
.gitmodules Update GitHub URL 2026-05-19 10:27:53 -07:00
AGENTS.md Make it clear there are more SDK packages 2026-06-16 10:37:58 -07:00
CHANGELOG.md Add 6/12/26 changelog entry (#1589) 2026-06-16 16:44:03 -07:00
CLAUDE.md feat(hexclave): PR 2 — visible rebrand (Hexclave brand goes public) (#1481) 2026-05-26 19:18:20 -07:00
CONTRIBUTING.md Rename port prefix envvar 2026-05-27 18:09:52 -07:00
LICENSE feat(hexclave): PR 2 — visible rebrand (Hexclave brand goes public) (#1481) 2026-05-26 19:18:20 -07:00
package.json feat(hexclave): PR 5 — internal symbol/path/package renames + brand strings (#1547) 2026-06-03 18:57:09 -07:00
pnpm-lock.yaml add platform analytics route to the dashboard (#1626) 2026-06-19 11:11:06 -07:00
pnpm-workspace.yaml Fix Docker builds for pnpm v11 (#1532) 2026-06-02 14:41:18 -07:00
README.md Add "For AI agents" section to README (#1563) 2026-06-05 17:01:13 -07:00
skills-lock.json Auth app redesign (#1367) 2026-05-19 23:03:46 -07:00
turbo.json Replace writeConfigObject with AI-aware updateConfigObject (#1537) 2026-06-15 12:00:24 -07:00
vitest.shared.ts Fix tests 2026-02-17 19:57:08 -08:00
vitest.workspace.ts Hosted components (#1229) 2026-03-10 11:29:05 -07:00

Hexclave

Hexclave

The user infrastructure platform.

Hexclave handles everything around your users: authentication, teams, payments, emails, analytics, and much more. Start in minutes on the hosted cloud. Your data is always yours to export and self-host.

Website · Docs · Dashboard · Discord

License SDKs Deploy


Where Hexclave fits in the infrastructure stack

Get started

Setting up Hexclave is one prompt. Paste this into your coding agent of choice:

Read skill.hexclave.com and help me setup hexclave in this project

For AI agents

If you're an AI coding agent, use your web fetch tool or curl the skill documentation for up-to-date integration instructions:

curl -sSL "https://skill.hexclave.com?question=<your-question>&context=<your-context>"

What's included

Hexclave ships as a catalog of apps you switch on as your product needs them. Each one is built on the same user model, and new apps land regularly.

  Authentication

Authentication that just works with passkeys, OAuth, and CLI auth. Drop in one component and ship the whole flow; auth methods toggle from the dashboard with no code changes needed.

Authentication

  Teams

Build for teams, not just users, with workspaces, email invites, and roles that actually gate the work. The workspace switcher remembers selection, invites auto sign up new users, and permissions hold up under audit.

Teams

  RBAC

Permissions, sorted: roles that nest and one permission check that works the same on server or client. Define them in the dashboard, check them anywhere in your code.

RBAC

  API Keys

API keys without the footguns: leaked keys get auto-revoked, work for users and teams, and show the full secret only once. We never keep the plaintext after creation.

API Keys

  Payments

Payments without the plumbing for subscriptions, one-time charges, and usage metering with credits. Bill a person or a whole team with one model, no separate codepath.

Payments

  Emails

Email that delivers and tells you so, handling transactional and marketing sends from one API. Edit templates with an AI editor, theme once, and track every open and click.

Emails

  Analytics

Know your users with no data stack required, with live active user counts and session replays out of the box. Ask in plain English to build dashboards or write SQL to save queries, all with one flag enabled.

Analytics

  Webhooks

React to every user event in real time with signed, tamper-proof webhooks. Retries and backoff are handled for you; verify in five lines and manage endpoints from the dashboard.

Webhooks

  Data Vault

A safe for the secrets your users hand you, locked with your secret so we never see the plaintext. Store and retrieve tokens in two lines each, server-only by design.

Data Vault

  Launch Checklist

Run through the must-do checks before flipping to production: domain setup, callbacks locked, secrets rotated. The progress tracker keeps your team aligned so nothing critical slips through on launch day.

Launch Checklist

Contributing

Hexclave is open source, and contributions are welcome. Read CONTRIBUTING.md to get started, and say hello in Discord before picking up anything large. Found a security issue? Email security@hexclave.com.

❤ Contributors

Contributors