fix: reduce recurring production Sentry errors (Stripe webhooks, email, session replay) (#1667)

## Summary

A cleanup pass over recurring production errors triaged from Sentry
(`stackframe-pw` org). The common thread: expected/edge-case conditions
thrown as `HexclaveAssertionError` / `captureError`, so Sentry filed
them as errors (and, for the Stripe ones, Stripe redelivered
indefinitely). Each is handled at the source or logged at the correct
severity.

| Sentry issue | Fix | Risk |
|---|---|---|
|
[STACK-BACKEND-1F5](https://stackframe-pw.sentry.io/issues/STACK-BACKEND-1F5)
— `Unknown stripe webhook type` (`invoice_payment.paid`, `payout.paid`)
| Add both to `ignoredEvents`. They fell through to the throwing `else`
and Stripe redelivered them. (`payout.failed`/`canceled`/`updated`
intentionally left unhandled for now.) | Trivial |
|
[STACK-SERVER-1ZV](https://stackframe-pw.sentry.io/issues/STACK-SERVER-1ZV)
— session-replay `413 Request body too large` | Measure event size in
UTF-8 bytes (was UTF-16 `.length`, which undercounts multibyte content);
drop a single oversized event with a warning instead of shipping a
doomed request | Low |
|
[STACK-BACKEND-140](https://stackframe-pw.sentry.io/issues/STACK-BACKEND-140)
+
[STACK-BACKEND-1F1](https://stackframe-pw.sentry.io/issues/STACK-BACKEND-1F1)
— `Unknown error while sending (test) email` | Classify refused SMTP
connections (`ECONNREFUSED`, surfaced by nodemailer as `code:
'ESOCKET'`) as a typed `CONNECTION_REFUSED` error with a real
user-facing message, instead of falling through to the `UNKNOWN`
catch-all in both the low-level sender and the send-test-email route.
Marked `canRetry` so the queued-email path reschedules with backoff. |
Low |

## Notes

- **Session replay (1ZV):** edited the `packages/template`
source-of-truth; the generated SDK copies are gitignored and regenerated
by CI (`pnpm -w run generate-sdks`). The `TextEncoder` is hoisted out of
the rrweb emit hot path to avoid per-event allocation.
- **Email classification (140/1F1):** the new `CONNECTION_REFUSED`
errorType is additive — other consumers only read `errorType` for
logging, and the send-test-email route only special-cases `UNKNOWN`, so
the new type cleanly bypasses both assertion captures. `canRetry: true`
is safe because the connection is refused before any SMTP exchange (no
message handed off → no duplicate-delivery risk); transient refusals
recover, and a persistent misconfig still fails after
`MAX_SEND_ATTEMPTS`. The one-shot send-test-email path ignores
`canRetry`, so its immediate feedback is unchanged.

## Investigated but intentionally NOT changed here

These were initially included, then reverted so we keep getting Sentry
signal while the root causes are still under investigation:

-
**[STACK-BACKEND-1GM](https://stackframe-pw.sentry.io/issues/STACK-BACKEND-1GM)**
— `Stripe webhook bad customer id`. A subscription-changed event with no
customer (the observed case was a Stripe-CLI test
`payment_intent.succeeded` against a dev-connected account). Skipping is
likely the right long-term fix, but kept the throw for now to keep
observing. Note: in live mode the same path could fire on legitimate
customerless one-time payments / guest checkouts.
-
**[STACK-BACKEND-1CN](https://stackframe-pw.sentry.io/issues/STACK-BACKEND-1CN)**
— `Recovered N stale outgoing request(s)`. This is a self-healing
recovery notice (0 user impact); the underlying cause is the poller
process dying between the claim `UPDATE` and the delete. Kept at
`captureError` to keep collecting data on how often / why it happens.

## Verification
- `typecheck` clean: `@hexclave/backend`, `@hexclave/template`,
`@hexclave/js`, `@hexclave/react`, `@hexclave/next`,
`@hexclave/tanstack-start`
- `eslint` clean on all touched files
This commit is contained in:
BilalG1 2026-06-25 14:48:49 -07:00 committed by GitHub
parent b8e384493d
commit d2a84f5a28
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 31 additions and 1 deletions

View File

@ -51,7 +51,9 @@ const ignoredEvents = [
"balance.available",
"customer.updated",
"customer.created",
"invoice_payment.paid",
"payout.created",
"payout.paid",
"payout.reconciliation_completed",
] as const satisfies Stripe.Event.Type[];

View File

@ -141,6 +141,19 @@ async function _lowLevelSendEmailWithoutRetries(options: LowLevelSendEmailOption
} as const);
}
// nodemailer surfaces a refused connection as code 'ESOCKET' with 'ECONNREFUSED' in the message.
// Safe to retry: the connection was refused before any SMTP exchange, so the message was never
// handed off — there's no duplicate-delivery risk, and a transient refusal (server restarting /
// overloaded) can recover. A persistent misconfig still fails after MAX_SEND_ATTEMPTS.
if (code === 'ECONNREFUSED' || error.message.includes('ECONNREFUSED')) {
return Result.error({
rawError: error,
errorType: 'CONNECTION_REFUSED',
canRetry: true,
message: 'The email server refused the connection. Please make sure the email host and port configuration are correct.',
} as const);
}
if (responseCode === 535 || code === 'EAUTH') {
return Result.error({
rawError: error,

View File

@ -110,6 +110,9 @@ const MAX_APPROX_BYTES_PER_BATCH = 512_000;
// envelope overhead (browser_session_id, timestamps, wrapper keys, etc.).
const MAX_FLUSH_PAYLOAD_BYTES = 900_000;
// Reused across the emit hot path to avoid per-event allocation.
const textEncoder = new TextEncoder();
export type StoredSession = {
session_id: string,
created_at_ms: number,
@ -286,6 +289,17 @@ export class SessionRecorder {
// When _flushInProgress blocked earlier flushes, events can accumulate
// well past MAX_APPROX_BYTES_PER_BATCH; sending them all at once would
// exceed the server's 1MB body limit (413).
// A single event over the limit can't be sent (rrweb events aren't splittable); drop it and move on.
const firstSize = allSizes[offset] ?? throwErr("_eventSizes out of sync with _events — this should never happen");
if (firstSize > MAX_FLUSH_PAYLOAD_BYTES) {
captureWarning(
"SessionRecorder.flush",
new Error(`Dropping oversized session replay event (${firstSize} bytes > ${MAX_FLUSH_PAYLOAD_BYTES} byte limit); it cannot be sent without a 413.`),
);
offset += 1;
continue;
}
let batchBytes = 0;
let batchEnd = offset;
for (let i = offset; i < allEvents.length; i++) {
@ -384,7 +398,8 @@ export class SessionRecorder {
}
}
const eventSize = JSON.stringify(event).length;
// Measure UTF-8 byte length to match the server's byte limit (.length counts UTF-16 units, undercounting multibyte content).
const eventSize = textEncoder.encode(JSON.stringify(event)).byteLength;
this._events.push(event);
this._eventSizes.push(eventSize);
this._approxBytes += eventSize;