chatwoot

mirror of https://github.com/chatwoot/chatwoot.git synced 2026-06-04 21:02:35 +08:00

History

Shivam Mishra 04ac9d3780 fix: use UPN for `imap_login` on Microsoft OAuth callback (#14522 ) Outgoing email on a Microsoft email inbox was failing with `535 5.7.3 Authentication unsuccessful` immediately after the user re-authenticated, while incoming (IMAP) continued to work. The root cause is in our OAuth callback: we persist the id_token's `email` claim into `channel.imap_login`, but Microsoft's SMTP AUTH (XOAUTH2) validates the username against the access token's UPN, not the mailbox's primary SMTP address or aliases. When those diverge — common in tenants that use one domain for sign-in identities and another for mailbox addresses — SMTP rejects every send. This PR makes `OauthCallbackController#update_channel` prefer `preferred_username` (v2.0) / `upn` (v1.0) from the id_token over `email`, with `email` as the fallback so Google flows are unchanged. ## What changed - `app/controllers/oauth_callback_controller.rb` — extract `imap_login_identity` (default: `users_data['email']`, same as before). `update_channel` now calls it instead of inlining the email claim. No behavioural change in the base controller. - `app/controllers/microsoft/callbacks_controller.rb` — override `imap_login_identity` to return `preferred_username \|\| upn \|\| super`. Provider-specific knowledge stays in the provider subclass; Google's flow is literally untouched. - `spec/controllers/microsoft/callbacks_controller_spec.rb` — adds one example that reproduces the divergent shape (`email: <alias>`, `preferred_username: <upn>`) and asserts `imap_login` lands on the UPN while `channel.email` stays on the mailbox alias. `channel.email` and `find_channel_by_email` still key on the id_token's `email` claim everywhere, so customer-facing From identity and reconnect-by-email matching are unchanged. Behavioural matrix: \| Provider / shape \| `imap_login` before \| `imap_login` after \| \|-----------------------------------------------------------------\|---------------------\|--------------------\| \| Microsoft, `upn == email` (common case) \| email \| UPN (= email, same string) \| \| Microsoft, `upn != email` (e.g. UPN on one domain, mailbox alias on another) \| alias (broken) \| UPN (works) \| \| Google \| email \| email (no change, base impl used) \| ## Why this happens (Microsoft side, for context) Two facts that together produce the bug — one is documented, the other we verified empirically because Microsoft's docs don't address it: 1. `email` and `upn` are different claims and can legitimately diverge. In Entra, the UPN is the sign-in identity; the id_token's `email` claim is the user's mailbox property (which can be a proxy address). For v2.0 tokens (which is what we use — `/common/oauth2/v2.0/token` in `MicrosoftConcern`), the documented "username to sign in as" claim is `preferred_username`. See [ID token claims reference](https://learn.microsoft.com/en-us/entra/identity-platform/id-token-claims-reference). So `users_data['email']` was the wrong source for `imap_login`; tenants where UPN == primary SMTP happened to mask the bug. 2. Exchange's XOAUTH2 SMTP rejects aliases in the `user=` field, even though IMAP accepts them. This asymmetry is not in [Microsoft's canonical XOAUTH2 doc](https://learn.microsoft.com/en-us/exchange/client-developer/legacy-protocols/how-to-authenticate-an-imap-pop-smtp-application-by-using-oauth) — the doc treats the SASL `user=` field uniformly across IMAP, POP, and SMTP, with only a shared-mailbox carve-out spelled out. We confirmed the SMTP-strict behaviour empirically by running an XOAUTH2 AUTH probe with the same access token: `user=<alias>` returned `535 5.7.3`, `user=<UPN>` returned `235`. Same token, only the `user=` field differed. That's what tied the symptom to claim-shape. Hence the patch: read the documented "sign-in name" claim and persist that, not the customer-facing mailbox address. ## How to reproduce (before the fix) 1. Set up a Microsoft email inbox in a tenant where the user's UPN domain differs from the user's primary SMTP / mailbox domain (e.g. UPN `user@tenant-a.example`, mailbox `User@tenant-b.example` where `tenant-b.example` is a proxy address on the same mailbox). 2. Connect or re-authenticate the inbox through the standard flow. 3. Inspect the channel: ```ruby ch.imap_login # => "User@tenant-b.example" (alias — wrong) token = ch.provider_config['access_token'] JSON.parse(Base64.urlsafe_decode64(token.split('.')[1].then { \|s\| s + '=' * (-s.length % 4) }))['upn'] # => "user@tenant-a.example" (UPN — what SMTP actually needs) ``` 4. Send a reply. SMTP returns `535 5.7.3 Authentication unsuccessful`. Incoming IMAP continues to work fine. After the fix, `imap_login` is set to the UPN at callback time and SMTP succeeds. ## Notes for review - The id_token decoding path (`users_data`) already exists and is trusted by the rest of the callback — no new attack surface. - Scoped to Microsoft on purpose. A provider-agnostic fallback chain in the base controller would also have worked (Google id_tokens don't carry `preferred_username` / `upn`, so it'd land on email anyway), but keeping Google's code path identical to today removes any risk of an unintended interaction with whatever a future Google update might add to its id_token. Pattern matches the existing `find_channel_by_email` override Google already has. - The [id_token claims reference](https://learn.microsoft.com/en-us/entra/identity-platform/id-token-claims-reference) warns that `preferred_username` is mutable and "can't be used to make authorization decisions." That warning targets apps using the claim as a stable cross-session identifier for app-level authz — we're not. We use it as the SASL `user=` string for SMTP AUTH, validated by Microsoft against the same token it just issued. The claim's mutability matters only if a tenant admin renames a UPN between re-auths, in which case the stored `imap_login` goes stale and SMTP 535s — but the pre-patch code has the identical mutability characteristic on the `email` claim (rename a mailbox's primary SMTP and you get the same stale-then-535). The patch doesn't enlarge that failure surface; it shrinks it, because UPN-vs-alias divergence is now handled rather than always-broken. - Related but out of scope: SMTP failures don't trigger `channel.authorization_error!` today. `ExceptionList::SMTP_EXCEPTIONS` (`lib/exception_list.rb`) only contains `Net::SMTPSyntaxError`; `Net::SMTPAuthenticationError` bubbles unhandled, and `ApplicationMailer#handle_smtp_exceptions` only logs. So a 535 — whether from this bug or any other identity drift — never prompts re-auth in the UI, unlike the IMAP path (`fetch_imap_emails_job` catches `OAuth2::Error` and calls `authorization_error!`). Worth a separate PR; flagging here so we don't pretend stale-identity SMTP errors are self-healing. - The alternative I considered — decoding the access token's `upn` directly — is more correct in principle but relies on Microsoft access tokens being JWTs, which is undocumented behaviour for the `outlook.office.com` audience and contradicts the [docs guidance](https://learn.microsoft.com/en-us/entra/identity-platform/access-tokens) that access tokens are opaque to clients. - Not addressed here, flagging for follow-up: existing channels that were re-authenticated before this fix still hold the alias in `imap_login` and will keep 535ing until the user re-auths again. A one-shot rake task could decode the stored access tokens and reconcile, but fixing forward via natural re-auth is lower-risk; let me know if you want the backfill. - Also out of scope: `app/mailers/conversation_reply_mailer_helper.rb` reads `provider_config['access_token']` raw without going through `Microsoft::RefreshOauthTokenService`, while the IMAP path refreshes. That's a real asymmetry but unrelated to this bug (the token here was minutes-old) and worth its own PR.	2026-06-02 13:26:30 +05:30
..
callbacks_controller.rb	fix: use UPN for `imap_login` on Microsoft OAuth callback (#14522 )	2026-06-02 13:26:30 +05:30

fix: use UPN for imap_login on Microsoft OAuth callback (#14522 )

Outgoing email on a Microsoft email inbox was failing with `535 5.7.3
Authentication unsuccessful` immediately after the user
re-authenticated, while incoming (IMAP) continued to work. The root
cause is in our OAuth callback: we persist the id_token's `email` claim
into `channel.imap_login`, but Microsoft's SMTP AUTH (XOAUTH2) validates
the username against the access token's **UPN**, not the mailbox's
primary SMTP address or aliases. When those diverge — common in tenants
that use one domain for sign-in identities and another for mailbox
addresses — SMTP rejects every send.

This PR makes `OauthCallbackController#update_channel` prefer
`preferred_username` (v2.0) / `upn` (v1.0) from the id_token over
`email`, with `email` as the fallback so Google flows are unchanged.

## What changed

- `app/controllers/oauth_callback_controller.rb` — extract
`imap_login_identity` (default: `users_data['email']`, same as before).
`update_channel` now calls it instead of inlining the email claim. No
behavioural change in the base controller.
- `app/controllers/microsoft/callbacks_controller.rb` — override
`imap_login_identity` to return `preferred_username || upn || super`.
Provider-specific knowledge stays in the provider subclass; Google's
flow is literally untouched.
- `spec/controllers/microsoft/callbacks_controller_spec.rb` — adds one
example that reproduces the divergent shape (`email: <alias>`,
`preferred_username: <upn>`) and asserts `imap_login` lands on the UPN
while `channel.email` stays on the mailbox alias.

`channel.email` and `find_channel_by_email` still key on the id_token's
`email` claim everywhere, so customer-facing From identity and
reconnect-by-email matching are unchanged.

Behavioural matrix:

| Provider / shape | `imap_login` before | `imap_login` after |

|-----------------------------------------------------------------|---------------------|--------------------|
| Microsoft, `upn == email` (common case) | email | UPN (= email, same
string) |
| Microsoft, `upn != email` (e.g. UPN on one domain, mailbox alias on
another) | alias (broken) | UPN (works) |
| Google | email | email (no change, base impl used) |

## Why this happens (Microsoft side, for context)

Two facts that together produce the bug — one is documented, the other
we verified empirically because Microsoft's docs don't address it:

1. **`email` and `upn` are different claims and can legitimately
diverge.** In Entra, the UPN is the sign-in identity; the id_token's
`email` claim is the user's mailbox property (which can be a proxy
address). For v2.0 tokens (which is what we use —
`/common/oauth2/v2.0/token` in `MicrosoftConcern`), the documented
"username to sign in as" claim is `preferred_username`. See [ID token
claims
reference](https://learn.microsoft.com/en-us/entra/identity-platform/id-token-claims-reference).
So `users_data['email']` was the wrong source for `imap_login`; tenants
where UPN == primary SMTP happened to mask the bug.

2. **Exchange's XOAUTH2 SMTP rejects aliases in the `user=` field, even
though IMAP accepts them.** This asymmetry is **not** in [Microsoft's
canonical XOAUTH2
doc](https://learn.microsoft.com/en-us/exchange/client-developer/legacy-protocols/how-to-authenticate-an-imap-pop-smtp-application-by-using-oauth)
— the doc treats the SASL `user=` field uniformly across IMAP, POP, and
SMTP, with only a shared-mailbox carve-out spelled out. We confirmed the
SMTP-strict behaviour empirically by running an XOAUTH2 AUTH probe with
the same access token: `user=<alias>` returned `535 5.7.3`, `user=<UPN>`
returned `235`. Same token, only the `user=` field differed. That's what
tied the symptom to claim-shape.

Hence the patch: read the documented "sign-in name" claim and persist
that, not the customer-facing mailbox address.

## How to reproduce (before the fix)

1. Set up a Microsoft email inbox in a tenant where the user's UPN
domain differs from the user's primary SMTP / mailbox domain (e.g. UPN
`user@tenant-a.example`, mailbox `User@tenant-b.example` where
`tenant-b.example` is a proxy address on the same mailbox).
2. Connect or re-authenticate the inbox through the standard flow.
3. Inspect the channel:
   ```ruby
ch.imap_login # => "User@tenant-b.example" (alias — wrong)
   token = ch.provider_config['access_token']
JSON.parse(Base64.urlsafe_decode64(token.split('.')[1].then { |s| s +
'=' * (-s.length % 4) }))['upn']
   # => "user@tenant-a.example"  (UPN — what SMTP actually needs)
   ```
4. Send a reply. SMTP returns `535 5.7.3 Authentication unsuccessful`.
Incoming IMAP continues to work fine.

After the fix, `imap_login` is set to the UPN at callback time and SMTP
succeeds.

## Notes for review

- The id_token decoding path (`users_data`) already exists and is
trusted by the rest of the callback — no new attack surface.
- Scoped to Microsoft on purpose. A provider-agnostic fallback chain in
the base controller would also have worked (Google id_tokens don't carry
`preferred_username` / `upn`, so it'd land on email anyway), but keeping
Google's code path identical to today removes any risk of an unintended
interaction with whatever a future Google update might add to its
id_token. Pattern matches the existing `find_channel_by_email` override
Google already has.
- The [id_token claims
reference](https://learn.microsoft.com/en-us/entra/identity-platform/id-token-claims-reference)
warns that `preferred_username` is mutable and "can't be used to make
authorization decisions." That warning targets apps using the claim as a
stable cross-session identifier for app-level authz — we're not. We use
it as the SASL `user=` string for SMTP AUTH, validated by Microsoft
against the same token it just issued. The claim's mutability matters
only if a tenant admin renames a UPN between re-auths, in which case the
stored `imap_login` goes stale and SMTP 535s — but the pre-patch code
has the identical mutability characteristic on the `email` claim (rename
a mailbox's primary SMTP and you get the same stale-then-535). The patch
doesn't enlarge that failure surface; it shrinks it, because
UPN-vs-alias divergence is now handled rather than always-broken.
- Related but out of scope: SMTP failures don't trigger
`channel.authorization_error!` today. `ExceptionList::SMTP_EXCEPTIONS`
(`lib/exception_list.rb`) only contains `Net::SMTPSyntaxError`;
`Net::SMTPAuthenticationError` bubbles unhandled, and
`ApplicationMailer#handle_smtp_exceptions` only logs. So a 535 — whether
from this bug or any other identity drift — never prompts re-auth in the
UI, unlike the IMAP path (`fetch_imap_emails_job` catches
`OAuth2::Error` and calls `authorization_error!`). Worth a separate PR;
flagging here so we don't pretend stale-identity SMTP errors are
self-healing.
- The alternative I considered — decoding the access token's `upn`
directly — is more correct in principle but relies on Microsoft access
tokens being JWTs, which is undocumented behaviour for the
`outlook.office.com` audience and contradicts the [docs
guidance](https://learn.microsoft.com/en-us/entra/identity-platform/access-tokens)
that access tokens are opaque to clients.
- **Not addressed here, flagging for follow-up:** existing channels that
were re-authenticated before this fix still hold the alias in
`imap_login` and will keep 535ing until the user re-auths again. A
one-shot rake task could decode the stored access tokens and reconcile,
but fixing forward via natural re-auth is lower-risk; let me know if you
want the backfill.
- Also out of scope: `app/mailers/conversation_reply_mailer_helper.rb`
reads `provider_config['access_token']` raw without going through
`Microsoft::RefreshOauthTokenService`, while the IMAP path refreshes.
That's a real asymmetry but unrelated to this bug (the token here was
minutes-old) and worth its own PR.

2026-06-02 13:26:30 +05:30

callbacks_controller.rb fix: use UPN for imap_login on Microsoft OAuth callback (#14522 ) 2026-06-02 13:26:30 +05:30