Commit Graph

14 Commits

Author SHA1 Message Date
Jordan Whited
8b58bd6c64 net/batching: implement NodeAttrNeverGSOEqualTail
This NodeCapability works around the UDP GSO bugs introduced by
torvalds/linux@b10b446 (v7.0-rc1). These bugs were later fixed by
torvalds/linux@78effd8 and torvalds/linux@5f17ae0 (v7.1-rc5). These
Linux kernel bugs cause mangled UDP headers and UDP checksums, resulting
in high levels of packet loss.

The aforementioned bugs have already made their way downstream into
various distros, e.g. Ubuntu 26.04 LTS. Impacted users are now dealing
with poor UDP performance in tailscaled, and in any other software that
makes use of UDP GSO.

Not all users of the affected kernels are impacted as the relevant
kernel code path sits between kernel and netdev driver, and behaviors
vary by driver/device capability.

We cannot detect impact at runtime, as this would require gathering all
netdevs, and performing loopback tests. This is invasive and in many
cases impossible.

So, we are left to choose between disabling UDP GSO for all users on
affected kernels, whether they experience real impact or not, or try
and work around the bugs. Disabling UDP GSO for a user that is not
impacted can cut max throughput in half, and consume more CPU cycles.

This commit attempts to workaround the bugs by avoiding UDP GSO when
batches are small, and injecting a 1-byte sentinel tail payload when
they are large. This tail payload is smaller than "GSO size", which
sidesteps the primary trigger of all fragments in a batch being
equal in length.

The end result is slightly increased payload and packet overhead, but
functional UDP GSO for all Linux 7.0-7.1.4 users, regardless of
netdev/driver.

Updates #19777

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-05-29 11:36:35 -07:00
James Tucker
25b8ed8d9e control/controlknobs,net/{batching,tstun},wgengine: add nodecaps to disable UDP & TUN GRO/GSO
Add four control-plane node attributes that let us disable UDP GSO/GRO
on the magicsock UDP socket and UDP/TCP GRO on the Tailscale TUN
device.

These complement the pre-existing TS_DEBUG_DISABLE_UDP_{GRO,GSO} and
TS_TUN_DISABLE_{UDP,TCP}_GRO envknobs. They exist so we can mitigate
upstream Linux kernel regressions on a deployed fleet without
requiring a client release, after two incidents (#13041, #19777) where
buggy kernel patches landed upstream and the fix took an excessively
long time to reach downstream distros.

Knob changes are reacted to in setNetworkMapInternal / SetNetworkMap via
a comparison against a cached "last applied" value and only an actual
transition triggers work: magicsock Rebind()+ReSTUN for UDP,
ApplyGROKnobs for TUN. The TUN side is gated by buildfeatures.HasGRO and
is one-way (wireguard-go GRO disablement is sticky); re-enabling
requires a client restart.

Updates #13041
Updates #19777

Change-Id: I802993070afa659cc06809bb0bfbb7f8a0cdb273
Signed-off-by: James Tucker <james@tailscale.com>
2026-05-27 17:10:14 -07:00
James Tucker
dea49bb4da net/batching: add envknobs to disable UDP GRO & GSO
It is sometimes useful when diagnosing subtle and specific performance
problems to rule out GRO/GSO independently and/or toggle them to
influence packet pacing.

Updates #17835
Updates tailscale/corp#31164

Signed-off-by: James Tucker <james@tailscale.com>
2026-05-27 12:05:00 -07:00
Brad Fitzpatrick
5ef3713c9f cmd/vet: add subtestnames analyzer; fix all existing violations
Add a new vet analyzer that checks t.Run subtest names don't contain
characters requiring quoting when re-running via "go test -run". This
enforces the style guide rule: don't use spaces or punctuation in
subtest names.

The analyzer flags:
- Direct t.Run calls with string literal names containing spaces,
  regex metacharacters, quotes, or other problematic characters
- Table-driven t.Run(tt.name, ...) calls where tt ranges over a
  slice/map literal with bad name field values

Also fix all 978 existing violations across 81 test files, replacing
spaces with hyphens and shortening long sentence-like names to concise
hyphenated forms.

Updates #19242

Change-Id: Ib0ad96a111bd8e764582d1d4902fe2599454ab65
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-05 15:52:51 -07:00
Alex Valiushko
330a17b7d7
net/batching: use vectored writes on Linux (#19054)
On Linux batching.Conn will now write a vector of
coalesced buffers via sendmmsg(2) instead of copying
fragments into a single buffer.

Scatter-gather I/O has been available on Linux since the
earliest days (reworked in 2.6.24). Kernel passes fragments
to the driver if it supports it, otherwise linearizes
upon receiving the data.

Removing the copy overhead from userspace yields up to 4-5%
packet and bitrate improvement on Linux with GSO enabled:
46Gb/s 4.4m pps vs 44Gb/s 4.2m pps w/32 Peer Relay client flows.

Updates tailscale/corp#36989


Change-Id: Idb2248d0964fb011f1c8f957ca555eab6a6a6964

Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
2026-03-25 16:38:54 -07:00
Jordan Whited
31d65a909d net/batching: eliminate gso helper func indirection
These were previously swappable for historical reasons that are no
longer relevant.

Removing the indirection enables future inlining optimizations if we
simplify further.

Updates tailscale/corp#38703

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-03-18 10:11:33 -07:00
Jordan Whited
96dde53b43 net/{batching,udprelay},wgengine/magicsock: add SO_RXQ_OVFL clientmetrics
For the purpose of improved observability of UDP socket receive buffer
overflows on Linux.

Updates tailscale/corp#37679

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-03-13 14:27:03 -07:00
Jordan Whited
607d01cdae net/batching: clarify & simplify single packet read limitations
ReadFromUDPAddrPort worked if UDP GRO was unsupported, but we don't
actually want attempted usage, nor does any exist today. Future work
on tailscale/corp#37679 would have required more complexity in this
method, vs clarifying the API intents.

Updates tailscale/corp#37679

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-03-11 10:25:58 -07:00
Will Norris
3ec5be3f51 all: remove AUTHORS file and references to it
This file was never truly necessary and has never actually been used in
the history of Tailscale's open source releases.

A Brief History of AUTHORS files
---

The AUTHORS file was a pattern developed at Google, originally for
Chromium, then adopted by Go and a bunch of other projects. The problem
was that Chromium originally had a copyright line only recognizing
Google as the copyright holder. Because Google (and most open source
projects) do not require copyright assignemnt for contributions, each
contributor maintains their copyright. Some large corporate contributors
then tried to add their own name to the copyright line in the LICENSE
file or in file headers. This quickly becomes unwieldy, and puts a
tremendous burden on anyone building on top of Chromium, since the
license requires that they keep all copyright lines intact.

The compromise was to create an AUTHORS file that would list all of the
copyright holders. The LICENSE file and source file headers would then
include that list by reference, listing the copyright holder as "The
Chromium Authors".

This also become cumbersome to simply keep the file up to date with a
high rate of new contributors. Plus it's not always obvious who the
copyright holder is. Sometimes it is the individual making the
contribution, but many times it may be their employer. There is no way
for the proejct maintainer to know.

Eventually, Google changed their policy to no longer recommend trying to
keep the AUTHORS file up to date proactively, and instead to only add to
it when requested: https://opensource.google/docs/releasing/authors.
They are also clear that:

> Adding contributors to the AUTHORS file is entirely within the
> project's discretion and has no implications for copyright ownership.

It was primarily added to appease a small number of large contributors
that insisted that they be recognized as copyright holders (which was
entirely their right to do). But it's not truly necessary, and not even
the most accurate way of identifying contributors and/or copyright
holders.

In practice, we've never added anyone to our AUTHORS file. It only lists
Tailscale, so it's not really serving any purpose. It also causes
confusion because Tailscalars put the "Tailscale Inc & AUTHORS" header
in other open source repos which don't actually have an AUTHORS file, so
it's ambiguous what that means.

Instead, we just acknowledge that the contributors to Tailscale (whoever
they are) are copyright holders for their individual contributions. We
also have the benefit of using the DCO (developercertificate.org) which
provides some additional certification of their right to make the
contribution.

The source file changes were purely mechanical with:

    git ls-files | xargs sed -i -e 's/\(Tailscale Inc &\) AUTHORS/\1 contributors/g'

Updates #cleanup

Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
2026-01-23 15:49:45 -08:00
Brad Fitzpatrick
7d19813618 net/batching: fix import formatting
From #17842

Updates #cleanup

Change-Id: Ie041b50659361b50558d5ec1f557688d09935f7c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-11-19 18:13:46 -08:00
Sachin Iyer
16e90dcb27
net/batching: fix gro size handling for misordered UDP_GRO messages (#17842)
Fixes #17835

Signed-off-by: Sachin Iyer <siyer@detail.dev>
2025-11-12 10:13:21 -05:00
Jordan Whited
d4b7200129
net/udprelay: use batching.Conn (#16866)
This significantly improves throughput of a peer relay server on Linux.

Server.packetReadLoop no longer passes sockets down the stack. Instead,
packet handling methods return a netip.AddrPort and []byte, which
packetReadLoop gathers together for eventual batched writes on the
appropriate socket(s).

Updates tailscale/corp#31164

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-19 14:44:39 -07:00
Jordan Whited
c083a9b053
net/batching: fix compile-time assert (#16864)
Updates #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-14 10:48:06 -07:00
Jordan Whited
16bc0a5558
net/{batching,packet},wgengine/magicsock: export batchingConn (#16848)
For eventual use by net/udprelay.Server.

Updates tailscale/corp#31164

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-08-13 13:13:11 -07:00