Replace E.New(fmt.Sprintf("...0x%04x", v)) with variadic E.New(..., fmt.Sprintf("0x%04x", v)) so plain text and integer args flow through F.ToString (sing-box/common/stun/stun.go precedent). Hoist three sysfs_linux.go if-init error checks per .claude/rules/go-syntax.md ("assign first, then check"); two of them also shadowed the outer err.
- Delete docs/adr/0001 and every code reference; the immutability contract
is now a four-line comment on the Export interface where it applies.
- Inline applyStaleClones at the linux/darwin reconcile call sites and
drop the helper plus the parallel staleBusIDs/pendingByBusID maps the
darwin host built only to match its signature.
- Inline single-use sentinel errors in iso_scheduler.go and
endpoint_darwin.go (no errors.Is callers).
- Rename exportLedger.fast/inventory to broadcastAccess/inventoryAccess
to match the Access suffix used by every other lock in the package.
- Strip narrative doc comments on exportLedger, the with* helpers,
IssueLease, ConsumeLeaseAndReserve, TryReserveForImport, host.go
interfaces, cgoCallbackHandle, and cloneDarwinExport; keep only the
non-obvious WHY required by .claude/rules/code-comment.md.
Net: 8 files, 99 insertions / 291 deletions, no behaviour change.
ExportHost.Events spawned goroutines tied to a caller-supplied ctx, and
neither host.Close nor the per-error cleanup in ServerService.Start
cancelled that ctx. A bind failure on port 3240 after Events succeeded
leaked ueventLoop, the netlink fd, and the darwin watcher goroutine.
The host now owns its background lifecycle: Start derives runCtx from
the caller's ctx, Close cancels it, Events binds goroutines to it.
ServerService.Start switches to a deferred cleanup that cancels and
closes the host on any error, so future failure steps cannot forget it.
The accepted limit of 4096 was looser than Linux's hard cap of 1024 in
drivers/usb/usbip/usbip_common.h, so a real peer would reject submits we
considered valid. Tighten the validator to 1024 to match upstream before
multi-packet iso lands.
P2: Darwin IN iso responses accepted malformed RET_SUBMIT descriptors —
aggregate ActualLength was checked but per-descriptor Offset/Length were
not, and ScatterIsoResponse silently clamped bad values into apparent
success. ValidateIsoResponse now lives in iso_scheduler.go beside
EncodeIsoSubmit/RebaseFrame and enforces the single-packet shape (Offset
== 0, Length == requestLen, ActualLength <= Length, sum == header,
payload covers range). pendingTransfer.validateResponse routes every
RET_SUBMIT shape check through one named seam so future defects land in
one place, and ScatterIsoResponse drops its defensive clamps so the
data-movement primitive can no longer mask a validation gap.
P2: kernelHandoffSession.Close mutated h.conn/h.monitorFile/h.relayConn
outside stateAccess while Start reads them under that lock; the server
registers prepared sessions before Start, so Close-before-Start can race
the direct-TCP path during import-time shutdown. Close now snapshots and
nils the three fields under stateAccess before closing the locals via
closeOnce, restoring symmetry with Start's under-lock reads and matching
the copy-under-lock, close-outside idiom established by
exportLedger.CloseAllSubscribers and ServerService.Close.
P1: The IOUSBHostControllerInterface command/doorbell blocks ran on an
Apple-owned dispatch queue, so callbacks could still fire (and panic in
cgo.Handle.Value) after Close deleted the handle. The controller now owns
a serial dispatch queue and drains it via dispatch_sync before tearing
the wrapper down — the watcher's pattern, extracted into a shared
box_usbhost_drain_and_release_queue helper.
P2: scatterResponse only bounds-checked ActualLength on IN transfers, so
an OUT RET_SUBMIT with a negative or oversized actual_length flowed
straight to box_usbhost_endpoint_sm_complete as a wrapped C.size_t.
Validation moves into pendingTransfer.accept, which reconciles every
response against the request before the length leaves the package.
Also wraps cgo.Handle in cgoCallbackHandle so destroy-then-Delete
ordering is encoded structurally for both the watcher and the controller.
- Route every reserved-state mutation through withInventoryWrite; the
lease insert path now broadcasts so extended subscribers see the busy
transition immediately instead of waiting for the next mutation. Folds
cleanupExpiredLocked changes into the broadcast decision on every
IssueLease early-reject path. Rename the field to inventory and force
read/write/write-quiet sites through dedicated accessors so future
callers cannot bypass the invariant.
- Mirror the linux clone-then-swap pattern in darwinExportHost.Reconcile
via cloneDarwinExport, so the ledger's unlocked Snapshot / LeaseCheck
reads never observe a half-mutated stale flag. Documented as
docs/adr/0001-export-pointer-immutability.md and on the Export
interface; both hosts now share the applyStaleClones helper.
- Rewrite 27 if (_, )?err := …; err != nil sites to assign-then-check
per .claude/rules/go-syntax.md.
- Delete parse / builder / tautological tests forbidden by
.claude/rules/code-test.md (option/usbip_test.go,
iso_scheduler_test.go, usbhost_darwin_status_test.go).
- Tag nine more usbip files with linux || (darwin && cgo); fixes
pre-existing windows / android lint failures because the protocol
types were only consumed by tagged code.
control_protocol.go was ungated but referenced ExportLeaseIdentity
from host.go (linux || (darwin && cgo)), breaking the stub-register
contract on Windows and darwin && !cgo. Move serverImportLease and
importLeaseTTL into export_ledger.go alongside their only consumers.
Drop the constructor guard so client import-all (Devices empty/omitted)
actually works, route ledger lease mutations through a new
mutateAndBroadcast helper so Unsubscribe-triggered release reaches all
subscribers, and tighten the Events contract: subscribe at Start so a
host failure aborts service start instead of silently disabling hotplug.
Three review findings shared one shape: a contract documented in
comments but reimplemented at each call site. Each fix collapses
the contract into a single named home.
- RebaseFrame now rounds up to the least frame >= currentFrame
with the matching low 8 bits, so scheduled iso submits never
land in the past. The .m bridge trusts the asap flag instead
of treating start_frame=0 as ASAP, which also fixes a latent
int32-wraparound path.
- NewServerService and NewClientService reject empty devices
with a clear error instead of starting up exporting nothing.
- exportLedger.reservedLocked unifies busy + outstanding lease;
AvailableExports and snapshotDeviceState consult it so legacy
devlist and control snapshots no longer advertise a leased
busid as available. IsBusy renamed to IsReserved across the
ExportHost interface and call sites.
Two supporting deepenings:
- Drop FrameOracle/IsoScheduler; iso submit is now the free
EncodeIsoSubmit(currentFrame, ...). The darwin endpoint holds
currentFrame func() uint64.
- Extract SelectMatches; host_linux and host_darwin Reconcile
loops share the match/dedup step.
The peer now owns submit/cancel lifecycle: pendingSubmit retains the
original DevID per submit, CMD_UNLINK carries it (stub_rx valid_request
was silently rejecting devid=0), and RET_UNLINK finalizes the bound
transaction. UrbTransaction.Cancel delegates instead of assembling the
wire itself. Stale exports stay in linux and darwin host snapshots so
the ledger broadcasts unavailable State updates rather than Removed.
Extract wire, URB lifecycle, iso translation, and per-endpoint behaviour
from client_darwin.go into dedicated modules so EndpointPause/Destroy can
quiesce in-flight URBs via CMD_UNLINK and so iso frame numbers survive the
256-frame wrap.
- usbip_peer.go: owns seqnum, write serialization, pending map, and
RET_SUBMIT/RET_UNLINK dispatch.
- urb_transaction.go: deterministic state machine for one in-flight URB
with Cancel + Wait; late RET_SUBMIT after cancel is swallowed.
- iso_scheduler.go: rebases Apple's 8-bit CI frame against the
controller's monotonic counter; explicit ASAP encoding into
TransferFlags so the server can pass firstFrameNumber=0 deliberately
rather than relying on the start_frame=0 sentinel.
- endpoint_darwin.go: per-endpoint worker selects on cmd/doorbell/
transaction-done; Pause/Destroy cancels the pending transaction,
completes the Apple CI transfer with ciStatusOffline, then writes
the CI success response. client_darwin.go becomes a thin demuxer.
- usbhost_darwin.{m,h,go}, host_darwin.go: plumb the wire-level ASAP
flag through to box_usbhost_device_iso so server-side scheduling
distinguishes "scheduled at frame 0" from ASAP.
Three independent correctness bugs that all violated "decide before
publishing": each surface (wire reply, sysfs path, broadcast state)
committed to a state the code later had to contradict.
darwin: switch IOUSBHostPipe abort to IOUSBHostAbortOptionSynchronous and
add the EP-0 path via abortDeviceRequestsWithOption (previously a silent
no-op for control transfers). After abort returns, await a per-pending
drained channel before writing RET_UNLINK so the late RetSubmit
suppression in finishSubmit has actually taken effect — the drained
channel is allocated lazily by markSubmitUnlinked, so the hot submit
path is unchanged.
linux: collapse the per-controller VHCI abstraction. The kernel exposes
attach/detach/status/status.N only on vhci_hcd.0 with globally unique
port numbers; the previous code wrote vhci_hcd.N/attach for N>0 and
silently failed past the primary controller's port range. Glob status*
on the primary, key reservations on bare port int, and carry the
status-suffix as a diagnostic-only secondary index for Description().
export_ledger: ConsumeLeaseAndReserve no longer publishes busy=true and
then conditionally rolls it back. Read seq under fast first, then make
every decision (including the lease.Generation check) inside one slow
critical section before any busy mutation. This removes the window
where a concurrent BroadcastIfChanged could observe transient busy and
poison l.state with no corrective broadcast on rollback.
Address `golangci-lint` modernize and unused warnings:
- replace map copy loops with `maps.Copy`
- collapse `if x > y { x = y }` clamps to `min`/`max`
- switch `for i := 0; i < n; i++` to `for i := range n`
- use `t.Context()` in the relay handoff test
- delete the unused `removeOnce` field and `openBinaryDevice` helper
`darwinExportHost.Events` closed its producer channel after destroying
the IOKit notification port, but `IONotificationPortDestroy` does not
synchronously drain the dispatch queue: a callback block already
executing or enqueued could still call `signal()` on the closed
channel, or resolve the `cgo.Handle` after `ref.Delete()`. Retain the
serial dispatch queue inside the watcher and `dispatch_sync` an empty
block before releasing it, so no native callback is in flight when
the Go side proceeds.
`ServerService.Close` cancelled the context and immediately closed the
`ExportHost`. On Darwin that calls `IOUSBHostDevice destroy` for every
capture while `darwinServerDataSession.handleSubmit` goroutines may
still be inside `IOUSBHostPipe` requests on the same device. Track
active sessions, close them explicitly, and wait for the per-session
cleanup goroutines via a WaitGroup before tearing down the host.
Darwin URBs handled in concurrent goroutines mutated BoxUSBHostDevice's
pipes/interfaces dictionaries without serialization, so SetConfiguration
racing with endpoint IO could corrupt them or hand back stale pipes.
Guard the collections with os_unfair_lock; keep the slow IOKit calls
outside the lock so unrelated endpoints don't stall.
Iso IN responses were flat-copied buffer[:actual], losing offsets for
multi-packet transfers. Pack the wire payload per IsoPackets[i].Offset/
ActualLength to match Linux's stub_tx.c, and mirror the scatter on the
Darwin client side so a future multi-packet emitter is correct.
On Linux the server was closing the userspace conn inside
NewServerDataSession (handoff.Start closes h.conn synchronously in
direct-TCP mode), so the subsequent WriteOpRepImport always wrote to a
closed socket and every default-mode import failed. Split DataSession
into prepare + Start: the server now writes the reply on the live conn
and only then calls session.Start() to hand the wire to the kernel.
Close also closes h.conn now so the prepare-then-Close error path
doesn't leak the fd.
On Darwin FinishImport returned false after successfully re-capturing
a pending replacement for a stale busy export, so the server skipped
reconcileAndBroadcast and the ledger kept the device hidden until
another IOKit topology event. Return true to trigger reconcile; the
next Reconcile fast-paths the matching registryID back into the
snapshot.
Three independent regressions that silently degraded the USB/IP feature:
- Darwin: when a busy device re-enumerated (same busid, new IOUSBHost
registry ID), Reconcile marked the export stale but never captured
the replacement, and nothing reconciled after FinishImport — the
device only reappeared on an unrelated future topology event. Now
Reconcile records the pending registry ID on the stale entry,
FinishImport rehydrates inline, and the server triggers a
reconcileAndBroadcast after release as a safety net.
- DEVLIST: SerialString and entryDeviceKey expected a "\0serial=..."
trailer in the 256-byte Path field, but encodePathField never wrote
one, so serial-based matching silently failed against plain usbipd
peers and against sing-box peers that fell back to DEVLIST polling.
encodePathField now writes the trailer (wire-compatible with kernel
usbip-utils, which NUL-terminate the path) and ReadOpRepDevListBody
populates DeviceEntry.Serial on decode.
- Control: any write/read I/O failure during the CONTROL handshake was
wrapped as errControlUnsupported, so a single TCP RST or 5s deadline
on startup permanently downgraded the client to 5-second DEVLIST
polling. Now only io.EOF on the ACK read and frame-validation
failures stay unsupported; other I/O failures retry with
exponential backoff (1s→30s) up to 3 consecutive transients before
committing to the polling fallback.
Drop the ConfirmImport reference left over from a prior design (the
function never existed) and remove three doc-comments that restated
the function name or narrated visible lock dances.
Apply the <10-LoC / <3-non-test-caller heuristic across the package:
inline writeControlFrame, removeLeaseWaiter, vhciAttach, vhciHubForSpeed,
applyControlSnapshot, applyRemoteEntries, shouldRetryBusID, entrySerial,
and the trivial darwin/linux import-host constructors at their call
sites.
Delete the darwinClientSession wrapper -- darwinVirtualController already
satisfies DataSession, so attach the Description() method to it directly
and return the controller from darwinImportHost.Attach.
Fold the 11-line data_session.go into host.go beside AttachedSession,
which embeds DataSession.
Inline thin pass-through helpers (< 10 lines, < 3 non-test callers) and
the writeSysfs path wrappers; delete the test-only vhciUsedPorts plus
its singleflight machinery and the never-called clientAssignment.IsActive.
Remove four private single-impl interfaces (usbEventListener,
darwinUSBHostDeviceWatch, darwinServerDataDevice, darwinEndpointStateMachine)
and both function-table DI seams (usbipOps, darwinServerOps); production calls
the underlying functions directly. Inline ~25 single-call helpers and fuse
subordinate routines into their sole caller. Collapse single-field helper
structs into their underlying types. Fold server_linux.go/server_darwin.go
into host_linux.go/host_darwin.go.
Delete change-detector tests per .claude/rules/code-test.md: parse round-trip,
mirror, input-validation, and builder-property suites that never exercised
real syscalls, network, or cgo. Six test files removed; linux_test.go slimmed
to three real-system tests; darwin_integration_test.go retains the six
real-cgo tests. Surviving suite: 16 tests that all exercise real OS APIs or
spawn the official Linux usbip server.
Net: -5100 LoC (~37% of the package).
The ExportHost.Reconcile changed bool is computed by both Linux and
Darwin implementations but discarded at the only call site
(reconcileAndBroadcast); the ledger already detects change in
BroadcastIfChanged. linuxImportHost.trackPort had a dead add=true
branch — all callers passed false. darwinExport.Snapshot was setting
StatusReason to the same string as State (tautological). LeaseCheck
for stale captures returned the State name as the reason.
Also trim narrative comments in export_ledger.go that referenced
refactor history or the deleted lease.go file, inline the trivial
checkAvailableUnderLock helper at its only caller, and replace
cloneIsoPackets with slices.Clone.
ServerService used to own three mutexes coordinating six pieces of
mutable state (exports, busy, controlState, controlSeq, controlSubs,
LeaseManager). The lock-ordering rule that fixed the import/attach
race in 0c892411a lived only as a comment. Two-mutex (fast/slow)
ledger absorbs all of it; the two mutexes are never held
simultaneously, so the ordering rule becomes a property of the type
instead of a discipline callers must observe. TryReserveForImport
closes the window where two concurrent OpReqImports could both pass
the busy check before either marked busy.
Collapse ServerService and ClientService to one definition each by
hiding Linux vs Darwin behind ExportHost, ImportHost, and Export
seams. Along the way, extract three state machines that previously
lived as scattered fields on the service structs:
- LeaseManager owns its own mutex and closes the
availability-vs-insert TOCTOU by checking export busy inside Issue
under the same lock as the insert.
- DataSession gives the three per-import data-plane implementations
(Linux kernel handoff, Darwin server data session, Darwin virtual
controller) a uniform Done/Err/Close interface.
- clientAssignment encapsulates the matched/import-all target state
and exposes ApplyMatched/ApplyAll diffs to ClientService, which
keeps worker goroutine lifecycle.
Service busy tracking moves off the per-platform serverExport struct
onto ServerService.busy, since it follows the lease/import lifecycle
rather than physical claim/release. linux_test.go is migrated to
construct ServerService and ClientService through the new host
interfaces.