Skip to content

Headless server leaks KQUEUE file descriptors on macOS (~1.6 per tool call) #2707

@PureWeen

Description

@PureWeen

Problem

The copilot headless server (copilot --headless --port <port>) leaks KQUEUE file descriptors on macOS. Each tool call (bash, grep, etc.) leaks ~1.6 kqueue handles that are never released, eventually exhausting resources and breaking all tool execution with EBADF errors.

Standalone Repro

kqueue-leak-repro.zip

A minimal .NET console app reproduces the leak using only the GitHub.Copilot.SDK:

// 1. Start a copilot headless server:  copilot --headless --port 4321
// 2. Run this repro:                   dotnet run
// 3. Watch kqueue count climb:         lsof -p <server-pid> | grep KQUEUE | wc -l

using GitHub.Copilot.SDK;

var client = new CopilotClient(new CopilotClientOptions
{
    CliUrl = "http://localhost:4321",
    UseStdio = false,
    AutoStart = false,
});
await client.StartAsync();

var session = await client.CreateSessionAsync(new SessionConfig
{
    OnPermissionRequest = PermissionHandler.ApproveAll,
});

for (int i = 1; i <= 50; i++)
{
    var response = await session.SendAndWaitAsync(
        new MessageOptions { Prompt = $"Run this exact bash command and report the output: echo 'hello from round {i}'" },
        timeout: TimeSpan.FromSeconds(30));
    
    // After each round, check: lsof -p <server-pid> | grep KQUEUE | wc -l
    // Count grows monotonically and never decreases.
}

Repro Results (50 tool calls)

Baseline: 112 kqueue handles

Round   1/50: kqueue=114 (+2 leaked)
Round  10/50: kqueue=127 (+15 leaked)
Round  25/50: kqueue=149 (+37 leaked)
Round  50/50: kqueue=191 (+79 leaked)

Baseline kqueue:  112
Final kqueue:     191
Leaked:           79
Leak rate:        ~1.6 per tool call

❌ KQUEUE LEAK CONFIRMED — handles grow monotonically and are never released.

Passive Monitoring (real-world usage over 45 minutes)

13:48  total=30   kqueue=3     (baseline, fresh server)
13:58  total=43   kqueue=3
14:03  total=62   kqueue=15
14:18  total=98   kqueue=52
14:33  total=149  kqueue=103   (+100 kqueue in 45 min)

Long-term Impact (9-day server uptime)

After 9 days of continuous operation, the server accumulated 9,594 leaked KQUEUE handles (10,321 total FDs):

$ lsof -p <pid> | awk '{print $5}' | sort | uniq -c | sort -rn | head -3
9594 KQUEUE
 436 unix
 242 CHR

All sessions lost bash/shell tool access simultaneously. Sessions reported EBADF or "Failed to start bash process". Only recovery was killing and restarting the headless server.

Analysis

The copilot headless server is a compiled Node.js binary. Node.js uses libuv which creates kqueue event watchers on macOS for child process management. Each tool subprocess (bash, grep, etc.) gets a kqueue watcher that is never closed after the subprocess exits.

  • The leak is in the CLI server process, not in the SDK client
  • The SDK client connects via TCP in persistent/headless mode and does not spawn subprocesses
  • The leak is proportional to tool call volume, not time
  • Kqueue handles are never released during normal operation — only a server restart clears them

Environment

  • macOS ARM64
  • copilot CLI bundled with GitHub.Copilot.SDK 0.2.1
  • Tested on both long-running server (9 days) and fresh server (repro completes in minutes)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions