Skip to content

asyncio: _ProactorBasePipeTransport._call_connection_lost leaks OSError [WinError 6] from PipeHandle.close() on Windows #149388

@mxschmitt

Description

@mxschmitt

Summary

On Windows with the ProactorEventLoop, _ProactorBasePipeTransport._call_connection_lost calls self._sock.close() at Lib/asyncio/proactor_events.py:166 without guarding against OSError. PipeHandle.close() (Lib/asyncio/windows_utils.py:105-108) is not idempotent at the Win32 level: it calls _winapi.CloseHandle(self._handle) unconditionally, and if the underlying Win32 handle has already been closed by another code path between reading self._handle is not None and executing CloseHandle, the call raises OSError: [WinError 6] The handle is invalid. The exception escapes into the loop's default exception handler, causing noisy logs (and crashes where handlers escalate). Sentry observations show this fires in production across many users on Python 3.13.12 / 3.13.13.

Python version / platform

  • CPython 3.13.12 and 3.13.13, Windows 10 / 11 (x86_64)
  • asyncio.ProactorEventLoop (Windows default)

Related issues (for cross-reference)

Minimal deterministic reproducer

This is the tightest demonstration of the underlying defect — 10 lines, no asyncio event loop, no subprocess, no threading:

import _winapi
from asyncio import windows_utils

r, w = windows_utils.pipe(overlapped=(False, False))
ph = windows_utils.PipeHandle(r)
# Another code path closes the underlying Win32 handle directly,
# without going through PipeHandle. This happens in the wild via
# finalizer races, IOCP cleanup, or DuplicateHandle-based paths.
_winapi.CloseHandle(ph._handle)
ph.close()  # -> OSError: [WinError 6] The handle is invalid

On Python 3.13.12 Windows this reliably raises:

Traceback (most recent call last):
  File "repro.py", line 8, in <module>
    ph.close()
  File ".../asyncio/windows_utils.py", line 107, in close
    CloseHandle(self._handle)
OSError: [WinError 6] The handle is invalid

PipeHandle.__del__ then also raises the same error on GC because _handle was never cleared — a direct trigger of the ValueError: I/O operation on closed pipe / OSError [WinError 6] family tracked in gh-83413.

End-to-end reproducer (asyncio path)

This demonstrates the same defect surfacing through asyncio's _ProactorBasePipeTransport._call_connection_lost. Triggers in ~30% of trials on Python 3.13.12 Windows.

"""
Reproduce OSError [WinError 6] from PipeHandle.close() in
_ProactorBasePipeTransport._call_connection_lost.

Requires: Python 3.13, Windows, ProactorEventLoop (default).
"""
import asyncio
import sys
import threading
import time

HITS = []


def on_exc(loop, ctx):
    exc = ctx.get("exception")
    if isinstance(exc, OSError) and getattr(exc, "winerror", None) == 6:
        HITS.append(ctx)
        print(f"[HIT #{len(HITS)}] {ctx.get('message')}")


async def one_trial():
    loop = asyncio.get_running_loop()
    loop.set_exception_handler(on_exc)

    proc = await asyncio.create_subprocess_exec(
        sys.executable, "-c",
        "import sys; sys.stdout.write('x'*100); sys.stdout.flush()",
        stdout=asyncio.subprocess.PIPE,
        stderr=asyncio.subprocess.PIPE,
    )

    # Second reference to the same PipeHandle that the transport wraps.
    # In production, the second reference is held elsewhere by asyncio
    # internals OR by user code that retained proc.stdout; here we
    # make it explicit for reproducibility.
    stdout_ph = proc._transport._proc.stdout

    stop = threading.Event()

    def closer():
        while not stop.is_set():
            try:
                stdout_ph.close()
            except OSError:
                pass
            time.sleep(0)

    t = threading.Thread(target=closer, daemon=True)
    t.start()
    try:
        await asyncio.wait_for(proc.communicate(), timeout=3)
    finally:
        stop.set()
        t.join(timeout=1)


def main():
    for _ in range(20):
        asyncio.run(one_trial())
    print(f"\n{len(HITS)} WinError 6 event(s) over 20 trials")


if __name__ == "__main__":
    main()

Sample output (three separate runs of 20 trials each):

3 WinError 6 event(s) over 20 trials
8 WinError 6 event(s) over 20 trials
4 WinError 6 event(s) over 20 trials

Root cause analysis

The race window is inside PipeHandle.close():

# Lib/asyncio/windows_utils.py
def close(self, *, CloseHandle=_winapi.CloseHandle):
    if self._handle is not None:   # (A)
        CloseHandle(self._handle)  # (B) releases GIL during blocking syscall
        self._handle = None        # (C)

Between (B) and (C), the GIL can be released (blocking Win32 call), allowing a second thread to also pass the self._handle is not None check at (A) and advance to its own (B). The second CloseHandle operates on a handle value that the first call has already closed, returning WinError 6.

The raised OSError propagates out of _ProactorBasePipeTransport._call_connection_lost at Lib/asyncio/proactor_events.py:166:

self._sock.close()   # ← raises OSError here
self._sock = None    # never runs
...
self._called_connection_lost = True  # never runs

Neither self._sock = None nor self._called_connection_lost = True executes, because the exception escapes. The loop's exception handler then surfaces:

Exception in callback _ProactorBasePipeTransport._call_connection_lost()
handle: <Handle _ProactorBasePipeTransport._call_connection_lost()>
Traceback (most recent call last):
  File "asyncio/events.py", line 89, in _run
  File "asyncio/proactor_events.py", line 166, in _call_connection_lost
  File "asyncio/windows_utils.py", line 107, in close
    CloseHandle(self._handle)
OSError: [WinError 6] The handle is invalid

Why this surfaces in production

In production we see the event at a rate of ~20 fires across many users running a FastAPI app that spawns ffmpeg subprocesses with stdout=PIPE / stderr=PIPE, uses proc.communicate(), and runs the event loop in a non-main thread. Breadcrumbs consistently show back-to-back subprocess spawns (e.g., two ffmpeg -list_devices queries within 600ms), suggesting high-churn subprocess traffic widens the race window naturally.

The second close() call is not from user code in those cases — it must come from asyncio itself. The likely path is _ProactorBasePipeTransport.__del__ (Lib/asyncio/proactor_events.py:114-117), which after PR gh-92842 calls self._sock.close() directly without synchronization with the loop's scheduled _call_connection_lost. If the transport is GC'd while _call_connection_lost is dispatched but not yet run (or is in-flight across the GIL boundary at step (B) of PipeHandle.close()), the finalizer races.

Proposed fix

Make PipeHandle.close() idempotent by catching OSError and always clearing _handle:

# Lib/asyncio/windows_utils.py
def close(self, *, CloseHandle=_winapi.CloseHandle):
    if self._handle is not None:
        try:
            CloseHandle(self._handle)
        finally:
            self._handle = None

Alternative: catch OSError in _ProactorBasePipeTransport._call_connection_lost around the self._sock.close() call, since the handle is already effectively gone and there is nothing to recover. Similar pattern to the shutdown()/hasattr guard immediately above it at lines 164-165.

# Lib/asyncio/proactor_events.py
def _call_connection_lost(self, exc):
    if self._called_connection_lost:
        return
    try:
        self._protocol.connection_lost(exc)
    finally:
        if hasattr(self._sock, 'shutdown') and self._sock.fileno() != -1:
            self._sock.shutdown(socket.SHUT_RDWR)
        try:
            self._sock.close()
        except OSError:
            pass
        self._sock = None
        ...
        self._called_connection_lost = True

Either fix resolves the repro above. The first is preferable because it localizes idempotency to the type that needs it (PipeHandle), matching how socket.socket.close() is already idempotent in CPython.

Suggested regression test

Lib/test/test_asyncio/test_windows_utils.py:

import unittest
import _winapi
from asyncio import windows_utils


class PipeHandleIdempotencyTest(unittest.TestCase):
    def test_close_is_idempotent_on_stale_handle(self):
        """PipeHandle.close() must not raise if the handle has been
        closed by another code path. Regression test for gh-NNNNNN.
        """
        r, w = windows_utils.pipe(overlapped=(False, False))
        ph = windows_utils.PipeHandle(r)
        handle_value = ph._handle
        # Close the underlying handle directly
        _winapi.CloseHandle(handle_value)
        # PipeHandle doesn't know the handle is gone — it should not raise
        ph.close()                  # must not raise
        self.assertIsNone(ph._handle)
        _winapi.CloseHandle(w)

cc @kumaraditya303

Metadata

Metadata

Assignees

No one assigned

    Labels

    OS-windowsstdlibStandard Library Python modules in the Lib/ directorytopic-asynciotype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions