Skip to content

[fal.ai/livepeer-staging] Media input loop fails with TransferEncodingError 400 — 'Not enough data to satisfy transfer length header' #805

@livepeer-tessa

Description

@livepeer-tessa

Summary

The media input loop and control channel subscription both crash with a TransferEncodingError: 400 when the trickle connection to the orchestrator fails mid-stream. This causes the worker to attempt to cleanly disconnect but hit ConnectionRefusedError trying to reach the orchestrator host.

Errors

2026-04-02 00:08:00,317 - scope.cloud.livepeer_app - ERROR - Media input loop failed: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data to satisfy transfer length header.'>

2026-04-02 00:08:01,605 - scope.cloud.livepeer_app - ERROR - Control channel subscription error: Trickle JSONL subscription error: ClientPayloadError: Response payload is not completed: <TransferEncodingError: 400, message='Not enough data to satisfy transfer length header.'>

Then cascading connection errors during cleanup:

aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host 34.169.235.70:8935 ssl:default [Connect call failed ('34.169.235.70', 8935)]
ConnectionRefusedError: [Errno 111] Connect call failed ('34.169.235.70', 8935)

Log Context

  • Timestamp: 2026-04-02 at 00:07:59 – 00:08:02 UTC
  • fal_job_id: 89db37ff-*
  • Orchestrator host: 34.169.235.70:8935 (direct IP, not staging-1.daydream.monster)
  • Also seen earlier: 2026-04-01 23:03:55 – 23:03:57 UTC

Relationship to Existing Issues

Impact

Worker terminates the session uncleanly. The trickle DELETE calls also fail with ClientConnectorError, suggesting the orchestrator went down or was restarted while the session was active.

Possible Fix

In livepeer_app.py (and trickle_subscriber.py):

  • Treat TransferEncodingError as a network-level disconnect, not an application error — catch it alongside EOFError in the input loop error handling, and do a graceful shutdown instead of propagating as an error
  • Don't attempt trickle DELETE/POST cleanup if the host is already unreachable (pre-check or suppress ClientConnectorError during teardown)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions