Implement PyTorch RLDS data loader with distributed support by tahsinkose · Pull Request #920 · Physical-Intelligence/openpi

tahsinkose · 2026-04-01T08:57:09Z

Summary

Implements the PyTorch RLDS data loader, replacing the NotImplementedError stub in
create_rlds_data_loader. This enables PyTorch DDP training on RLDS-format datasets (e.g.,
DROID).

Design

Rank-0-only pipeline: Only rank 0 builds the heavy TF-based RLDS pipeline; other ranks
receive batches via torch.distributed.broadcast to avoid redundant resource usage.
Broadcast + shard: Each full batch is broadcast from rank 0, then sliced so every rank gets
batch_size // world_size samples.
Exhaustion sync: A broadcast flag ensures all ranks break together, preventing deadlocks on
collectives.
Single-GPU: When torch.distributed is not initialized, the loader degenerates to a simple
pass-through with no overhead.

Implement PyTorch RLDS data loader with distributed support

38d2163

tahsinkose requested a review from kvablack as a code owner April 1, 2026 08:57

tahsinkose mentioned this pull request Apr 1, 2026

How to run pi05_full_droid_finetune with pytorch? #812

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement PyTorch RLDS data loader with distributed support#920

Implement PyTorch RLDS data loader with distributed support#920
tahsinkose wants to merge 1 commit intoPhysical-Intelligence:mainfrom
tahsinkose:feat/pytorch-rlds-dataloader

tahsinkose commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tahsinkose commented Apr 1, 2026

Summary

Design

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant