Skip to content

[v2.0] Phase 2: Core Rust implementation #24

@fsecada01

Description

@fsecada01

Implement the Rust splitting engine exposed to Python via PyO3.

Wiki: Phase 2 detail
Branch: feature/rust-backend

Tasks

  • 2.1 src/lib.rs — PyO3 module entry point; register CharacterTextSplitter and TokenTextSplitter
  • 2.2 src/splitters/character.rsCharacterTextSplitter with split_text + split_texts (Rayon parallel); split_internal in separate impl block (not exposed to Python)
  • 2.3 src/splitters/token.rsTokenTextSplitter with token counting and chunking
  • 2.4 src/splitters/recursive.rsRecursiveSplitter with configurable separator hierarchy
  • 2.5 src/utils.rs — shared Rust utilities (encoding helpers, etc.)
  • 2.6 cargo clippy --all-targets -- -D warnings passes; cargo fmt --check passes

Key constraint

split_internal must live in a separate impl block (no #[pymethods]) so PyO3 does not expose it to Python.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestrustRust implementation worktrackingParent tracking issue with sub-tasksv2.0TextSpitter v2.0 Rust backend

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions