Ensure Rust and Python backends produce identical output across all inputs. Quantify the performance gain. **Wiki:** [Phase 5 detail](https://github.com/fsecada01/TextSpitter/wiki/TextSpitter-2.0-Rust-Roadmap#phase-5-testing-strategy) **Branch:** `feature/rust-backend` ## Tasks - [ ] **5.1** `tests/test_parity.py` — happy-path parity: `split_text` and `split_texts` produce identical output from both backends - [ ] **5.2** Edge case parity tests: empty string, no separator present, separator-only input, single chunk exceeding `chunk_size` - [ ] **5.3** `tests/bench_splitting.py` — benchmark script: Python baseline vs Rust batch at 10,000 docs; prints speedup factor - [ ] **5.4** All parity tests pass in CI (both `use_rust=True` and `use_rust=False` paths exercised) ## Expected benchmark result | Backend | Time | Speedup | |---------|--------|---------| | Python | ~45s | 1x | | Rust | ~1.2s | ~37x |
Ensure Rust and Python backends produce identical output across all inputs. Quantify the performance gain.
Wiki: Phase 5 detail
Branch:
feature/rust-backendTasks
tests/test_parity.py— happy-path parity:split_textandsplit_textsproduce identical output from both backendschunk_sizetests/bench_splitting.py— benchmark script: Python baseline vs Rust batch at 10,000 docs; prints speedup factoruse_rust=Trueanduse_rust=Falsepaths exercised)Expected benchmark result