Explore using PyBytesWriter API for compression libraries output buffers

# Feature or enhancement

The new `PyBytesWriter()` API is fast and easy to use. I expect it will bring a nice improvement both to maintainability and speed for compression output buffer management.

I have some perf recordings showing that a large portion (>50%!) of time in decompression for a mix of data sizes (1K, 1M, 1G) is in `_BlocksOutputBuffer_Finish`, re-assembling the output buffer.

I also made a very hacky modification to pycore_blocks_output_buffer.h to use `PyBytesWriter()` and found it greatly sped up decompression time:

The below two tests are operating on compressed enwiki content with zstd compression.
| test | main | PyBytesWriter() |
|--------|--------|--------|
| decompress 1M | 2.15ms | 1.65ms |
| decompress 1G | 2.2s | 1.73s |

Those are 25-30% speedups!

I think this is enough to motivate a refactor of this code to use `PyBytesWriter()` and benchmark against the current implementation across compression modules and data sizes.

cc @vstinner for viz


### Linked PRs
* gh-139976

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Explore using PyBytesWriter API for compression libraries output buffers #139877

Feature or enhancement

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Explore using PyBytesWriter API for compression libraries output buffers #139877

Description

Feature or enhancement

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions