feat: smooth streaming mode for TUI response rendering#281
Merged
anandgupta42 merged 6 commits intomainfrom Mar 19, 2026
Merged
feat: smooth streaming mode for TUI response rendering#281anandgupta42 merged 6 commits intomainfrom
anandgupta42 merged 6 commits intomainfrom
Conversation
…response rendering During LLM streaming, the `<markdown>` element re-lays out block elements on every token delta, causing visible text jumps and jerky scrolling. This adds an opt-in `ALTIMATE_SMOOTH_STREAMING` feature flag with four optimizations: - Use `<code filetype="markdown">` during streaming (no layout-shifting blocks), swap to `<markdown>` after message completion for rich rendering - Pre-merge consecutive `message.part.delta` events in the SDK 16ms batch window (N tokens → 1 store update per part per frame) - Replace `produce()` with direct store path updates on the delta hot path - Reduce `toBottom()` scroll delay from 50ms to 0ms - Memoize `trim()` in `TextPart` (unconditional, no flag needed) Enable with: `ALTIMATE_SMOOTH_STREAMING=true`
Comment on lines
+64
to
+74
| const key = `${props.messageID}:${props.partID}:${props.field}` | ||
| const existing = deltaMap.get(key) | ||
| if (existing !== undefined) { | ||
| const prev = merged[existing] as typeof event | ||
| ;(prev.properties as typeof props).delta += props.delta | ||
| continue | ||
| } | ||
| deltaMap.set(key, merged.length) | ||
| } | ||
| merged.push(event) | ||
| } |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
Comment on lines
1486
to
1496
| <markdown syntaxStyle={syntax()} streaming={false} content={trimmed()} conceal={ctx.conceal()} /> | ||
| </Match> | ||
| <Match when={!Flag.OPENCODE_EXPERIMENTAL_MARKDOWN}> | ||
| <code | ||
| filetype="markdown" | ||
| drawUnstyledText={false} | ||
| streaming={true} | ||
| streaming={false} | ||
| syntaxStyle={syntax()} | ||
| content={props.part.text.trim()} | ||
| content={trimmed()} | ||
| conceal={ctx.conceal()} | ||
| fg={theme.text} |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
…ssion - Clear `deltaMap` on non-delta events to preserve causal ordering (prevents folding deltas across intervening `message.part.updated` events) - Clone event objects instead of mutating in-place during delta merge - Restore dynamic `streaming` prop in fallback `<markdown>`/`<code>` blocks (`!props.message.time.completed`) to avoid regression for non-opt-in users
…xperience Combines three streaming optimizations under one flag: - **Smooth streaming** (`ALTIMATE_SMOOTH_STREAMING`): renders with `<code>` during streaming to avoid markdown layout jumps, swaps to `<markdown>` after message completion - **Line buffering** (`ALTIMATE_LINE_STREAMING`): buffers deltas and flushes only on `\n` (complete lines). Remaining text flushes on message completion. No partial lines ever appear. - **Width cap** (`ALTIMATE_CONTENT_MAX_WIDTH`): caps text at 100 columns for readability. Automatically disabled on small screens where the cap would exceed available width. `ALTIMATE_CALM_MODE=true` enables all three with sensible defaults. Individual flags still work independently for fine-grained control. Includes 38 unit tests covering delta merging, line buffering, flag composition, and width capping edge cases (small screens, empty buffers, consecutive newlines, cross-message isolation).
- Flush/delete line buffer entries in `message.removed` handler to prevent memory leaks when messages are aborted or removed without `time.completed` - Add clarifying comment explaining line streaming / smooth streaming interaction (line streaming branch handles its own direct store updates) - Add "Calm Mode Quick Start" section to CLI docs with usage examples - Update `ALTIMATE_LINE_STREAMING` doc to mention abort cleanup
…removed` When streaming ends, `message.part.updated` writes the full final text via `reconcile()`. Without clearing the line buffer first, `flushAllBuffersForMessage` on `message.updated` would append the remaining buffered text on top of the already-complete content — duplicating the trailing partial line. Fix: discard all line buffer entries for a part when `message.part.updated` fires (the server's content is authoritative). Also clear on `message.part.removed` to prevent orphaned buffer entries.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Adds an opt-in
ALTIMATE_SMOOTH_STREAMINGfeature flag that significantly improves TUI response rendering smoothness during LLM streaming. When enabled:<code>during streaming,<markdown>after completion — the<markdown>element re-lays out block elements (headers, code blocks, lists) on every token delta, causing visible text jumps and jerky scrolling. The<code>element does syntax coloring without block-level layout shifts, then swaps to rich<markdown>rendering once the message finishes.message.part.deltaevents for the same part+field are merged within the 16ms batch window, reducing store updates from N-per-part to 1-per-part per frame.produce()proxy with directsetStore()path mutation on the delta hot path, avoiding Immer-style proxy creation on every token.toBottom()delay from 50ms to 0ms.trim()—TextPartnow usescreateMemo()fortrim()so it runs once per text change instead of 3x per reactive read (unconditional, always active).Enable with:
ALTIMATE_SMOOTH_STREAMING=trueorOPENCODE_SMOOTH_STREAMING=trueType of change
Issue for this PR
Closes #280
How did you verify your code works?
bun run build:localand tested interactivelybun turbo typecheck)Checklist
ALTIMATE_SMOOTH_STREAMING(except the unconditionaltrim()memoization which is a pure perf improvement)