Skip to content

v0.0.4: Session export, reformulation fix, settings defaults, and UI improvements#3

Merged
9bow merged 12 commits intomainfrom
develop
Feb 28, 2026
Merged

v0.0.4: Session export, reformulation fix, settings defaults, and UI improvements#3
9bow merged 12 commits intomainfrom
develop

Conversation

@9bow
Copy link
Owner

@9bow 9bow commented Feb 21, 2026

Summary

  • Session history export: 세션 상세 화면에 Export… 버튼 추가 — NSSavePanel로 타임스탬프·인식·번역 포함 .txt 파일 저장
  • Reformulation 중복 버그 수정: SFSpeechRecognizer가 reformulate 시 이미 확정된 텍스트를 suffix로 재등장시키는 문제 수정. suffix 감지 로직 + stripLeadingPunctuation 헬퍼로 선행 구두점 제거
  • 기본값 업데이트: 폰트 20pt, 투명도 50%, Expire 20s, Speech Pause 3s, Split mode, Overlay unlock
  • Live Preview 개선: 설정 화면의 미리보기가 Combined/Split 모드에 따라 분기됨
  • README 국기 이모지: 언어 내비게이션 링크에 🇬🇧 🇰🇷 🇨🇳 🇯🇵 추가

Changes

Bug Fix

  • AppState.swift: Reformulation else-branch에서 currentTextlastConsumedPartial을 suffix로 포함하는 경우 감지, liveText에서 제거 및 lastConsumedPartial 업데이트
  • AppState.swift: stripLeadingPunctuation() 헬퍼 추가 — liveText 및 모든 subtitle entry commit에 적용

Features

  • SessionHistoryView.swift: Export… 버튼, NSSavePanel, sessionText() 포맷터
  • UserSettings.swift: 기본값 5개 업데이트
  • FontSettingsView.swift: Split/Combined 모드별 분기 Live Preview

Docs

  • README*.md: 4개 언어 파일 국기 이모지 추가

Test Plan

  • Split mode에서 음성 인식 후 reformulation 발생 시 liveText가 이미 확정된 내용을 재표시하지 않는지 확인
  • 번역 창에 중복 항목이 생성되지 않는지 확인
  • 쉼표로 시작하는 subtitle entry가 나타나지 않는지 확인
  • Session History에서 Export… 클릭 시 저장 패널이 열리고 .txt 파일이 올바른 형식으로 저장되는지 확인
  • 신규 설치 시 Split mode, 20pt 폰트, 50% 투명도가 기본으로 적용되는지 확인
  • 설정 > Display > Live Preview가 Combined/Split 전환 시 즉시 반영되는지 확인
  • ./build.sh 빌드 성공 확인

🤖 Generated with Claude Code

9bow added 5 commits February 21, 2026 17:38
…pending new content

When SFSpeechRecognizer reformulates text by prepending new content before
already-consumed text, the previous code showed the full reformulated text
(including already-finalized content) as liveText.

- Detect when currentText ends with lastConsumedPartial (suffix match)
- Show only the new prefix portion as liveText and update lastConsumedPartial
- Add stripLeadingPunctuation() helper to remove leading commas/whitespace
  from liveText and all subtitle entry commits
Default values updated to match preferred configuration:
- Font size: 24pt -> 20pt
- Background opacity: 70% -> 50%
- Subtitle expiry: 10s -> 20s
- Speech pause: 2.0s -> 3.0s
- Display mode: combined -> split
- Overlay lock: true -> false (both windows)

FontSettingsView live preview now reflects the active display mode:
- Split mode: two side-by-side boxes (Recognition | Translation)
- Combined mode: single box with original and translated text
Temporarily lower the session window level to .normal before showing
the save panel, then restore the original level after it closes.
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds session export and UI tweaks while addressing a speech recognizer reformulation duplication issue and updating first-run settings defaults.

Changes:

  • Adds session history export to .txt via NSSavePanel from the session detail view.
  • Improves reformulation handling by stripping already-consumed suffixes and introducing stripLeadingPunctuation to clean live/committed text.
  • Updates default user settings (font size/opacity/expiry/pause/split+unlock) and improves the settings live preview for split vs combined mode.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
README.md Adds flag icons to language nav and includes split-mode screenshot.
README.ko.md Adds flag icons to language nav and includes split-mode screenshot.
README.ja.md Adds flag icons to language nav and includes split-mode screenshot.
README.zh.md Adds flag icons to language nav and includes split-mode screenshot.
OST/Sources/UI/SessionHistoryView.swift Adds “Export…” button and export formatting/writing logic.
OST/Sources/UI/FontSettingsView.swift Branches Live Preview UI based on split vs combined display mode.
OST/Sources/Settings/UserSettings.swift Updates AppStorage defaults (font/opacity/expiry/pause/mode/locks).
OST/Sources/App/AppState.swift Reformulation de-dup logic adjustments + leading punctuation stripping applied to live/committed text.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +134 to +142
private func exportSession(_ session: RecordedSession) {
let panel = NSSavePanel()
panel.allowedContentTypes = [.plainText]
panel.nameFieldStringValue = "OST-\(session.formattedDate.replacingOccurrences(of: ":", with: "-")).txt"

// Temporarily lower the session window level so the save panel appears in front
// when "Always on top" (.floating) is active.
let hostWindow = NSApp.keyWindow
let originalLevel = hostWindow?.level
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exportSession ignores file write failures (try?). If the write fails (permissions, disk full, sandbox denial), the user gets no feedback. Handle the write with do/catch and surface/log the error (e.g., AppLogger + an alert) so the export action is observable and debuggable.

Copilot uses AI. Check for mistakes.
Comment on lines +258 to +260
// Update tracking so subsequent sink calls use this as the new base,
// preventing repeated stripping of the same consumed suffix
self.lastConsumedPartial = currentText
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the reformulation-suffix branch, lastConsumedPartial is set to the full currentText even though newPart is still being shown as live (unconsumed) text. On the next sink update (when currentText extends), the hasPrefix(lastConsumedPartial) path will drop everything up to this full value, which can cause newPart to disappear and never be committed. Consider keeping lastConsumedPartial representing only actually-consumed text and track the reformulated baseline separately (or avoid updating lastConsumedPartial here) so newPart remains visible/consumable until finalized.

Suggested change
// Update tracking so subsequent sink calls use this as the new base,
// preventing repeated stripping of the same consumed suffix
self.lastConsumedPartial = currentText
// Keep lastConsumedPartial representing only actually-consumed text;
// do not mark the newly prepended part as consumed yet.

Copilot uses AI. Check for mistakes.
Comment on lines +476 to +480
/// Strips leading punctuation characters (commas, periods, etc.) and whitespace from text.
private func stripLeadingPunctuation(_ text: String) -> String {
var result = text[...]
while let first = result.unicodeScalars.first,
CharacterSet.punctuationCharacters.union(.whitespaces).contains(first) {
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stripLeadingPunctuation only strips CharacterSet.whitespaces, not newlines. Since upstream code trims with .whitespaces in some paths, liveText (and committed entries) can still start with \n/\r and render oddly. Use .whitespacesAndNewlines (and update the doc comment accordingly) to ensure all leading whitespace is removed.

Suggested change
/// Strips leading punctuation characters (commas, periods, etc.) and whitespace from text.
private func stripLeadingPunctuation(_ text: String) -> String {
var result = text[...]
while let first = result.unicodeScalars.first,
CharacterSet.punctuationCharacters.union(.whitespaces).contains(first) {
/// Strips leading punctuation characters (commas, periods, etc.) and all leading whitespace (including newlines) from text.
private func stripLeadingPunctuation(_ text: String) -> String {
var result = text[...]
while let first = result.unicodeScalars.first,
CharacterSet.punctuationCharacters.union(.whitespacesAndNewlines).contains(first) {

Copilot uses AI. Check for mistakes.
Comment on lines +190 to +222
// Recognition window
VStack(alignment: .leading, spacing: 2) {
Text("Recognition")
.font(.caption2)
.foregroundColor(.secondary)
Text("Hello, this is sample speech.")
.font(.system(size: settings.fontSize))
.foregroundColor(settings.fontColor)
.fixedSize(horizontal: false, vertical: true)
}
.padding(10)
.frame(maxWidth: .infinity, alignment: .leading)
.background(
RoundedRectangle(cornerRadius: 8)
.fill(settings.backgroundColor.opacity(settings.backgroundOpacity))
)

// Translation window
VStack(alignment: .leading, spacing: 2) {
Text("Translation")
.font(.caption2)
.foregroundColor(.secondary)
Text("안녕하세요, 샘플 음성입니다.")
.font(.system(size: settings.translatedFontSize))
.foregroundColor(settings.translatedFontColor)
.fixedSize(horizontal: false, vertical: true)
}
.padding(10)
.frame(maxWidth: .infinity, alignment: .leading)
.background(
RoundedRectangle(cornerRadius: 8)
.fill(settings.backgroundColor.opacity(settings.backgroundOpacity))
)
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In split mode the Live Preview always renders both the recognition and translation samples, even if showOriginalText or showTranslation is toggled off. This makes the preview inconsistent with the visibility settings. Gate each preview pane (or its text content) on the corresponding toggle so the preview accurately reflects what users will see.

Suggested change
// Recognition window
VStack(alignment: .leading, spacing: 2) {
Text("Recognition")
.font(.caption2)
.foregroundColor(.secondary)
Text("Hello, this is sample speech.")
.font(.system(size: settings.fontSize))
.foregroundColor(settings.fontColor)
.fixedSize(horizontal: false, vertical: true)
}
.padding(10)
.frame(maxWidth: .infinity, alignment: .leading)
.background(
RoundedRectangle(cornerRadius: 8)
.fill(settings.backgroundColor.opacity(settings.backgroundOpacity))
)
// Translation window
VStack(alignment: .leading, spacing: 2) {
Text("Translation")
.font(.caption2)
.foregroundColor(.secondary)
Text("안녕하세요, 샘플 음성입니다.")
.font(.system(size: settings.translatedFontSize))
.foregroundColor(settings.translatedFontColor)
.fixedSize(horizontal: false, vertical: true)
}
.padding(10)
.frame(maxWidth: .infinity, alignment: .leading)
.background(
RoundedRectangle(cornerRadius: 8)
.fill(settings.backgroundColor.opacity(settings.backgroundOpacity))
)
if settings.showOriginalText {
// Recognition window
VStack(alignment: .leading, spacing: 2) {
Text("Recognition")
.font(.caption2)
.foregroundColor(.secondary)
Text("Hello, this is sample speech.")
.font(.system(size: settings.fontSize))
.foregroundColor(settings.fontColor)
.fixedSize(horizontal: false, vertical: true)
}
.padding(10)
.frame(maxWidth: .infinity, alignment: .leading)
.background(
RoundedRectangle(cornerRadius: 8)
.fill(settings.backgroundColor.opacity(settings.backgroundOpacity))
)
}
if settings.showTranslation {
// Translation window
VStack(alignment: .leading, spacing: 2) {
Text("Translation")
.font(.caption2)
.foregroundColor(.secondary)
Text("안녕하세요, 샘플 음성입니다.")
.font(.system(size: settings.translatedFontSize))
.foregroundColor(settings.translatedFontColor)
.fixedSize(horizontal: false, vertical: true)
}
.padding(10)
.frame(maxWidth: .infinity, alignment: .leading)
.background(
RoundedRectangle(cornerRadius: 8)
.fill(settings.backgroundColor.opacity(settings.backgroundOpacity))
)
}

Copilot uses AI. Check for mistakes.
- SessionHistoryView: handle export write errors with do/catch,
  AppLogger, and NSAlert instead of silently ignoring via try?
- AppState: remove lastConsumedPartial override in suffix-strip branch
  to prevent newPart (live unconsumed text) from being silently dropped
- AppState: use .whitespacesAndNewlines in stripLeadingPunctuation
  to also remove leading newlines
- FontSettingsView: gate split-mode preview panes on showOriginalText
  and showTranslation toggles for accurate visibility reflection
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +139 to +158
// Temporarily lower the session window level so the save panel appears in front
// when "Always on top" (.floating) is active.
let hostWindow = NSApp.keyWindow
let originalLevel = hostWindow?.level
hostWindow?.level = .normal

panel.begin { response in
hostWindow?.level = originalLevel ?? .normal
guard response == .OK, let url = panel.url else { return }
let text = sessionText(session)
do {
try text.write(to: url, atomically: true, encoding: .utf8)
} catch {
AppLogger.shared.log("Session export failed: \(error.localizedDescription)", category: .app)
let alert = NSAlert()
alert.messageText = "Export Failed"
alert.informativeText = error.localizedDescription
alert.alertStyle = .warning
alert.runModal()
}
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exportSession uses NSApp.keyWindow + temporarily changes the window level to get the save panel in front. This is fragile (the key window may be an overlay window, and restoring the level can fight with user setting changes while the panel is open). Prefer presenting the NSSavePanel as a sheet attached to the session history window (e.g., begin as a sheet for the host window) so AppKit handles ordering without mutating window levels.

Suggested change
// Temporarily lower the session window level so the save panel appears in front
// when "Always on top" (.floating) is active.
let hostWindow = NSApp.keyWindow
let originalLevel = hostWindow?.level
hostWindow?.level = .normal
panel.begin { response in
hostWindow?.level = originalLevel ?? .normal
guard response == .OK, let url = panel.url else { return }
let text = sessionText(session)
do {
try text.write(to: url, atomically: true, encoding: .utf8)
} catch {
AppLogger.shared.log("Session export failed: \(error.localizedDescription)", category: .app)
let alert = NSAlert()
alert.messageText = "Export Failed"
alert.informativeText = error.localizedDescription
alert.alertStyle = .warning
alert.runModal()
}
if let hostWindow = NSApp.keyWindow {
hostWindow.beginSheet(panel) { response in
guard response == .OK, let url = panel.url else { return }
let text = sessionText(session)
do {
try text.write(to: url, atomically: true, encoding: .utf8)
} catch {
AppLogger.shared.log("Session export failed: \(error.localizedDescription)", category: .app)
let alert = NSAlert()
alert.messageText = "Export Failed"
alert.informativeText = error.localizedDescription
alert.alertStyle = .warning
alert.runModal()
}
}
} else {
panel.begin { response in
guard response == .OK, let url = panel.url else { return }
let text = sessionText(session)
do {
try text.write(to: url, atomically: true, encoding: .utf8)
} catch {
AppLogger.shared.log("Session export failed: \(error.localizedDescription)", category: .app)
let alert = NSAlert()
alert.messageText = "Export Failed"
alert.informativeText = error.localizedDescription
alert.alertStyle = .warning
alert.runModal()
}
}

Copilot uses AI. Check for mistakes.
Comment on lines 222 to +266
@@ -244,11 +245,25 @@ final class AppState: ObservableObject {
let common = self.findCommonPrefix(currentText, self.lastConsumedPartial)
if common.count > 10 {
self.lastConsumedPartial = common
self.liveText = String(currentText.dropFirst(common.count)).trimmingCharacters(in: .whitespaces)
let tail = String(currentText.dropFirst(common.count)).trimmingCharacters(in: .whitespaces)
self.liveText = self.stripLeadingPunctuation(tail)
} else {
// Completely different text (e.g. language change)
self.lastConsumedPartial = ""
self.liveText = currentText
// Check if consumed text reappears at the END of reformulated text
// (recognizer prepended new content before already-consumed text)
let trimmedConsumed = self.lastConsumedPartial.trimmingCharacters(in: .whitespaces)
let trimmedCurrent = currentText.trimmingCharacters(in: .whitespaces)
if !trimmedConsumed.isEmpty && trimmedCurrent.hasSuffix(trimmedConsumed) {
let newPart = String(trimmedCurrent.dropLast(trimmedConsumed.count))
self.liveText = self.stripLeadingPunctuation(newPart)
// Do NOT update lastConsumedPartial here: newPart is still live/unconsumed.
// Overwriting with currentText would cause newPart to be silently dropped
// on the next sink call when currentText extends.
AppLogger.shared.log("Reformulation suffix stripped: \(trimmedConsumed.count) chars", category: .speech)
} else {
// Completely different text (e.g. language change)
self.lastConsumedPartial = ""
self.liveText = self.stripLeadingPunctuation(currentText)
}
Copy link

Copilot AI Feb 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stripLeadingPunctuation is applied to liveText, but lastConsumedPartial tracking/comparisons still use the raw recognizer text. If the recognizer later reformulates by removing/changing leading punctuation (e.g., “, hello” → “hello”), the hasPrefix / common-prefix checks can fail and previously-consumed content can reappear in liveText. Consider normalizing both currentText and lastConsumedPartial consistently for these comparisons (or maintaining a normalized consumed prefix) so punctuation-only reformulations don’t regress the “no re-showing consumed text” behavior.

Copilot uses AI. Check for mistakes.
9bow added 5 commits February 25, 2026 14:48
- Replace "Recognition" with "Transcription" throughout UI
- Move on-device recognition toggle from Debug to Languages tab
- Add version display in About tab from Bundle info
- Update Display Mode descriptions to use Transcription/Translation
- Simplify overlay window section titles for split mode
- Add SFSpeechRecognizerDelegate for availability monitoring
- Periodically recreate recognizer to prevent resource exhaustion
- Use taskGeneration counter to filter stale callbacks
- Add retry mechanism (3 attempts, 2s delay) for restart failures
- Replace DispatchQueue.main.asyncAfter with Task-based delay
- Fix thread safety in SystemAudioCapture with lock-protected properties
- Reconfigure translation source on auto-detected language change
- Invalidate translation session on capture stop
- Add HTTP status check in fallback Google Translate API
- Guard auto-scroll with isAtBottom check in all overlay views
- Only auto-scroll when user is at bottom or overlay is locked
@9bow 9bow merged commit 5114592 into main Feb 28, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants