refactor(stage-tamagotchi,crates): no more cuda required and candle removed#323
refactor(stage-tamagotchi,crates): no more cuda required and candle removed#323nekomeowww merged 2 commits intomainfrom
Conversation
✅ Deploy Preview for airi-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
✅ Deploy Preview for airi-vtuber ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Summary of Changes
Hello @nekomeowww, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly refactors the audio transcription capabilities by replacing the candle machine learning framework with ONNX Runtime. This strategic shift aims to enhance the application's performance, reduce specific hardware dependencies, and streamline the underlying machine learning inference pipeline. The changes span across the Rust backend, build configurations, and the frontend interface to ensure a seamless transition to the new technology.
Highlights
- Technology Migration: The core audio transcription functionality has been migrated from the
candlemachine learning framework toONNX Runtime (ORT). This change aims to improve performance and broaden compatibility with various hardware accelerators. - CUDA Dependency Removal: The direct dependency on CUDA for GPU acceleration within the
stage-tamagotchiapplication has been removed. The newONNX Runtimeintegration now leverages its own execution providers, including CoreML for macOS, DirectML and CUDA for Windows, and CUDA for Linux, offering more flexible GPU support. - Plugin Renaming and Restructuring: The audio transcription plugin has been renamed from
tauri-plugin-ipc-audio-transcription-candletotauri-plugin-ipc-audio-transcription-ortto accurately reflect the underlying technology. This involved creating a new Rust crate and updating all references across the application. - Build System and Frontend Updates: Extensive updates were made to
Cargo.tomlandCargo.lockfiles to reflect the newORTdependencies and remove allcandle-related packages. The frontend application (apps/stage-tamagotchi/src/) was also updated to invoke the newORT-based plugin and adapt to changes in the transcription result format (e.g., no longer returning language as a separate value).
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request refactors the audio transcription plugin to use ONNX Runtime (ort) instead of candle, removing the CUDA dependency and providing a more flexible backend. I have identified a few critical issues that need to be addressed regarding event handling, audio processing, and API mismatch. I have also included some suggestions to improve maintainability by renaming components that still carry the old candle naming. Addressing these points will ensure the stability and correctness of this new implementation.
| let transcription = processor | ||
| .transcribe(chunk.as_slice(), &config) | ||
| .map_err(|e| e.to_string())?; | ||
|
|
||
| info!("Transcription completed: {}", transcription); | ||
|
|
||
| Ok((transcription, language)) | ||
| Ok(transcription) |
There was a problem hiding this comment.
The frontend expects the ipc_audio_transcription command to return a single-element array (a tuple in Rust), but the current implementation returns a single String. This will cause a deserialization error or unexpected behavior on the frontend when it tries to destructure the result.
The returned value should be wrapped in a tuple to match the frontend's expectation.
| let transcription = processor | |
| .transcribe(chunk.as_slice(), &config) | |
| .map_err(|e| e.to_string())?; | |
| info!("Transcription completed: {}", transcription); | |
| Ok((transcription, language)) | |
| Ok(transcription) | |
| let transcription = processor | |
| .transcribe(chunk.as_slice(), &config) | |
| .map_err(|e| e.to_string())?; | |
| info!("Transcription completed: {}", transcription); | |
| Ok((transcription,)) |
|
I'll build and test it later 👀 |
|
Let me merge it so we have have a test with our CI/CD. |
Build succeeded, @typed-sigterm try this: https://github.com/moeru-ai/airi/actions/runs/16516148177 For @Weathercold, I got another PR about Nix, hold on. |
Description
Close #310, related to #319
Linked Issues
Additional context