Feature Type
Would make my life easier
Feature Description
This is a feature request for https://community.livekit.io/t/how-to-commit-user-turn-without-generating-a-response/433
We need to force the user to give up their turn and get the agent to “say” something specific - so in this case, we do not want to trigger the agent to say anything automatically when we end the user turn and call commit_user_turn().
We need to call commit_user_turn() though because we need to commit the user's latest speech from the STT (i.e. buffered transcript) into the livekit local state so the LLM is aware of it before the user next ends their turn. i.e. we do not wish to discard the users latest speech whenever the agent interrupts their turn.
Ideally we could do something like commit_user_turn(skip_reply=False)
Workarounds / Alternatives
We could manually commit it to local state i.e. consume the transcript creation event and manually add it to the local context maybe like
@self._session.on("user_input_transcribed")
...
agent = session.current_agent
agent._chat_ctx.items.append(chat_msg)
session._conversation_item_added(chat_msg)
like this, but the problem would be I do not know how to “flush” the STT to get the latest available transcript as input to this.
Another idea would be to do
commit_user_turn()
session.interupt()
generate agent speech
But as it says here Search | DeepWiki I think that would not be bulletproof either as its possible to interrupt before the LLM request starts making the interrupt a no-op
Additional Context
No response
Feature Type
Would make my life easier
Feature Description
This is a feature request for https://community.livekit.io/t/how-to-commit-user-turn-without-generating-a-response/433
We need to force the user to give up their turn and get the agent to “say” something specific - so in this case, we do not want to trigger the agent to say anything automatically when we end the user turn and call commit_user_turn().
We need to call commit_user_turn() though because we need to commit the user's latest speech from the STT (i.e. buffered transcript) into the livekit local state so the LLM is aware of it before the user next ends their turn. i.e. we do not wish to discard the users latest speech whenever the agent interrupts their turn.
Ideally we could do something like
commit_user_turn(skip_reply=False)Workarounds / Alternatives
We could manually commit it to local state i.e. consume the transcript creation event and manually add it to the local context maybe like
like this, but the problem would be I do not know how to “flush” the STT to get the latest available transcript as input to this.
Another idea would be to do
But as it says here Search | DeepWiki I think that would not be bulletproof either as its possible to interrupt before the LLM request starts making the interrupt a no-op
Additional Context
No response