Enhancement: "push to talk" and keyboard shortcuts for easier voice prompting (STT) #4807

danielrosehill · 2024-11-29T00:40:58Z

What features would you like to see added?

Hey!

I would really love to begin prompting by speech (ie, using voice recognition)

If it would be of interest, I'd also like to contribute some documentation around the various STT features as I couldn't find the parameters covered in the STT page.

Specifically: what does "conversation mode" toggle on and "auto transcribe audio".

I have a couple of ideas for this which I'm batching under one feature enhancement with the intention of looking into the feasibility of trying to work on these myself:

Hotkey support to start and stop voice detection to facilitate (almost) hands-free usage
Some implementation of "push to talk" mode ... hold down an icon (e.g. the mic button) until you're ready to send.

The second feature is really just a workaround for what I find to be the main frustration of STT and which is specifically challenging when trying to use it for prompting: the automatic cutoffs / pause detection. I don't know if this is baked into the engine or if it's a parameter that can be adjusted. But it would be really helpful to increase the buffer time to a few seconds so that users had time to think about what they want to instruct.

More details

I think the above pretty much covers it!

I'm possibly in the minority of LLM users who feel this way, but I find the idea of voice prompting much more potentially useful than having real time chats with LLMs (ie, simultaneous STT and TTS). I mean, it would be nice to have both. But if I had to choose, voice prompting would actually speed up my workflow the most!

Which components are impacted by your request?

General, UI

Pictures

No response

Code of Conduct

I agree to follow this project's Code of Conduct

berry-13 · 2024-12-16T17:23:36Z

@danielrosehill hey, my bad for the late reply, I didn’t fully get the question at first. I hadn’t thought about the "push to talk" thing, but you can already trigger the STT with Shift + Alt + L

danielrosehill added the enhancement New feature or request label Nov 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: "push to talk" and keyboard shortcuts for easier voice prompting (STT) #4807

Enhancement: "push to talk" and keyboard shortcuts for easier voice prompting (STT) #4807

danielrosehill commented Nov 29, 2024

berry-13 commented Dec 16, 2024

Enhancement: "push to talk" and keyboard shortcuts for easier voice prompting (STT) #4807

Enhancement: "push to talk" and keyboard shortcuts for easier voice prompting (STT) #4807

Comments

danielrosehill commented Nov 29, 2024

What features would you like to see added?

More details

Which components are impacted by your request?

Pictures

Code of Conduct

berry-13 commented Dec 16, 2024