What is everyone using for the LLM model for HA voice when selfhosting ollama? I’ve tried llama and qwen with varying degrees of understanding my commands. I’m currently on llama as it appears a little better. I just wanted to see if anyone found a better model.

Edit: as pointed out, this is more of a speech to text issue than llm model. I’m looking into the alternatives to whisper

  • chaospatterns@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    10 days ago

    That’s not going to be fixed with a different LLM model though. I’m experiencing similar problems. If my stt is bad then, then the LLM just gets even more confused or requires a big model that doesnt run efficiently on my local GPU. won’t trigger my custom automations because the tools don’t consider custom automations phrases.

    Speech2phrase improves accuracy for utterances that are basic like turn on X, or anything specified in an automation, but then struggles for other speech.

    My next project is to implement a router that forwards the utterance to both speech2phrase and whisper and try to estimate which is correct.