There’s a model with a more expensive dock, or one without. The one without worked fine. But it had to be the Box 3 not Box 2. It worked pretty well and you could create custom images to indicate whether it was listening, thinking, etc.
The box isn’t powerful enough to run an LLM itself. It’s just good enough as an audio conduit. You can either use their cloud integration with ChatGPT, or now, Anthropic Claude. But if you had a powerful Home Assistant server, say an Nvidia Jetson or a PC with a beefy Nvidia GPU, you could run local models like Llama and have better privacy.
This is from earlier this year. I imagine they’ve advanced more since then.
I have really been wanting to try it out. Is there any good off the shelf hardware that you can use as “smart speakers” for it yet?
https://www.espressif.com/en/news/ESP32-S3-BOX-3
There’s a model with a more expensive dock, or one without. The one without worked fine. But it had to be the Box 3 not Box 2. It worked pretty well and you could create custom images to indicate whether it was listening, thinking, etc.
Instructions here: https://www.home-assistant.io/voice_control/s3_box_voice_assistant/
The box isn’t powerful enough to run an LLM itself. It’s just good enough as an audio conduit. You can either use their cloud integration with ChatGPT, or now, Anthropic Claude. But if you had a powerful Home Assistant server, say an Nvidia Jetson or a PC with a beefy Nvidia GPU, you could run local models like Llama and have better privacy.
This is from earlier this year. I imagine they’ve advanced more since then.