βοΈ Steps To Make It Work
This guide explains only the essential settings.
You can find tooltips for each field directly in the Unity Editor.
Step 1. π§ͺ Settings
Go to UnityNeuroSpeech β Main β Create Settings in the Unity toolbar.
Default settings are recommended.
Step 2. π UNS Manager
UnityNeuroSpeech Manager is a GameObject in your scene that controls all non-agent scripts.
Without it, no agent (talkable AI) will work.
Create a Dropdown in your scene.
Then go to UnityNeuroSpeech β Main β Create UNS Manager.
The important setting there is:
- Whisper model path in StreamingAssets β path to your downloaded Whisper model (
.bin) inside theStreamingAssetsfolder (without theAssetsdirectory).
Example:
If the full path is
Assets/StreamingAssets/UnityNeuroSpeech/Whisper/ggml-medium.bin
then you should enter
UnityNeuroSpeech/Whisper/ggml-medium.bin
Step 3. π§ Agent
An Agent in UnityNeuroSpeech is a GameObject that can listen, respond, and talk using LLM.
Once you create your first agent, youβll be able to talk with your AI!
Add a Button and an AudioSource to your scene.
Then go to UnityNeuroSpeech β Main β Create Agent.
Here are some important settings:
-
Agent index β the index mentioned in the QuickStart.
It links an agent to its voice file.
β οΈ Each agent must have a unique index! -
Emotions β AI can respond with emotion tags.
Example:
β How are you, DeepSeek?
β <happy> Iβm feeling grateful. What about you?
The word inside< >is the emotion chosen by AI.
Emotions are used for monitoring via the Agent API.
The system prompt (generated automatically by UNS) defines how emotions are used. -
Actions β optional behavior tags like
"turn_off_lights","enable_cutscene_123","play_horror_sound", etc. Works as emotions
Click Generate Agent, then Create Agent In Scene β only in that order!
π Thatβs it! When you run the game:
- Select a microphone in the dropdown
- Click the button to start recording
- Speak β click again
- AI responds with voice
After you click Generate Agent, two files will be created:
- AgentNameController.cs β your agent controller (you donβt need to modify it)
- AgentNameSettings.asset β ScriptableObject with agent settings (system prompt, model name, index, etc.)
You can edit the settings as you wish.
Agent performance (βspeedβ) depends on:
- LLM model size
- Whisper model size
- Voice files length
- AI response size
Small models like deepseek-r1:7b or ggml-tiny.bin run fast but may ignore system prompts (emotions, actions, etc.).
Large models like ggml-large.bin usually work perfectly β but will be very slow π
On first load, TTS may respond slowly β itβs ok. It'll work faster next time.