The listening overlay
When you press your dictation hotkey, a small window pops up to show InkSpoke is hearing you. That window is the listening overlay (or HUD). It gives you instant feedback while you speak, lets you switch language or workspace mid-session, and hands you a Send button for finishing without touching the keyboard again. This page is a complete tour of it.
If you just want the basic press-speak-press loop, start with Push-to-talk basics. This page goes deeper into everything the overlay shows you.
What the overlay is
The overlay is a frameless, always-on-top window that appears the moment you trigger dictation and disappears when the text lands. It never steals focus from the app you're writing into — your cursor stays exactly where it was, so the finished text goes to the right place.
Because it's reused across every dictation, it appears instantly and feels like part of the system rather than a separate app.
Anatomy of the overlay
Here's the default Classic style — a compact bar — with every element labeled:
┌──────────────────────────────────────────────────────────┐
│ ● Listening… 0:07 │ ← status dot + label + elapsed timer
├──────────────────────────────────────────────────────────┤
│ ▁▃▅▇█▇▅▃▂▄▆█▆▄▂▁▃▅▇▅▃▁▂▄▆▄▂▁▃▅▇▅▃▁ │ ← live waveform (reacts to your voice)
│ │
│ [ Marketing ▾ ] [ Auto ▾ ] [ Send ] │ ← workspace picker · language picker · Send
├──────────────────────────────────────────────────────────┤
│ Esc cancel · Alt+Space finish 🎙 Studio Mic │ ← footer hint · active input device
└──────────────────────────────────────────────────────────┘
Element by element:
| Part | What it tells you / does |
|---|---|
| Status dot + label | The current phase: Preparing…, Listening…, Processing…, Done, or Error. |
| Elapsed timer | How long the current session has been recording. |
| Live waveform | A real-time equalizer that dances to your voice, so you can see you're actually being heard. |
| Workspace picker | The workspace InkSpoke resolved for this dictation — override it here. |
| Language picker | The dictation language (or Auto) — switch it per session. |
| Send | Stop, transcribe, refine, and inject — the same as pressing your hotkey again. |
| Footer hint | Reminds you: Esc cancel · <your finish hotkey> finish. |
| Device name | The microphone InkSpoke is currently listening through. |
You can drag the overlay anywhere by grabbing its header or grip and dragging. If it consistently lands somewhere awkward, set a fixed position instead.
Reading the state
The label and dot walk through a short lifecycle as your words become text:
- Preparing… — shown only if the speech model needs a moment to load (roughly longer than half a second). Most of the time you'll skip straight to Listening.
- Listening… — InkSpoke is capturing audio. Speak now.
- Processing… — you've finished; InkSpoke is transcribing and (if enabled) refining.
- Done / Error — the text was injected, or something went wrong.
The waveform reinforces the state visually: a gentle breathing pulse while preparing, voice-driven bars while listening, and a scrolling wave while processing.
The two styles: Classic and Capsule
InkSpoke ships two fully-working looks for the overlay. Pick whichever you prefer under Settings (OverlayStyle).
Classic (the default) is the compact bar shown above — small, quick, out of the way.
Capsule is a taller, rounded design with a live-transcript box, a colored status dot, and a coral voice-pulse animation that ripples while you're listening. It's the better choice if you like to watch your words appear as you speak.
╭─────────────────────────────── ───────────────────────────────╮
│ ◉ Listening 0:12 │ ← colored status dot
│ ┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅┅ (coral voice-pulse) │
│ │
│ "so the plan for next quarter is to focus on…" │ ← live transcript box
│ ▁▃▅▇█▇▅▃▂▄▆█▆▄▂▁▃▅▇▅▃▁ │
│ │
│ [ Marketing ▾ ] [ Auto ▾ ] [ Send ] │
│ Esc cancel · Alt+Space finish 🎙 Studio Mic │
╰──────────────────────────────────────────────────────────────╯
| Classic | Capsule | |
|---|---|---|
| Shape | Compact bar (default) | Rounded capsule |
| Live transcript | Alongside the waveform | In a dedicated box |
| Listening cue | Waveform | Voice-pulse animation + status dot |
| Best for | Staying minimal | Watching text stream in |
The streaming transcript that appears as you speak only shows when your Dictation Mode is set to Live Preview (the default is Standard, which transcribes in one pass after you finish). See Dictation modes and languages for the trade-offs. Either mode works with either overlay style.
The pickers: language and workspace
Both dropdowns let you change how this dictation is handled without opening settings.
Language picker — lists Auto plus your preferred languages. By default InkSpoke auto-detects, so you'll usually see Auto. To force a language for the session, pick it from the dropdown.
While the overlay is up, press ↑ or ↓ to cycle through your languages. No need to open the dropdown.
Workspace picker — shows the workspace InkSpoke resolved for the app you're dictating into. A smart-matched workspace is pre-selected and marked with a green indicator; you can override it or choose Auto to let InkSpoke decide. (If you've turned smart matching off, the first entry reads Select Workspace instead of Auto.) Workspaces teach InkSpoke your vocabulary, tone, and preferred models — see Smart matching and precedence.
Finishing, cancelling, and moving
| Action | How |
|---|---|
| Finish (transcribe + inject) | Click Send, or press your activation hotkey again. |
| Cancel (throw the session away) | Press Esc, or click the ✕. Nothing is transcribed or injected. |
| Cycle language | ↑ / ↓ |
| Move the overlay | Left-drag its header or grip. |
Esc is registered as a global shortcut while the overlay is open, so it cancels even if your focus is elsewhere.
When there's nowhere to type: copy-mode fallback
Sometimes, by the time your text is ready, the place you meant to type into can't accept it — you clicked onto the desktop, a button, or a window with no text field. Instead of losing your words, the overlay switches to copy mode: it shows the full transcript with a Copy button and the label No text field detected.
┌────────────────────────────────────────────────────────────┐
│ ⚠ No text field detected │
├────────────────────────────────────────────────────────────┤
│ "Here's the transcript we couldn't inject anywhere — │
│ it's safe, just copy it and paste it yourself." │
│ │
│ [ Copy ] │
└────────────────────────────────────────────────────────────┘
Click Copy (you'll see a Copied! confirmation) and paste it wherever you like. Your words are never thrown away just because the target moved.
Where it appears on screen
By default the overlay anchors in the Center of your active monitor. You can change this with the OverlayPosition setting. Two positions are unique to the overlay — FollowCursor (it appears near your text cursor) and Center — plus eight fixed screen anchors:
┌───────────────────────────────────────────────────┐
│ TopLeft TopCenter TopRight │
│ │
│ MiddleLeft Center ★ MiddleRight │
│ │
│ BottomLeft BottomCenter BottomRight │
└───────────────────────────────────────────────────┘
★ = default · plus FollowCursor (tracks your cursor)
| Position | Where it lands |
|---|---|
FollowCursor | Near your text cursor (overlay-only) |
Center | Middle of the active monitor (default) |
TopLeft … BottomRight | Any of the eight screen edges/corners |
Settings
Both settings live in the desktop app's general settings:
| Setting | Default | What it does |
|---|---|---|
OverlayStyle | Classic | Switch between the compact Classic bar and the Capsule live-transcript design. |
OverlayPosition | Center | Anchor the overlay: FollowCursor, Center, or one of eight screen anchors. |
Platform notes
The overlay looks and behaves the same on Windows, macOS, and Linux. Two small differences are worth knowing:
- Hotkey labels in the footer match your OS. The footer hint shows your real finish hotkey — for example
Alt+Spaceon Windows and Linux, or⌥Spaceon macOS. - On macOS, the overlay is careful never to steal foreground from your target app, and it re-activates that app after you press Send or Copy so the text lands in the right place.
Next steps
- Push-to-talk basics — the core press-speak-press loop the overlay wraps.
- Dictation modes and languages — Standard vs. Live Preview, and setting up your languages.
- General settings and hotkeys — change the overlay style, position, and your dictation hotkeys.
- Text injection — how the finished text gets into your app, and what copy-mode falls back to.