Audio and model settings

This page is the practical reference for two parts of Settings: the Audio tab, where you tune how InkSpoke captures your voice, and the AI Models section, where you choose the models that turn that voice into polished text. For the concepts behind these choices — the two model roles and the three sources — read Models and providers first; this page tells you where each control lives and what its default is.

Both areas open from the left navigation of the Settings window:

Audio → Configuration → the Audio tab.
AI Models → its own top-level item, split into three tabs: Global Defaults, On-Device, and Providers.

Audio settings

The Audio tab controls the whole capture side of dictation — which mic you use, the sounds you hear, when a recording stops on its own, and how aggressively InkSpoke cleans up the incoming audio.

┌──────────────────────────────────────────────────────────┐
│  Configuration  ›  Audio                                 │
├──────────────────────────────────────────────────────────┤
│  Microphone     [ Built-in Microphone ▾ ]    ⟳ Refresh   │
│                                                          │
│  Sound Effects              [ ✓ On ]                     │
│     ▶ start    ▶ stop    ▶ cancel      (test each)       │
│                                                          │
│  Dictation Mode        ( ● Standard   ◦ Live Preview )   │
│  Silence timeout           [  −    30 s   + ]            │
│  Max recording duration    [  −   300 s   + ]            │
│                                                          │
│  Noise Suppression (DeepFilterNet)     [ ✓ On ]          │
│  Whisper Mode (quiet speech)           [   Off  ]        │
└──────────────────────────────────────────────────────────┘

Microphone

Pick which input device InkSpoke records from. If you plug in a headset or USB mic after opening Settings, click Refresh to re-scan for it. Your active device is also shown on the listening overlay and can be switched quickly from the system-tray Microphone submenu.

Sound Effects

InkSpoke plays short chimes when a recording starts, stops, and is cancelled, so you get audible confirmation without looking at the overlay. Three test buttons let you preview each cue. Turn the whole set off with a single toggle.

note

The Sound Effects card only appears on platforms that provide capture sounds. If you don't see it, your build doesn't ship them.

Dictation Mode

This chooses how your speech is transcribed:

Standard (default) transcribes everything in one pass after you stop — the most accurate option.
Live Preview streams a running transcript into the overlay while you're still talking, finalizing at sentence boundaries.

Live Preview runs on the on-device speech path only. If you have a cloud speech model active, InkSpoke shows a warning and quietly falls back to Standard rather than firing a stream of API calls.

Live Preview needs an on-device speech model

Select a cloud (Platform or BYOK) speech model and Live Preview can't stream — you'll get Standard behavior instead. Keep Whisper (on-device) active to see the live transcript. More detail in Dictation modes and languages.

Recording limits

Two safety timers stop a session that runs away — for example if you walk off mid-dictation:

Setting	Default	Range	What it does
Silence timeout	30 s	0–300 s	Auto-cancels after this many seconds of continuous silence. `0` disables the timer.
Max recording duration	300 s (5 min)	0–3600 s	Hard cap on a single dictation. `0` means no limit.

Noise Suppression (DeepFilterNet)

A neural denoiser that strips background noise — fans, keyboard clatter, ambient chatter — while preserving your voice. It's on by default and adds only a few milliseconds of latency. Because it relies on a downloadable model, the toggle is disabled while that model is still downloading.

Whisper Mode

Whisper Mode (off by default) tunes InkSpoke for dictating quietly — an open office, a shared room, a sleeping partner nearby. When it's on, InkSpoke lowers the threshold for detecting speech, boosts input gain, and asks Whisper to search harder, so soft, near-silent speech still transcribes cleanly. Leave it off for normal-volume dictation. It's also toggleable from the system-tray menu.

Audio settings at a glance

Setting	Default	What it does
Microphone	System default until you pick one	Chooses the capture device; Refresh re-scans.
Sound Effects	On	Start / stop / cancel chimes, with test buttons.
Dictation Mode	Standard	Standard (batch) vs Live Preview (streaming, on-device only).
Silence timeout	30 s	Auto-stop after continuous silence (`0` = off).
Max recording duration	300 s	Hard cap per dictation (`0` = no limit).
Noise Suppression (DeepFilterNet)	On	Neural background-noise removal (needs its model downloaded).
Whisper Mode	Off	Quiet-speech tuning (lower threshold, higher gain).

AI Models

The AI Models area is where you choose the two models every dictation flows through — a speech model that hears you and a text model that refines the result — and where you download local models or connect your own providers. It has three tabs.

┌────────────────────────────────────────────────────────────────┐
│  AI Models                                                     │
│  [ Global Defaults ]  [ On-Device · PRO ]  [ Providers · PRO ] │
└────────────────────────────────────────────────────────────────┘

Global Defaults

Your baseline models — used for every dictation unless a workspace pins its own. Two grouped pickers list every available model under Platform, On-Device, and BYOK headings.

┌──────────────────────────────────────────────────────────┐
│  AI Models  ›  Global Defaults                           │
├──────────────────────────────────────────────────────────┤
│  Speech recognition                                      │
│   [ Whisper Small  (On-Device) ▾ ]                       │
│                                                          │
│  Text processing (refinement)                            │
│   [ Platform AI  (Platform) ▾ ]                          │
│      Workspace-default refinement   [ ✓ On ]             │
│      Token limit                    [  −  ····  + ]      │
│                                                          │
│   (Master AI Refinement is ON)                           │
└──────────────────────────────────────────────────────────┘

New installs start on the private-by-default pairing: Whisper Small (on-device, offline, free) for speech and Platform AI (cloud) for refinement during your Pro trial.

The text side adds two controls:

Control	Default	What it does
Workspace-default refinement	On	Whether workspaces that don't pin their own text model still get refined by this global default.
Token limit	—	Caps how long a refined response can be.

Both sit under the master AI Refinement switch (on Configuration → General, on by default). When that master switch is off, these controls are greyed out with a reminder, and InkSpoke injects your raw transcript verbatim. See How refinement works.

On-Device (Pro)

PRO (requires Pro or Perpetual) — download and manage models that run entirely on your machine, so your audio and text never leave the computer. A banner reminds you that Whisper Small (244M) is free; every other on-device model requires Pro or Perpetual.

┌───────────────────────────────────────────────────────────┐
│  AI Models  ›  On-Device   [PRO]                          │
│  Small (244M) is free · other models need Pro             │
│  Storage  ▓▓▓▓▓▓▓░░░░░░░░  used by downloaded models      │
├───────────────────────────────────────────────────────────┤
│  Whisper Small        244M     ● in use                   │
│     SPEED ●●●●○   ACCURACY ●●●○○   many languages         │
│     [ Delete ]                                            │
│  Whisper Large V3 Turbo                                   │
│     SPEED ●●●○○   ACCURACY ●●●●●          [ Download ]    │
│  ──────────────────────────────────────────────────       │
│  Use GPU for dictation                 [ ✓ On ]           │
│  Apple Neural Engine (macOS only)      [ Download ]       │
│  ──────────────────────────────────────────────────       │
│  Local text model (GGUF)                                  │
│     Max context size          [  −   16384   + ]          │
└───────────────────────────────────────────────────────────┘

Speech — Whisper models. InkSpoke ships the Whisper family in several sizes. Each card shows its size chip, an in use pill for the active one, SPEED and ACCURACY dot ratings, its language support, and a Download / Delete button. A storage bar tracks total disk use; the model that's currently in use can't be deleted.

Whisper size	Tier
Tiny	Pro
Base	Pro
Small (244M)	Free — the default
Medium	Pro
Large	Pro
Large V3 Turbo	Pro
Large V3 Turbo Q5 / Q8	Pro

As a rule, smaller models are faster and lighter; larger ones are more accurate. See Models and providers for the full trade-off.

Acceleration. A Use GPU for dictation toggle (on by default) appears only where hardware acceleration is available. On macOS you'll also see an optional Apple Neural Engine accelerator — a small extra download that's used automatically once installed. See Platform notes below.

Text — local model. The On-Device tab also offers a downloadable local text model (GGUF format) so refinement can run fully offline. It has one tunable cloud models don't:

Setting	Default	Range	What it does
Max context size	16,384	512 – 131,072	How much text (in tokens) the local model can consider at once. Larger uses more memory.

Free your memory automatically

On-device models can be RAM-hungry. By default InkSpoke unloads idle models after 10 minutes and reloads them next time you dictate — the Model Memory strategy on General and hotkeys.

Providers (BYOK, Pro)

PRO (requires Pro or Perpetual) — Bring Your Own Key. Connect any OpenAI-compatible endpoint (a commercial API, or a local server like Ollama or LM Studio) with your own key. Each provider you add can expose several models that then appear in the Global Defaults and workspace pickers alongside Platform and On-Device options.

┌───────────────────────────────────────────────────────────┐
│  AI Models  ›  Providers   [PRO]                [ + Add ] │
├───────────────────────────────────────────────────────────┤
│  My OpenAI            api.openai.com/v1        3 models   │
│  Local LM Studio      localhost:1234/v1        1 model    │
│                                                           │
│  ── Editing a provider ──────────────────────────────     │
│   Quick Setup   [ preset ▾ ]  [ Apply ]                   │
│   Name          [ My OpenAI            ]                  │
│   API Base URL  [ https://api.openai.com/v1 ]             │
│   API Key       [ ••••••••••••  ] (stored in keychain)    │
│   Timeout       [  30  ] s                                │
│   Models        id · type · token limit · temperature     │
│                 [ + add row ]                             │
│                 [ Test Connection ]   [ Cancel ] [ Save ] │
└───────────────────────────────────────────────────────────┘

Add Provider is disabled unless you're on Pro — but any providers you already added stay viewable and deletable even if your plan lapses; you just can't add new ones until you upgrade again. Each provider card shows its name, base URL, and model count.

Editing a provider gives you these fields:

Field	Notes
Quick Setup	Optional preset picker + Apply to pre-fill a common provider.
Name	Your label for the provider.
API Base URL	The OpenAI-compatible endpoint. Built-in presets already include the `/v1` suffix.
API Key	Masked in the UI and stored in your OS keychain — never in InkSpoke's settings file.
Timeout	5–120 s. Defaults to 10 s for text models, 30 s for speech.
Temperature	Sampling temperature for the provider.
Models table	One row per model: id, type (speech or text), token limit (50k–1M, in 5k steps), and temperature. Add or remove rows as needed.
Test Connection	Verifies your key and URL work before you save.

Providers and their models are stored in InkSpoke's local database, with keys in the keychain — none of it lands in settings.json. You can manage the same personal keys from the web account.

Keys live in your keychain, not the cloud

API keys you paste here are held in your operating system's secure credential store on this device. Removing a provider deletes its stored key.

Platform notes

On-device speech models use whatever acceleration your machine offers. The Use GPU for dictation toggle and the Apple Neural Engine card appear only where they apply:

Platform	On-device acceleration
Windows	CUDA GPU when present, otherwise CPU.
macOS	Metal GPU, plus an optional Apple Neural Engine download used automatically once installed.
Linux	CPU only (no GPU toggle shown).

The Sound Effects card is shown only on platforms that ship capture chimes.

Next steps

Models and providers — the concepts: two roles, three sources, and how workspaces override them.
General and hotkeys — the AI Refinement master switch, Model Memory, and injection settings.
On-device vs. cloud and privacy — decide where your words get processed.
Account, sync, and updates — unlock Pro to use the models above.

Audio settings​

Microphone​

Sound Effects​

Dictation Mode​

Recording limits​

Noise Suppression (DeepFilterNet)​

Whisper Mode​

Audio settings at a glance​

AI Models​

Global Defaults​

On-Device (Pro)​

Providers (BYOK, Pro)​

Platform notes​

Next steps​