Models and providers

InkSpoke runs on two kinds of AI model working back to back: one turns your speech into text, the other polishes that text before it lands in your app. This page explains those two roles, the three places a model can come from, and how you choose which ones are active — globally and per workspace.

The two model roles

Every dictation flows through up to two models, in order:

Role	What it does	Also called
Speech recognition	Converts your audio into a raw transcript.	ASR, "speech model"
Text processing	Refines the transcript — removes filler, fixes grammar, matches the app's tone.	LLM, "text model", refinement

The speech model always runs. The text model only runs when AI refinement is on (its master switch, AI Refinement, is enabled by default). Turn refinement off and InkSpoke injects the raw transcript verbatim.

note

The two roles are configured independently. You can pair an on-device speech model with a cloud text model, or vice versa — whatever fits your privacy and speed needs. See On-device vs. cloud.

The three sources

Both roles can be filled by a model from one of three sources. In every model picker they're grouped exactly this way:

Source	What it is	Runs	Tier
Platform	InkSpoke's own hosted models — nothing to download or configure.	Cloud	Pro (included in the Pro trial)
On-Device	Models you download and run locally: Whisper for speech, a local GGUF model for text.	On your machine	Small speech model is free; the rest are Pro
BYOK	Bring Your Own Key — connect any OpenAI-compatible provider with your own API key.	That provider's cloud	Pro

New installs start with a deliberately simple, private-by-default pairing:

Speech: Whisper Small (on-device, offline, free).
Text: Platform AI (cloud) — active during your Pro trial.

You can change either side at any time.

Platform models

Platform models are InkSpoke's built-in, hosted models. There's nothing to set up — pick one and it works. Platform text refinement is part of Pro (and available during the Pro trial); when the trial ends, on-device and BYOK options keep working.

On-Device models

On-device models download to your machine and run entirely offline — your audio and text never leave the computer. They live under AI Models → On-Device (a Pro area), where each model card shows its size, speed/accuracy ratings, language support, and a Download/Delete button. A storage bar tracks total disk use, and the model that's currently in use can't be deleted.

Speech — Whisper sizes. InkSpoke ships the Whisper family in several sizes. Only Small is free; every other size (larger and smaller) requires Pro or Perpetual.

Whisper size	Tier	Rough trade-off
Tiny	Pro	Fastest, lowest accuracy.
Base	Pro	A step up from Tiny.
Small (244M)	Free	The default — a balanced choice for most people.
Medium	Pro	More accurate, slower and heavier.
Large	Pro	Highest accuracy, most demanding.
Large V3 Turbo	Pro	Large-model accuracy, noticeably faster.
Large V3 Turbo Q5 / Q8	Pro	Quantized Turbo variants that trade a little accuracy for lower memory use.

As a rule, smaller models are faster and lighter; larger ones are more accurate. The Turbo variants aim for near-Large accuracy at higher speed.

Text — local LLM. The On-Device tab also offers a downloadable local text model (GGUF format) so refinement can run fully offline too. It has one tunable you won't find on cloud models:

Setting	Default	Range	What it does
Max context size	16,384	512 – 131,072	How much text (tokens) the local model can consider at once. Larger uses more memory.

Free your RAM automatically

On-device models can be memory-hungry. By default InkSpoke unloads idle models after 10 minutes (the Model Memory strategy on Configuration → General), reloading them the next time you dictate. Power users can switch this to Always loaded for zero warm-up, or Manual with an "Unload models now" button.

BYOK — bring your own key

If you already pay for OpenAI, or run a local server like Ollama or LM Studio, or use any other OpenAI-compatible endpoint, connect it directly under AI Models → Providers (a Pro area). Each provider you add can expose several models that then appear in the pickers alongside Platform and On-Device options.

Adding a provider takes a few fields:

Quick Setup preset (optional) to pre-fill common providers, or fill in manually.
Name and API Base URL.
API Key — masked in the UI and stored in your OS keychain, never in InkSpoke's settings file.
Timeout and temperature, plus a models table where you list each model's id, type (speech or text), and token limit.
A Test Connection button to confirm the key and URL work before you save.

Your keys stay on your device. You can also manage the same personal keys from the web account.

Choosing the active models

Your global choices live under AI Models → Global Defaults. Two pickers — one per role — each list every available model, grouped by source:

┌───────────────────────────────────────────────────────────┐
│  AI Models  ›  Global Defaults                            │
├───────────────────────────────────────────────────────────┤
│  Speech recognition                                       │
│   [ Whisper Small  (On-Device) ▾ ]                        │
│      ── Platform ──   ── On-Device ──   ── BYOK ──        │
│                                                           │
│  Text processing (refinement)                             │
│   [ Platform AI  (Platform) ▾ ]                           │
│      Workspace-default refinement:  [ ✓ On ]              │
│      Token limit:  [  −   2048   +  ]                     │
│                                                           │
│   (Master AI Refinement is ON)                            │
└───────────────────────────────────────────────────────────┘

Whatever you pick here is the default for every dictation — unless a workspace overrides it.

The text-processing side has two extra controls: a workspace-default refinement toggle (on by default) that decides whether workspaces without their own text model still get refined by this default, and a token limit stepper that caps how long a refined response can be. If the master AI Refinement switch is off, these are greyed out with a reminder.

How workspaces override per context

A workspace can pin its own preferred speech model and preferred text model, so dictation into your IDE can use different models than dictation into email — automatically, based on the app you're in.

When it's time to refine, InkSpoke resolves the text model in strict order:

The speech model resolves more simply: a workspace's preferred speech model (if set) wins for that dictation; otherwise the Global Default is used. You can also override the workspace itself — and therefore its models — for a single session from the picker on the listening overlay.

tip

This is why the same words come out differently in different apps: your "Code" workspace might pin an accuracy-first Whisper size and skip refinement, while your "Email" workspace uses a cloud text model tuned for a professional tone. See Smart matching and precedence.

What needs Pro

Most of the model catalog is a Pro feature. Here's the quick map:

Capability	Free	Pro / Perpetual
Whisper Small (on-device speech)	✅	✅
Other on-device Whisper sizes	—	✅
On-device local text (LLM) model	—	✅
Platform (cloud) models	Trial	✅
BYOK providers & keys	—	✅

note

BYOK providers you already added stay viewable and deletable even if your plan lapses — you just can't add new ones until you're on Pro again.

Platform notes

On-device speech models can use hardware acceleration where it's available:

Platform	On-device acceleration
Windows	CUDA GPU when present, otherwise CPU.
macOS	Metal GPU, plus an optional Apple Neural Engine accelerator (a small extra download that's used automatically once installed).
Linux	CPU only.

The GPU toggle (Use GPU for dictation, on by default) appears only on platforms that support it.

Next steps

Audio and models settings — where the pickers, downloads, and providers live.
On-device vs. cloud and privacy — decide where your words get processed.
Create and tune a workspace — pin per-context speech and text models.
Choosing your models on the web — set active models from your account.

The two model roles​

The three sources​

Platform models​

On-Device models​

BYOK — bring your own key​

Choosing the active models​

How workspaces override per context​

What needs Pro​

Platform notes​

Next steps​