Skip to main content

Models and providers

InkSpoke runs on two kinds of AI model working back to back: one turns your speech into text, the other polishes that text before it lands in your app. This page explains those two roles, the three places a model can come from, and how you choose which ones are active — globally and per workspace.

The two model roles

Every dictation flows through up to two models, in order:

RoleWhat it doesAlso called
Speech recognitionConverts your audio into a raw transcript.ASR, "speech model"
Text processingRefines the transcript — removes filler, fixes grammar, matches the app's tone.LLM, "text model", refinement

The speech model always runs. The text model only runs when AI refinement is on (its master switch, AI Refinement, is enabled by default). Turn refinement off and InkSpoke injects the raw transcript verbatim.

note

The two roles are configured independently. You can pair an on-device speech model with a cloud text model, or vice versa — whatever fits your privacy and speed needs. See On-device vs. cloud.

The three sources

Both roles can be filled by a model from one of three sources. In every model picker they're grouped exactly this way:

SourceWhat it isRunsTier
PlatformInkSpoke's own hosted models — nothing to download or configure.CloudPro (included in the Pro trial)
On-DeviceModels you download and run locally: Whisper for speech, a local GGUF model for text.On your machineSmall speech model is free; the rest are Pro
BYOKBring Your Own Key — connect any OpenAI-compatible provider with your own API key.That provider's cloudPro

New installs start with a deliberately simple, private-by-default pairing:

  • Speech: Whisper Small (on-device, offline, free).
  • Text: Platform AI (cloud) — active during your Pro trial.

You can change either side at any time.

Platform models

Platform models are InkSpoke's built-in, hosted models. There's nothing to set up — pick one and it works. Platform text refinement is part of Pro (and available during the Pro trial); when the trial ends, on-device and BYOK options keep working.

On-Device models

On-device models download to your machine and run entirely offline — your audio and text never leave the computer. They live under AI Models → On-Device (a Pro area), where each model card shows its size, speed/accuracy ratings, language support, and a Download/Delete button. A storage bar tracks total disk use, and the model that's currently in use can't be deleted.

Speech — Whisper sizes. InkSpoke ships the Whisper family in several sizes. Only Small is free; every other size (larger and smaller) requires Pro or Perpetual.

Whisper sizeTierRough trade-off
TinyProFastest, lowest accuracy.
BaseProA step up from Tiny.
Small (244M)FreeThe default — a balanced choice for most people.
MediumProMore accurate, slower and heavier.
LargeProHighest accuracy, most demanding.
Large V3 TurboProLarge-model accuracy, noticeably faster.
Large V3 Turbo Q5 / Q8ProQuantized Turbo variants that trade a little accuracy for lower memory use.

As a rule, smaller models are faster and lighter; larger ones are more accurate. The Turbo variants aim for near-Large accuracy at higher speed.

Text — local LLM. The On-Device tab also offers a downloadable local text model (GGUF format) so refinement can run fully offline too. It has one tunable you won't find on cloud models:

SettingDefaultRangeWhat it does
Max context size16,384512 – 131,072How much text (tokens) the local model can consider at once. Larger uses more memory.
Free your RAM automatically

On-device models can be memory-hungry. By default InkSpoke unloads idle models after 10 minutes (the Model Memory strategy on Configuration → General), reloading them the next time you dictate. Power users can switch this to Always loaded for zero warm-up, or Manual with an "Unload models now" button.

BYOK — bring your own key

If you already pay for OpenAI, or run a local server like Ollama or LM Studio, or use any other OpenAI-compatible endpoint, connect it directly under AI Models → Providers (a Pro area). Each provider you add can expose several models that then appear in the pickers alongside Platform and On-Device options.

Adding a provider takes a few fields:

  • Quick Setup preset (optional) to pre-fill common providers, or fill in manually.
  • Name and API Base URL.
  • API Key — masked in the UI and stored in your OS keychain, never in InkSpoke's settings file.
  • Timeout and temperature, plus a models table where you list each model's id, type (speech or text), and token limit.
  • A Test Connection button to confirm the key and URL work before you save.

Your keys stay on your device. You can also manage the same personal keys from the web account.

Choosing the active models

Your global choices live under AI Models → Global Defaults. Two pickers — one per role — each list every available model, grouped by source:

┌───────────────────────────────────────────────────────────┐
│ AI Models › Global Defaults │
├───────────────────────────────────────────────────────────┤
│ Speech recognition │
│ [ Whisper Small (On-Device) ▾ ] │
│ ── Platform ── ── On-Device ── ── BYOK ── │
│ │
│ Text processing (refinement) │
│ [ Platform AI (Platform) ▾ ] │
│ Workspace-default refinement: [ ✓ On ] │
│ Token limit: [ − 2048 + ] │
│ │
│ (Master AI Refinement is ON) │
└───────────────────────────────────────────────────────────┘

Whatever you pick here is the default for every dictation — unless a workspace overrides it.

The text-processing side has two extra controls: a workspace-default refinement toggle (on by default) that decides whether workspaces without their own text model still get refined by this default, and a token limit stepper that caps how long a refined response can be. If the master AI Refinement switch is off, these are greyed out with a reminder.

How workspaces override per context

A workspace can pin its own preferred speech model and preferred text model, so dictation into your IDE can use different models than dictation into email — automatically, based on the app you're in.

When it's time to refine, InkSpoke resolves the text model in strict order:

The speech model resolves more simply: a workspace's preferred speech model (if set) wins for that dictation; otherwise the Global Default is used. You can also override the workspace itself — and therefore its models — for a single session from the picker on the listening overlay.

tip

This is why the same words come out differently in different apps: your "Code" workspace might pin an accuracy-first Whisper size and skip refinement, while your "Email" workspace uses a cloud text model tuned for a professional tone. See Smart matching and precedence.

What needs Pro

Most of the model catalog is a Pro feature. Here's the quick map:

CapabilityFreePro / Perpetual
Whisper Small (on-device speech)
Other on-device Whisper sizes
On-device local text (LLM) model
Platform (cloud) modelsTrial
BYOK providers & keys
note

BYOK providers you already added stay viewable and deletable even if your plan lapses — you just can't add new ones until you're on Pro again.

Platform notes

On-device speech models can use hardware acceleration where it's available:

PlatformOn-device acceleration
WindowsCUDA GPU when present, otherwise CPU.
macOSMetal GPU, plus an optional Apple Neural Engine accelerator (a small extra download that's used automatically once installed).
LinuxCPU only.

The GPU toggle (Use GPU for dictation, on by default) appears only on platforms that support it.

Next steps